If you are not registered or logged in, you may still use these forums but with limited features. Show recent topics
  [Search] Search   [Hottest Topics] Hottest Topics   [Members]  Member Listing   [FAQ]  FAQ 
[Register] Register / 
[Login] Login 
Blocking Bad Bots  XML
Forum Index » Support Forum
Author Message
amber222
Graduate

Joined: 07/05/2004 21:13:07
Messages: 586
Offline

Some info I found on the Internet to help stop bad bots from indexing the Guestbook, as well as harvesting emails and using up bandwidth.

On this forum, someone has posted their code for a "php spider trap". lt Uses robots.txt, .htaccess and getout.php:
http://www.webmasterworld.com/forum88/3104.htm

The Perl version of the above script, named "trap.cgi":
http://www.webmasterworld.com/forum13/1823.htm

This is a slick article with instructions for using mod_rewrite. I don't understand the concept or know if it's possible under most hosts, but maybe some of the experts here can translate it for us:
http://diveintomark.org/archives/2003/02/26/how_to_block_spambots_ban_spybots_and_tell_
unwanted_robots_to_go_to_hell
(split this address to allow line ending. After clicking on the link, you must paste the last part in the address window at the end)

Sample .htaccess spider-blocking script (using mod_rewrite) has a long list of bots added:
http://techpatterns.com/downloads/scripts/sample_wbmw.txt

A robots.txt Tutorial with lists of spambots, harvesters and bots searching for plagerism:
http://www.clockwatchers.com/robots_list.html

A ready-made robots.txt file to be downloaded from phpbbhacks.com. Use for any site, not just phpbb:
http://www.phpbbhacks.com/download/3182
 
Forum Index » Support Forum
Go to:   
Based on the open source JForum