Page 1 of 2 12 LastLast
Results 1 to 10 of 13

Thread: Block Robots and Web Downloaders with robots.txt

  1. #1
    Will.Spencer's Avatar
    Will.Spencer is offline Retired
    Join Date
    Dec 2008
    Posts
    5,033
    Blog Entries
    1
    Thanks
    1,010
    Thanked 2,329 Times in 1,259 Posts

    Block Robots and Web Downloaders with robots.txt

    robots.txt is somewhat useless because really bad robots just ignore it. Still, some robots do pay attention to directives in robots.txt, so it does save some bandwidth and annoyance.

    I am pretty severe with my robots.txt. Here are all of the User-agents I am currently denying:
    Code:
    User-agent: OutfoxBot/0.5
    User-agent: complex_network_group
    User-agent: Alexibot
    User-agent: Aqua_Products
    User-agent: BackDoorBot
    User-agent: BackDoorBot/1.0
    User-agent: BPImageWalker/2.0
    User-agent: Black.Hole
    User-agent: BlackWidow
    User-agent: BlowFish
    User-agent: BlowFish/1.0
    User-agent: Bookmark search tool
    User-agent: Bot mailto:craftbot@yahoo.com
    User-agent: BotALot
    User-agent: BotRightHere
    User-agent: BuiltBotTough
    User-agent: Bullseye
    User-agent: Bullseye/1.0
    User-agent: BunnySlippers
    User-agent: Cegbfeieh
    User-agent: CheeseBot
    User-agent: CherryPicker
    User-agent: CherryPickerElite/1.0
    User-agent: CherryPickerSE/1.0
    User-agent: ChinaClaw
    User-agent: Copernic
    User-agent: CopyRightCheck
    User-agent: Crescent
    User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
    User-agent: Custo
    User-agent: DISCo
    User-agent: DISCo Pump 3.0
    User-agent: DISCo Pump 3.2
    User-agent: DISCoFinder
    User-agent: DittoSpyder
    User-agent: Download Demon
    User-agent: Download Demon/3.2.0.8
    User-agent: Download Demon/3.5.0.11
    User-agent: EirGrabber
    User-agent: EmailCollector
    User-agent: EmailSiphon
    User-agent: EmailWolf
    User-agent: EroCrawler
    User-agent: Express WebPictures
    User-agent: Express WebPictures (www.express-soft.com)
    User-agent: ExtractorPro
    User-agent: EyeNetIE
    User-agent: FairAd Client
    User-agent: Flaming AttackBot
    User-agent: FlashGet
    User-agent: FlashGet WebWasher 3.2
    User-agent: Foobot
    User-agent: FrontPage
    User-agent: FrontPage [NC,OR]
    User-agent: Gaisbot
    User-agent: GetRight
    User-agent: GetRight/2.11
    User-agent: GetRight/3.1
    User-agent: GetRight/3.2
    User-agent: GetRight/3.3
    User-agent: GetRight/3.3.3
    User-agent: GetRight/3.3.4
    User-agent: GetRight/4.0.0
    User-agent: GetRight/4.1.0
    User-agent: GetRight/4.1.1
    User-agent: GetRight/4.1.2
    User-agent: GetRight/4.2
    User-agent: GetRight/4.2b (Portuguxeas)
    User-agent: GetRight/4.2c
    User-agent: GetRight/4.3
    User-agent: GetRight/4.5
    User-agent: GetRight/4.5a
    User-agent: GetRight/4.5b
    User-agent: GetRight/4.5b1
    User-agent: GetRight/4.5b2
    User-agent: GetRight/4.5b3
    User-agent: GetRight/4.5b6
    User-agent: GetRight/4.5b7
    User-agent: GetRight/4.5c
    User-agent: GetRight/4.5d
    User-agent: GetRight/4.5e
    User-agent: GetRight/5.0beta1
    User-agent: GetRight/5.0beta2
    User-agent: GetWeb!
    User-agent: Go!Zilla
    User-agent: Go!Zilla (www.gozilla.com)
    User-agent: Go!Zilla 3.3 (www.gozilla.com)
    User-agent: Go!Zilla 3.5 (www.gozilla.com)
    User-agent: Go-Ahead-Got-It
    User-agent: GrabNet
    User-agent: Grafula
    User-agent: HMView
    User-agent: HTTrack
    User-agent: HTTrack 3.0
    User-agent: HTTrack 3.0x
    User-agent: HTTrack [NC,OR]
    User-agent: Harvest
    User-agent: Harvest/1.5
    User-agent: Image Stripper
    User-agent: Image Sucker
    User-agent: Indy Library
    User-agent: Indy Library [NC,OR]
    User-agent: InfoNaviRobot
    User-agent: InterGET
    User-agent: Internet Ninja
    User-agent: Internet Ninja 4.0
    User-agent: Internet Ninja 5.0
    User-agent: Internet Ninja 6.0
    User-agent: Iron33/1.0.2
    User-agent: JOC Web Spider
    User-agent: JennyBot
    User-agent: JetCar
    User-agent: Kenjin Spider
    User-agent: Kenjin.Spider
    User-agent: Keyword Density/0.9
    User-agent: Keyword.Density
    User-agent: LNSpiderguy
    User-agent: LeechFTP
    User-agent: LexiBot
    User-agent: LinkScan/8.1a Unix
    User-agent: LinkScan/8.1a.Unix
    User-agent: LinkWalker
    User-agent: LinkextractorPro
    User-agent: MIDown tool
    User-agent: MIIxpc
    User-agent: MIIxpc/4.2
    User-agent: MSIECrawler
    User-agent: Mass Downloader
    User-agent: Mass Downloader/2.2
    User-agent: Mata Hari
    User-agent: Mata.Hari
    User-agent: Microsoft URL Control
    User-agent: Microsoft URL Control - 5.01.4511
    User-agent: Microsoft URL Control - 6.00.8169
    User-agent: Microsoft.URL
    User-agent: Mister PiX
    User-agent: Mister PiX version.dll
    User-agent: Mister Pix II 2.01
    User-agent: Mister Pix II 2.02a
    User-agent: Mister.PiX
    User-agent: NICErsPRO
    User-agent: NPBot
    User-agent: NPbot
    User-agent: Navroad
    User-agent: NearSite
    User-agent: Net Vampire
    User-agent: Net Vampire/3.0
    User-agent: NetAnts
    User-agent: NetAnts/1.10
    User-agent: NetAnts/1.23
    User-agent: NetAnts/1.24
    User-agent: NetAnts/1.25
    User-agent: NetMechanic
    User-agent: NetSpider
    User-agent: NetZIP
    User-agent: NetZip Downloader 1.0 Win32(Nov 12 1998)
    User-agent: NetZip-Downloader/1.0.62 (Win32; Dec 7 1998)
    User-agent: NetZippy+(http:/www.innerprise.net/usp-spider.asp)
    User-agent: Octopus
    User-agent: Offline Explorer
    User-agent: Offline Explorer/1.2
    User-agent: Offline Explorer/1.4
    User-agent: Offline Explorer/1.6
    User-agent: Offline Explorer/1.7
    User-agent: Offline Explorer/1.9
    User-agent: Offline Explorer/2.0
    User-agent: Offline Explorer/2.1
    User-agent: Offline Explorer/2.3
    User-agent: Offline Explorer/2.4
    User-agent: Offline Explorer/2.5
    User-agent: Offline Navigator
    User-agent: Offline.Explorer
    User-agent: Openbot
    User-agent: Openfind
    User-agent: Openfind data gatherer
    User-agent: Oracle Ultra Search
    User-agent: PageGrabber
    User-agent: Papa Foto
    User-agent: PerMan
    User-agent: ProPowerBot/2.14
    User-agent: ProWebWalker
    User-agent: Python-urllib
    User-agent: QueryN Metasearch
    User-agent: QueryN.Metasearch
    User-agent: RMA
    User-agent: Radiation Retriever 1.1
    User-agent: ReGet
    User-agent: RealDownload
    User-agent: RealDownload/4.0.0.40
    User-agent: RealDownload/4.0.0.41
    User-agent: RealDownload/4.0.0.42
    User-agent: RepoMonkey
    User-agent: RepoMonkey Bait & Tackle/v1.01
    User-agent: SiteSnagger
    User-agent: SlySearch
    User-agent: SmartDownload
    User-agent: SmartDownload/1.2.76 (Win32; Apr 1 1999)
    User-agent: SmartDownload/1.2.77 (Win32; Aug 17 1999)
    User-agent: SmartDownload/1.2.77 (Win32; Feb 1 2000)
    User-agent: SmartDownload/1.2.77 (Win32; Jun 19 2001)
    User-agent: SpankBot
    User-agent: Sqworm/2.9.85-BETA (beta_release; 20011115-775; i686-pc-linux
    User-agent: SuperBot
    User-agent: SuperBot/3.0 (Win32)
    User-agent: SuperBot/3.1 (Win32)
    User-agent: SuperHTTP
    User-agent: SuperHTTP/1.0
    User-agent: Surfbot
    User-agent: Szukacz/1.4
    User-agent: Teleport
    User-agent: Teleport Pro
    User-agent: Teleport Pro/1.29
    User-agent: Teleport Pro/1.29.1590
    User-agent: Teleport Pro/1.29.1634
    User-agent: Teleport Pro/1.29.1718
    User-agent: Teleport Pro/1.29.1820
    User-agent: Teleport Pro/1.29.1847
    User-agent: TeleportPro
    User-agent: Telesoft
    User-agent: The Intraformant
    User-agent: The.Intraformant
    User-agent: TheNomad
    User-agent: TightTwatBot
    User-agent: Titan
    User-agent: True_Robot
    User-agent: True_Robot/1.0
    User-agent: TurnitinBot
    User-agent: TurnitinBot/1.5
    User-agent: URL Control
    User-agent: URL_Spider_Pro
    User-agent: URLy Warning
    User-agent: URLy.Warning
    User-agent: VCI
    User-agent: VCI WebViewer VCI WebViewer Win32
    User-agent: VoidEYE
    User-agent: WWW-Collector-E
    User-agent: WWWOFFLE
    User-agent: Web Image Collector
    User-agent: Web Sucker
    User-agent: Web.Image.Collector
    User-agent: WebAuto
    User-agent: WebAuto/3.40 (Win98; I)
    User-agent: WebBandit
    User-agent: WebBandit/3.50
    User-agent: WebCapture 2.0
    User-agent: WebCopier
    User-agent: WebCopier v.2.2
    User-agent: WebCopier v2.5
    User-agent: WebCopier v2.6
    User-agent: WebCopier v2.7a
    User-agent: WebCopier v2.8
    User-agent: WebCopier v3.0
    User-agent: WebCopier v3.0.1
    User-agent: WebCopier v3.2
    User-agent: WebCopier v3.2a
    User-agent: WebEMailExtrac.*
    User-agent: WebEnhancer
    User-agent: WebFetch
    User-agent: WebGo IS
    User-agent: WebLeacher
    User-agent: WebReaper
    User-agent: WebReaper [info@webreaper.net]
    User-agent: WebReaper [webreaper@otway.com]
    User-agent: WebReaper v9.1 - www.otway.com/webreaper
    User-agent: WebReaper v9.7 - www.webreaper.net
    User-agent: WebReaper v9.8 - www.webreaper.net
    User-agent: WebReaper vWebReaper v7.3 - www,otway.com/webreaper
    User-agent: WebSauger
    User-agent: WebSauger 1.20b
    User-agent: WebSauger 1.20j
    User-agent: WebSauger 1.20k
    User-agent: WebStripper
    User-agent: WebStripper/2.03
    User-agent: WebStripper/2.10
    User-agent: WebStripper/2.12
    User-agent: WebStripper/2.13
    User-agent: WebStripper/2.15
    User-agent: WebStripper/2.16
    User-agent: WebStripper/2.19
    User-agent: WebWhacker
    User-agent: WebZIP
    User-agent: WebZIP/2.75 (http:/www.spidersoft.com)
    User-agent: WebZIP/3.65 (http:/www.spidersoft.com)
    User-agent: WebZIP/3.80 (http:/www.spidersoft.com)
    User-agent: WebZIP/4.0 (http:/www.spidersoft.com)
    User-agent: WebZIP/4.1 (http:/www.spidersoft.com)
    User-agent: WebZIP/4.21
    User-agent: WebZIP/4.21 (http:/www.spidersoft.com)
    User-agent: WebZIP/5.0
    User-agent: WebZIP/5.0 (http:/www.spidersoft.com)
    User-agent: WebZIP/5.0 PR1 (http:/www.spidersoft.com)
    User-agent: WebZip
    User-agent: WebZip/4.0
    User-agent: WebmasterWorldForumBot
    User-agent: Website Quester
    User-agent: Website Quester - www.asona.org
    User-agent: Website Quester - www.esalesbiz.com/extra/
    User-agent: Website eXtractor
    User-agent: Website eXtractor (http:/www.asona.org)
    User-agent: Website.Quester
    User-agent: Webster Pro
    User-agent: Webster.Pro
    User-agent: Wget
    User-agent: Wget/1.10.2
    User-agent: Wget/1.5.2
    User-agent: Wget/1.5.3
    User-agent: Wget/1.6
    User-agent: Wget/1.7
    User-agent: Wget/1.8
    User-agent: Wget/1.8.1
    User-agent: Wget/1.8.1+cvs
    User-agent: Wget/1.8.2
    User-agent: Wget/1.9-beta
    User-agent: Widow
    User-agent: Xaldon WebSpider
    User-agent: Xaldon WebSpider 2.5.b3
    User-agent: Xenu's
    User-agent: Xenu's Link Sleuth 1.1c
    User-agent: Zeus
    User-agent: Zeus 11389 Webster Pro V2.9 Win32
    User-agent: Zeus 11652 Webster Pro V2.9 Win32
    User-agent: Zeus 18018 Webster Pro V2.9 Win32
    User-agent: Zeus 26378 Webster Pro V2.9 Win32
    User-agent: Zeus 30747 Webster Pro V2.9 Win32
    User-agent: Zeus 32297 Webster Pro V2.9 Win32
    User-agent: Zeus 39206 Webster Pro V2.9 Win32
    User-agent: Zeus 41641 Webster Pro V2.9 Win32
    User-agent: Zeus 44238 Webster Pro V2.9 Win32
    User-agent: Zeus 51070 Webster Pro V2.9 Win32
    User-agent: Zeus 51674 Webster Pro V2.9 Win32
    User-agent: Zeus 51837 Webster Pro V2.9 Win32
    User-agent: Zeus 63567 Webster Pro V2.9 Win32
    User-agent: Zeus 6694 Webster Pro V2.9 Win32
    User-agent: Zeus 71129 Webster Pro V2.9 Win32
    User-agent: Zeus 82016 Webster Pro V2.9 Win32
    User-agent: Zeus 82900 Webster Pro V2.9 Win32
    User-agent: Zeus 84842 Webster Pro V2.9 Win32
    User-agent: Zeus 90872 Webster Pro V2.9 Win32
    User-agent: Zeus 94934 Webster Pro V2.9 Win32
    User-agent: Zeus 95245 Webster Pro V2.9 Win32
    User-agent: Zeus 95351 Webster Pro V2.9 Win32
    User-agent: Zeus 97371 Webster Pro V2.9 Win32
    User-agent: Zeus Link Scout
    User-agent: asterias
    User-agent: b2w/0.1
    User-agent: cosmos
    User-agent: eCatch
    User-agent: eCatch/3.0
    User-agent: hloader
    User-agent: httplib
    User-agent: humanlinks
    User-agent: larbin
    User-agent: larbin (samualt9@bigfoot.com)
    User-agent: larbin samualt9@bigfoot.com
    User-agent: larbin_2.6.2 (kabura@sushi.com)
    User-agent: larbin_2.6.2 (larbin2.6.2@unspecified.mail)
    User-agent: larbin_2.6.2 (listonATccDOTgatechDOTedu)
    User-agent: larbin_2.6.2 (vitalbox1@hotmail.com)
    User-agent: larbin_2.6.2 kabura@sushi.com
    User-agent: larbin_2.6.2 larbin2.6.2@unspecified.mail
    User-agent: larbin_2.6.2 larbin@correa.org
    User-agent: larbin_2.6.2 listonATccDOTgatechDOTedu
    User-agent: larbin_2.6.2 vitalbox1@hotmail.com
    User-agent: libWeb/clsHTTP
    User-agent: lwp-trivial
    User-agent: lwp-trivial/1.34
    User-agent: moget
    User-agent: moget/2.1
    User-agent: pavuk
    User-agent: pcBrowser
    User-agent: psbot
    User-agent: searchpreview
    User-agent: spanner
    User-agent: suzuran
    User-agent: tAkeOut
    User-agent: toCrawl/UrlDispatcher
    User-agent: turingos
    User-agent: webfetch/2.1.0
    User-agent: wget
    Disallow: /
    This list is obviously out of date. Updating it is another item on the ToDo list.
    Submit Your Webmaster Related Sites to the NB Directory
    I swear, by my life and my love of it, that I will never live for the sake of another man, nor ask another man to live for mine.

  2. #2
    Mr.Bill's Avatar
    Mr.Bill is offline One is glad to be of service
    Join Date
    Dec 2008
    Location
    Redmond, Oregon
    Posts
    828
    Blog Entries
    1
    Thanks
    72
    Thanked 350 Times in 182 Posts
    Here is one from a site of mine it has a couple others in it

    Code:
    User-agent: WebmasterWorldForumBot
    Disallow: /
    
    User-agent: URL_Spider_Pro
    Disallow: /
    
    User-agent: CherryPicker
    Disallow: /
    
    User-agent: EmailCollector
    Disallow: /
    
    User-agent: EmailSiphon
    Disallow: /
    
    User-agent: WebBandit
    Disallow: /
    
    User-agent: EmailWolf
    Disallow: /
    
    User-agent: ExtractorPro
    Disallow: /
    
    User-agent: CopyRightCheck
    Disallow: /
    
    User-agent: Crescent
    Disallow: /
    
    User-agent: SiteSnagger
    Disallow: /
    
    User-agent: ProWebWalker
    Disallow: /
    
    User-agent: CheeseBot
    Disallow: /
    
    User-agent: LNSpiderguy
    Disallow: /
    
    User-agent: Black Hole
    Disallow: /
    
    User-agent: Titan
    Disallow: /
    
    User-agent: WebStripper
    Disallow: /
    
    User-agent: NetMechanic
    Disallow: /
    
    User-agent: CherryPicker
    Disallow: /
    
    User-agent: EmailCollector
    Disallow: /
    
    User-agent: EmailSiphon
    Disallow: /
    
    User-agent: WebBandit
    Disallow: /
    
    User-agent: EmailWolf
    Disallow: /
    
    User-agent: ExtractorPro
    Disallow: /
    
    User-agent: CopyRightCheck
    Disallow: /
    
    User-agent: Crescent
    Disallow: /
    
    User-agent: Wget
    Disallow: /
    
    User-agent: SiteSnagger
    Disallow: /
    
    User-agent: ProWebWalker
    Disallow: /
    
    User-agent: CheeseBot
    Disallow: /
    
    User-agent: mozilla/4
    Disallow: /
    
    User-agent: mozilla/5
    Disallow: /
    
    User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows NT)
    Disallow: /
    
    User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 95)
    Disallow: /
    
    User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 98)
    Disallow: /
    
    User-agent: Teleport
    Disallow: /
    
    User-agent: TeleportPro
    Disallow: /
    
    User-agent: MIIxpc
    Disallow: /
    
    User-agent: Telesoft
    Disallow: /
    
    User-agent: Website Quester
    Disallow: /
    
    User-agent: WebZip
    Disallow: /
    
    User-agent: moget/2.1
    Disallow: /
    
    User-agent: WebZip/4.0
    Disallow: /
    
    User-agent: WebSauger
    Disallow: /
    
    User-agent: WebCopier
    Disallow: /
    
    User-agent: NetAnts
    Disallow: /
    
    User-agent: Mister PiX
    Disallow: /
    
    User-agent: WebAuto
    Disallow: /
    
    User-agent: TheNomad
    Disallow: /
    
    User-agent: WWW-Collector-E
    Disallow: /
    
    User-agent: RMA
    Disallow: /
    
    User-agent: libWeb/clsHTTP
    Disallow: /
    
    User-agent: asterias
    Disallow: /
    
    User-agent: httplib
    Disallow: /
    
    User-agent: turingos
    Disallow: /
    
    User-agent: spanner
    Disallow: /
    
    User-agent: InfoNaviRobot
    Disallow: /
    
    User-agent: Harvest/1.5
    Disallow: /
    
    User-agent: Bullseye/1.0
    Disallow: /
    
    User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95)
    Disallow: /
    
    User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
    Disallow: /
    
    User-agent: CherryPickerSE/1.0
    Disallow: /
    
    User-agent: CherryPickerElite/1.0
    Disallow: /
    
    User-agent: WebBandit/3.50
    Disallow: /
    
    User-agent: NICErsPRO
    Disallow: /
    
    User-agent: Microsoft URL Control - 5.01.4511
    Disallow: /
    
    User-agent: DittoSpyder
    Disallow: /
    
    User-agent: Foobot
    Disallow: /
    
    User-agent: SpankBot
    Disallow: /
    
    User-agent: BotALot
    Disallow: /
    
    User-agent: lwp-trivial/1.34
    Disallow: /
    
    User-agent: lwp-trivial
    Disallow: /
    
    User-agent: Wget/1.6
    Disallow: /
    
    User-agent: BunnySlippers
    Disallow: /
    
    User-agent: Microsoft URL Control - 6.00.8169
    Disallow: /
    
    User-agent: URLy Warning
    Disallow: /
    
    User-agent: Wget/1.5.3
    Disallow: /
    
    User-agent: LinkWalker
    Disallow: /
    
    User-agent: cosmos
    Disallow: /
    
    User-agent: moget
    Disallow: /
    
    User-agent: hloader
    Disallow: /
    
    User-agent: humanlinks
    Disallow: /
    
    User-agent: LinkextractorPro
    Disallow: /
    
    User-agent: Offline Explorer
    Disallow: /
    
    User-agent: Mata Hari
    Disallow: /
    
    User-agent: LexiBot
    Disallow: /
    
    User-agent: Web Image Collector
    Disallow: /
    
    User-agent: The Intraformant
    Disallow: /
    
    User-agent: True_Robot/1.0
    Disallow: /
    
    User-agent: True_Robot
    Disallow: /
    
    User-agent: BlowFish/1.0
    Disallow: /
    
    User-agent: JennyBot
    Disallow: /
    
    User-agent: MIIxpc/4.2
    Disallow: /
    
    User-agent: BuiltBotTough
    Disallow: /
    
    User-agent: ProPowerBot/2.14
    Disallow: /
    
    User-agent: BackDoorBot/1.0
    Disallow: /
    
    User-agent: toCrawl/UrlDispatcher
    Disallow: /
    
    User-agent: WebEnhancer
    Disallow: /
    
    User-agent: TightTwatBot
    Disallow: /
    
    User-agent: suzuran
    Disallow: /
    
    User-agent: VCI WebViewer VCI WebViewer Win32
    Disallow: /
    
    User-agent: VCI
    Disallow: /
    
    User-agent: Szukacz/1.4
    Disallow: /
    
    User-agent: QueryN Metasearch
    Disallow: /
    
    User-agent: Openfind data gathere
    Disallow: /
    
    User-agent: Openfind
    Disallow: /
    
    User-agent: Xenu's Link Sleuth 1.1c
    Disallow: /
    
    User-agent: Xenu's
    Disallow: /
    
    User-agent: Zeus
    Disallow: /
    
    User-agent: RepoMonkey Bait & Tackle/v1.01
    Disallow: /
    
    User-agent: RepoMonkey
    Disallow: /
    
    User-agent: Zeus 32297 Webster Pro V2.9 Win32
    Disallow: /
    
    User-agent: Webster Pro
    Disallow: /
    
    User-agent: EroCrawler
    Disallow: /
    
    User-agent: LinkScan/8.1a Unix
    Disallow: /
    
    User-agent: Keyword Density/0.9
    Disallow: /
    
    User-agent: Kenjin Spider
    Disallow: /
    
    User-agent: Cegbfeieh
    Disallow: /

    Reverse IP Check ಠ_ಠ Proxy Sites
    <?php if ($youask == 'stupid question') { echo ('stupid answer'); } ?>

  3. #3
    Cash Nebula's Avatar
    Cash Nebula is offline Newbie Net Builder
    Join Date
    Dec 2008
    Location
    Sydney, Australia
    Posts
    17
    Thanks
    0
    Thanked 1 Time in 1 Post
    So, it's not enough just to block the agent by name, you have to block each version as well?

  4. #4
    Will.Spencer's Avatar
    Will.Spencer is offline Retired
    Join Date
    Dec 2008
    Posts
    5,033
    Blog Entries
    1
    Thanks
    1,010
    Thanked 2,329 Times in 1,259 Posts
    Quote Originally Posted by Cash Nebula View Post
    So, it's not enough just to block the agent by name, you have to block each version as well?
    I think so.
    Submit Your Webmaster Related Sites to the NB Directory
    I swear, by my life and my love of it, that I will never live for the sake of another man, nor ask another man to live for mine.

  5. #5
    Aquinas is offline Newbie Net Builder
    Join Date
    Jan 2009
    Posts
    59
    Thanks
    1
    Thanked 8 Times in 4 Posts

    much better way

    Actually Will,

    I know a far better way to do that. I will PM you the details as i am unsure if i can post it up here.

    I can even give you a made one i used so only a tiny bit of editing on your end.

  6. #6
    Mr.Bill's Avatar
    Mr.Bill is offline One is glad to be of service
    Join Date
    Dec 2008
    Location
    Redmond, Oregon
    Posts
    828
    Blog Entries
    1
    Thanks
    72
    Thanked 350 Times in 182 Posts
    Aquinas I would be interested in seeing this better way

    Reverse IP Check ಠ_ಠ Proxy Sites
    <?php if ($youask == 'stupid question') { echo ('stupid answer'); } ?>

  7. #7
    Aquinas is offline Newbie Net Builder
    Join Date
    Jan 2009
    Posts
    59
    Thanks
    1
    Thanked 8 Times in 4 Posts

    Block Robots and Web Downloaders : A better way

    Given the amount of interest on this. I decided to make publicly available in this forum, the best method in my opinion that is far more powerful and easier to manage.

    I have done what Will was bascially doing, but done in .HTACCESS.

    I made this, and Will S. should be able to use it with very few changes.

    Everyone else can of course use it too.



    **Just remember when you are finished playing with it, and save it, make sure you save it as .htaccess with no extension, then drop it in the root folder of your webserver**
    Attached Files Attached Files

  8. Thanked by:

    Aquarezz (24 May, 2009), Loko (28 May, 2009), Mr.Bill (24 May, 2009), Will.Spencer (24 May, 2009)

  9. #8
    Will.Spencer's Avatar
    Will.Spencer is offline Retired
    Join Date
    Dec 2008
    Posts
    5,033
    Blog Entries
    1
    Thanks
    1,010
    Thanked 2,329 Times in 1,259 Posts
    Blocking in .htaccess tends to be better than blocking in robots.txt, because it's harder to ignore.
    Submit Your Webmaster Related Sites to the NB Directory
    I swear, by my life and my love of it, that I will never live for the sake of another man, nor ask another man to live for mine.

  10. #9
    Aquinas is offline Newbie Net Builder
    Join Date
    Jan 2009
    Posts
    59
    Thanks
    1
    Thanked 8 Times in 4 Posts
    On the same topic, but with a slightly different purpose, all of you may be interested in this:

    Getting spammed in blog comments by the same Jack ASS. How to block the spammer's IP?

  11. #10
    1901gt is offline Web Designer
    Join Date
    Jan 2009
    Location
    Singapore
    Posts
    235
    Blog Entries
    2
    Thanks
    35
    Thanked 32 Times in 24 Posts
    Actually, what is the purpose of this robots.txt?

Page 1 of 2 12 LastLast

Similar Threads

  1. The EPFL mini-robots
    By kiki in forum General Chat
    Replies: 1
    Last Post: 2 June, 2010, 03:24 AM
  2. Help with Robots.txt
    By 5starpix in forum Building
    Replies: 4
    Last Post: 11 February, 2010, 03:27 AM
  3. robots.txt help
    By Sami4u in forum Building
    Replies: 9
    Last Post: 27 September, 2009, 08:43 AM
  4. Robots.txt?
    By dmi in forum Managing
    Replies: 15
    Last Post: 9 September, 2009, 18:15 PM
  5. What is robots.txt file?
    By ltimranjaved in forum Managing
    Replies: 1
    Last Post: 26 May, 2009, 13:27 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •