Results 1 to 7 of 7

Thread: how to block all spider ?

  1. #1

    how to block all spider ?

    Guy how to block all spider with htaccess ???
    yes everyone include all major search engine, cause i setup a few live test site with huge content but prefer not showing it to any bot

    i try wp stop all SE setting, but google still index the site, with title and no description ...


  2. #2
    Join Date
    Apr 2010
    I would suggest adding entries in your htaccess files

    i wrote a article for simmilar task not exctly all bots.

    however with bit of tweaking this should work.

    htaccess based spamBot and Leacher Blocking Code | Anant Shrivastava : Techno Enthusiast

  3. #3
    Bad spiders will ignore the robots.txt file. Only legitimate spiders use the robots.txt.

    For bad spiders, the best solution is to forcefully block them using the .htaccess file.

    You can use the robots.txt file to block Google, Yahoo, Bing/MSN and others. Just use the following in the file.

    User-agent: * 
    Disallow: /
    "Democracy is two wolves and a lamb voting on what to have for lunch. Liberty is a well-armed lamb contesting the vote." -- Benjamin Franklin

  4. #4
    Popup blocker?


    Thanks, I needed a laugh!

  5. #5

    You need to use robots.txt like TopDogger wrote. Then you should place meta tag:
    <meta name="robots" content="noindex, nofollow">

    This "nofollow" is not necessary in your case. You can change it with "follow". For bad spiders, use your .htaccess to block them

  6. #6
    Hey Sonny .. If your IP address is static, then you could set up a rule in a .htaccess file to allow only your IP. Only problem is that if your IP is dynamic then you would have to update the rule each time it changes.

    This will automatically block everyone and you could also redirect them to a coming soon / maintennance page etc until the site is ready.

    Exclude your IP only and give eveyone else a 403 forbidden.

    RewriteCond %{REMOTE_ADDR} !^111\.222\.33\.4
    RewriteRule ^(.*)$ - [F,L]
    Exclude your IP and 302 redirect everyone else to a temp holding page such as a index.html.

    RewriteCond %{REMOTE_ADDR} !^111\.222\.33\.4
    RewriteCond %{REQUEST_URI} !/index\.html$ [NC]
    RewriteRule ^(.*)$ /index\.html [R=302,L]
    There's way too many bots to block them all in a htaccess. You would have better results using some kind of anti-bot script.

  7. #7
    After a long time, encountered a very good question. There are two methods to block spiders through .htaccess.

    1. Use SetEnv directive with combination with FilesMatch.

    SetEnvIfNoCase user-agent "^Custo" bad_bot=1
    SetEnvIfNoCase user-agent "^Bot\" bad_bot=1
    <FilesMatch "(.*)">
    Order Allow,Deny
    Allow from all
    Deny from env=bad_bot
    Just define all the bad bots with SetEnvIfNoCase. The

    2. ModRewrite Method

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bot\ [OR]
    RewriteRule ^.* - [F,L]

Similar Threads

  1. Search Engine Crawler and Spider
    By joy1986joy in forum Promoting
    Replies: 4
    Last Post: 20 July, 2010, 13:20 PM
  2. Replies: 8
    Last Post: 28 March, 2010, 09:08 AM
  3. Who are SE Bots and spider.??
    By in forum Promoting
    Replies: 2
    Last Post: 12 June, 2009, 20:06 PM
  4. Spider Cat... Spider Cat....
    By m42 in forum General Chat
    Replies: 4
    Last Post: 22 May, 2009, 17:42 PM
  5. Blogger Spider
    By superfast502 in forum Managing
    Replies: 1
    Last Post: 29 December, 2008, 15:26 PM

Tags for this Thread


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts