Results 1 to 8 of 8

Thread: Need help, google bot going crazy

  1. #1
    Kaustubh is offline Unknown Net Builder
    Join Date
    Jan 2011
    Posts
    38
    Thanks
    6
    Thanked 2 Times in 2 Posts

    Need help, google bot going crazy

    Hello,
    Since 3 days i have been hit by Google bot so strongly that it is consuming all my memory resources and giving 500 error. I dont know what is going wrong but google bot is accessing proxified pages, usually some twitter profiles continuously, i wrote the condition in robots.txt to exclude browse.php, but its not honoring that and keep giving hit on browse.php, i banned few ips of googlebot from .htaccess but it keep on coming back, it never happened before.

    and for few reasons, yandex bot is doing same..hiting on proxified pages..

    something wrong is going on..

    Please help

  2. #2
    NickW is offline Unknown Net Builder
    Join Date
    Aug 2010
    Posts
    20
    Thanks
    0
    Thanked 3 Times in 3 Posts
    Can you show us exactly what your robots.txt file looks like?

  3. Thanked by:

    Will.Spencer (24 May, 2011)

  4. #3
    UncleP's Avatar
    UncleP is online now The perfect face for radio
    Join Date
    Nov 2009
    Location
    Blighty
    Posts
    217
    Thanks
    20
    Thanked 91 Times in 61 Posts
    A temporary fix might be to rename your browse.php file if you can. You will need to change the name referenced in /includes/init.php on line 25 "define('SCRIPT_NAME', 'browse.php');" that might at least buy you some time if it's automated. It's something I do anyway to try to delay detection by filtering networks, well worth doing imho. Don't forget to change any quick links or pre-made proxify shortcuts to the new name too. You could even make a new browse.php file that redirects to bing, lol. It may be a faked referral id, so blocking google might not stop the attacks anyway.
    If I can't be a good example, I'll just have to be a terrible warning...

  5. Thanked by:

    Will.Spencer (24 May, 2011)

  6. #4
    Kaustubh is offline Unknown Net Builder
    Join Date
    Jan 2011
    Posts
    38
    Thanks
    6
    Thanked 2 Times in 2 Posts
    Quote Originally Posted by NickW View Post
    Can you show us exactly what your robots.txt file looks like?
    User-agent: * Disallow: /browse.php User-agent: Googlebot-Mobile Disallow: /

  7. #5
    Mike-XS's Avatar
    Mike-XS is offline XeroAgent
    Join Date
    Sep 2009
    Location
    OZ
    Posts
    209
    Thanks
    30
    Thanked 109 Times in 71 Posts
    Usually Googlebots follow the robots.txt. If you didn't check already, maybe have a quick look if the IP addresses actually belong to google and not someone else.

    Here's something you can use to block that s#!t until you get it sorted out.

    PHP Code:
    $ua $_SERVER['HTTP_USER_AGENT'];
    $detect = array('googlebot','yandex');

    if(!empty(
    $ua))
    {
      foreach(
    $detect as $d)
      {
         if(
    stripos($ua,$d) !== false)
         {
            exit(
    "<strong>$d</strong> detected - No proxy for you !");
         }
      }

    If you add this to the top of your browse.php page it should stop the browse page from loading and restrict access to 'anyone' trying to go through your proxy with a googlebot or yandex useragent.

    You could also send them to the default Glype banned page:

    PHP Code:
    header('HTTP/1.1 403 Forbidden'true403);
          echo 
    loadTemplate('banned.page');
          exit; 
    or redirect them to another page with a message if you'd prefer to do something else to/with them.

    PHP Code:
    exit(header('Location: ./some-other-page.php')); 
    If you use it, let me know if you have any problems.
    I added a little more info in this guide on How to block bots from accessing pages through your proxy .
    Last edited by Mike-XS; 24 May, 2011 at 14:41 PM.

  8. #6
    IProx is offline Unknown Net Builder
    Join Date
    Jun 2009
    Posts
    20
    Thanks
    1
    Thanked 5 Times in 5 Posts
    Your issue is surely not with Google, but a bad bot pretending to be Google. There are a lot of bad bots(BlockScript tracks over 643,228 bad bot IPs) and they won't respect your robots.txt file.
    Block Proxies, Bots, Filtering Companies, Scrapers, and more with BlockScript.

  9. #7
    Will.Spencer's Avatar
    Will.Spencer is offline Retired
    Join Date
    Dec 2008
    Posts
    5,033
    Blog Entries
    1
    Thanks
    1,010
    Thanked 2,327 Times in 1,258 Posts
    Quote Originally Posted by Kaustubh View Post
    User-agent: * Disallow: /browse.php User-agent: Googlebot-Mobile Disallow: /
    Umm... that's "Googlebot-Mobile", not "Googlebot".
    Submit Your Webmaster Related Sites to the NB Directory
    I swear, by my life and my love of it, that I will never live for the sake of another man, nor ask another man to live for mine.

  10. #8
    Kaustubh is offline Unknown Net Builder
    Join Date
    Jan 2011
    Posts
    38
    Thanks
    6
    Thanked 2 Times in 2 Posts
    Alright, the problem was coz of googlebot was stucked in some kind of loop.

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •