You are welcome to look around. You will have to register before you can post a message, create a blog, chat live with our members, or add a site to our directory.
Will, does your white list cover all of the Google spiders?
It blocks every one I've seen. Each time an IP gets blocked by bot-trap, I get an email. I look at the email to see if the User-Agent is a bot that should not be blocked. If it is, I check it's IP address in WHOIS to make sure it's not some content thief faking their User-Agent. If the User-Agent and the WHOIS data match, I add the new IP address range to my white list.
If I understand this correctly, in addition to copying the other files to the server and modifying the robots.txt file, I need to set 777 permissions on the .htaccess file in the root directory and add the whitelist IPs. Is that correct?
I currently have the following list of spammer IPs blocked in the .htaccess on the test site. Do I just add the whitelist IPs to this block or keep them in a separate group?
Code:
<Limit GET HEAD POST>
order allow,deny
deny from 24.129.33.46
deny from 69.94.108.180
deny from 82.128.
deny from 208.66.195.
allow from all
</LIMIT>
Where is the blacklist being created? Could this system fill up the .htaccess file with blacklisted IPs over time?
Are you using a single pixel image for the link trap or did you place a larger image on the page somewhere?
BTW, the .htaccess protocol does not allow comments on the same line as a directive. I was making notations just like those in your whitelist for IPs that I was manually banning and my server error log filled up with error messages. You have to place comments on separate lines.
If I understand this correctly, in addition to copying the other files to the server and modifying the robots.txt file, I need to set 777 permissions on the .htaccess file in the root directory and add the whitelist IPs. Is that correct?
My .htaccess files seem to work with 644 (rw-r--r--) permissions, but that's because they are owned by the same user id as the web server runs under.
Quote:
Originally Posted by TopDogger
I currently have the following list of spammer IPs blocked in the .htaccess on the test site. Do I just add the whitelist IPs to this block or keep them in a separate group?
Hmmm... you got me... I think they can be separate.
Quote:
Originally Posted by TopDogger
Where is the blacklist being created? Could this system fill up the .htaccess file with blacklisted IPs over time?
It's appended to the end of .htaccess. The .htaccess can grow large over time. Every month or two I delete the older deny statements.
Quote:
Originally Posted by TopDogger
Are you using a single pixel image for the link trap or did you place a larger image on the page somewhere?
I'm using a single pixel image.
Quote:
Originally Posted by TopDogger
BTW, the .htaccess protocol does not allow comments on the same line as a directive. I was making notations just like those in your whitelist for IPs that I was manually banning and my server error log filled up with error messages. You have to place comments on separate lines.
That's odd -- it seems to be working fine here under Apache 2.2. Are you running Apache 1.3 or Apache 2.2?
My .htaccess files seem to work with 644 (rw-r--r--) permissions, but that's because they are owned by the same user id as the web server runs under.
644 is the standard permissions for the .htaccess. The instructions say to, "Make blacklist.dat and .htaccess writable by the web server user." I interpret that to mean that the file needs to be writable by the script being run by a user, which 644 doesn't cover. Maybe I'm interpreting this wrong. I'm looking at it as being similar to a cache directory, where the permissions typically need to be set to 666 or 777. Are the permissions on the blacklist.dat file set to 644 also?
Quote:
Originally Posted by Will.Spencer
That's odd -- it seems to be working fine here under Apache 2.2. Are you running Apache 1.3 or Apache 2.2?
I'm running Apache 2.2.10. One of the techs at my hosting company pointed that out a few months back while we were troubleshooting a server issue. He noticed that the Apache error log was pretty fat and packed with hundreds of messages pointing to the .htaccess files. He took a look at one of the .htaccess files and said that comments had to be on separate lines. I never saw an error with my sites, so the messages may have been warnings. It was something new to me.
644 is the standard permissions for the .htaccess. The instructions say to, "Make blacklist.dat and .htaccess writable by the web server user." I interpret that to mean that the file needs to be writable by the script being run by a user, which 644 doesn't cover. Maybe I'm interpreting this wrong. I'm looking at it as being similar to a cache directory, where the permissions typically need to be set to 666 or 777. Are the permissions on the blacklist.dat file set to 644 also?
I think it means "the script being run by the web server." On my system, the web server runs as the user www and .htaccess is owned by the user www.
But, 777 works too -- no matter who owns the file.
Quote:
Originally Posted by TopDogger
I'm running Apache 2.2.10. One of the techs at my hosting company pointed that out a few months back while we were troubleshooting a server issue. He noticed that the Apache error log was pretty fat and packed with hundreds of messages pointing to the .htaccess files. He took a look at one of the .htaccess files and said that comments had to be on separate lines. I never saw an error with my sites, so the messages may have been warnings. It was something new to me.
Very odd -- not a single mention of this in my error logs.
But I did some Googling and found other people with similar issues -- particularly when the comments contained forward slashes. So, it seems best to move the comments to separate lines.
I think it means "the script being run by the web server." On my system, the web server runs as the user www and .htaccess is owned by the user www.
OK. after I see how it logs the first spider, I will try setting the permissions to 644 to see what happens. If it does not work, an error will probably show up in the site's error log. Personally, I prefer not setting anything to 777.
Quote:
Originally Posted by Will.Spencer
Very odd -- not a single mention of this in my error logs.
But I did some Googling and found other people with similar issues -- particularly when the comments contained forward slashes. So, it seems best to move the comments to separate lines.
All I use is a hash for comments in the .htaccess. There might be some kind of obscure server configuration issue that causes it to log errors on some servers, but not on others.
OK. after I see how it logs the first spider, I will try setting the permissions to 644 to see what happens. If it does not work, an error will probably show up in the site's error log. Personally, I prefer not setting anything to 777.
As an old Unix guy, 777 makes me feel weird.
Quote:
Originally Posted by TopDogger
All I use is a hash for comments in the .htaccess. There might be some kind of obscure server configuration issue that causes it to log errors on some servers, but not on others.
The forward slashes were apparently being misinterpreted as being part of a CIDR (Classless Internet Domain Routing) statement. Like 192.168.0.0/24.
I do not understand how the CIDR number works. What range of IPs does 64.233.160.0/19 cover?
CIDR notation uses binary math. /19 means "the first 19 binary digits are the network range and the rest is the IP address range." When you make the number after the slash larger, the networks get smaller. When you make the number after the slash smaller, the networks get larger.
10.0.0.0/8 is a traditional Class A network, i.e. 10.0.0.0 to 10.255.255.255.
192.168.0.0/24 is a traditional Class C network, i.e. 192.168.0.0 to 192.168.0.255.
Raising the number after the slash by one digit cuts the size of the network in two. Lowering the number after the slash doubles the size of the network.
I've never liked doing math, so I use a CIDR calculator for these calculations.