Ravens PHP Scripts: Forums
 

 

View next topic
View previous topic
Post new topic   Reply to topic    Ravens PHP Scripts And Web Hosting Forum Index -> NukeSentinel(tm) v2.6.x
Author Message
dad7732
RavenNuke(tm) Development Team



Joined: Mar 18, 2007
Posts: 1242

PostPosted: Thu Apr 22, 2010 11:59 am Reply with quote

A few weeks ago I blocked "betaBot" for example, but it is still accessing and harvesting - 25 hits today. What am I missing?
 
View user's profile Send private message
montego
Site Admin



Joined: Aug 29, 2004
Posts: 9457
Location: Arizona

PostPosted: Sun May 09, 2010 10:00 am Reply with quote

Can't think of a code reason why it wouldn't be blocked unless:

1) you either didn't spell it correctly,

2) these are coming from different IP addresses and you have your blocker set to block at an individual IP address (rather than a higher subnet),

3) you forgot to have the setting set to "Block"

But, knowing you, I am sure none of the above apply, so, yes, that is puzzling.

_________________
Where Do YOU Stand?
HTML Newsletter::ShortLinks::Mailer::Downloads and more... 
View user's profile Send private message Visit poster's website
dad7732







PostPosted: Sun May 09, 2010 10:07 am Reply with quote

Ok, what I do/did:

1. Bring up tracked IP Addresses
2. Tracked Agents
3. Select a UA string
4. Check IP - same IP but don't know why that would be important as it's the "agent" that's being blocked, not the IP - unless I'm wrong on this.
5. Click the block icon (red shield)

Check back a few days later and the supposed blocked agent is in the list again showing more than just a few access points. This happens with more than just one pesky agent.

I also noticed that if I remove an agents from the list, some are still being blocked. I'll have to gather more specifics on this, possibly in the DB to see if those agents are actually removed, etc.

Cheers
 
dad7732







PostPosted: Sun May 09, 2010 10:15 am Reply with quote

More ....

Just checked the DB for NSNST/harvesters and my example betaBot was spelled in all lowercase - betabot - edited it to betaBot which is the actual spelling in the agent. I did not add this entry manually btw, just clicked on the red shield initially.

I also had php/5.2.1 blocked but just php in the agent was being blocked and "php" alone was NOT in the DB list, I removed the entry in the DB. Interesting ...

Cheers
 
dad7732







PostPosted: Sun May 09, 2010 10:34 am Reply with quote

Aha!!!

I removed betaBot from the DB list and when adding it back again, it came up as "betabot" - lowercase again and noticed in the entry box after clicking on the "gold" shield to block it that it was auto-entered in lowercase and hence the same added into the DB

To verify this a bit further, one of the agents was PycURL .. Clicking on the shield to block it, "pycurl" was the entry in the box and was entered as lowercase in the DB.

Do I need to enter this as a bug or more discussion/testing needed beforehand?

Cheers
 
montego







PostPosted: Tue May 11, 2010 7:16 am Reply with quote

I was pretty certain that the user agent blocker is not case sensitive. Going back to the very beginning, is it possible that these agents were getting blocked but since so many are coming from different IP addresses, that NS is having to block each one it sees as it sees it? I have never blocked a user agent from the tracking. I ALWAYS add it in via the Harvestor Menu.
 
dad7732







PostPosted: Tue May 11, 2010 7:59 am Reply with quote

betaBot - my example is coming from only one IP address. If I use the agent blocking function it adds it to the harvester database but in lowercase - betabot. Next day, betaBot is listed with more than one or two hits and the agent UA is betaBot, not betabot. Now, if I edit it to betaBot in the DB that takes care of it, no more visits. This is also true of other bots as well. It appears to not be a case of multiple IP addresses but rather the agent string is, in actuality, case sensitive. I'm running FreeBSD if that matters .. it may.

Here's another one:

BabalooSpider is in the string but babaloospider is what is entered in the DB and return visits are not denied, no email.

When either of these two, for instance, access the site, I don't get an email but when I edit the string to change the case to what it is supposed to be then I DO get the email that the bot(s) have been blocked.

Cheers
 
Palbin
Site Admin



Joined: Mar 30, 2006
Posts: 2583
Location: Pittsburgh, Pennsylvania

PostPosted: Tue May 11, 2010 8:44 am Reply with quote

Sounds like a regex problem.

_________________
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan. 
View user's profile Send private message
dad7732







PostPosted: Tue May 11, 2010 9:14 am Reply with quote

In summary ..

Example:

The actual UA that shows up in the list of Tracked Agents:

WWW-Mechanize/0.9.2 (http://rubyforge.org/projects/mechanize/)

Click on the shield to block this agent

What shows up in the field is:

www-mechanize/0.9.2 (http://rubyforge.org/projects/mechanize/)

Click on Save Changes

The above is what is saved to the harvester db

Next time that WWW-Mechanize/0.9.2 (http://rubyforge.org/projects/mechanize/) accesses the site, it will not be blocked.

The end result of this anomaly is that if a bot programmer uses upper/lowercase in the string then it will not be blocked.


Cheers
 
sixonetonoffun
Spouse Contemplates Divorce



Joined: Jan 02, 2003
Posts: 2496

PostPosted: Tue May 11, 2010 9:38 am Reply with quote

I haven't looked at the code at all for a long time but it sure sounds like saved agents are getting strtolower'd. I wouldn't go changing anything let the dev team dig into this a little. Changing one thing can break other integrity tests.

_________________
[b][size=5]openSUSE 11.4-x86 | Linux 2.6.37.1-1.2desktop i686 | KDE: 4.6.41>=4.7 | XFCE 4.8 | AMD Athlon(tm) XP 3000+ | MSI K7N2 Delta-L | 3GB Black Diamond DDR
| GeForce 6200@433Mhz 512MB | Xorg 1.9.3 | NVIDIA 270.30[/size:2b8 
View user's profile Send private message
montego







PostPosted: Sat May 15, 2010 7:37 am Reply with quote

The tests are definitely meant to be case insensitive, therefore, it doesn't matter if the string is forced to all lower case. However, I just recently found similar cases on my own site where I have my blocker set to "Write to .htaccess" and at the "12.0.0.*" level and I am seeing several block messages at the last node level and not one immediately after the other (btw, if the accesses come too quickly, the system will take a bit of time to get the IP written and then eventually future ones should get blocked, and this applies to all the blockers in general).

Therefore, I am convinced NS does have an issue here.
 
dad7732







PostPosted: Sat May 15, 2010 7:52 am Reply with quote

When I look at the auto-entry when choosing to block an agent and if it's all lowercase where some characters should be upper then I manually change it before saving the change to the DB. Although it works it's a band-aid for now.

Thanks for looking at this a little closer.

Cheers
 
montego







PostPosted: Sat May 15, 2010 7:56 am Reply with quote

Odd. My symptoms are I have a string of "webcapture" and I am getting block notices for an agent with "Webcapture" in it. Therefore, the string check is case insensitive, but maybe somehow the actual blocking is the problem... unfortunately I won't have any time to look at this for quite some time. Sad
 
dad7732







PostPosted: Sat May 15, 2010 8:06 am Reply with quote

It may be OS related. I'm running FreeBSD which is case-sensitive in most respects. Could be the issue, who knows. But at present I can't see UNIX interferring with the MySQL DB in that respect, dunno at this point.

Haven't seen this issue presented here before, maybe it's going unnoticed or maybe it's so selective that it works in 99% of cases, just that mine fits the 1% case - as usual.

Cheers
 
Guardian2003
Site Admin



Joined: Aug 28, 2003
Posts: 6799
Location: Ha Noi, Viet Nam

PostPosted: Sun May 16, 2010 2:15 pm Reply with quote

*nix, for the most part is case sensitive, which seems common with your FreeBSD OS so I'm inclined to write that off as a culprit (for now).
I would imagine that the UA is forced to lower case (strtolower) and stored that way in the DB for consistency. Likewise, I'm pretty sure that data strings for an incoming UA are similarly altered, in effect the 'compare' is being done with the case of all alpha characters in lowercase.

What I find interesting and somewhat puzzling is that if you are manually altering the UA text in the DB to match the original mixed case string it seems to have been beneficial for you; when in reality it shouldn't make any difference.

It is certainly something worth investigating further though.
 
View user's profile Send private message Send e-mail
dad7732







PostPosted: Sun May 16, 2010 2:42 pm Reply with quote

If I block an uppercase UA and it stays lowercase without modification, it will not be blocked. If I edit the UA prior to saving then it is blocked, already verified that. What actually led me to this discovery is the UA with Webcapture in the string. When blocked it is saved as webcapture and subsequent hits are not blocked. Editing the string to Webcapture prior to saving will block any further hits. Strange but that's the case.

Cheers
 
Palbin







PostPosted: Sun May 16, 2010 5:43 pm Reply with quote

I am not seeing how case would matter no matter what OS you are using since it is PHP that is handling everything.
 
dad7732







PostPosted: Sun May 16, 2010 6:17 pm Reply with quote

If the UA string contains, for example:

RoBot

So when the shield is clicked to block the UA containing "RoBot", what actually gets entered in the DB is "robot" and therefore isn't blocked. I would think that since *nix is the underlying handler then case does matter when the DB is polled. All I know is that if the agent blocker is "robot" and the UA is "RoBot", the agent isn't blocked, isn't here, unless it is represented in the DB exactly as it is in the UA string.
 
Palbin







PostPosted: Sun May 16, 2010 6:29 pm Reply with quote

I understand the problem Smile. Just saying that I don't see how OS could be the culprit.
 
dad7732







PostPosted: Sun May 16, 2010 6:46 pm Reply with quote

Ok, I may need some instruction here ....

Does MySQL running on FreeBSD adhere to the case sensitive rules of UNIX?

I have Mod_Spel enabled, does that come into play here?

Cheers
 
64bitguy
The Mouse Is Extension Of Arm



Joined: Mar 06, 2004
Posts: 1164

PostPosted: Mon May 24, 2010 6:27 pm Reply with quote

I thought I should mention that UA discovery and isolation is in fact Case-Sensitive.

UA discover and blocking could be made case-insensitive; if (for example) the NC function was utilized in the .htaccess.

I would particularly refer to: http://httpd.apache.org/docs/2.2/mod/mod_setenvif.html

_________________
Steph Benoit
100% Section 508 and W3C HTML5 and CSS Compliant (Truly) Code, because I love compliance. 
View user's profile Send private message
montego







PostPosted: Thu Jun 03, 2010 3:14 pm Reply with quote

Well, the real question at hand is how is NukeSentinel(tm) handling this. I have checked the Harvestor code within includes/nukesentinel.php and it is case insensitive using this code here (only a part of the code):

stristr($nsnst_const['user_agent'], $harvest)

Therefore, if the string in the Harvestor list is "robot", it will catch "RoBot" in the User Agent string. I suspect there is a bug somewhere else in the NS code causing the originally noted issue that the RN team is going to have to hunt down.
 
dad7732







PostPosted: Thu Jun 03, 2010 3:53 pm Reply with quote

Not flying completely blind here but would it depend on the server OS being case-sensitive? I'm running FreeBSD with Apache and if the string is "robot" it does NOT catch RoBot. I'm also running with mod_spel activated if that makes any difference, dunno.

Cheers
 
montego







PostPosted: Mon Jun 07, 2010 6:32 am Reply with quote

No, as Guardian mentioned already, server OS has nothing to do with this. Also as noted previously, I experienced a similar issue leading me to believe this to be a NS issue.
 
dad7732







PostPosted: Mon Jun 07, 2010 6:48 am Reply with quote

Good, since you have similar issues then either BOTH of us are going nutz together OR there is a real issue. I'm on your side re: real issue !! Smile

Cheers
 
Display posts from previous:       
Post new topic   Reply to topic    Ravens PHP Scripts And Web Hosting Forum Index -> NukeSentinel(tm) v2.6.x

View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Powered by phpBB © 2001-2007 phpBB Group
All times are GMT - 6 Hours
 
Forums ©