PHP Web Host - Quality Web Hosting For All PHP Applications Sign up for PayPal and start accepting credit card payments instantly
  Login or Register
 • Home • Downloads • Your Account • Forums • 

View next topic
View previous topic


Google
 
Web RavenPHPScripts (This Site)
Post new topic   Reply to topic
Author Message
dad7732
RavenNuke(tm) Development Team


Joined: Mar 18, 2007
Posts: 1191

PostPosted: Mon Nov 30, 2009 7:22 pm Reply with quote Back to top

The MSNBOT/2.0b does not recognize the robots.txt file. So here is what to add in your /root/.htaccess file:

Code:

RewriteCond %{HTTP_REFERER} ^msnbot/2\.0b [NC]
RewriteRule .* - [F,L]

//Returns a 403-Forbidden response and no content.


Works well. My site "was" being literally inundated with MSNBOT hits daily. Better than adding to the harvester list in NS.

If this doesn't belong here, move it - couldn't find a more appropriate place.

Cheers
View user's profile Send private message
unicornio
Involved
Involved


Joined: Aug 13, 2009
Posts: 432

PostPosted: Mon Nov 30, 2009 8:23 pm Reply with quote Back to top

hi dad7732

I tried to use this trick but unfortunatly I get a 500 error and the page doesnt show.
View user's profile Send private message
dad7732
RavenNuke(tm) Development Team


Joined: Mar 18, 2007
Posts: 1191

PostPosted: Mon Nov 30, 2009 8:36 pm Reply with quote Back to top

You copy/pasted exactly as it is to your .htaccess file, doesn't matter where as long as it's below the opening lines, mine is in the middle somewhere. Looks like maybe a syntax error, Can you post your .htaccess file here without including any login/pass information, etc. ?
View user's profile Send private message
dad7732
RavenNuke(tm) Development Team


Joined: Mar 18, 2007
Posts: 1191

PostPosted: Tue Dec 01, 2009 10:56 pm Reply with quote Back to top

What a sneaky bunch of "you know what's" at MSN. Looking at my tracked IP's I noticed several hundreds of hits from the MSN Bot tonight .. say what??? There must be an auto-detection setup for block attempts on their bots so guess what they did, they changed the user-agent string to eliminate MSNBOT/2.0b !! So .. seeing as how the IP block 65.55.xcxx.xxx is dedicated to the MSN Bot, I added 65.55.*.* as a "flood" blocker and to not be emailed several hundreds of times I set it to just "block,default page". We'll see what happens now. And yes, I could just as easily block the IP at the server level in hosts.allow, hosts.deny on some BSD Servers.

Cheers
View user's profile Send private message
Raven
Site Admin/Owner


Joined: Aug 27, 2002
Posts: 16986
Location: Kansas

PostPosted: Tue Dec 01, 2009 11:13 pm Reply with quote Back to top

According to this article it should now be honoring the robots file:
Only registered users can see links on this board!
Get registered or login to the forums!
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger
unicornio
Involved
Involved


Joined: Aug 13, 2009
Posts: 432

PostPosted: Wed Dec 02, 2009 1:30 am Reply with quote Back to top

it is working fine right now dad7732. I just did like this

before
//Returns a 403-Forbidden response and no content.
after
#Returns a 403-Forbidden response and no content.

I forgot to comment out this line. lol. Thanks
View user's profile Send private message
dad7732
RavenNuke(tm) Development Team


Joined: Mar 18, 2007
Posts: 1191

PostPosted: Wed Dec 02, 2009 7:54 am Reply with quote Back to top

Maybe MSN hasn't read the article!! Adding the MSN Bot to robots.txt has no effect here in any domain as of this morning. Adding the IP blocker is the only thing that works simply because as I mentioned, they changed their UA string accordingly. I haven't added their IP server-wide because I may have some users (30+ domains) that actually may want the visits.

Cheers, thanks for the article

PS: unicornio - that was my next suggestion, etc. Smile
View user's profile Send private message
dad7732
RavenNuke(tm) Development Team


Joined: Mar 18, 2007
Posts: 1191

PostPosted: Wed Dec 02, 2009 8:52 am Reply with quote Back to top

Forgot to add that:

65.55.xxx.xxx - the MSN Bot

Was denied domain access 1688 times so far since midnight last night until the posting time just now - only 9 hrs. Over a 24 hr period this bot can be responsible for a pretty hefty resource drain.

Cheers
View user's profile Send private message
dad7732
RavenNuke(tm) Development Team


Joined: Mar 18, 2007
Posts: 1191

PostPosted: Wed Dec 02, 2009 3:38 pm Reply with quote Back to top

As of a few minutes ago, we're up to 2940 MSN Bot hits since midnight that are now being rejected. Robots.txt isn't working. These people are ruthless to say the least. If they want my site that bad then they can PAY for it .. Smile

Cheers
View user's profile Send private message
Guardian2003
Site Admin


Joined: Aug 28, 2003
Posts: 6373
Location: Vsetin, Czech Republic

PostPosted: Fri Dec 04, 2009 2:19 pm Reply with quote Back to top

This doesn't sound right to me, there is something else going on here. There's no reason for a 'bot' to change it's UA, mmmmm.....
View user's profile Send private message Send e-mail Visit poster's website
Guardian2003
Site Admin


Joined: Aug 28, 2003
Posts: 6373
Location: Vsetin, Czech Republic

PostPosted: Fri Dec 04, 2009 2:22 pm Reply with quote Back to top

FYI
Only registered users can see links on this board!
Get registered or login to the forums!
View user's profile Send private message Send e-mail Visit poster's website
dad7732
RavenNuke(tm) Development Team


Joined: Mar 18, 2007
Posts: 1191

PostPosted: Fri Dec 04, 2009 2:36 pm Reply with quote Back to top

Old list. Where is MSNBOT 2.0b ? I've done a lot of research on this and the primary IP for the MSN bots is 65.55.xxx.xxx regardless of the UA. I've seen the change in UA immeidately after blocking MSNBOT 2.0b in the harvester menu.

Cheers
View user's profile Send private message
unicornio
Involved
Involved


Joined: Aug 13, 2009
Posts: 432

PostPosted: Sat Dec 05, 2009 4:38 pm Reply with quote Back to top

Because "msnbot/2.0b" continued to crawl numerous pages and directories that are officially off limits via META tags, robots.txt rules and X-Robots-Tag directives, I just officially blocked it.

Open your .htaccess

Copy and paste

Code:
RewriteCond %{HTTP_USER_AGENT} ^msnbot/2\.0b [NC]
RewriteCond %{REQUEST_URI} !^/robots\.txt$
RewriteRule .* /robots.txt [L,R=301]


Upload it again. it is working for me. Can u please test it in order to know how we can improve this issue.
View user's profile Send private message
Display posts from previous:       
Post new topic   Reply to topic

View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Forums ©
 

All logos and trademarks in this site are property of their respective owner.
The comments are property of their posters, all the rest © 2002-2011 by Raven

You can syndicate our news using the file xml

CSE HTML Validator Helped Clean up This Page! [Valid RSS] valid RSS 2.0 Valid robots.txt Stop Spam Harvesters, Join Project Honey Pot

Website engines core code is © copyright by PHP-Nuke but has been heavily patched and modified by myself and others.
PHP-Nuke is a free software released under the GNU/GPL.


:: fisubice phpbb2 style by Daz :: PHP-Nuke theme by www.nukemods.com ::
:: fisubice Theme Modified by the RavenNuke™ Team ::

:: W3C CSS Compliance Validation :: W3C HTML 4.01 Transitional Compliance Validation ::

zerosum