Ravens PHP Scripts: Forums
 

 

View next topic
View previous topic
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Ravens PHP Scripts And Web Hosting Forum Index -> Raven's RavenNuke(tm) v2.02.02 Distro
Author Message
Bluezzz
Involved
Involved



Joined: Feb 08, 2005
Posts: 290
Location: USA

PostPosted: Sat Jan 13, 2007 6:17 am Reply with quote

How do I block googlebot from Private Messages? I have the robot text set to not allow access to that module I'm pretty sure ... how is it that it's going there anyway?

_________________
Bluezzz
~ Stop & smell the roses, while you can! ~ 
View user's profile Send private message
kguske
Site Admin



Joined: Jun 04, 2004
Posts: 6432

PostPosted: Sat Jan 13, 2007 8:41 am Reply with quote

How do you have robots.txt set to block it?

_________________
I search, therefore I exist...
nukeSEO - nukeFEED - nukePIE - nukeSPAM - nukeWYSIWYG
 
View user's profile Send private message
Guardian2003
Site Admin



Joined: Aug 28, 2003
Posts: 6799
Location: Ha Noi, Viet Nam

PostPosted: Sat Jan 13, 2007 10:58 am Reply with quote

If you are using the default robots.txt then Google is permitted in Private Messages in as much as, it is allowed to index legitimate links to it from within your site.
For example on your home page menu there are links to Private Message and also in the Memeberlist module if this is publicly accesible as well as forum pages.

As Private Messages cannot be sent from 'anonymous' you may as well ensure the Private Messages modules access permission is set to 'Registered Only' in nukes module administration area andhope Google will drop links already indexed.
 
View user's profile Send private message Send e-mail
kguske







PostPosted: Sat Jan 13, 2007 11:04 am Reply with quote

It will respect robots.txt if you have added a restriction against looking at the Private Messages module.
 
Bluezzz







PostPosted: Sat Jan 13, 2007 3:38 pm Reply with quote

Mine looks like this ... not sure it's right mind you LOL

User-agent: Googlebot-Image
Disallow: /

User-agent: *
Crawl-delay: 20
Disallow: /modules.php?name=Top&querylang=union%20select%200,pwd?id=honeytrap
Disallow: /_private/
Disallow: /_vti_bin/
Disallow: /_vti_cnf/
Disallow: /_vti_log/
Disallow: /_vti_pvt/
Disallow: /_vti_txt/
Disallow: /abuse/
Disallow: /admin/
Disallow: /blocks/
Disallow: /cgi-bin/
Disallow: /db/
Disallow: /images/
Disallow: /includes/
Disallow: /language/
Disallow: /modules/
Disallow: /themes/
Disallow: /admin.php
Disallow: /config.php
Disallow: /demohack.php
Disallow: /cplogin.php
Disallow: /hackattempt.php
Disallow: /login.php

Doesn't Disallow: /modules block the bot? If not what should my robots.txt look like please? Thanks : o}
 
Bluezzz







PostPosted: Sat Jan 13, 2007 3:50 pm Reply with quote

I've also seen the bot in Your Account and Profiles, I need to block it from those as well!
 
Bluezzz







PostPosted: Sat Jan 13, 2007 4:22 pm Reply with quote

PMs are set to registered only. But Your Account has to be open to all. Profiles, I assume, is another way of saying Your Account? Anyway, I don't want the bots into anything *private*. My Member List is Admin only now, so I'd rather the bots weren't into that either (to keep people's emails private).

How do I add on to the robots.txt to keep them out of those three areas?
 
jakec
Site Admin



Joined: Feb 06, 2006
Posts: 3048
Location: United Kingdom

PostPosted: Sat Jan 13, 2007 4:51 pm Reply with quote

Once you set them to registered, or admin only Googlebot should not be able to access them, so you should be safe.
 
View user's profile Send private message
Bluezzz







PostPosted: Sat Jan 13, 2007 5:33 pm Reply with quote

It is accessing them, and it is Registered only ... that's why I'm posting LOL. I don't want the bot in any of the private stuff like Member List, Private Messages, Your Account or Profiles. The only one of those that is public is Your Account, and it has to be so people can join. So, how to keep the bot out of those private areas?

Or, are you saying even though they are getting into those areas they are not logging what they find? If that's the case why are they in them at all then?
 
Guardian2003







PostPosted: Sat Jan 13, 2007 8:48 pm Reply with quote

What links is Google getting to exactly?
I am assuming you have some data which tells you were google is going so it might help if you can give a couple of example links.
 
kguske







PostPosted: Sat Jan 13, 2007 8:49 pm Reply with quote

/modules will not block bots. But
Code:
Disallow: /modules.php?name=Private_Messages
should.
 
Bluezzz







PostPosted: Sat Jan 13, 2007 9:47 pm Reply with quote

Where should the robots.txt file be btw? I currently have two, on in the domain root, and one in the nuke root ... do I need two? Which, if only one necessary, should I keep?
 
Bluezzz







PostPosted: Sat Jan 13, 2007 9:51 pm Reply with quote

Googlebot samples from IP Tracking ...

/nukefolder/modules.php?name=Private_Messages&file=index&mode=post&u=56 2007-01-13 03:44:10
/nukefolder/modules.php?name=Your_Account&redirect=posting&mode=quote&p=681
 
evaders99
Former Moderator in Good Standing



Joined: Apr 30, 2004
Posts: 3221

PostPosted: Sat Jan 13, 2007 11:55 pm Reply with quote

Mm that's interesting. Can we get the IPs on those? It may not really be Google bot, esp if there are no public links on your site to those

_________________
- Star Wars Rebellion Network -

Need help? Nuke Patched Core, Coding Services, Webmaster Services 
View user's profile Send private message Visit poster's website
Bluezzz







PostPosted: Sun Jan 14, 2007 1:21 am Reply with quote

This is only a small list of where this one has been but the Your Account and Private Messages concern me ...

66.249.72.237 crawl-66-249-72-237.googlebot.com
/nukefolder/modules.php?name=Your_Account&redirect=posting&mode=quote&p=639 2007-01-11 10:48:23
/nukefolder/modules.php?name=Your_Account&redirect=posting&mode=quote&p=638 2007-01-11 10:43:20
/nukefolder/modules.php?name=Your_Account&redirect=posting&mode=quote&p=645 2007-01-11 10:38:21
/nukefolder/modules.php?name=Your_Account&redirect=posting&mode=quote&p=641 2007-01-11 10:23:21
/nukefolder/modules.php?name=Private_Messages&file=index&mode=post&u=53

Here are two more IPs from the same domain
66.249.66.163 crawl-66-249-66-163.googlebot.com
66.249.66.40 crawl-66-249-66-40.googlebot.com

A Whois search brings up google for these IPs.
 
evaders99







PostPosted: Mon Jan 15, 2007 2:45 am Reply with quote

Well the first ones are valid redirects, because Google sees links such as
modules.php ? name=Forums&file=posting&mode=quote&p=639
It also sees the second links.

These are public URLs that phpBB gives out. They do not indicate that Google bot is actually looking at any private messages or is trying to post something
 
Bluezzz







PostPosted: Thu Jan 18, 2007 12:19 am Reply with quote

OK so not to be concerned about it showing as Private Messages or Your Account? Or should I Disallow from robot.txt for both of these features just to be safe? Thanks for you help btw!
 
evaders99







PostPosted: Thu Jan 18, 2007 12:48 am Reply with quote

You could disallow them. It really doesn't matter, I don't think Google will index them because there is no original content on them
 
Display posts from previous:       
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Ravens PHP Scripts And Web Hosting Forum Index -> Raven's RavenNuke(tm) v2.02.02 Distro

View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Powered by phpBB © 2001-2007 phpBB Group
All times are GMT - 6 Hours
 
Forums ©