Ravens PHP Scripts: Forums
 

 

View next topic
View previous topic
Post new topic   Reply to topic    Ravens PHP Scripts And Web Hosting Forum Index -> Modules
Author Message
testy1
Involved
Involved



Joined: Apr 06, 2008
Posts: 484

PostPosted: Sat Nov 07, 2009 5:06 am Reply with quote

So i have dug out the akismet module I was working on a long time back and decided to investigate it a little further ( gives me something to do while I'm away Smile ).

basically I'm after some thoughts, idea's, abuse on what should be there, shouldn't be there.Basically any thoughts that may help me. (and don't be shy to yell out if your interested in helping)

easiest way is a list of thoughts so far i suppose.




  • a way to submit missed spam
  • a way to submit false positives (ham)
  • Key Verification
  • Include all $_SERVER i.e akismet would like this



I was using a class I found but decided to start fresh with functions, I have got as far as;


  • checking a connection is possible to akismet servers.
  • checking key verification
  • checking if a comment is spam or ham (all $_SERVER is included)


Sample Output;
Code:


resource(3) of type (stream)

string(243) "POST /1.1/verify-key HTTP/1.0
Host: rest.akismet.com
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Content-Length: 52
User-Agent: Ravennuke/2.4.0 | akismet_module/1.0

key=6c6f385dd095&blog=http://localhost/test/akismet/"

Connection OK

resource(4) of type (stream)

string(243) "POST /1.1/verify-key HTTP/1.0
Host: rest.akismet.com
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Content-Length: 52
User-Agent: Ravennuke/2.4.0 | akismet_module/1.0

key=6c6f385dd095&blog=http://localhost/test/akismet/"

Key Verification Successfull

resource(5) of type (stream)

string(1094) "POST /1.1/comment-check HTTP/1.0
Host: 6c6f385dd095.rest.akismet.com
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Content-Length: 886
User-Agent: Ravennuke/2.4.0 | akismet_module/1.0

blog=http://localhost/test/akismet/&HTTP_USER_AGENT=Mozilla%2F5.0+%28Windows%3B+U%3B+Windows+NT+5.1%3B+en-GB%3B+rv%3A1.9.1.5%29+Gecko%2F20091102+Firefox%2F3.5.5+%28.NET+CLR+3.5.30729%29&HTTP_ACCEPT=text%2Fhtml%2Capplication%2Fxhtml%2Bxml%2Capplication%2Fxml%3Bq%3D0.9%2C%2A%2F%2A%3Bq%3D0.8&HTTP_ACCEPT_LANGUAGE=en-gb%2Cen%3Bq%3D0.5&HTTP_ACCEPT_ENCODING=gzip%2Cdeflate&HTTP_ACCEPT_CHARSET=ISO-8859-1%2Cutf-8%3Bq%3D0.7%2C%2A%3Bq%3D0.7&SERVER_NAME=localhost&SERVER_ADDR=127.0.0.1&REMOTE_ADDR=127.0.0.1&REMOTE_PORT=1755&REQUEST_METHOD=GET&REQUEST_URI=%2Ftest%2Fakismet%2Findex.php&SCRIPT_NAME=%2Ftest%2Fakismet%2Findex.php&user_ip=127.0.0.1&user_agent=Mozilla%2F5.0+%28Windows%3B+U%3B+Windows+NT+5.1%3B+en-GB%3B+rv%3A1.9.1.5%29+Gecko%2F20091102+Firefox%2F3.5.5+%28.NET+CLR+3.5.30729%29&referer=&comment_author=Testy1&comment_author_email=&comment_author_url=&comment_content=viagra-test-123"

Comment Is SPAM
 
View user's profile Send private message
eldorado
Involved
Involved



Joined: Sep 10, 2008
Posts: 424
Location: France,Translator

PostPosted: Sat Nov 07, 2009 5:48 am Reply with quote

i've thought of it and of a simple class to implement that but failed miserably.

if you have something good , I'm interrested for my Simple_Forum module because I don't want client side regulations Smile

_________________
United-holy-dragons.net (My RN site)- Rejekz(cod4 clan) - gamerslounge 
View user's profile Send private message Visit poster's website MSN Messenger
nuken
RavenNuke(tm) Development Team



Joined: Mar 11, 2007
Posts: 2024
Location: North Carolina

PostPosted: Sat Nov 07, 2009 8:45 am Reply with quote

That would be a very nice addon. Are you integrating it into the whole site of just the forums? Either way it would be a major plus.

_________________
Tricked Out News 
View user's profile Send private message Send e-mail Visit poster's website
testy1







PostPosted: Sat Nov 07, 2009 10:57 pm Reply with quote

nuken wrote:
That would be a very nice addon. Are you integrating it into the whole site of just the forums? Either way it would be a major plus.


the idea was primarily for raven as there was talk of the forums becoming modular

ok according to akismet

Quote:

If it is at all possible, please modify the user agent string you request with to be of the following format:

Application Name/Version | Plugin Name/Version


As per akismet we would need the following user agent

Code:


define('AKISMET_API_USERAGENT', 'Ravennuke/'. RAVENNUKE_VERSION_FRIENDLY .' | akismet_module/'. AKISMET_MODULE_VERSION);


Outputs
Quote:

Ravennuke/02.40.00 | akismet_module/1.0



This function sends the POST request to akismet with a timeout of 3 seconds (The timeout could be made configurable???). The HTTP request is constructed with full headers. The response headers are discarded and the function returns the body of the response.


Code:


function rnAkismet_http_post($request, $host, $path) {

   $http_request  = "POST $path HTTP/1.0\r\n";
   $http_request .= "Host: $host\r\n";
   $http_request .= "Content-Type: application/x-www-form-urlencoded; charset=utf-8\r\n";
   $http_request .= "Content-Length: " . strlen($request) . "\r\n";
   $http_request .= "User-Agent: " . AKISMET_API_USERAGENT . "\r\n";
   $http_request .= "\r\n";
   $http_request .= $request;

  $response = '';
  if (false !== ($fs = @fsockopen($host, AKISMET_API_PORT, $errno, $errstr, 3))) {
    fwrite($fs, $http_request);
    while (!feof($fs)) {
      $response .= fgets($fs, 1160); // One TCP-IP packet
    }
    fclose($fs);
    $response = explode("\r\n\r\n", $response, 2);
  }
 
  return $response;
}



Key Verification
Quote:

Key Verification — rest.akismet.com/1.1/verify-key

The key verification call should be made before beginning to use the service. It requires two variables, key and blog.

key (required)
The API key being verified for use with the API
blog (required)
The front page or home URL of the instance making the request. For a blog, site, or wiki this would be the front page. Note: Must be a full URI, including [ Only registered users can see links on this board! Get registered or login! ]

The call returns "valid" if the key is valid. This is the one call that can be made without the API key subdomain.



Raven Key Verification Function

Code:


function rnAkismet_verify_key($key, $ip = null) {
   global $nukeurl;
  if (!empty($key)) {
    $request = 'key='. $key .'&blog='. $nukeurl;
    $response = rnAkismet_http_post($request, AKISMET_API_HOST, '/'. AKISMET_API_VERSION .'/verify-key', $ip);
  }
  if (isset($response[1]) && 'valid' == $response[1]) {
      return true;
  } else {
      return false;
  }
}



Comment Check

ok this is where Im having design difficulties.Akismet request the following;


Quote:

This is basically the core of everything. This call takes a number of arguments and characteristics about the submitted content and then returns a thumbs up or thumbs down. Almost everything is optional, but performance can drop dramatically if you exclude certain elements. I would recommend erring on the side of too much data, as everything is used as part of the Akismet signature.

blog (required)
The front page or home URL of the instance making the request. For a blog or wiki this would be the front page. Note: Must be a full URI, including [ Only registered users can see links on this board! Get registered or login! ]
user_ip (required)
IP address of the comment submitter.
user_agent (required)
User agent information.
referrer (note spelling)
The content of the HTTP_REFERER header should be sent here.
permalink
The permanent location of the entry the comment was submitted to.
comment_type
May be blank, comment, trackback, pingback, or a made up value like "registration".
comment_author
Submitted name with the comment
comment_author_email
Submitted email address
comment_author_url
Commenter URL.
comment_content
The content that was submitted.
Other server enviroment variables
In PHP there is an array of enviroment variables called $_SERVER which contains information about the web server itself as well as a key/value for every HTTP header sent with the request. This data is highly useful to Akismet as how the submited content interacts with the server can be very telling, so please include as much information as possible.

This call returns either "true" or "false" as the body content. True means that the comment is spam and false means that it isn't spam. If you are having trouble triggering you can send "viagra-test-123" as the author and it will trigger a true response, always.



Idea: - use function akismet_prepare_comment_data to prepare the data like user ip, referrer, username etc, the function also checks if a user is registered and inserts some info automatically. We could then send the data via the function rnAkismet_comment_check.The $slink would be the link to the particular content. e.g. if you were looking at a news article here

Code:


http://localhost/dfwmods/modules.php?name=News&file=article&sid=1


the slink would be;

Code:


article1.html
or probably more like
article.html'.$sid.'


The function;
Code:


function akismet_prepare_comment_data($sAuthor = 'Anonymous', $sEmail = '', $sLink, $sComment) {
   global $user;
   if (is_user($user)) {
   $userinfo = getusrinfo($user);
   } else {
      $userinfo = '';
   }
   
  // Prepare data that is common to nodes/comments.
  $comment_data = array();
    // IP address of the comment submitter.
   
    $comment_data['user_ip'] = $_SERVER['REMOTE_ADDR'];
    // User agent information of the comment submitter.
    $comment_data['user_agent'] = $_SERVER['HTTP_USER_AGENT'];
    // The content of the HTTP_REFERER header should be sent here.
    $comment_data['referer'] = isset($_SERVER['HTTP_REFERER']) ? $_SERVER['HTTP_REFERER'] : '';;
    // Submitted name with the comment.
   $comment_data['comment_author'] = (isset($userinfo['username']) ? $userinfo['username'] : $sAuthor);

    $comment_data['permalink'] = $nukeurl . $sLink;
    $comment_data['comment_author_email'] = (isset($userinfo['user_email']) ? $userinfo['user_email'] : $sEmail);
    $comment_data['comment_author_url'] = (isset($userinfo['user_website']) ? $userinfo['user_website'] : '');
    $comment_data['comment_content'] = $sComment;
  return $comment_data;
}


The following function would send the data via the function rnAkismet_http_post.The rnAkismet_comment_check function also ties in the server variables akismet requested (see further down).
Code:


function rnAkismet_comment_check($comment_data) {
  global $akismet_cfg;;
  if (!empty($akismet_cfg['api_key'])) {
    $comment_data = array_merge(rnAkismet_include_request(), $comment_data);
    $query_string = rnAkismet_build_string($comment_data);
    $host = $akismet_cfg['api_key'] .'.'. AKISMET_API_HOST;
    $response = rnAkismet_http_post($query_string, $host, '/'. AKISMET_API_VERSION .'/comment-check');
  }
  if (!isset($response[1])) {
    return AKISMET_API_RESULT_ERROR;
  }
  return ('true' == $response[1] ? AKISMET_API_RESULT_IS_SPAM : AKISMET_API_RESULT_IS_HAM);
}


Akismet also ask;

Quote:

Other server enviroment variables
In PHP there is an array of enviroment variables called $_SERVER which contains information about the web server itself as well as a key/value for every HTTP header sent with the request. This data is highly useful to Akismet as how the submited content interacts with the server can be very telling, so please include as much information as possible.


we could achieve this by the following function i borrowed from egroupware.

Code:


function rnAkismet_include_request() {
  // You may add more elements here, but they are often related to internal server
  // data that makes little sense to check whether a comment is spam or not.
  // Be sure to not send HTTP_COOKIE as it may compromise your user's privacy!
  static $safe_to_send = array(
    'CONTENT_LENGTH',
    'CONTENT_TYPE',
    'HTTP_ACCEPT',
    'HTTP_ACCEPT_CHARSET',
    'HTTP_ACCEPT_ENCODING',
    'HTTP_ACCEPT_LANGUAGE',
    'HTTP_REFERER',
    'HTTP_USER_AGENT',
    'REMOTE_ADDR',
    'REMOTE_PORT',
    'SCRIPT_URI',
    'SCRIPT_URL',
    'SERVER_ADDR',
    'SERVER_NAME',
    'REQUEST_METHOD',
    'REQUEST_URI',
    'SCRIPT_NAME'
  );

  // The contents of $_SERVER doesn't change between requests,
  // so we can have this cached in static storage.
  static $server_data;
  if (!$server_data) {
    $server_data = array();
    foreach ($_SERVER as $key => $value) {
      if (in_array($key, $safe_to_send)) {
        $server_data[$key] = $value;
      }
    }
  }
  return $server_data;
}




Next would be all comments classified as spam would be held in the database so an administrator could resubmit the comment as ham (not spam) in case of false positive's.Accordingly each comment should have a new link that will allow administrators to mark the comment as sapm in case it gets through.

I guess we could now tie all this in with a function in mainfile.

anything im missing....or can anyone see any problems with this.
 
testy1







PostPosted: Sat Nov 07, 2009 11:58 pm Reply with quote

ok here is the mainfile function so far

Code:


function check_spam($sAuthor = '', $sEmail = '', $sLink, $sComment) {
  global $akismet_cfg, $nukeurl, $user;
  include_once (INCLUDE_PATH . 'includes/akismet.inc.php');
  // First check Server Connectivity

  $Connect = akismet_check_server_connectivity();
  if ($Connect === true) {
    // Next Verify The Key
    $VerifyKey = rnAkismet_verify_key($akismet_cfg['api_key']);
    if ($VerifyKey === true) {
      if (is_user($user)) {
        $userinfo = getusrinfo($user);
      }
      $sAuthor = isset($userinfo['username']) ? $userinfo['username'] : 'Anonymous';
      $sEmail = isset($userinfo['user_email']) ? $userinfo['user_email'] : $sEmail;
      $sWebsite = isset($userinfo['user_website']) ? $userinfo['user_website'] : '';
      // The comment
      $user = '[a-zA-Z0-9_\-\.\+\^!#\$%&*+\/\=\?\`\|\{\}~\']+';
      $domain = '(?:(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.?)+';
      $ipv4 = '[0-9]{1,3}(\.[0-9]{1,3}){3}';
      $ipv6 = '[0-9a-fA-F]{1,4}(\:[0-9a-fA-F]{1,4}){7}';
      $VerifyEmail = (bool)preg_match("/^$user@($domain|(\[($ipv4|$ipv6)\]))$/", $sEmail);
      $sEmail = (isset($VerifyEmail) && $VerifyEmail === true) ? $sEmail : '';
      $sLink = htmlentities($sLink, ENT_QUOTES);
      $sComment = htmlentities(strip_tags($sComment), ENT_QUOTES);
      // Next Prepare the Data
      $Prepare = akismet_prepare_comment_data($sAuthor, $sEmail, $sWebsite, $sLink, $sComment);
      $CheckComment = rnAkismet_comment_check($Prepare);
      if ($CheckComment === 1) {
        echo 'Comment Is SPAM!<br />';
      } else {
        echo 'Comment Is HAM!<br />';
      }
    } else {
      echo 'Key Verification Failed<br />';
    }
  } else {
    echo 'Connection Failed<br />';
  }
}



Output Test 1;
Code:


Ravennuke/02.40.00 | akismet_module/1.0

array(8) {
  ["user_ip"]=>
  string(9) "127.0.0.1"
  ["user_agent"]=>
  string(109) "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)"
  ["referer"]=>
  string(45) "http://localhost/dfwmods/admin.php?op=akismet"
  ["comment_author"]=>
  string(6) "Testy1"
  ["permalink"]=>
  string(72) "http://localhost/dfwmodsmodules.php?name=News&file=article&sid=1"
  ["comment_author_email"]=>
  string(16) "egmade@local.com"
  ["comment_author_url"]=>
  string(23) "http://www.spamsite.com"
  ["comment_content"]=>
  string(29) "hi there people nice function"
}

Comment Is HAM!


Output Test 2;
Code:


Ravennuke/02.40.00 | akismet_module/1.0

array(8) {
  ["user_ip"]=>
  string(9) "127.0.0.1"
  ["user_agent"]=>
  string(109) "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)"
  ["referer"]=>
  string(45) "http://localhost/dfwmods/admin.php?op=akismet"
  ["comment_author"]=>
  string(6) "Testy1"
  ["permalink"]=>
  string(72) "http://localhost/dfwmodsmodules.php?name=News&file=article&sid=1"
  ["comment_author_email"]=>
  string(16) "egmade@local.com"
  ["comment_author_url"]=>
  string(23) "http://www.spamsite.com"
  ["comment_content"]=>
  string(15) "viagra-test-123"
}

Comment Is SPAM!
 
Guardian2003
Site Admin



Joined: Aug 28, 2003
Posts: 6799
Location: Ha Noi, Viet Nam

PostPosted: Sun Nov 08, 2009 5:33 am Reply with quote

This is what I have been using [ Only registered users can see links on this board! Get registered or login! ]
 
View user's profile Send private message Send e-mail
testy1







PostPosted: Sun Nov 08, 2009 5:44 am Reply with quote

I was using similar but decided to go with a rewrite.do you think a class is the way to go?
 
Guardian2003







PostPosted: Sun Nov 08, 2009 5:55 am Reply with quote

I would because you can just 'include' it whenever you need it, you don't have to have it loaded all the time. All the codes in one place so if Akismet ever changes, it's easier to find the bit of code you need to change.
I used it in a couple of modules but since I have not had any spam since RN 2.x it didn't seem worthwhile incorporating it any more.
 
testy1







PostPosted: Sun Nov 08, 2009 4:10 pm Reply with quote

ok Ill check it out thanks.
 
kguske
Site Admin



Joined: Jun 04, 2004
Posts: 6432

PostPosted: Sat Jul 10, 2010 9:12 am Reply with quote

Sorry I missed this originally. I was looking at this for a series of standalone form-to-email applications. I reviewed 4 libraries (either class, include functions, or both).

I remembered that Guardian was working on this, and asked him about it. His response might help others, so I am re-posting part of it here (thanks, G):

Guardian2003 wrote:
I think I would probably look at Project Honeypot as an initial filter and then add a manually updated blacklist as it would be more viable in terms of the man hours spent updating the blacklist once a month I should think.

If you're using a Class or developing your own code to access the Akismet API use the http1.0 protocol NOT http1.1 - 1.0 is much faster and works just as well."

Since Project Honeypot is IP-based, and IPs are so easy to spoof, I'm wondering how effective that would be. As far as stopping feedback / comment / forum spam, a content analysis approach like Akismet seems more effective. Checking the IP first might weed out a few spammers, but it would not be long before even those bottom-feeders circumvent that. Of course, Guardian has been looking at this MUCH longer than I - so I'd be especially interested in further discussion on this (hence this post)...

Also, I'm interested in further feedback on Akismet 1.0 vs. 1.1 - everything I've seen uses 1.1, though that could be just a "latest-is-greatest" mentality. Is there any different in the interface - or just the Akismet version parameter that gets passed?

_________________
I search, therefore I exist...
nukeSEO - nukeFEED - nukePIE - nukeSPAM - nukeWYSIWYG
 
View user's profile Send private message
sixonetonoffun
Spouse Contemplates Divorce



Joined: Jan 02, 2003
Posts: 2496

PostPosted: Sun Jul 11, 2010 7:04 am Reply with quote

I think it would be a nice pre registration mod on any site. Which also takes real time operation out of the equation to some extent. I've been to lazy to pursue it as its not been an issue for me either.

_________________
[b][size=5]openSUSE 11.4-x86 | Linux 2.6.37.1-1.2desktop i686 | KDE: 4.6.41>=4.7 | XFCE 4.8 | AMD Athlon(tm) XP 3000+ | MSI K7N2 Delta-L | 3GB Black Diamond DDR
| GeForce 6200@433Mhz 512MB | Xorg 1.9.3 | NVIDIA 270.30[/size:2b8 
View user's profile Send private message
kguske







PostPosted: Sun Jul 11, 2010 7:47 am Reply with quote

Six, to clarify, by "it" do you mean Project Honeypot or Akismet? And what do you mean by real-time operation? Thanks...
 
sixonetonoffun







PostPosted: Sun Jul 11, 2010 8:42 am Reply with quote

I'd say either could apply but I was thinking of Akismet.
Realtime as in checking validity of a posters email on an anon reply to an active comment thread. Operation could be way to slow. Where a onsite database would be feasible as an option.
(not that allowing anon posters doesn't lead to trolls and flame wars but... it is done frequently)

Where registration confirmation delay would hardly be noticed.

Am I just confusing the use of Akismet and topic?

Could be just too tired to comprehend anything Exclamation

Edit some hours of sleep later:
Ok so someone registered on my site using a suspected spammer name/email so I decided to look at another option for fun.

Much like nukesentinel is [ Only registered users can see links on this board! Get registered or login! ] I'm testing it now.
Likes:
Easy to update csv file for ip bans.
Fast load

Dislikes:
Installer seems quirky easier to upload files to live site after setting them up locally.
Not a native solution duplicates some native checking.

Off topic will start a new thread after some testing.
 
Display posts from previous:       
Post new topic   Reply to topic    Ravens PHP Scripts And Web Hosting Forum Index -> Modules

View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Powered by phpBB © 2001-2007 phpBB Group
All times are GMT - 6 Hours
 
Forums ©