PHP Web Host - Quality Web Hosting For All PHP Applications $35/month $250/year (Unlimited) - $25/month - 200,000 impressions - Your Ad Could be Here - Click For Details
  Login or Register
 • Home • Downloads • Your Account • Forums • 

View next topic
View previous topic


Google
 
Web RavenPHPScripts (This Site)
Post new topic   Reply to topic
Author Message
testy1
Involved
Involved


Joined: Apr 06, 2008
Posts: 483

PostPosted: Sat Nov 07, 2009 5:06 am Reply with quote Back to top

So i have dug out the akismet module I was working on a long time back and decided to investigate it a little further ( gives me something to do while I'm away Smile ).

basically I'm after some thoughts, idea's, abuse on what should be there, shouldn't be there.Basically any thoughts that may help me. (and don't be shy to yell out if your interested in helping)

easiest way is a list of thoughts so far i suppose.




  • a way to submit missed spam
  • a way to submit false positives (ham)
  • Key Verification
  • Include all $_SERVER i.e akismet would like this



I was using a class I found but decided to start fresh with functions, I have got as far as;


  • checking a connection is possible to akismet servers.
  • checking key verification
  • checking if a comment is spam or ham (all $_SERVER is included)


Sample Output;
Code:

resource(3) of type (stream)

string(243) "POST /1.1/verify-key HTTP/1.0
Host: rest.akismet.com
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Content-Length: 52
User-Agent: Ravennuke/2.4.0 | akismet_module/1.0

key=6c6f385dd095&blog=http://localhost/test/akismet/"

Connection OK

resource(4) of type (stream)

string(243) "POST /1.1/verify-key HTTP/1.0
Host: rest.akismet.com
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Content-Length: 52
User-Agent: Ravennuke/2.4.0 | akismet_module/1.0

key=6c6f385dd095&blog=http://localhost/test/akismet/"

Key Verification Successfull

resource(5) of type (stream)

string(1094) "POST /1.1/comment-check HTTP/1.0
Host: 6c6f385dd095.rest.akismet.com
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Content-Length: 886
User-Agent: Ravennuke/2.4.0 | akismet_module/1.0

blog=http://localhost/test/akismet/&HTTP_USER_AGENT=Mozilla%2F5.0+%28Windows%3B+U%3B+Windows+NT+5.1%3B+en-GB%3B+rv%3A1.9.1.5%29+Gecko%2F20091102+Firefox%2F3.5.5+%28.NET+CLR+3.5.30729%29&HTTP_ACCEPT=text%2Fhtml%2Capplication%2Fxhtml%2Bxml%2Capplication%2Fxml%3Bq%3D0.9%2C%2A%2F%2A%3Bq%3D0.8&HTTP_ACCEPT_LANGUAGE=en-gb%2Cen%3Bq%3D0.5&HTTP_ACCEPT_ENCODING=gzip%2Cdeflate&HTTP_ACCEPT_CHARSET=ISO-8859-1%2Cutf-8%3Bq%3D0.7%2C%2A%3Bq%3D0.7&SERVER_NAME=localhost&SERVER_ADDR=127.0.0.1&REMOTE_ADDR=127.0.0.1&REMOTE_PORT=1755&REQUEST_METHOD=GET&REQUEST_URI=%2Ftest%2Fakismet%2Findex.php&SCRIPT_NAME=%2Ftest%2Fakismet%2Findex.php&user_ip=127.0.0.1&user_agent=Mozilla%2F5.0+%28Windows%3B+U%3B+Windows+NT+5.1%3B+en-GB%3B+rv%3A1.9.1.5%29+Gecko%2F20091102+Firefox%2F3.5.5+%28.NET+CLR+3.5.30729%29&referer=&comment_author=Testy1&comment_author_email=&comment_author_url=&comment_content=viagra-test-123"

Comment Is SPAM
View user's profile Send private message
eldorado
Involved
Involved


Joined: Sep 10, 2008
Posts: 414
Location: France,Translator

PostPosted: Sat Nov 07, 2009 5:48 am Reply with quote Back to top

i've thought of it and of a simple class to implement that but failed miserably.

if you have something good , I'm interrested for my Simple_Forum module because I don't want client side regulations Smile
View user's profile Send private message Visit poster's website MSN Messenger
nuken
RavenNuke(tm) Development Team


Joined: Mar 11, 2007
Posts: 1536
Location: North Carolina

PostPosted: Sat Nov 07, 2009 8:45 am Reply with quote Back to top

That would be a very nice addon. Are you integrating it into the whole site of just the forums? Either way it would be a major plus.
View user's profile Send private message Send e-mail Visit poster's website
testy1
Involved
Involved


Joined: Apr 06, 2008
Posts: 483

PostPosted: Sat Nov 07, 2009 10:57 pm Reply with quote Back to top

nuken wrote:
That would be a very nice addon. Are you integrating it into the whole site of just the forums? Either way it would be a major plus.


the idea was primarily for raven as there was talk of the forums becoming modular

ok according to akismet

Quote:

If it is at all possible, please modify the user agent string you request with to be of the following format:

Application Name/Version | Plugin Name/Version


As per akismet we would need the following user agent

Code:

define('AKISMET_API_USERAGENT', 'Ravennuke/'. RAVENNUKE_VERSION_FRIENDLY .' | akismet_module/'. AKISMET_MODULE_VERSION);


Outputs
Quote:

Ravennuke/02.40.00 | akismet_module/1.0



This function sends the POST request to akismet with a timeout of 3 seconds (The timeout could be made configurable???). The HTTP request is constructed with full headers. The response headers are discarded and the function returns the body of the response.


Code:

function rnAkismet_http_post($request, $host, $path) {

   $http_request  = "POST $path HTTP/1.0\r\n";
   $http_request .= "Host: $host\r\n";
   $http_request .= "Content-Type: application/x-www-form-urlencoded; charset=utf-8\r\n";
   $http_request .= "Content-Length: " . strlen($request) . "\r\n";
   $http_request .= "User-Agent: " . AKISMET_API_USERAGENT . "\r\n";
   $http_request .= "\r\n";
   $http_request .= $request;

  $response = '';
  if (false !== ($fs = @fsockopen($host, AKISMET_API_PORT, $errno, $errstr, 3))) {
    fwrite($fs, $http_request);
    while (!feof($fs)) {
      $response .= fgets($fs, 1160); // One TCP-IP packet
    }
    fclose($fs);
    $response = explode("\r\n\r\n", $response, 2);
  }
 
  return $response;
}



Key Verification
Quote:

Key Verification — rest.akismet.com/1.1/verify-key

The key verification call should be made before beginning to use the service. It requires two variables, key and blog.

key (required)
The API key being verified for use with the API
blog (required)
The front page or home URL of the instance making the request. For a blog, site, or wiki this would be the front page. Note: Must be a full URI, including
Only registered users can see links on this board!
Get registered or login to the forums!


The call returns "valid" if the key is valid. This is the one call that can be made without the API key subdomain.



Raven Key Verification Function

Code:

function rnAkismet_verify_key($key, $ip = null) {
   global $nukeurl;
  if (!empty($key)) {
    $request = 'key='. $key .'&blog='. $nukeurl;
    $response = rnAkismet_http_post($request, AKISMET_API_HOST, '/'. AKISMET_API_VERSION .'/verify-key', $ip);
  }
  if (isset($response[1]) && 'valid' == $response[1]) {
      return true;
  } else {
      return false;
  }
}



Comment Check

ok this is where Im having design difficulties.Akismet request the following;


Quote:

This is basically the core of everything. This call takes a number of arguments and characteristics about the submitted content and then returns a thumbs up or thumbs down. Almost everything is optional, but performance can drop dramatically if you exclude certain elements. I would recommend erring on the side of too much data, as everything is used as part of the Akismet signature.

blog (required)
The front page or home URL of the instance making the request. For a blog or wiki this would be the front page. Note: Must be a full URI, including
Only registered users can see links on this board!
Get registered or login to the forums!

user_ip (required)
IP address of the comment submitter.
user_agent (required)
User agent information.
referrer (note spelling)
The content of the HTTP_REFERER header should be sent here.
permalink
The permanent location of the entry the comment was submitted to.
comment_type
May be blank, comment, trackback, pingback, or a made up value like "registration".
comment_author
Submitted name with the comment
comment_author_email
Submitted email address
comment_author_url
Commenter URL.
comment_content
The content that was submitted.
Other server enviroment variables
In PHP there is an array of enviroment variables called $_SERVER which contains information about the web server itself as well as a key/value for every HTTP header sent with the request. This data is highly useful to Akismet as how the submited content interacts with the server can be very telling, so please include as much information as possible.

This call returns either "true" or "false" as the body content. True means that the comment is spam and false means that it isn't spam. If you are having trouble triggering you can send "viagra-test-123" as the author and it will trigger a true response, always.



Idea: - use function akismet_prepare_comment_data to prepare the data like user ip, referrer, username etc, the function also checks if a user is registered and inserts some info automatically. We could then send the data via the function rnAkismet_comment_check.The $slink would be the link to the particular content. e.g. if you were looking at a news article here

Code:

http://localhost/dfwmods/modules.php?name=News&file=article&sid=1


the slink would be;

Code:

article1.html
or probably more like
article.html'.$sid.'


The function;
Code:

function akismet_prepare_comment_data($sAuthor = 'Anonymous', $sEmail = '', $sLink, $sComment) {
   global $user;
   if (is_user($user)) {
   $userinfo = getusrinfo($user);
   } else {
      $userinfo = '';
   }
   
  // Prepare data that is common to nodes/comments.
  $comment_data = array();
    // IP address of the comment submitter.
   
    $comment_data['user_ip'] = $_SERVER['REMOTE_ADDR'];
    // User agent information of the comment submitter.
    $comment_data['user_agent'] = $_SERVER['HTTP_USER_AGENT'];
    // The content of the HTTP_REFERER header should be sent here.
    $comment_data['referer'] = isset($_SERVER['HTTP_REFERER']) ? $_SERVER['HTTP_REFERER'] : '';;
    // Submitted name with the comment.
   $comment_data['comment_author'] = (isset($userinfo['username']) ? $userinfo['username'] : $sAuthor);

    $comment_data['permalink'] = $nukeurl . $sLink;
    $comment_data['comment_author_email'] = (isset($userinfo['user_email']) ? $userinfo['user_email'] : $sEmail);
    $comment_data['comment_author_url'] = (isset($userinfo['user_website']) ? $userinfo['user_website'] : '');
    $comment_data['comment_content'] = $sComment;
  return $comment_data;
}


The following function would send the data via the function rnAkismet_http_post.The rnAkismet_comment_check function also ties in the server variables akismet requested (see further down).
Code:

function rnAkismet_comment_check($comment_data) {
  global $akismet_cfg;;
  if (!empty($akismet_cfg['api_key'])) {
    $comment_data = array_merge(rnAkismet_include_request(), $comment_data);
    $query_string = rnAkismet_build_string($comment_data);
    $host = $akismet_cfg['api_key'] .'.'. AKISMET_API_HOST;
    $response = rnAkismet_http_post($query_string, $host, '/'. AKISMET_API_VERSION .'/comment-check');
  }
  if (!isset($response[1])) {
    return AKISMET_API_RESULT_ERROR;
  }
  return ('true' == $response[1] ? AKISMET_API_RESULT_IS_SPAM : AKISMET_API_RESULT_IS_HAM);
}


Akismet also ask;

Quote:

Other server enviroment variables
In PHP there is an array of enviroment variables called $_SERVER which contains information about the web server itself as well as a key/value for every HTTP header sent with the request. This data is highly useful to Akismet as how the submited content interacts with the server can be very telling, so please include as much information as possible.


we could achieve this by the following function i borrowed from egroupware.

Code:

function rnAkismet_include_request() {
  // You may add more elements here, but they are often related to internal server
  // data that makes little sense to check whether a comment is spam or not.
  // Be sure to not send HTTP_COOKIE as it may compromise your user's privacy!
  static $safe_to_send = array(
    'CONTENT_LENGTH',
    'CONTENT_TYPE',
    'HTTP_ACCEPT',
    'HTTP_ACCEPT_CHARSET',
    'HTTP_ACCEPT_ENCODING',
    'HTTP_ACCEPT_LANGUAGE',
    'HTTP_REFERER',
    'HTTP_USER_AGENT',
    'REMOTE_ADDR',
    'REMOTE_PORT',
    'SCRIPT_URI',
    'SCRIPT_URL',
    'SERVER_ADDR',
    'SERVER_NAME',
    'REQUEST_METHOD',
    'REQUEST_URI',
    'SCRIPT_NAME'
  );

  // The contents of $_SERVER doesn't change between requests,
  // so we can have this cached in static storage.
  static $server_data;
  if (!$server_data) {
    $server_data = array();
    foreach ($_SERVER as $key => $value) {
      if (in_array($key, $safe_to_send)) {
        $server_data[$key] = $value;
      }
    }
  }
  return $server_data;
}




Next would be all comments classified as spam would be held in the database so an administrator could resubmit the comment as ham (not spam) in case of false positive's.Accordingly each comment should have a new link that will allow administrators to mark the comment as sapm in case it gets through.

I guess we could now tie all this in with a function in mainfile.

anything im missing....or can anyone see any problems with this.
View user's profile Send private message
testy1
Involved
Involved


Joined: Apr 06, 2008
Posts: 483

PostPosted: Sat Nov 07, 2009 11:58 pm Reply with quote Back to top

ok here is the mainfile function so far

Code:

function check_spam($sAuthor = '', $sEmail = '', $sLink, $sComment) {
  global $akismet_cfg, $nukeurl, $user;
  include_once (INCLUDE_PATH . 'includes/akismet.inc.php');
  // First check Server Connectivity

  $Connect = akismet_check_server_connectivity();
  if ($Connect === true) {
    // Next Verify The Key
    $VerifyKey = rnAkismet_verify_key($akismet_cfg['api_key']);
    if ($VerifyKey === true) {
      if (is_user($user)) {
        $userinfo = getusrinfo($user);
      }
      $sAuthor = isset($userinfo['username']) ? $userinfo['username'] : 'Anonymous';
      $sEmail = isset($userinfo['user_email']) ? $userinfo['user_email'] : $sEmail;
      $sWebsite = isset($userinfo['user_website']) ? $userinfo['user_website'] : '';
      // The comment
      $user = '[a-zA-Z0-9_\-\.\+\^!#\$%&*+\/\=\?\`\|\{\}~\']+';
      $domain = '(?:(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.?)+';
      $ipv4 = '[0-9]{1,3}(\.[0-9]{1,3}){3}';
      $ipv6 = '[0-9a-fA-F]{1,4}(\:[0-9a-fA-F]{1,4}){7}';
      $VerifyEmail = (bool)preg_match("/^$user@($domain|(\[($ipv4|$ipv6)\]))$/", $sEmail);
      $sEmail = (isset($VerifyEmail) && $VerifyEmail === true) ? $sEmail : '';
      $sLink = htmlentities($sLink, ENT_QUOTES);
      $sComment = htmlentities(strip_tags($sComment), ENT_QUOTES);
      // Next Prepare the Data
      $Prepare = akismet_prepare_comment_data($sAuthor, $sEmail, $sWebsite, $sLink, $sComment);
      $CheckComment = rnAkismet_comment_check($Prepare);
      if ($CheckComment === 1) {
        echo 'Comment Is SPAM!<br />';
      } else {
        echo 'Comment Is HAM!<br />';
      }
    } else {
      echo 'Key Verification Failed<br />';
    }
  } else {
    echo 'Connection Failed<br />';
  }
}



Output Test 1;
Code:

Ravennuke/02.40.00 | akismet_module/1.0

array(8) {
  ["user_ip"]=>
  string(9) "127.0.0.1"
  ["user_agent"]=>
  string(109) "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)"
  ["referer"]=>
  string(45) "http://localhost/dfwmods/admin.php?op=akismet"
  ["comment_author"]=>
  string(6) "Testy1"
  ["permalink"]=>
  string(72) "http://localhost/dfwmodsarticle1.html"
  ["comment_author_email"]=>
  string(16) "egmade@local.com"
  ["comment_author_url"]=>
  string(23) "http://www.spamsite.com"
  ["comment_content"]=>
  string(29) "hi there people nice function"
}

Comment Is HAM!


Output Test 2;
Code:

Ravennuke/02.40.00 | akismet_module/1.0

array(8) {
  ["user_ip"]=>
  string(9) "127.0.0.1"
  ["user_agent"]=>
  string(109) "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)"
  ["referer"]=>
  string(45) "http://localhost/dfwmods/admin.php?op=akismet"
  ["comment_author"]=>
  string(6) "Testy1"
  ["permalink"]=>
  string(72) "http://localhost/dfwmodsarticle1.html"
  ["comment_author_email"]=>
  string(16) "egmade@local.com"
  ["comment_author_url"]=>
  string(23) "http://www.spamsite.com"
  ["comment_content"]=>
  string(15) "viagra-test-123"
}

Comment Is SPAM!
View user's profile Send private message
Guardian2003
Site Admin


Joined: Aug 28, 2003
Posts: 6373
Location: Vsetin, Czech Republic

PostPosted: Sun Nov 08, 2009 5:33 am Reply with quote Back to top

This is what I have been using
Only registered users can see links on this board!
Get registered or login to the forums!
View user's profile Send private message Send e-mail Visit poster's website
testy1
Involved
Involved


Joined: Apr 06, 2008
Posts: 483

PostPosted: Sun Nov 08, 2009 5:44 am Reply with quote Back to top

I was using similar but decided to go with a rewrite.do you think a class is the way to go?
View user's profile Send private message
Guardian2003
Site Admin


Joined: Aug 28, 2003
Posts: 6373
Location: Vsetin, Czech Republic

PostPosted: Sun Nov 08, 2009 5:55 am Reply with quote Back to top

I would because you can just 'include' it whenever you need it, you don't have to have it loaded all the time. All the codes in one place so if Akismet ever changes, it's easier to find the bit of code you need to change.
I used it in a couple of modules but since I have not had any spam since RN 2.x it didn't seem worthwhile incorporating it any more.
View user's profile Send private message Send e-mail Visit poster's website
testy1
Involved
Involved


Joined: Apr 06, 2008
Posts: 483

PostPosted: Sun Nov 08, 2009 4:10 pm Reply with quote Back to top

ok Ill check it out thanks.
View user's profile Send private message
kguske
Site Admin


Joined: Jun 04, 2004
Posts: 6044

PostPosted: Sat Jul 10, 2010 9:12 am Reply with quote Back to top

Sorry I missed this originally. I was looking at this for a series of standalone form-to-email applications. I reviewed 4 libraries (either class, include functions, or both).

I remembered that Guardian was working on this, and asked him about it. His response might help others, so I am re-posting part of it here (thanks, G):

Guardian2003 wrote:
I think I would probably look at Project Honeypot as an initial filter and then add a manually updated blacklist as it would be more viable in terms of the man hours spent updating the blacklist once a month I should think.

If you're using a Class or developing your own code to access the Akismet API use the http1.0 protocol NOT http1.1 - 1.0 is much faster and works just as well."

Since Project Honeypot is IP-based, and IPs are so easy to spoof, I'm wondering how effective that would be. As far as stopping feedback / comment / forum spam, a content analysis approach like Akismet seems more effective. Checking the IP first might weed out a few spammers, but it would not be long before even those bottom-feeders circumvent that. Of course, Guardian has been looking at this MUCH longer than I - so I'd be especially interested in further discussion on this (hence this post)...

Also, I'm interested in further feedback on Akismet 1.0 vs. 1.1 - everything I've seen uses 1.1, though that could be just a "latest-is-greatest" mentality. Is there any different in the interface - or just the Akismet version parameter that gets passed?
View user's profile Send private message
sixonetonoffun
Spouse Contemplates Divorce


Joined: Jan 02, 2003
Posts: 2499

PostPosted: Sun Jul 11, 2010 7:04 am Reply with quote Back to top

I think it would be a nice pre registration mod on any site. Which also takes real time operation out of the equation to some extent. I've been to lazy to pursue it as its not been an issue for me either.
View user's profile Send private message
kguske
Site Admin


Joined: Jun 04, 2004
Posts: 6044

PostPosted: Sun Jul 11, 2010 7:47 am Reply with quote Back to top

Six, to clarify, by "it" do you mean Project Honeypot or Akismet? And what do you mean by real-time operation? Thanks...
View user's profile Send private message
sixonetonoffun
Spouse Contemplates Divorce


Joined: Jan 02, 2003
Posts: 2499

PostPosted: Sun Jul 11, 2010 8:42 am Reply with quote Back to top

I'd say either could apply but I was thinking of Akismet.
Realtime as in checking validity of a posters email on an anon reply to an active comment thread. Operation could be way to slow. Where a onsite database would be feasible as an option.
(not that allowing anon posters doesn't lead to trolls and flame wars but... it is done frequently)

Where registration confirmation delay would hardly be noticed.

Am I just confusing the use of Akismet and topic?

Could be just too tired to comprehend anything Exclamation

Edit some hours of sleep later:
Ok so someone registered on my site using a suspected spammer name/email so I decided to look at another option for fun.

Much like nukesentinel is
Only registered users can see links on this board!
Get registered or login to the forums!
I'm testing it now.
Likes:
Easy to update csv file for ip bans.
Fast load

Dislikes:
Installer seems quirky easier to upload files to live site after setting them up locally.
Not a native solution duplicates some native checking.

Off topic will start a new thread after some testing.
View user's profile Send private message
Display posts from previous:       
Post new topic   Reply to topic

View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Forums ©
 

All logos and trademarks in this site are property of their respective owner.
The comments are property of their posters, all the rest © 2002-2011 by Raven

You can syndicate our news using the file xml

CSE HTML Validator Helped Clean up This Page! [Valid RSS] valid RSS 2.0 Valid robots.txt Stop Spam Harvesters, Join Project Honey Pot

Website engines core code is © copyright by PHP-Nuke but has been heavily patched and modified by myself and others.
PHP-Nuke is a free software released under the GNU/GPL.


:: fisubice phpbb2 style by Daz :: PHP-Nuke theme by www.nukemods.com ::
:: fisubice Theme Modified by the RavenNuke™ Team ::

:: W3C CSS Compliance Validation :: W3C HTML 4.01 Transitional Compliance Validation ::

zerosum