Comment Spam : Eliminate False Positives With Akismet + reCaptcha

The recent WeblogToolsCollection post about a new antispam plugin “WP Mollom” got me thinking. What’s the main problem with Akismet? It’s certainly good enough at catching spam – it only misses about 4 spam comments per month on this blog and has nearly 99.9% accuracy overall. However, the situation might not be so rosy when it comes to comments being incorrectly labeled as spam. I say “might” because there isn’t really any practical way to check – on a large site you can get over ten thousand spam comments per month; you can’t just check each of them manually to fish out a (hopefully) few false positives.

And you shouldn’t have to.

Enter the TanTanNoodles Simple Spam Filter. It’s a WordPress plugin that, among other things, gives comments caught by Akismet a second chance by asking the user to enter a reCAPTCHA. This brilliant compromise between fully-automated spam detection and bothersome CAPTCHA systems should virtually eliminate false positives. In fact, I’m surprised this approach wasn’t used in WP right from the start.

CAPTCHA for a blocked comment (a dramatic angle!)

Regarding other features of the aforementioned plugin, there’s :

  • a simple word blacklist,
  • a regexp blacklist,
  • an option to block comments with more than X external links (Huh? This has been in the WP core for years!),
  • and an option to block comments that are too similar to existing comments.

The plugin also discards posts that contain nothing but links. Overall, it lives up to it’s name by providing a number of simple yet sometimes useful functions. Personally, I would install it just for the “second chance” reCAPTCHA feature šŸ™‚

A note of caution : I’d advise against enabling the “block similar comments” feature. The algorithm is extremely resource-intensive ( O(N^3) ) and could really bog down your server if your blog has a lot of comments.

Related posts :

6 Responses to “Comment Spam : Eliminate False Positives With Akismet + reCaptcha”

  1. Ajay says:

    I use Simple Spam Filter as well on my blog. In fact it is a perfect first line of defense.

    However, I don’t think this is a good idea to be using a CAPTCHA. It is extremely irritating to be entering an extra field to leave the comment.

  2. White Shadow says:

    I think you misunderstood my point. The beautiful thing is that the users usually don’t need to enter a CAPTCHA with this plugin. The CAPTCHA only shows up when Simple Spam Filter or Akismet detect that the comment looks spammy.

  3. Aja says:

    Damn, I missed that. You’re right. It doesn’t show up unless someone is flagged as spam.

    I haven’t bothered about it on my blog. I do check and add spam words that should be blocked into the list. It’s been highly effective.

    However, what is your experience with manual spammers? Currently, that is a major problem bogging everyone down.

  4. White Shadow says:

    I haven’t encountered a lot of those yet, maybe one or two suspicious comments per month. Maybe this blog’s niche (if there is one šŸ™‚ ) doesn’t attract manual spammers. But I can see that this could become a big problem eventually.

    Unfortunately, filtering human-written spam is probably one of those problems that would require a full-scale AI to solve – you’d need sophisticated natural language processing to understand if a comment is relevant to the post and judge the posters intent.

    On the other hand, there’s StupidFilter

  5. Ajay says:

    I’m going to wait till they release the WP plugin. Currently I’m badly hit by the “stupid” comments. Sometimes it is really amusing.

  6. White Shadow says:

    It might be a long wait. Their site hasn’t been updated in months. If it stays like that for much longer I might just set up their beta-version code on a server somewhere and make a publicly accessible API. The source code is GPL so that should be legal.

Leave a Reply