Spam Killed My Backups

Verily, my Photoshop skill doth suck. Background credit : kire1987 @ DeviantArtHaving up-to-date backups is an essential safeguard in case something goes wrong with your website. So some time ago I installed WP-DBManager and configured it to send a daily backup of my WordPress database to my GMail account. All was well, until last week the backup process failed five times in a row. Upon checking the PHP error log, I found several messages stating “memory limit exceeded”. What the…?

Turns out it was the comment spam.

This site gets hundreds of spam comments per day. Most of them are caught by Akismet, but even then they’re still stored in the database for 30 days. This can quickly add up to megabytes of worthless records. So when the backup plugin tries to process a database dump, it chokes on the (relatively generous) 32MB memory limit. To illustrate, here are some stats :

Count Average length Used DB space
All legitimate comments 3577 250 bytes 0.85 MB
Spam comments
(only the first half of April)
4505 5528 bytes 23.75 MB

Overall, spam comments accounted for more than half of the WP database size. After I deleted them the daily backup routine started working again.

Avoiding Spam Overload

Of course, cleaning up the database is only a temporary solution. If you have a semi-popular site, the spam comments can build up and you might encounter a problem similar to the one described above in the future. Here are some ideas on how to avoid that :

  • Configure Akismet to automatically discard spam comments on old posts (there’s a checkbox in Plugins -> Akismet Configuration). This will significantly decrease the amount of diskspace wasted on storing spam.
  • Increase the PHP memory limit so that it can cope with large databases. Effective, but it’s usually not an option for people on cheap hosting. Most shared hosts will lock the memory limit at 8 MB.
  • Use an anti-spam plugin that doesn’t save detected spam comments. For example, WP-SpamFree is very effective and can work alongside Akismet. Personally, I’m not using it right now because of performance considerations, but that might change.

Environmental Sidenote

On a somewhat related note, McAfee recently released a study that demonstrates the environmental impact of e-mail spam. According to the study, unwanted e-mail leads to approximately 17 million tons of carbon dioxide being released as greenhouse gas each year.

Here’s a little calculation I did for fun. Lets assume that processing a spam comment “costs” the same amount* of CO2 as delivering a spam email (0.3 g). Now, multiply that by the amount of spam caught daily by Akismet (~17030800 comments today). The result : 1864 tons of carbon emissions are caused by comment spam each year. For perspective, this is how much carbon dioxide equivalent would be released by 360 passenger cars in the same time period. Scary.

* There is no reason to believe this is true ;)

Related posts :

6 Responses to “Spam Killed My Backups”

  1. Lets assume that processing a spam comment “costs” the same amount* of CO2 as delivering a spam email (0.3 g)

    Interesting thought :)
    I created a plugin called AVH First Defense Against Spam. It checks for spammers by IP before any content is served by WordPress. It also blocks spammers that will directly access wp-comments-post.php and thus no comment has to be processed by Akismet and again no content is served.
    Both these implementations have the advantage of eliminating bandwidth usage and CPU cycles thus reducing carbon emissions.

    I should have called the plugin AVH Green Defense Against Spam :)

  2. White Shadow says:

    Is checking by IP really still effective, what with botnets, proxies and all?

  3. The plugin checks the database at Stop Forum Spam and you have the ability to block a potential spammer based on the frequency the spammer’s IP exists in their database.

    Proxies aren’t a problem, you still have the ability to get the users real IP, unless they are behind a anonymous proxy which in that case you have to wonder why the user is hiding it’s true IP.

    Botnets are a bit of a problem, but my plugin does show a small message when access is being blocked. If a legitimate user is blocked they will know why and hopefully they’ll take action.

  4. White Shadow says:

    Ah well, that sounds good. I might try it out someday, though probably not on this site. I’ve got 33 active plugins here and the server might topple if I add more ;)

  5. Lester Chan says:

    Thanks for the shout out =D

  6. DoZ says:

    Good to know, thanks a lot :)
    By the way, I use WP-SpamFree AND Akismet, together.

Leave a Reply