Skip to content

Magic word comment spamtrap

A couple of thoughts about trapping comment spam: my first idea was, add a textfield requiring a significant word from the blog entry in question. Simple for a person to produce, and in practise you just need to check if the word occurred in the entry at all, skipping a list of obvious stop words (“in”, “the”, etc.).

But of course a clever robot spider from hell can deal with that: the same technique generates magic words as tests for them.

So my second idea was a textfield already containing a word, with a note saying “If you submit this form without clearing this textfield, I’ll know you’re a robot spider from hell.” Better still, “Change this word to the one following it in the article above, to prove you’re not …” You get the idea.

For added fun, rotate these methods (and the simpler “Don’t fill in this textfield unless you’re a RSfH” on eg. the URL field) at random. If I didn’t have a thesis to write, I’d put together a wordpress plugin.

(Why am I thinking about this, given the obvious lack of comment spam on my blog? Because (a) I still occasionally moderate down posts advertising the-card-game-whose-name-we-do-not-speak, and (b) I’m terrified that one of the extremely infrequent genuine comments of my friends is going to get blitzed. It’s not that I don’t trust Spam Karma, it’s simply that I don’t understand it.)

4 Comments

  1. erik wrote:

    why not just a rendered image with a password? Character recognition won’t be good enough in the upcoming years to recognise misformed letters … Since i installed it i didn’t have any comments, euh, spam anymore … e.g. http://uberdork.supertwist.net/2005/03/13/plug-it-in-plug-it-in/ See my comment forms for an example

    Wednesday, June 29, 2005 at 2:00 pm | Permalink
  2. tikitu wrote:

    That is indeed true. I just happen to dislike them (my character recognition often isn’t good enough, either). (I could bleat about blind people, screen readers, etc, but I somehow doubt that anyone blind is reading my blog…)

    Wednesday, June 29, 2005 at 2:09 pm | Permalink
  3. Robin wrote:

    Some of your ideas won’t work, in particular the ‘clear this field’ ones. As I understand it (having written robots that do form submission, although not evil spammy ones), they don’t simulate a page being submitted by going to it, finding all the form details, constructing a suitable response, and submitting that. Instead they simply post straight to the comment handling script (note: they may have gotten more advanced due to people trying to avoid them)

    What I’d suggest: * Rename the comment handling script, so the bots can’t find it. * Put a hidden field in the page that is required by the comment script (similar to the ‘type this phrase’ thing, except the submitter doesn’t have to do any work, as a real browser will take care of it.) For bonus marks, have the field value be a) random for a specific page, or b) random for a time duration (sucky if people take a long long time to type the comment, but there are ways to deal with that) * Have the comment script do a blacklist lookup on any URLs. There are blacklists that track URLs that are being spamvertised, if the same thing has been sent out in mass mails, it may well be there.

    Any of these can be worked around by evil spammers, but that’s not the point. In rp parlance: “I don’t have to run faster than the troll, I just have to run faster than you!”. Make yourself harder to spam than the average joe, and you’ll find they go for the lower hanging fruit. (Make what you will of the implication that you’re a fruit :)

    Wednesday, July 13, 2005 at 2:43 pm | Permalink
  4. tikitu wrote:

    Ah, good points. Especially like the hidden field, given that point about spiders not actually reading pages at all. But surely the easiest thing of all would be a find-replace in the comment-handling script that just adds a prefix to the fields expected. The installation process for standard blogging tools should do this. (Of course the next-gen spiders then will read the pages to find out the prefix…)

    Wrt blacklisting: that’s more-or-less what Spam Karma does, only it builds its own blacklist. You get downgraded if you make lots and lots of comments real fast, or if you include too many links (this keeps tripping Erik up :-).

    Wednesday, July 13, 2005 at 5:06 pm | Permalink