New kinds of spam and how to deal with ‘em
I’ve been reading the messages captured by my anti-spam filter. I had so much fun reading the efforts of those guys to post totally and absolutely undesirable advertisements in my blogs that I’ll share with you guys the different ways I stop them.
Spammers are a bunch of guys and gals (mostly guys) that inspires in me feelings like love and hate. I admire them and at the same time I hope and want to drive them out of business. Yeah, they just want to make their bread and butter, just like any bloke, it’s just that they don’t want to do it in the same way one will expect another one would do it. Spammers use dubious methods to achieve their goals, most of them almost, but not quite, entirely unlike legal ways.
What I find interesting is the fact that an increasing number of spammers seems to think that my blogs are ideal platforms for the products or services they advertise, and just when they try to post something for free, my systems stop them.
Spammers do what in the Queen’s language we refer as “Intrusive Advertising by Electronic Mail.” You know the real life equivalent of spam: flyers. Just as your home mailbox is full of flyers trying to sell jou everything, from furniture to house stuff, your e-mail inbox is full of viagra pills or worse. As a matter of fact, my spammers know something about target advertising, because they insist in sending me ads for penis-enlargement pills. The difference is that while the pamphlets are real and someone paid several people in order to design it, print it and deliver it, allowing you to clean your windows with it, spammers cut all the costs. They don’t do marketing research, they just send the same mail to millions of e-mail addresses. They don’t spent their money to deliver their e-mails, they employ a lot of compromised computers from unsuspecting people. They don’t even need to collect a list of prospects, they just harvest lists of e-mails of your chain e-mails. If you send or even received a sad letter telling you that Sandy needs a new brain (her sickness and name evolves) someone has your address in a list and made a profit just by selling it. That’s the reason you receive so much spam in your inbox. My inbox, an extreme example, receives 95% of spam, 4% of chain letters, and, in a good day, 1% of valid mail. Modern e-mail software has integrated spam filters, blocking sometimes even 99% of spam, and what’s better, they learn and improve their work. Web-based e-mail, like Google Mail, does an excellent job filtering spam.
Thus, spammers had to find new ways and strategies to remain in business. And, well, blogs were right there. Spammers wrote bots, that is, little computer programs, that reads a blog, find the comment form, fills the appropriate fields, fill ‘em and submits ‘em. And if you don’t have appropriate ways to stop ‘em, they’ll overwhelm you and your readers. Let’s watch some real examples:
vEM87V hcjrmhjlsoec, [url=http://yqdfggwzmpco.com.xy/]yqdfggwzmpco[/url], [link=http://gvomnewfblqb.com.xy/]gvomnewfblqb[/link], http://violpjslvoyz.com.xy/,
buy viagra! purchase viagra! get viagra!
I’ve read your blog and, although it’s not a perfect match (I was looking for cheap travel agencies, like cheap-travel-agencies-dot-spam-link-full-of-ads-dot-com) I’ll return and read more of it.
Because people tend to dislike the owners of blogs aimed to children whose comments are full of ads of websites full of pornographic pictures, blog owners had to stop spammers before they do harm to their blogs. The spammers depends on numbers to make a profit, and if they fail to reach that number, their money fail to reach them.
Let’s do an example, just for fun and with surely wrong figures. Let’s say a webmaster known as V for Vancouver maintains an porn website, offering mostly erotic pictures back from 1820. Those pictures looks like Little Red Riding Hood’s grandmother showing her ankles. There are 20 free photos to show the, well, customers, let’s say, what the site is all about. But because no body knows it, the site is brand-new and the server spends its time yawning. How is it possible for a server to yawn doesn’t matter right now. To spice things up V calls a spacker, that is, a hacker that works for a spammer, to do some work for him: V will pay the spacker a penny for each click driven to V’s site, and a full dollar for each subscriber. The spacker prepares a list of 100 million e-mail addresses, sends 200 million mails, and sits in his easy chair with a glass of wine (or was it in his couch with a beer? whatever…) waiting for the clicks and money to arrive. Sadly, barely one hundred thousand people actually clicked in the ad, that’s 0.001%. And of those hundred thousand barely 0.1% became subscribers, and that’s 100 persons. In real life an advertiser would have to deliver one flyer to each inhabitant in Mexico to reach those figures, and if only one hundred people actually purchased something, I’d say that the campaign was not successful and that advertiser sucks (and would have to declare bankruptcy). But not our spacker. He won a thousand dollars for the clicks and a hundred for the subscribers. V, on the contrary, made barely 1600 dollars for the new subscribers, each paying 16 bucks a month, and had to pay 1100. Our spacker feels pity for V; after all, V will pay more if he makes more, and most of the mails went straight to the spam hole. The spacker sends the message to all the blogs he knows and all the blogs he doesn’t know. He harvested the addresses and uses a bot to spread his work. And this is when we can stop him and force him out of business.
There are several methods you can use to stop spammers in your blog.
You can close the comments. Well, it works but also kills the main idea behind blogging: the users feedback. But you can’t simply left the comments open in the wild.
You can moderate the comments. That means comment received, comment verified and, if found legitimate, comment published. It is not practical if the author has a life out of the series of tubes, or if there is a lot of traffic in the blog. You can moderate just the first message, but if the spammer passes the first one, either by mistake or careful social engineering, it’s open season. And, of course, there is the possibility of someone yelling at you because his comment does not appear immediately.
To prevent that from happening, you can use a man in the middle. Enters the captcha (Completely Automated Public Turing test to tell Computers and Humans Apart). In it’s easiest implementation, a captcha is just a word distorted in such a way that a computer can’t read it, but a human can, and therefore the comment was wrote by a human being and not by a spambot. Just look at the phoney word smwm as presented by a simple captcha implementation:

A good and useful implementation of a captcha is reCaptcha, of Carnegie Mellon University. It uses real words of real books that their OCR couldn’t recognize, and uses them as the basis for the captcha. You need tontype two words, one recognized, one unrecognized. If what you type matches the recognized word, then the unrecognized must be valid too, and you just helped to digitize an old book from lots of time ago. And your comment will be published, of course.
If you actually don’t want to help, or simply want more control, you can try other systems. Peter’s Custom Antispam is a good one. You can control fonts, colors, sizes, words and more. However, a badly configured system may allow the blog vulnerable by retrieving the list of words or trying a dictionary attack. Peter’s Math Antispamis an alternative, and you are challenged to solving an equation, say 2+2, and if you can figure out the answer, say, 4, then you can post your comment. The disadvantage of this system is that it needs to be relatively easy so it won’t bug your commenters, and thus vulnerable to dictionary attacks. You can solve that by using Peter’s Random Antispam, that displays a random set of letters and numbers, making it not vulnerable but also slightly annoying. But what happens to people with sight problems? It will be impossible for a blind person to see a captcha.
There are other ways to tell humans and spambots apart. You can put a question whose answer only a man would know it, such as “Whose the son of my mother that ain’t my sibling?” the answer being “me.” This kind of captchas are usually limited to a single language, or few questions, and again, vulnerable to dictionary attacks. You can put a check box next to the phrase “Mark if human,” for example. It may be overridden by a good spambot, however. You can also introduce honeypot fields, asking the user to fill or not to fill a variable, such as “type the second to last letter of gazorninplat in the third box” and moving that variable, or duplicating the fields, one visible to robots and the other to humans.
Another variant you could use may be check the data of the spammer-to-be against a list of known spammers. The messages of any commenters will be considered doubtful and will be scanned to find if it fits into a pattern: maybe the same IP address, the same website, or the same e-mail address of a previous spam comment. New commenters will be sent to a moderation queue. If the message is a match, it will be removed. If it is not, it will be approved. As more and more messages are found to be spam, the system will learn and improve its abilities. The backdraw is that a spammer may send a preemptive message, and once approved, it will launch a blitzkrieg attack. If the database is shared, the blitzkrieg attack will be reduced to a simple firecracker: maybe one or two comments before the system detects it as a spam message. One of the best resources of this type is Akismet, available for free to all wordpress users and a sistem you can adapt to use in other platforms. For commercial sites you can get a cheap license: if your blog makes more than 500 dollars a month, you probably can spend 5 bucks a month to remain spam-free. There is a backdraw, if you have users that usually sent comments full of links (a spammers’ signature) such as users of science blogs. This kind of comments, if from a trusted party, go straight ahead to the moderation queue. Some users sends two messages, the original, full of links, and another one to the admin, saying that their comment is in the queue. There is another alternative: Bad Behavior, a fine piece of software that analyses the posting method: if the spammer hasn’t read the page first, or if any address appears to be forged, or if it takes too little time since you access the page, or if the page seems to be too old, anything that looks weird will be stopped. This method has its advantages: it can stop an attack before it can even start. However it seems to be a problem for those poor souls that need to post a comment from an anonymous network or proxies. But I think that if you need to be behind proxies or conserve your anonimity at any cost, the least of your problems will be commenting in a blog.
I have a forum. It is an anonymous forum, and I named it “A Perfect Match” (El Fósforo Perfecto) in honour to a funny yet bad-translated spam comment. To keep it open, I have a complete spam protection system: it has a captcha, and the message then is compared against a list of known offenders, and also it has three different honeypots, that makes difficult the use of spambots to post. It has remained spam-free for a lot of time and I hope it will remain that way for a long time to come, even if the traffic is still too low to be a target for spackers.
There you go. Now you have ideas and tools, you know how these people think, and you are ready to implement your own protection to your blog, or at least you know where to start. All of this in a little bit less than two thousand and one hundred words.
Cheerio, partners.
V.

Leave a Comment