Spam comments on blog posts

Hi there,

Hi Matt,

I'm sorry for the spam - there has been a persistent daily bunch of them coming through and seemingly they always target the same forum threads, blog entries and photos. It took us a little while to realise that they were also always commenting on blog entries, so we weren't automatically removing them with our own moderation efforts. We're a lot more aware of that now and blog comments should generally be removed by us within a few hours at most.

But you're right - there should also be an option for blog owners to block TP members from posting. At least when it's their first comment. I'll make it a point to tackle this particular persistent spammer next week with some better strategies. Sorry for the inconvenience in the meanwhile. We all detest those buggers

That it's always posting to the same threads means it's almost certainly a bot; probably some version of xrumer.
I'd recommend altering the registration page to have per-session input field names (hash of the real field name, the user's ip address, hourly timestamp, and a salt; be sure to check both the current hour and the previous one on receiving, to prevent losing users loading at :59 and submitting at :01), as well as a "don't fill in this field" input hidden with CSS (positioned at a random location in between the other input fields), which you use to automatically discard registrations. (Or if you want to put in the effort, to give the bots a fake registration, with all posts they make automatically deleted, and maybe channeled into a queue to check for new blacklist keywords. But that's probably too much effort with all the different commenting systems on the site.)

[ 02-Jun-2012, at 04:33 by Sander ]

The fact that it started after featuring also indicates a bot, and explains why it's always those same threads. It's basically following the featured links on the home page it seems

One thought was to always block all url shorteners. They are really a haven for spam anyway, and this particular bot is getting around blacklisted keywords/urls using them.

I like the per session hashed input fields idea Sander!

URL shorteners can have some legitimate purposes (I've used them myself when linking to gigantic google maps URLs), so it'd be a shame to block them outright; and I don't think it'd be very effective for long against bots. Though blocking them for new members might be worth the tradeoff.

I have had very positive results with the registration form hashing; a punbb forum I'm an admin at went from 60 spam registrations per day, to 3 per week, to 0 for the last month. (I suspect that for a while someone was "training" the bot on the "new" registration form once a week, which allowed spam registrations for ~1.5 hours; but has now given up on it.) If you go this way, and it still doesn't prevent things, shoot me an email for some more tricks.

Thanks very much for your responses.

I'm afraid Sander that the advice you give is a little beyond my computer know-how without a step by step walk through. I'm completely unfamiliar with the things you talk about!

Peter, exactly what I was thinking (from a user perspective). My main thought was that for Blog Comments only it would be good if none were automatically approved - that way I can filter them manually each time I log in.

Many thanks all.

Matt.

matthinc: Sorry, should've clarified that. My advice was for Sam and Peter; suggesting a possible way to keep the spammers completely out of the door, working on the assumption that this particular kind of spam is being created automatically, rather than by humans.

I've just flagged/deleted some spam in the forums that I think is being referred to here - that persistent one that comes up pretty much every day, but with the URL changing every second day or whatever. Something I noticed is their post count doesn't come up, and nor do the threads the bot has posted in comes up on their profile page - I'm sure the earlier posts by this bot resulted in those things being displayed.

With banning URL shorteners, I see more reasons to ban them than allow them to be honest. The only reason I can see to allow them is for the example Sander gave which is the huge URLs that Google Maps have. Other than that they provide an anonymous way of directing people to an address - and this can also make it difficult (i.e. takes more time) to see/check if it's spam or genuine.

Is there a minimum post requirement before URLs can be posted (whether or not a hyperlink is included), and is there anyway posts with links can be held back to be checked for users with fewer than a certain amount of posts, as I've seen in other forums?

There isn't a number of minimum posts before being allowed to post URLs at the moment. Another thing to consider for sure.

I'm not sure I want to add in pre-moderation as a measure at this stage since it will mean a lot of extra effort, probably more than the effort required to clean up in their wake.

Sander, I've tried the suggestion to hash the signup fields in the past (when we had several hundred bots signing up per day late last year). For some reason it just didn't work at all. I just figured their bots were smart enough to work around that, so I abandoned the idea since it also complicated the code of course. That said, it might work for this particular one and probably would cut out a few other spammers along the way. Maybe worth giving it another go.

There isn't a number of minimum posts before being allowed to post URLs at the moment. Another thing to consider for sure.

FWIW, I'm very much not in favour of this. I don't believe it'd stop either SEO or mass-spam. It might throw up a small hurdle for "visit my blog" promotional posts, but it'd do so at the cost of making SEO-spammers create even more drivel first, and thus being harder to recognize when they finally drop their link.

Sander, I've tried the suggestion to hash the signup fields in the past (when we had several hundred bots signing up per day late last year). For some reason it just didn't work at all. I just figured their bots were smart enough to work around that

They probably were too stupid instead, blindly filling in every single input field in the page. This is why the hidden-with-CSS "Don't fill in this field" is there. (Better yet: "Type 'human' here", and hide + fill it in with JavaScript.)


Spam comments on blog posts

Spam comments on blog posts

Spam comments on blog posts

Subscribe to receive free email updates:

0 Response to "Spam comments on blog posts"

Post a Comment