This site is an archive; learn more about 8 years of OpenHatch.

[OH-Dev] Removed a bunch of spam (and users) from project pages

Asheesh Laroia asheesh at asheesh.org
Fri Dec 2 20:50:03 UTC 2011


Howdy all,

I just spent a bit of time cleaning up spam "Answers" to project 
involvement questions from the site.

I wrote some simple functions to do that, which I ran inside

     python manage.py shell_plus

on the deployment. If the spammers continue their spammerificness, we could 
maybe turn this into a management command, or add some spam detection code 
to the answer submission process.

I did some research on antispam backends, and it seems that Akismet and 
TyepPad Antispam don't have high enough accuracy for what we're 
doing. I would suggest using http://code.google.com/p/django-spambayes/ 
instead.

The user deletion code I ran has two interesting properties:

* It saves the text of spam users' answers, so we can use it as 
training data down the road.

* It emails the users with a note saying that we deleted their accounts.

It's attached as "antispam.py".

Before doing the deletion, I made a backup that we can dig through if 
necessary to restore these people, but I think that won't be necessary.

-- Asheesh.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: antispam.py
Type: text/x-python
Size: 1711 bytes
Desc: 
URL: <http://lists.openhatch.org/pipermail/devel/attachments/20111202/4d11447a/attachment.py>


More information about the Devel mailing list