This site is an archive; learn more about 8 years of OpenHatch.

[Devel] To make the map faster, first fix our public data dumps -- help request!

Asheesh Laroia asheesh at openhatch.org
Wed Sep 22 18:23:58 UTC 2010


Hey everyone,

I have a way that you can help improve the OpenHatch code as well as make
it easier for people to get started with the code. Would you help me out,
as well as everyone else who wants to make it better? (-:

Help request
------------

So right now, the map is painfully slow: http://openhatch.org/bugs/issue134

Zathras on #openhatch has been asking for dumps of the OpenHatch database.
The reason I don't publish them is that they include some private information,
like passwords and people's un-published email addresses and locations. He or
other people could do things like fix the map if it were easy to get a local
instance running that would work like the real site -- data and all.

Data dumping
------------

So -- I wrote a management command that filters the database, only getting data
that is reasonable to share with the public, and saves it to standard output.
The idea is that every night, we can run the command on the production server,
and then everyone can get a reasonably up-to-date database if they want to try
fixing things that require real data. (This would also be useful for testing bug
tracker imports -- if you use our imported bugs, you don't have to download them
all yourself.)

I just pushed it, so if you have a local install, you'll need to:

    $ git pull

before you can try it out. Once you've done that, run the new management command:

    $ ./bin/mysite dump_public_user_data

You'll get a WHOLE lot of data to standard out. This is all the data from the User
model, but with only a few columns.

I wrote a basic test of the functionality at mysite/customs/tests.py and the management
command itself lives at:

mysite/customs/management/commands/dump_public_user_data.py

If you read that file, it explains fairly clearly what you should filter out.

What you need to do, and what you'll get
----------------------------------------

You have to create a local instance -- http://openhatch.org/wiki/Getting_started_with_the_OpenHatch_code
explains how.

If you can get the public data dumping to include the Person model in a privacy-preserving
way (as specified in dump_public_user_data.py) that would rock. It would mean other people
could play with fixing the map to be faster.

If you can submit a patch for that, along with a test that shows that it has the right
behavior, I'll send you chocolate in the mail. It would totally rock.

If you don't want chocolate, you'll get my everlasting thanks.

Bonus points
------------

Really, the more data you can get us to dump successfully, the better. But I'm not going to
talk about that here. (Maybe in a separate email.)

-- Asheesh.

-- 
In the Spring, I have counted 136 different kinds of weather inside of
24 hours.
		-- Mark Twain, on New England weather


More information about the Devel mailing list