This site is an archive; learn more about 8 years of OpenHatch.

[OH-Dev] Status of scrapy-ification of bugimporters (and request for help)

John Morrissey jwm at horde.net
Sun Sep 16 20:38:10 UTC 2012


On Wed, Sep 05, 2012 at 03:21:19PM -0400, Asheesh Laroia wrote:
> I spent some time over Labor Day weekend refactoring the
> oh-bugimporters code to use scrapy, rather than our homebrew,
> not-quite-functioning async downloading framework.
> 
> It's here for now: (note I may eventually rebase this branch)
> https://github.com/paulproteus/oh-bugimporters/tree/experimental/scrapy-ify-trac
[snip]
> Things to be done: reasonably easy
> ----------------------------------
> 
> * Easy: Fixing up the patch set so that it doesn't have silly commit
> log messages like 'rofl'
> 
> * Easy: Fix up oh-mainline so it accepts this JSON data as input
> (for example, I changed the serialization system so that we output
> datetime.datetime.isoformat() strings rather than custom datetime
> objects). (Also, oh-mainline wants YAML files as input, but this
> generates JSON.)
> 
> * Reasonably easy: Modify the code so that in case of a remote 404,
> oh-bugimporters sets a flag on items.ParsedBug called _deleted. Then
> oh-mainline knows that when it sees that, it should delete the
> corresponding Bug in the database.
> 
> * Reasonably easy: The export of all the "TrackerModel" subclasses
> should include a list of current bug URLs, so that oh-bugimporters
> knows to download fresh data for those bugs.
> 
> Things to be done: Less easy
> ----------------------------
> 
> * Slightly harder: Make a trivial fake downloader function that
> takes a Request as input, and a configuration dictionary mapping
> URLs to filenames on disk, and returns a Response object that has
> that data. That way, we can continue to use something like
> fakeGetPage to let the test suite run offline and predictably.
> 
> * The other bug importers need to get ported as well. Right now, we
> only handle Trac instances. That's a large fraction of the bug
> importers, but there's github, roundup, launchpad, and Google Code
> to port as well.

Looks like all of this is done except (a) porting the remaining importers
and (b) updating oh-mainline to accept the new JSON output from the
importers. Anything other than those two things remaining?

john
-- 
John Morrissey          _o            /\         ----  __o
jwm at horde.net        _-< \_          /  \       ----  <  \,
www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__


More information about the Devel mailing list