This site is an archive; learn more about 8 years of OpenHatch.

[OH-Dev] Two different ways to refresh data on old bugs

Asheesh Laroia asheesh at asheesh.org
Thu Nov 1 18:37:32 UTC 2012


Hello, all OH-Devvers,

I've been working on oh-bugimporters, and I'm working through the changes 
necessary to support refreshing data about older bugs: 
http://openhatch.org/bugs/issue772

I've hit an architectural issue I wanted some feedback on.

As background: What we do for Bugzilla, Trac, and Roundup is to re-scrape 
each bug that we've ever seen, once a day. (We have to do that if, say, 
the query we're given only shows new bugs, and a bug gets marked as 
resolved.) For Trac and Roundup, it's O(N) requests to the remote bug 
tracker. For Bugzilla we can group those into one big query that gets data 
about all the bugs.

I've been looking at the Github and Google Code Issues API, and it seems 
that both of them prefer that you make requests like this by asking for 
"all bug data that has changed since a date".

(References:

Github: http://developer.github.com/v3/issues/

Google Code: https://code.google.com/p/support/wiki/IssueTrackerAPI )

Given that, I think it makes sense to make the following changes:

* In oh-mainline, for the backends that support it, change 
mysite/customs/models.py GoogleTrackerModel.as_dict() and 
GithubTrackerModel.as_dict() to also export a special bit of data called 
get_older_bug_data. That will point to the URL of a query that, when 
downloaded, will give all updates on that remote bug tracker since 
whatever date you ask. Pre-configure that query URL to get all updates 
since the minimum last_polled date of all the bugs we know about from the 
tracker.

* In oh-bugimporters: for Google Code and Github bug trackers, use the 
get_older_bug_data query to download all updates since that date, and then 
filter down the results so that we only export the data corresponding to 
bugs we are actually tracking.

(If we didn't do the filtering, we would eventually store a copy of all 
bugs in the remote bug tracker, even if they e.g. asked us to only 
download the ones that are bitesized.)

So...

(1) Does that strategy make sense?

(2) Is someone interested in taking a shot at that? (-:

I will aim to look into it on Monday if no one else does first, but as 
usual I'm very happy to mentor people or review code for oh-bugimporters.

-- Asheesh.


More information about the Devel mailing list