[OH-Dev] Two different ways to refresh data on old bugs
Asheesh Laroia
asheesh at asheesh.org
Thu Nov 1 18:37:32 UTC 2012
Hello, all OH-Devvers,
I've been working on oh-bugimporters, and I'm working through the changes
necessary to support refreshing data about older bugs:
http://openhatch.org/bugs/issue772
I've hit an architectural issue I wanted some feedback on.
As background: What we do for Bugzilla, Trac, and Roundup is to re-scrape
each bug that we've ever seen, once a day. (We have to do that if, say,
the query we're given only shows new bugs, and a bug gets marked as
resolved.) For Trac and Roundup, it's O(N) requests to the remote bug
tracker. For Bugzilla we can group those into one big query that gets data
about all the bugs.
I've been looking at the Github and Google Code Issues API, and it seems
that both of them prefer that you make requests like this by asking for
"all bug data that has changed since a date".
(References:
Github: http://developer.github.com/v3/issues/
Google Code: https://code.google.com/p/support/wiki/IssueTrackerAPI )
Given that, I think it makes sense to make the following changes:
* In oh-mainline, for the backends that support it, change
mysite/customs/models.py GoogleTrackerModel.as_dict() and
GithubTrackerModel.as_dict() to also export a special bit of data called
get_older_bug_data. That will point to the URL of a query that, when
downloaded, will give all updates on that remote bug tracker since
whatever date you ask. Pre-configure that query URL to get all updates
since the minimum last_polled date of all the bugs we know about from the
tracker.
* In oh-bugimporters: for Google Code and Github bug trackers, use the
get_older_bug_data query to download all updates since that date, and then
filter down the results so that we only export the data corresponding to
bugs we are actually tracking.
(If we didn't do the filtering, we would eventually store a copy of all
bugs in the remote bug tracker, even if they e.g. asked us to only
download the ones that are bitesized.)
So...
(1) Does that strategy make sense?
(2) Is someone interested in taking a shot at that? (-:
I will aim to look into it on Monday if no one else does first, but as
usual I'm very happy to mentor people or review code for oh-bugimporters.
-- Asheesh.
More information about the Devel
mailing list