This site is an archive; learn more about 8 years of OpenHatch.

[OH-Dev] Scrapy migration process: Continuing

Asheesh Laroia asheesh at asheesh.org
Tue Sep 18 04:12:53 UTC 2012


Hey all,

The bug downloading code that powers openhatch.org/search/ has been going 
through a major refactoring.

oh-bugimporters has been moving to the new, Scrapy-based backend.

http://linode2.openhatch.org/~paulproteus/scrapy-log-2012-09-18 is a log 
file of the most recent crawl. At the end of that, you can see the 
following stats:

* item_scraped_count: 858

* spider_exceptions/KeyError: 1482

So I guess there are some bugs to fix in some of the existing importers. 
The log file there has more info, and for any of the errors found there, 
the best thing to do is to create a new test case within oh-bugimporters.

The Bugzilla importer also needs to be moved over, and hasn't been yet.

Cheers for now,

-- Asheesh.


More information about the Devel mailing list