[tei-council] identifying tickets
stuart yeates
stuart.yeates at vuw.ac.nz
Tue Apr 12 23:19:12 EDT 2011
On the suspicion that I wasn't the only one has found tracking down
tickets in the last two days a bit of a pain, I have written a script
that converts the RSS feeds from sourceforge into TEI:
http://dl.dropbox.com/u/4479425/sourceforgeRSS2TEI/sourceforgeRSS2TEI.xsl
Current results are at:
http://dl.dropbox.com/u/4479425/sourceforgeRSS2TEI/bugs.xml
http://dl.dropbox.com/u/4479425/sourceforgeRSS2TEI/features.xml
I seriously considered generating wiki markup instead, so it could be
cut-n-pasted into the wiki (where by-column sorting functionality and
fancy layout options are 'free'), but figured that TEI could be
re-purposed by more people. I'm open to suggestions as to where this
script and the resulting data should be exposed publicly.
Feedback is welcome, as always.
Current known issues:
* Double escaping of <, > and & (that's how it comes in the RSS and |
didn't want to start shaving that yak)
* Limit of 100 bugs + 100 features (can use multiple requests + xinclude
to get more if needed?)
* Comments appear unavailable via API (and i'm not scraping the HTML)
* title field may be truncated if it contains a <
* Dates aren't in an easily sortable format
* The header is impoverished, as is usual for such documents
How I use the script:
wget --output-document bugs.rss
"http://sourceforge.net/api/artifact/index/tracker-id/644062/limit/1000/rss"
wget --output-document features.rss
"http://sourceforge.net/api/artifact/index/tracker-id/644065/limit/1000/rss"
10096 xsltproc ./sourceforgeRSS2TEI.xsl ./bugs.rss | xmllint --format -
> bugs.xml
10098 xsltproc ./sourceforgeRSS2TEI.xsl ./features.rss | xmllint
--format - > ./features.xml
cheers
stuart
--
Stuart Yeates
Library Technology Services http://www.victoria.ac.nz/library/
More information about the tei-council
mailing list