6. A Different Breed of Open
● Making data accessible:
● Built-in search
● Permanent URIs
● Standardized Feeds
● Real-time Alerts
● REST Architecture with Feed Publishing
● RSS/Atom => Pubsubhubbub => Alerts
10. 1 Year Re-cap
● Open Sourced It (for real)
● Improved the API (xml/json)
● Decreased Load Times
● Restructured the Back-end
● Basic Documentation
● Wrapped into a build system
11. The next year
● In general..
● Data Quality and Documentation
● Usage Tracking and Statistics
● User Interface Improvements
● Further separation of the Platform and Service
● Right now
● Data Quality, Data Quality, Data Quality
● And a little bit of documentation
13. Well, not exactly
● Legislative Research Service has the data
● Big, ancient mainframe to boot
● They FTP us updates every 5 minutes
● In SOBI formats (what?)
● With some XML mixed in
● We parse it back into XML/JSON/SQL structure
14. Reasons for Difficulty
● Poorly Documented SOBI behavior
● Formatted as a change log (sometimes)
● Finding sources of error can be hard
● LRS is not co-operative
15. Solutions
● Version Control
● Write objects to JSON/XML files
● With Git, commit each new version
– Commit message points to the source SOBI
● Use git to trace data errors back to SOBI files
● Unit Test known corner cases
● Periodically do a scrape check?
16. Progress
✔ Parsing has been overhauled
✔ Objects are written to file
✔ Bugs have been found and fixed
✔ Periodic Scrapes are approved
17. A short task list
✗ Integrate git into the parsing system.
✗ Document expected behavoir
✗ Write a small test suite
✗ Try to avoid having to scrape.
18. HFOSS Symposium 2011
● Bryan Sivak – Civic Commons
● Mark Prutalis – Sahana Foundation
● Many universities, Mozilla, Google
● David, Moorthy, Brian, and Myself!
● 1 Hour and a few 3' x 4' posters.