SlideShare a Scribd company logo
1 of 18
Download to read offline
Open Legislation
  Spring 2011
Open Data
(Government)
Secondary Sources are nice
●   OpenCongress
●   GovTrack.US
●   OpenStates
●   FedSpending.org

●   Many more
Primary Sources are better
●   Data.gov
●   USAspending.gov
●   California
●   Oregon
●   Washington

●   Many more
Sometimes though...
Open Data is not Enough.

  We need Platforms.
A Different Breed of Open
●   Making data accessible:
    ●   Built-in search
    ●   Permanent URIs
    ●   Standardized Feeds
    ●   Real-time Alerts


●   REST Architecture with Feed Publishing
    ●   RSS/Atom => Pubsubhubbub => Alerts
So back to
Open Legislation
Browse, Search, and Share
http://open.nysenate.gov/legislation
Its not a Service;
Its an Open Platform
1 Year Re-cap
●   Open Sourced It (for real)
●   Improved the API (xml/json)
●   Decreased Load Times
●   Restructured the Back-end
●   Basic Documentation
●   Wrapped into a build system
The next year
●   In general..
    ●   Data Quality and Documentation
    ●   Usage Tracking and Statistics
    ●   User Interface Improvements
    ●   Further separation of the Platform and Service

●   Right now
    ●   Data Quality, Data Quality, Data Quality
    ●   And a little bit of documentation
The Senate has Legislative
   Data Quality issues?
Well, not exactly
●   Legislative Research Service has the data
    ●   Big, ancient mainframe to boot


●   They FTP us updates every 5 minutes
    ●   In SOBI formats (what?)
    ●   With some XML mixed in

●   We parse it back into XML/JSON/SQL structure
Reasons for Difficulty
●   Poorly Documented SOBI behavior

●   Formatted as a change log (sometimes)
    ●   Finding sources of error can be hard


●   LRS is not co-operative
Solutions
●   Version Control
    ●   Write objects to JSON/XML files
    ●   With Git, commit each new version
        –   Commit message points to the source SOBI
    ●   Use git to trace data errors back to SOBI files


●   Unit Test known corner cases

●   Periodically do a scrape check?
Progress
✔   Parsing has been overhauled
✔   Objects are written to file
✔   Bugs have been found and fixed
✔   Periodic Scrapes are approved
A short task list
✗   Integrate git into the parsing system.
✗   Document expected behavoir
✗   Write a small test suite
✗   Try to avoid having to scrape.
HFOSS Symposium 2011
●   Bryan Sivak – Civic Commons
●   Mark Prutalis – Sahana Foundation
●   Many universities, Mozilla, Google

●   David, Moorthy, Brian, and Myself!
    ●   1 Hour and a few 3' x 4' posters.

More Related Content

Similar to Open Legislation Spring 2011 Talk 1

#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, CriteoParis Open Source Summit
 
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017DevOpsDays Tel Aviv
 
Pakistan Census Data – Case Study
Pakistan Census Data – Case StudyPakistan Census Data – Case Study
Pakistan Census Data – Case StudyJabran Rafique
 
Indextank east bay ruby meetup slides
Indextank east bay ruby meetup slidesIndextank east bay ruby meetup slides
Indextank east bay ruby meetup slidesYogiWanKenobi
 
Open Data and Web API
Open Data and Web APIOpen Data and Web API
Open Data and Web APISammy Fung
 
WEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
WEBINAR: Proven Patterns for Loading Test Data for Managed Package TestingWEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
WEBINAR: Proven Patterns for Loading Test Data for Managed Package TestingCodeScience
 
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...Blackboard APAC
 
Presentation1.pdf
Presentation1.pdfPresentation1.pdf
Presentation1.pdfZixunZhou
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithNETWAYS
 
An Introduction to Pentaho Kettle
An Introduction to Pentaho KettleAn Introduction to Pentaho Kettle
An Introduction to Pentaho KettleDan Moore
 
(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport MeetingAlonso Torres
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4aspyker
 
10 ways to stumble with big data
10 ways to stumble with big data10 ways to stumble with big data
10 ways to stumble with big dataLars Albertsson
 
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA
 
Beyond the Hype: 4 Years of Go in Production
Beyond the Hype: 4 Years of Go in ProductionBeyond the Hype: 4 Years of Go in Production
Beyond the Hype: 4 Years of Go in ProductionC4Media
 

Similar to Open Legislation Spring 2011 Talk 1 (20)

#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
 
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
 
Pakistan Census Data – Case Study
Pakistan Census Data – Case StudyPakistan Census Data – Case Study
Pakistan Census Data – Case Study
 
Indextank east bay ruby meetup slides
Indextank east bay ruby meetup slidesIndextank east bay ruby meetup slides
Indextank east bay ruby meetup slides
 
Ice dec04-04-sammy
Ice dec04-04-sammyIce dec04-04-sammy
Ice dec04-04-sammy
 
Open Data and Web API
Open Data and Web APIOpen Data and Web API
Open Data and Web API
 
WEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
WEBINAR: Proven Patterns for Loading Test Data for Managed Package TestingWEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
WEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
 
ION Ljubljana - Aaron Hughes: Best Current Operational Practices
ION Ljubljana - Aaron Hughes: Best Current Operational PracticesION Ljubljana - Aaron Hughes: Best Current Operational Practices
ION Ljubljana - Aaron Hughes: Best Current Operational Practices
 
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
 
Presentation1.pdf
Presentation1.pdfPresentation1.pdf
Presentation1.pdf
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles Judith
 
An Introduction to Pentaho Kettle
An Introduction to Pentaho KettleAn Introduction to Pentaho Kettle
An Introduction to Pentaho Kettle
 
(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting
 
Introduction To Pentaho Kettle
Introduction To Pentaho KettleIntroduction To Pentaho Kettle
Introduction To Pentaho Kettle
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
 
10 ways to stumble with big data
10 ways to stumble with big data10 ways to stumble with big data
10 ways to stumble with big data
 
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
 
Streaming Analytics
Streaming AnalyticsStreaming Analytics
Streaming Analytics
 
Beyond the Hype: 4 Years of Go in Production
Beyond the Hype: 4 Years of Go in ProductionBeyond the Hype: 4 Years of Go in Production
Beyond the Hype: 4 Years of Go in Production
 
Dynamic sitemaps
Dynamic sitemapsDynamic sitemaps
Dynamic sitemaps
 

Recently uploaded

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Recently uploaded (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Open Legislation Spring 2011 Talk 1

  • 1. Open Legislation Spring 2011
  • 3. Secondary Sources are nice ● OpenCongress ● GovTrack.US ● OpenStates ● FedSpending.org ● Many more
  • 4. Primary Sources are better ● Data.gov ● USAspending.gov ● California ● Oregon ● Washington ● Many more
  • 5. Sometimes though... Open Data is not Enough. We need Platforms.
  • 6. A Different Breed of Open ● Making data accessible: ● Built-in search ● Permanent URIs ● Standardized Feeds ● Real-time Alerts ● REST Architecture with Feed Publishing ● RSS/Atom => Pubsubhubbub => Alerts
  • 7. So back to Open Legislation
  • 8. Browse, Search, and Share http://open.nysenate.gov/legislation
  • 9. Its not a Service; Its an Open Platform
  • 10. 1 Year Re-cap ● Open Sourced It (for real) ● Improved the API (xml/json) ● Decreased Load Times ● Restructured the Back-end ● Basic Documentation ● Wrapped into a build system
  • 11. The next year ● In general.. ● Data Quality and Documentation ● Usage Tracking and Statistics ● User Interface Improvements ● Further separation of the Platform and Service ● Right now ● Data Quality, Data Quality, Data Quality ● And a little bit of documentation
  • 12. The Senate has Legislative Data Quality issues?
  • 13. Well, not exactly ● Legislative Research Service has the data ● Big, ancient mainframe to boot ● They FTP us updates every 5 minutes ● In SOBI formats (what?) ● With some XML mixed in ● We parse it back into XML/JSON/SQL structure
  • 14. Reasons for Difficulty ● Poorly Documented SOBI behavior ● Formatted as a change log (sometimes) ● Finding sources of error can be hard ● LRS is not co-operative
  • 15. Solutions ● Version Control ● Write objects to JSON/XML files ● With Git, commit each new version – Commit message points to the source SOBI ● Use git to trace data errors back to SOBI files ● Unit Test known corner cases ● Periodically do a scrape check?
  • 16. Progress ✔ Parsing has been overhauled ✔ Objects are written to file ✔ Bugs have been found and fixed ✔ Periodic Scrapes are approved
  • 17. A short task list ✗ Integrate git into the parsing system. ✗ Document expected behavoir ✗ Write a small test suite ✗ Try to avoid having to scrape.
  • 18. HFOSS Symposium 2011 ● Bryan Sivak – Civic Commons ● Mark Prutalis – Sahana Foundation ● Many universities, Mozilla, Google ● David, Moorthy, Brian, and Myself! ● 1 Hour and a few 3' x 4' posters.