3. Why? transparency, openness, it’s public data tap creativity, enthusiasm of web developersstimulate applications for citizens & commerce track crime in your area understand where funding is going plan travel choose a school
4. Theme for this talk how to accelerate this uptake?reduce cost of exploiting public data?stimulate an ecosystem of value added services? data dump and information intermediaries linked data approach intermediaries for a linked data world
5. Traditional publication approach:data dumps publish individual datasets – typically CSV easy for publisher consumer has complete control no complex formats or query languages manage data as they want to familiar technology stack growing set of intermediaries web services to help you work with datasets not specific to public sector data
9. Limitations to data dumps Silo design pattern each application does its own data integration hard to share or reuse efforts between applications Static local stores which require management and update *http://www.flickr.com/photos/zoomzoom/
10. Linked data : public sector data web How: URIs to identify things described dereference to RDF (& other formats) SPARQL endpoints for query vocabularies and patterns for statistics, versioning, provenance ... standard URI sets time periods, regions, departments, schools ...
11. Public sector data web DCSF AdminGeography Edubase Schools TimePeriods Ofsted Gov.Bodies
12. Benefits of linked data approach integrated (linked!) data standard identifiers enables linking other sets seed connections between third party sets fine grain addressing of data annotations (e.g. provenance) fine grained programmatic access consume live or cache, not forced to use static data model directly linked from data
13. But ... barrier to entry too high - “just give us CSV” alien data model alien query methods alien representation formats overall mismatch to typical web developer tool kit
14. Solution middleware to provide web-friendly access run at publisher end or as an intermediary publish as linked data -> automatic API configure automatically from ontology customize configuration (e.g.URI patterns)if needed
15. Linked data API Access RESTful API design serve lists of resources or individual resources automatic sorting, paging of lists simple web API to control filtering, viewing Formatting developer-friendly JSON & XML retain resource-centric model remove round-tripping requirements rooted graph
16. Structure request SELECT ?item WHERE { ... } GET /doc/schools/district/Oxford.json ? min-capacity=1200 Data source SPARQL endpoint selector viewer DESCRIBE <x> <y> formatter cache response Endpoint API specification vocabulary of data set
17. Operation /doc/schools/district/Oxford.json ? min-capacity=1200 Matchendpoint /doc/schools/district/{d} Retrievematches SELECT ?r WHERE { ?r a school:School; school:district [rdfs:label ‘Oxford’]; school:capacity ?c . FILTER (?c >= 1200) } OFFSET 0 LIMIT 10 buildresponse metadata: query and configuration List page N-1 page N page N+1 select format: JSON school i school i school i
19. Linked data API : outcomes lowers barrier to entry very positive reception build linked data applications with e.g. jQuery no need to for full RDF stack stepping stone to linked data world retain concept of resources with URIs retain schema-less model look at the SPARQL you made, look at API config open specification (Epimorphics, Talis, TSO) multiple implementations, including open source http://code.google.com/p/linked-data-api/
21. Conclusions intermediary services, such as LD access API, can make the power and flexibility of linked data available to broader range of developers meet public sector goals of stimulating network of value added applications for citizens and business lots more to do ...