Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Working with data.open.ac.uk, the Linked Data Platform of the Open University
1. Working with data.open.ac.uk, the linked data platform of the OU Mathieu d’Aquin and the LUCERO team @mdaquin Knowledge Media Institute, the Open University LUCERO project lucero-project.info – data.open.ac.uk
2. Linked Data As set of principles and technologies for a Web of Data Putting the “raw” data online in a standard, web enabled representation (RDF) Make the data Web addressable (URIs) Link with other data
4. So Linked Data for the OU? RAE DBPedia Data from Research Outputs OpenLearn Content ORO Exposed as linked data, our data interlink with each other and the external world: become part of the “global data space” on the Web Archive of Course Material Library’s Catalogue Of Digital Content geonames data.gov.uk Currently: OU public data sit in different systems – hard to discover, obtain, integrate by users. A/V Material Podcasts iTunesU BBC DBLP
5. Why is it important? The OU has been the first University to expose its data as linked data: http://data.open.ac.uk Now widely recognized as a critical step forward for the HE sector in the UK (and worldwide) Favor transparency and reuse of data, both externally and internally Reduces cost of dealing with our own public data: integration and reuse by design Enable both new kinds of applications, and to make the ones that are already feasible more cost effective At least 3 other UK universities have now followed our example: http://data.online.lincoln.ac.uk/, http://data.ox.ac.uk/, http://data.southampton.ac.uk/ And others in other countries are setting up similar initiatives
6. “if you are working in an IT department within a University you better read this report, as soon your department will need to be making these same decisions.” David Flanders, JISCExpoProgramme Manager, http://code.google.com/p/jiscexpo/wiki/luceroproject#Site_Visit_Report
7. The data.open.ac.uk Stack Applications Institutional repository data Research Data (Arts) Organizational infrastructure Technical infrastructure
9. Technological principle: Everything has a URI Example: http://data.open.ac.uk/course/m366 – the course M366 http://data.open.ac.uk/oro/21166 – an article in ORO http://data.open.ac.uk/page/person/ext-911ee9dfa3db572830b00bd8a9983e39 – an Person, who authored the article above http://xmlns.com/foaf/0.1/Person – the type person http://purl.org/dc/terms/creator – the property that links an author to an article
10. Technological principle: Content negotiation Accept: text/html Accept: application/rdf+xml <?xml version="1.0" encoding="UTF-8"?> <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Descriptionrdf:about="http://data.open.ac.uk/oro/9719"> <label xmlns="http://www.w3.org/2000/01/rdf-schema#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</label> <authorListxmlns="http://purl.org/ontology/bibo/" rdf:resource="http://data.open.ac.uk/oro/9719#authors"/> <title xmlns="http://purl.org/dc/terms/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</title> <abstract xmlns="http://purl.org/ontology/bibo/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers against the glycosylated form of MUC1 are described, along with their use in treatment and diagnosis of conditions associated with elevated production of MUC1.</abstract> <isPartOfxmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/oro/repository"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/peerReviewed"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/published"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-07bcb3718cb0de7883dc7b8fde7e283d"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/b7fc322e6386517c5ebef3c09d13bd9e"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-7c8b5252e28115f91640559c2fe64ca3"/> <date xmlns="http://purl.org/dc/terms/">2007-11-15</date> <rdf:typerdf:resource="http://purl.org/ontology/bibo/Article"/> <rdf:typerdf:resource="http://purl.org/ontology/bibo/Patent"/> </rdf:Description></rdf:RDF>
11. RDF <?xml version="1.0" encoding="UTF-8"?> <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Descriptionrdf:about="http://data.open.ac.uk/oro/9719"> <label xmlns="http://www.w3.org/2000/01/rdf-schema#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</label> <authorListxmlns="http://purl.org/ontology/bibo/" rdf:resource="http://data.open.ac.uk/oro/9719#authors"/> <title xmlns="http://purl.org/dc/terms/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</title> <abstract xmlns="http://purl.org/ontology/bibo/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers against the glycosylated form of MUC1 are described, along with their use in treatment and diagnosis of conditions associated with elevated production of MUC1.</abstract> <isPartOfxmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/oro/repository"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/peerReviewed"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/published"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-07bcb3718cb0de7883dc7b8fde7e283d"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/b7fc322e6386517c5ebef3c09d13bd9e"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-7c8b5252e28115f91640559c2fe64ca3"/> <date xmlns="http://purl.org/dc/terms/">2007-11-15</date> <rdf:typerdf:resource="http://purl.org/ontology/bibo/Article"/> <rdf:typerdf:resource="http://purl.org/ontology/bibo/Patent"/> </rdf:Description></rdf:RDF>
12. By the way… On Study at the OU: http://data.open.ac.uk/course/m366 – if HTML requested, goes to http://www3.open.ac.uk/study/undergraduate/course/m366.htm Try http://www3.open.ac.uk/study/undergraduate/course/m366.rdf
13. Technological principle: link… also to external datasets Using URIs makes pieces of data directly addressable and linkable on the Web, independently of where the data is: http://data.open.ac.uk/course/m366 isAvailableInhttp://sws.geonames.org/458258/ (Republic of Latvia) http://data.open.ac.uk/organization/the_open_universitysameAshttp://education.data.gov.uk/doc/school/133849 http://data.open.ac.uk/location/building/mbbn (Berrill Building North) postcode http://data.ordnancesurvey.co.uk/id/postcodeunit/MK76AA And others can link to our data…
14. SPARQL The “SQL” of RDF and linked data Fits the graph data model of RDF Select [variables: ?x ?name, etc.] From [graph, or all graphs if nothing] Where [triple patterns and filters] Order by, limit, offset, etc. SPARQL protocol: simply based on HTTP A SPARQL endpoint is a URL that takes a “query” parameter And return results in the SPARQL xml format See http://data.open.ac.uk
15. SPARQL: example queries Courses available in Nigeria select distinct ?course where {?course <http://data.open.ac.uk/saou/ontology#isAvailableIn> <http://sws.geonames.org/2328926/>. ?course a <http://purl.org/vocab/aiiso/schema#Module>} http://data.open.ac.uk/query?query=select%20distinct%20%3Fcourse%20where%20{%3Fcourse%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23isAvailableIn%3E%20%3Chttp%3A%2F%2Fsws.geonames.org%2F2328926%2F%3E.%20%3Fcourse%20a%20%3Chttp%3A%2F%2Fpurl.org%2Fvocab%2Faiiso%2Fschema%23Module%3E}
16. SPARQL: example queries Courses available in Nigeria select distinct ?course where {?course <http://data.open.ac.uk/saou/ontology#isAvailableIn> <http://sws.geonames.org/2328926/>. ?course a <http://purl.org/vocab/aiiso/schema#Module>} http://data.open.ac.uk/query?query=select%20distinct%20%3Fcourse%20where%20{%3Fcourse%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23isAvailableIn%3E%20%3Chttp%3A%2F%2Fsws.geonames.org%2F2328926%2F%3E.%20%3Fcourse%20a%20%3Chttp%3A%2F%2Fpurl.org%2Fvocab%2Faiiso%2Fschema%23Module%3E}
17. SPARQL: example queries Video podcasts related to postgraduate courses in computing select ?x ?t where { ?c <http://purl.org/dc/terms/subject> <http://data.open.ac.uk/topic/computing>. ?c <http://data.open.ac.uk/saou/ontology#courseLevel> <http://data.open.ac.uk/saou/ontology#postgraduate>. ?x <http://data.open.ac.uk/podcast/ontology/relatesToCourse> ?c. ?x <http://purl.org/dc/terms/title> ?t. ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/podcast/ontology/VideoPodcast>} http://data.open.ac.uk/query?query=select%20%3Fx%20%3Ft%0Awhere%20{%0A%20%20%20%3Fc%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fsubject%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Ftopic%2Fcomputing%3E.%0A%20%20%20%3Fc%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23courseLevel%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23postgraduate%3E.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FrelatesToCourse%3E%20%3Fc.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Ftitle%3E%20%3Ft.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FVideoPodcast%3E%0A}&limit=0
18. SPARQL: example queries Things related to “earthquake” select ?c ?desc where { ?c <http://purl.org/dc/terms/description> ?desc . { {?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/openlearn/ontology/OpenLearnUnit>} UNION {?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/podcast/ontology/VideoPodcast>} } FILTER regex(str(?desc), "earthquake", "i" )} http://data.open.ac.uk/query?query=select%20%3Fc%20%3Fdesc%20where%7B%0A%3Fc%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fdescription%3E%20%3Fdesc%20.%0A%7B%7B%3Fc%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%0A%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fopenlearn%2Fontology%2FOpenLearnUnit%3E%7D%0AUNION%0A%7B%3Fc%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%0A%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FVideoPodcast%3E%7D%7D%0AFILTER%20regex(str(%3Fdesc)%2C%20%22earthquake%22%2C%20%22i%22%20)%0A%7D&limit=0
19. Expose Store Collect Extract Link Ontologies Scheduler Cleaning rules RDF file (add) RDF file (delete) URL redirection rules RSS Extractor Delete (1) Add (2) RDF Cleaner Web Server ORO, podcast RSS feed RDF file (add) RDF file (delete) Triple Store RSS Updater SPARQL endpoint RDF Extractor New items Obsolete items Each datasets Index Entity Name System Search XML Updater URI creation rules Lib, courses, loc Planning + Logging Generic process Dataset specific process
27. Define URI SchemeData Modeling Validation Lucero Core Team Lucero members Data Owner Development of Extractor URI Creation Rules Definition Deployment Lucero KMi Team
28. Datasets Already “officially” in place: ORO: more than 18,000 publications from OU researchers Podcasts: 2,500 audio and video tracks from podcast.open.ac.uk, linked to the relate courses Study at the OU: more than 600 live module descriptions OpenLearn: more than 550 Units of course material KMi Staff and Planet newsletter Currently being processed: OU Buildings in MK and regional centers Library Catalogue YouTube channel Old Courses “Reading Experience Database” project People Profiles
30. Building applications with Linked Data Everything is based on HTTP/XML In principle, just need a Web connection… Libraries available in many languages to manipulate RDF data Java: Jena (http://openjena.org/) PHP: ARC2 (https://github.com/semsol/arc2) Python:RDFLib (http://www.rdflib.net/) …
31. Example: Accessing data.open.ac.uk with PHP/Arc2 include_once("arc2/ARC2.php"); // declare the SPARQL endpoint $config = array('remote_store_endpoint' => 'http://data.open.ac.uk/query’,); $store = ARC2::getRemoteStore($config); // Execute a SPARQL query $postcodesq = 'select distinct ?p where {[] <http://data.ordnancesurvey.co.uk/ontology/postcode/postcode> ?p.}’; $rows = $store->query($postcodesq, 'rows'); // Display the results foreach($rows as $row) { echo $row[‘p’].”</br/>”; }
32. Applications For education Mobile podcast explorer, podcast explorer on TV OU Building Map, OU location tracker (cf. foursquare) OU Expert Search Connecting courses/OpenLearn to relevant podcast OU Course Profile Facebook app using list of courses, “Study Buddy” app connecting facebook users to relevant courses For Research Display connections in a research community Research Data/Impact Analysis Connection research datasets to external data
36. Example application: Expert Search using publication information and connecting to contact information within the OU
37. Example application: Explore Information about a person in the “Reading Experience Database” based on data provided by DBPedia (Linked Data version of Wikipedia) New ways to look at humanities research data
39. The future More data… always more data More links, especially to external entities BBC Government agencies Other universities More applications: Integration into main OU websites (e.g., study at the OU) Integration into common OU applications (people profile, Facebook course profile, etc.) Support for common OU processes (REF audit, course recommendation, providing resources to AL and lecturers) Connecting to other Universities Many other universities in the UK and abroad are making the move to linked data (see linkeduniversities.org) Linked data has the potential to create connections across institutions, a data-based network on higher education course providers
40. Conclusion Linked data is more than an emerging, academic trend. data.open.ac.uk and linked data in general are fast becoming very valuable resources for developers, internally and externally We are very proud to have been the first university to really deploy a linked data platform Needs to sustain and evolve as a core service at the OU… … and as a key component of the Web of University Linked Data
41. Thank You SalmanElahi ((Ex)-Dev) Carlo Allocca (Dev) Jane Whild (Admin) FouadZablith (Dev) KMi AndriyNikolov (linking) Enrico Motta (SGP) Mathieu d’Aquin (PD) Arts Suzanne Duncanson-Hunter John Wolfe Paul Lawrence Richard Nurse ((ex-)PM) Owen Stephens (PM) Stuart Brown Com./ Student Comp. Services Data Owners Non Scantlebury Library Specialists Arts Specialists OU Library
Editor's Notes
Usual pitch: - data on the web = every piece of data is web addressable, so data across different places/stores/systems become linkable: the Web = 1 data space