This document discusses the creation of a linked data dataset for Madrid's public transport authority (CRTM) to make their transport data more accessible and reusable. It outlines the motivation and benefits of open transport data, reviews existing methods of publishing open data, and proposes publishing CRTM's data as linked open data using semantic web standards to enable new applications and value-added services by combining the transport data with other public datasets. The methodology describes transforming CRTM's static and real-time transport datasets into RDF and providing SPARQL and SPARQL-Stream endpoints to access the data. Examples demonstrate sample URIs, queries to retrieve stop points, and visualizations of the linked data.
A Linked Data Dataset for Madrid Transport Authority's Datasets
1. A Linked Data Dataset for
Madrid Transport Authority’s
Datasets
XI Congreso de Ingeniería del Transporte (CIT 2014)
Santander, 11/06/2014
Oscar Corcho, Luis Criado, Javier Chamorro
ocorcho@fi.upm.es, {luis.criado,javier.chamorro}@crtm.es
@ocorcho
https://www.slideshare.com/ocorcho
11/06/2014
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
2. • Many potential (re)uses of open transport data
• Provide travel information to persons
• Allow better multimodal route planning
• Facilitate public transport management
• …
• Accessibility
• Which metro accesses are accessible for wheelchair
users?
• In which bus stops is it safer and more convenient for a
wheelchair user to wait.
• Is there any accessible parking space nearby a bus
stop?
• etc.
Motivation: open data for what?
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
3. Legal framework and open data initiatives. Europe and Spain
• Open Access Initiative (2001)
• Información científica en la red; > 510 organizaciones
• Convención de Aarhus (1998)
• Derecho de participación y acceso; 41 países y la UE
• Directiva PSI
• Reutilización de la PSI (2003/98/EC)
• Convención sobre el acceso a documentos oficiales (2009)
• Firmada por 12 países
• Bélgica, Finlanda, Noruega, Suecia, Hungría, Estonia, Lituania, Eslovenia, Georgia,
Montenegro, Serbia y Macedonia
• Ley 37/2007. Reutilización de la PSI
• Ley 11/2007. Acceso de los ciudadanos a los servicios públicos, y Derecho a la
calidad de los servicios
• RD 4/2010 Esquema Nacional de Interoperabilidad
• Estándares abiertos
• Principio de neutralidad tecnológica
• Software de fuentes abiertas
• RD 1495/2011 Desarrolla la Ley 37/2007
• Norma Técnica de Interoperabilidad (19/02/2013, BOE 4/3/2013)
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets Adapted from Antonio Rodríguez Pascual (IGN)
4. Is there any open
transport data already?
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
We are surrounded by them
5. Open data and how it is published
• 1) In notice boards
• For those who have a lot of free time
• Or those who are there at the right moment in time
Adapted from Antonio Rodríguez Pascual (IGN)
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
6. Open data and how it is published
• 2) In Web pages and mobile apps
• For people
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets Adapted from Antonio Rodríguez Pascual (IGN)
7. Open data and how it is published
• 2) In Web pages and mobile apps
• For people
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets Adapted from Antonio Rodríguez Pascual (IGN)
8. Open data and how it is published
• 3) As Web files
• So that they can be loaded by humans in their
information systems (XML, HTML, CSV, etc.)
• Hopefully it is not a scanned PDF
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets Adapted from Antonio Rodríguez Pascual (IGN)
9. Open data and how it is published
• 4) Via Web services
• For humans and machines
• It allows generating added-value services
• And can be integrated in the application business
logic
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets Adapted from Antonio Rodríguez Pascual (IGN)
10. Is there any open
transport data already?
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
We are surrounded by them, however…
THEY ARE NOT EASY TO REUSE YET
11. We may need to go into 4 and 5 star Linked Data
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
12. And better if they are all mixed together
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
How? Using Linked Data techniques
13. What can we do with such Linked Open Transport Data?
• Calculate accessible routes
• Combined with geographical data (IGN)
• Which stop should I use if I have mobility problems?
• Commercial routes by bus
• Combined with Madrid’s shop census (from Ayto. Madrid)
• Geomarketing decisions for enterpreneurs
• Where should I open my shop? Based on the combination of
the number of travellers per stop, demographic data, data
about other businesses and shops around, etc.
• Personalised offers to travellers
• With real-time data and data about consumption patterns
(e.g., credit card transactions)
• …
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
14. Some people have started working on it
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
15. Transforming CRTM data
RDF data
Static
Data
<Stop>
<IdStop>28</IdStop><PMV>61247</PMV>
<Name>P CASTELLANA-JUZGADOS</Name>
<PostalAdress>P de la Castellana, 187</PostalAdress>
<CoordinateX>-3.68972639781606</CoordinateX>
<CoordinateY>40.4650604583015</CoordinateY>
</Stop>
users applications
Bus, Metro, …, Train Stops
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
Work in progress!!
16. Methodology for Linked Data Generation
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
Villazón-Terrazas, B., Vilches-Blázquez, L.M., Corcho,
O., & Gómez-Pérez, A. (2011) Methodological guidelines
for publishing government linked data. In Wood, D. (ed)
Linking Government Data (pp. 27-49). Springer-Verlag.
1. Prepare Stakeholders
2. Select a dataset
3. Model the data.
4. Specify an appropriate open data license
5. Create good URIs for Linked Data
6. Use standard vocabularies
7. Convert data to a Linked Data
representation.
8. Provide machine access to data
9. Announce the new data sets on an
authoritative domain
10. Recognize the social contract
Hyland, B., Atemezing G, & Villazón-Terrazas B (2014)
Best Practices for Publishing Linked Data. W3C
Working Group Note. http://www.w3.org/TR/ld-bp/
1. Prepare Stakeholders
2. Select a dataset
3. Model the data.
4. Specify an appropriate open data license
5. Create good URIs for Linked Data
6. Use standard vocabularies
7. Convert data to a Linked Data
representation.
8. Provide machine access to data
9. Announce the new data sets on an
authoritative domain
10. Recognize the social contract
17. Some sample URIs
• http://crtm.linkeddata.es/recurso/urbanismo-
infraestructuras/transporte/MetroLigero/Linea/1
• curl -L -H "Accept:text/html"
http://crtm.linkeddata.es/recurso/urbanismo-
infraestructuras/transporte/MetroLigero/Linea/1
• curl -L -H "Accept:application/rdf+xml"
http://crtm.linkeddata.es/recurso/urbanismo-
infraestructuras/transporte/MetroLigero/Linea/1
• curl -L -H "Accept:text/turtle"
http://crtm.linkeddata.es/recurso/urbanismo-
infraestructuras/transporte/MetroLigero/Linea/1
• http://crtm.linkeddata.es/recurso/urbanismo-
infraestructuras/transporte/Cercanias/Linea/C-2
• http://crtm.linkeddata.es/recurso/urbanismo-
infraestructuras/transporte/BusInterurbano/Linea/180
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
18. Some examples of static and dynamic datasets
• SPARQL and SPARQL-Stream endpoints
• http://crtm.linkeddata.es/sparql (static data)
• http://streams.linkeddata.es/ (real time data)
• A couple of simple questions:
• Give me all the stop points (query)
select distinct ?x where
{?x a <http://transport.data.gov.uk/def/naptan/StopPoint>}
• Give me all the stop points that belong to Metro Ligero line 1
(query)
prefix estrn:
<http://vocab.linkeddata.es/datosabiertos/def/urbanismo-
infraestructuras/transporte#>
select distinct ?x where
{?y a estrn:LineaMetroLigero .
?y estrn:parada ?x .
?x a <http://transport.data.gov.uk/def/naptan/StopPoint> .
FILTER(?y=<http://crtm.linkeddata.es/recurso/urbanismo-
infraestructuras/transporte/MetroLigero/Linea/1>)}
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
19. Some sample visualisations
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets
http://girtab.dia.fi.upm.es:8180/crtm/
20. A Linked Data Dataset for
Madrid Transport Authority’s
Datasets
XI Congreso de Ingeniería del Transporte (CIT 2014)
Santander, 11/06/2014
Oscar Corcho, Luis Criado, Javier Chamorro
ocorcho@fi.upm.es, {luis.criado,javier.chamorro}@crtm.es
@ocorcho
https://www.slideshare.com/ocorcho
11/06/2014
Corcho, Criado, Chamorro - A Linked Data Dataset for Madrid Transport Authority’s Datasets