Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Linking the world with Python and Semantics

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 84 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (19)

Ähnlich wie Linking the world with Python and Semantics (20)

Anzeige

Weitere von Tatiana Al-Chueyr (20)

Aktuellste (20)

Anzeige

Linking the world with Python and Semantics

  1. 1. Linking the world with Python and Semantics @tati_alchueyr (Globo.com) 25th July 2012, FISL 13
  2. 2. how do you store your data?
  3. 3. how do you store your data? [ ] data... what data?! [ ] raw files (csv, json, xml) [ ] database (eg. Relational Data Base) [ ] graphs (eg. Resource Description Framework) [ ] other...
  4. 4. how do you search for...? Apartments near English-Portuguese bilingual childcare in Rio de Janeiro state. ERP service providers with offices in São Paulo and New York. Researchers working on artificial intelligence in Southeast of Brazil. GNU GPL software for image processing developed from 2009 to 2010 authored also by Brazilian developers
  5. 5. how do you search for...? Apartments near English-Portuguese bilingual childcare in Rio de Janeiro state. ERP service providers with offices in São Paulo and New York. Researchers working on artificial intelligence in Southeast of Brazil. GNU GPL software for image processing developed from 2009 to 2010 authored also by Brazilian developers
  6. 6. how do you search for...? Apartments near English-Portuguese bilingual childcare in Rio de Janeiro state. ERP service providers with offices in São Paulo and New York. Researchers working on artificial intelligence in Southeast of Brazil. GNU GPL software for image processing developed from 2009 to 2010 authored also by Brazilian developers
  7. 7. how do you search for...? Apartments near English-Portuguese bilingual childcare in Rio de Janeiro state. ERP service providers with offices in São Paulo and New York. Researchers working on artificial intelligence in Southeast of Brazil. GNU GPL software for image processing developed from 2009 to 2010 authored also by Brazilian developers
  8. 8. what ^ have in common?
  9. 9. linked open data in 2007
  10. 10. linked open data in 2008
  11. 11. linked open data in 2009
  12. 12. linked open data in 2011
  13. 13. traditional RDMS
  14. 14. linked data graph
  15. 15. linked data modelling
  16. 16. modelling
  17. 17. modelling
  18. 18. quering RDB select bookID, authorName from books, authors where books.aid = authors.aid and books.isbn = ‘006251587X’.
  19. 19. quering RDF select ?authName ?authEmail where { <amazon:book#006251587X> <amazon:hasAuthor> <foaf:name#TimBerners-Lee> <foaf:name#TimBerners-Lee> <foaf:name> ? authName <foaf:name#TimBerners-Lee> <foaf:email>? authEmail }
  20. 20. globo.com developers before using web semantics
  21. 21. globo.com developers while learning web semantics (?w ?t ?f)
  22. 22. globo.com developers after using web semantics
  23. 23. Sample hard to test code
  24. 24. approach 1 # queries isolation
  25. 25. approach 2 # data as object DAO
  26. 26. Y U NO make SPARQL queries?!
  27. 27. Y U NO make data access easy?!
  28. 28. Y U NO make things testable?!
  29. 29. product developers evaluating web semantics
  30. 30. fact 1: we don't have an out-of-box solution
  31. 31. fact 2: but we do have some options
  32. 32. some options #1: create a solution from scratch #2: study existing solutions and then [ ] contribute to them [ ] develop on top of them [ ] goto #1
  33. 33. the final decision is not only ours
  34. 34. but we chose starting from #2 #2: study existing solutions and then (...)
  35. 35. ok, lmgfy
  36. 36. a few results from google ActiveRDF PyRdfa active-semantic pysparql Django4Store RDFAlchemy Django-RDF RdfLib Django-RDFAlchemy Redland Djubby semantic-django EasyRDF SPARQLWrapper Jena FuXi Sparrow Oort Sparta Pymantic SuRF
  37. 37. click to know more ActiveRDF PyRdfa active-semantic pysparql Django4Store RDFAlchemy Django-RDF RdfLib Django-RDFAlchemy Redland Djubby semantic-django EasyRDF SPARQLWrapper Jena FuXi Sparrow Oort Sparta Pymantic SuRF
  38. 38. {?project :by_author ?author . ?author :works_at :globocom . } ActiveRDF PyRdfa active-semantic pysparql Django4Store RDFAlchemy Django-RDF RdfLib Django-RDFAlchemy Redland Djubby semantic-django EasyRDF SPARQLWrapper Jena FuXi Sparrow Oort Sparta Pymantic SuRF
  39. 39. {?project :use_language :python . } ActiveRDF PyRdfa active-semantic pysparql Django4Store RDFAlchemy Django-RDF RdfLib Django-RDFAlchemy Redland Djubby semantic-django EasyRDF SPARQLWrapper Jena FuXi Sparrow Oort Sparta Pymantic SuRF
  40. 40. {?project :use_language :python ; :last_commit ?commit . FILTER (?commit >= "2011-12-01"^^xsd:date) } ActiveRDF PyRdfa active-semantic pysparql Django4Store RDFAlchemy Django-RDF RdfLib Django-RDFAlchemy Redland Djubby semantic-django EasyRDF SPARQLWrapper Jena FuXi Sparrow Oort Sparta Pymantic SuRF
  41. 41. relation between these tools
  42. 42. team filtering ActiveRDF PyRdfa active-semantic pysparql Django4Store RDFAlchemy Django-RDF RdfLib Django-RDFAlchemy Redland Djubby semantic-django EasyRDF SPARQLWrapper Jena FuXi Sparrow Oort Sparta Pymantic SuRF
  43. 43. SPARQLWrapper problem: list all predicates of a class # List all predicates of dbonto:Band query = """ SELECT distinct ?subject FROM <http://dbpedia.org> { ?subject rdfs:domain ?object . <http://dbpedia.org/ontology/Band> rdfs:subClassOf ?object OPTION (TRANSITIVE, t_distinct, t_step('step_no') as ?n, t_min (0) ). }""" http://live.dbpedia.org/sparql sparql = SPARQLWrapper("http://dbpedia.org/sparql") sparql.setQuery(query) sparql.setReturnFormat(JSON) results = sparql.query().convert() for result in results["results"]["bindings"]: print(result["subject"]["value"])
  44. 44. SPARQLWrapper abstract endpoint returns dict # List all predicates of dbonto:Band query = """ SELECT distinct ?subject FROM <http://dbpedia.org> { ?subject rdfs:domain ?object . <http://dbpedia.org/ontology/Band> rdfs:subClassOf ?object OPTION (TRANSITIVE, t_distinct, t_step('step_no') as ?n, t_min (0) ). }""" http://live.dbpedia.org/sparql sparql = SPARQLWrapper("http://dbpedia.org/sparql") sparql.setQuery(query) sparql.setReturnFormat(JSON) results = sparql.query().convert() for result in results["results"]["bindings"]: print(result["subject"]["value"])
  45. 45. SPARQLWrapper Ok, not different from what we have...
  46. 46. SPARQLWrapper just a wrapper around a SPARQL server well tested ;)
  47. 47. SPARQLWrapper problem: list all subjects given ?p ?o from SPARQLWrapper import SPARQLWrapper, JSON # List all instances (eg. bands) with genre Metal query = """ PREFIX db: <http://dbpedia.org/resource/> PREFIX dbonto: <http://dbpedia.org/ontology/> SELECT DISTINCT ?who FROM <http://dbpedia.org> WHERE { ?who dbonto:genre db:Metal . } """ sparql = SPARQLWrapper("http://dbpedia.org/sparql") sparql.setQuery(query) sparql.setReturnFormat(JSON) results = sparql.query().convert() for result in results["results"]["bindings"]: print(result["who"]["value"])
  48. 48. RdfLib problem: list all subjects given ?p ?o import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Definitions genre = rdflib.URIRef("http://dbpedia.org/ontology/genre") metal = rdflib.URIRef("http://dbpedia.org/resource/Metal") # Query for label in graph.subjects(genre, metal): print label
  49. 49. RdfLib abstract endpoint returns dict namespace import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Namespaces to clear up definitions DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Query for label in graph.subjects(DBONTO.genre, DB.Metal): print label
  50. 50. RdfLib abstract endpoint returns dict namespace import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Namespaces to clear up definitions DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Query for label in graph.subjects(DBONTO.genre, DB.Metal): print label subjects predicates objects subject_predicates subject_objects predicates_objects
  51. 51. RdfLib abstract endpoint returns dict namespace import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Namespaces to clear up definitions DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Using triples for musician, _, _ in graph.triples((None, DBONTO.genre, DB.Metal)): print musician
  52. 52. RdfLib abstract endpoint returns dict namespace query by triples import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Namespaces to clear up definitions DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Query for label in graph.subjects(DBONTO.genre, DB.Metal): print label
  53. 53. RdfLib abstract endpoint returns dict namespace query by triples add / remove import rdflib import rdfextras.store.SPARQL # n3 fixture file graph = rdflib.Graph() graph.parse("fixture_genre_metal.nt", format="nt") # Namespace DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Add nodes graph.add((DB.AndrewsMedina, DBONTO.genre, DB.Metal)) graph.add((DB.Siminino, DBONTO.genre, DB.Metal)) graph.add((DB.Herman, DBONTO.genre, DB.Metal)) # Remove nodes graph.remove((DB.AndrewsMedina, DBONTO.genre, DB.Metal))
  54. 54. RdfLib concentrates on providing the core RDF types and interfaces, through plugin interface
  55. 55. RdfLib makes testing simple, allowing fixtures using n3 files, add triples and remove triples
  56. 56. RdfLib but... each triple query requires a new connection to SPARQL
  57. 57. RdfLib therefore too many access to SPARQL endpoint
  58. 58. RdfLib and... doesn't provide an ORM (object relational mapping)
  59. 59. SuRF abstract endpoint returns dict namespace query by triples add / remove from surf import Store, Session, ns, query store = Store(reader='sparql_protocol', endpoint='http://dbpedia.org/sparql') session = Session(store, {}) session.enable_logging = False ns.register(db='http://dbpedia.org/resource/') ns.register(dbonto='http://dbpedia.org/ontology/') MusicalArtist = session.get_class(ns.DB['MusicalArtist']) artistas_metal = MusicalArtist.get_by(dbonto_genre=ns.DB["Metal"]) print artistas_metal ORM
  60. 60. SuRF problem: list all subjects given ?p ?o from surf import Store, Session, ns, query store = Store(reader='sparql_protocol', endpoint='http://dbpedia.org/sparql') session = Session(store, {}) ns.register(db='http://dbpedia.org/resource/') ns.register(dbonto='http://dbpedia.org/ontology/') query_surf = query.select("?who").distinct() query_surf.where(("?who", ns.DBONTO.genre, ns.DB.Metal)) metal_bands = session.default_store.execute(query_surf) for band in metal_bands: print band composed ORM queries
  61. 61. SuRF various approaches ORM programaticaly
  62. 62. SuRF simple ORM no need to redeclare TTL definitions
  63. 63. SuRF “complex” queries using lazy evalutation
  64. 64. SuRF documentation & community
  65. 65. SuRF but... no django-style models
  66. 66. SuRF verbose syntax
  67. 67. RDFAlchemy problem: list all subjects given ?p ?o from rdfalchemy.sparql import SPARQLGraph from rdflib import Namespace endpoint = "http://dbpedia.org/sparql" graph = SPARQLGraph(endpoint) DB = Namespace("http://dbpedia.org/resource/") DBONTO = Namespace("http://dbpedia.org/ontology/") metal_bands = graph.subjects(predicate=DBONTO.genre, object=DB.Metal) for band in metal_bands: print band
  68. 68. RDFAlchemy abstract endpoint returns dict namespace query by triples add / remove from rdfalchemy.sparql import SPARQLGraph from rdfalchemy import rdfSubject, rdfSingle from rdflib import Namespace DB = Namespace('http://dbpedia.org/resource/') DBONTO = Namespace("http://dbpedia.org/ontology/") RDFS = Namespace('http://www.w3.org/2000/01/rdf-schema#') endpoint = "http://live.dbpedia.org/sparql" graph = SPARQLGraph(endpoint) rdfSubject.db = graph class MusicalArtist(rdfSubject): rdfs_label = rdfSingle(RDFS.label, 'label') genre = rdfSingle(DBONTO.genre, 'genre') metal_artists = MusicalArtist.filter_by(genre=DB.Metal) for band in metal_artists: print band ORM django-like
  69. 69. RDFAlchemy django-like models
  70. 70. RDFAlchemy simple syntax
  71. 71. RDFAlchemy but... non-lazy
  72. 72. RDFAlchemy we have to declare all data already described in TTL files as python classes
  73. 73. semantic-django abstract endpoint returns dict namespace query by triples add / remove # Classes similar to django model's are created from TTL # files (using manage.py) class BaseLugar(BaseEntidade): latitude = models.UriField() longitude = models.UriField() geonameid = models.UriField() tem_mapa = models.UriField() apelido = models.UriField() ImagemMapa = models.UriField() genero_gramatical = models.UriField() class Meta: semantic_graph = 'http://semantica.globo.com/base/Lugar' ORM django-like
  74. 74. semantic-django https://github.com/rfloriano/semantic-django
  75. 75. semantic-django dream of many product developers
  76. 76. semantic-django but... just started to be developed
  77. 77. study existing solutions, and now? [ ] contribute to them [ ] develop on top of them [ ] create a solution from scratch [ ] other, _________________
  78. 78. grab your post-it, it's review time! =) =( comments shows no my SuRF query models favorite not my nice models lazy choice RDFAlchemy API name low RDFlib space layer django just semantic-django like started (...)
  79. 79. any questions...? @tati_alchueyr
  80. 80. casting by (click to know more about each meme)

×