SlideShare ist ein Scribd-Unternehmen logo
1 von 117
SPARQL UniProt.RDF




    Jerven Bolleman
      Developer
      Swiss-Prot Group
      Swiss Institute of Bioinformatics




Tuesday, December 4, 2012
A few notes before we begin


     • SPARQL 1
        – Some what useful
        – Standardized in 2008
     • SPARQL 1.1
        – Very useful
        – Currently in recommended standard

     • Still finding incompatibilities
     • Or not yet implemented features



    © 2012 SIB



Tuesday, December 4, 2012
Raise your hand if you have questions




    © 2012 SIB



Tuesday, December 4, 2012
Tutorial plan


     • Set up Topbraid Composer
        – Skipped in talk
        – On VM
     • Gather data from uniprot website
        – Already there.        Text
     • Learn sparql
                   You do not need Topbraid Composer
                   to use UniProt RDF data or do sparql
                   queries.
                   You can use beta.sparql.uniprot.org
                   as well.
    © 2012 SIB



Tuesday, December 4, 2012
Download and install Topbraid composer


     • Requirements
        – Sun/Oracle JVM
     • Go to
        – http://www.topquadrant.com/products/
          TB_download.html
        – Register
        – Select any edition, free is ok for today




    © 2012 SIB



Tuesday, December 4, 2012
Start Topbraid




    © 2012 SIB



Tuesday, December 4, 2012
Setting up a workspace for this tutorial


     • http://www.topquadrant.com/products/TB_download.html




    © 2012 SIB



Tuesday, December 4, 2012
New project
     • File > New Project > General




    © 2012 SIB



Tuesday, December 4, 2012
Gather data from uniprot.org website




                 • In the navigator select the new project you just made.




    © 2012 SIB



Tuesday, December 4, 2012
Gather data from uniprot.org website
  Right click on your new project.
  Select “Import” in the drop down menu




          • Import RDF or OWL file from the web


    © 2012 SIB



Tuesday, December 4, 2012
Using the same process download core.owl




                 You can see a html view of this schema
                 ontology at
                 http://www.uniprot.org/core/




    © 2012 SIB



Tuesday, December 4, 2012
Gather data from uniprot.org website




             You can see a html view of this entry at
                http://www.uniprot.org/taxonomy/40674




    © 2012 SIB



Tuesday, December 4, 2012
Gather data from uniprot.org website


     • Open the mammalia.rdf file by double clicking




    © 2012 SIB



Tuesday, December 4, 2012
You get a very helpfull dialog.
      Hit yes




    © 2012 SIB



Tuesday, December 4, 2012
Its SPARQLy mammal time !!




    © 2012 SIB



Tuesday, December 4, 2012
Lets look at an single taxon record




    © 2012 SIB



Tuesday, December 4, 2012
Lets navigate to it in TopBraid


     • Type the uri as is with the angle brackets




    © 2012 SIB



Tuesday, December 4, 2012
Investigate the taxon record




    © 2012 SIB



Tuesday, December 4, 2012
The “Eastern Chipmunk” in turtle




    © 2012 SIB



Tuesday, December 4, 2012
Turtle is the RDF serialization aligned with
     SPARQL

     • Shorthand to avoid typing so much
        – . ‘dot’ is end statement
        – ; ‘semi-colon’ repeat subject
        – , ‘comma’ is repeat subject and predicate
     • prefix
        – before ‘:’ is abbreviation of uri




    © 2012 SIB



Tuesday, December 4, 2012
Why don’t these queries work on the web?


     • PREFIX
        – Topbraid composer uses the prefixes defined in the
          files “overview” tab.
        – On the web you often have to add these.

                   PREFIX :<http://purl.uniprot.org/core/>
                   SELECT ?x
                   FROM <http://purl.uniprot.org/taxonomy/>
                   WHERE {?x a :Taxon}




    © 2012 SIB



Tuesday, December 4, 2012
a = rdf:type = <http://www.w3.org/1999/02/22-rdf-
     syntax-ns#type>




    © 2012 SIB



Tuesday, December 4, 2012
rdfs:subClassOf
     taxon:45474 is a more specific classification than
     taxon:13712




    © 2012 SIB



Tuesday, December 4, 2012
rank => “The level, for nomenclatural purposes, of
     a taxon in a taxonomic hierarchy”




    © 2012 SIB



Tuesday, December 4, 2012
Why learn SPARQL


     • Standardized formal query language
        – implementation independent
           • SPARQL ➔ SQL (via R2ML)
           • SPARQL ➔ webservice (via SADI)
           • SPARQL ➔ LDAP (e.g. SquirrelRDF)
           • SPARQL ➔ RDF (triplestore e.g. OWLIM-se)
           • SPARQL ➔ HADOOP/HIVE (e.g. SHARD)
        – How you query independent of how you store!




    © 2012 SIB



Tuesday, December 4, 2012
Apparently it helps
      kill vampires !!!




    © 2012 SIB



Tuesday, December 4, 2012
Lets learn SPARQL


     • Queries over RDF data.
       – Four basic types
          • SELECT
              – Returns “tab delimited” results
          • CONSTRUCT
              – Makes new triples
          • DESCRIBE
              – Returns all triples mentioning a resource
          • ASK
              – Return true if anything matches

    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




                 taxon:9606 rdf:type core:Taxon .




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




                 ?anyTaxon rdf:type core:Taxon .




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




          SELECT ?anyTaxon
          WHERE {
            ?anyTaxon rdf:type core:Taxon .
          }




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




                 taxon:9606 rdf:type core:Taxon .
                 taxon:9606 core:reviewed “true” .




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




                 ?anyTaxon rdf:type core:Taxon .
                 ?anyTaxon core:reviewed “true” .




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




          SELECT ?anyTaxon
          WHERE {
            ?anyTaxon rdf:type core:Taxon .
            ?anyTaxon core:reviewed “true” .
          }




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




          SELECT ?anyTaxon
          WHERE {
            ?anyTaxon rdf:type core:Taxon .
            ?anyTaxin core:reviewed “true” .
          }




    © 2012 SIB



Tuesday, December 4, 2012
SPARQL:queries triple pattern




          SELECT ?anyTaxon
          WHERE {
            ?anyTaxon rdf:type core:Taxon .
            $anyTaxon core:reviewed “true” .
          }




    © 2012 SIB



Tuesday, December 4, 2012
Lets learn SPARQL




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
Shorthand a = rdf:type




    © 2012 SIB



Tuesday, December 4, 2012
AND join (default)




    © 2012 SIB



Tuesday, December 4, 2012
Now you type




    © 2012 SIB



Tuesday, December 4, 2012
Remember ‘;’ shortcut




    © 2012 SIB



Tuesday, December 4, 2012
Two variables one output column




    © 2012 SIB



Tuesday, December 4, 2012
Optional


     • When values may be missing
        – yet interesting when they are there
     • Use as sub query
     • bound values from outside stay bound inside
        – ?x ?y?z . OPTIONAL {?x ?b ?c}
           • ?x same variable = same thing




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
UNION


     • Allows you to combine query patterns as an OR
       operation.
     • Joins are still from outer to inner.




    © 2012 SIB



Tuesday, December 4, 2012
UNION




    © 2012 SIB



Tuesday, December 4, 2012
Negation


     • When you do not want a certain category of matches.

                            SELECT ?pet
                            WHERE {
                              ?pet a pets:Friendly .
                            }




    © 2012 SIB



Tuesday, December 4, 2012
Oooops




    © 2012 SIB



Tuesday, December 4, 2012
Not exists (Negation 1)




    © 2012 SIB



Tuesday, December 4, 2012
Minus (Negation 2)




    © 2012 SIB



Tuesday, December 4, 2012
MINUS{} or FILTER (NOT EXISTS{})


     • Whats the difference?
       – MINUS subtracts results
       – NOT EXITS tests if the sub pattern is possible at all.
          • Normally the faster option.




    © 2012 SIB



Tuesday, December 4, 2012
MINUS all data




    © 2012 SIB



Tuesday, December 4, 2012
FILTER (NOT EXISTS{}) no results




    © 2012 SIB



Tuesday, December 4, 2012
Negation option 3
       SPARQL 1.0

                 SELECT ?subject ?rank
                 WHERE {
                    ?subject core:rank ?rank .
                    OPTIONAL
 { ?subject core:rank core:Genus .
                   
 
   
    
   
   
    ?subject core:rank ?genus .}
                    FILTER(! BOUND(?genus))
                 }




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
FILTERS


     • You just saw it twice
        – Once in the !BOUND
        – Once in the NOT EXISTS

     • FILTERS a result set by possibly removing values
        – FILTER do not add a value to the result
     • Inside the same graph pattern order independent.




    © 2012 SIB



Tuesday, December 4, 2012
Filter




    © 2012 SIB



Tuesday, December 4, 2012
Filter on not in




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
IN




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
FILTER on numbers


     • <
        –        FILTER (1 < 2)
     • >
        –        FILTER (2 > 1)
     • =
        –        FILTER (1 =1)
     • !=
        –        FILTER(1 != 2)
     •



    © 2012 SIB



Tuesday, December 4, 2012
Filters


     • ?x = ?y does casting (value conversions)
        – 1.0^^xsd:float = 1^^xsd:int is true
     • sameTerm(?x, ?y) does not
        – sameTerm(1.0^^xsd:float, 1^^xsd:int)




    © 2012 SIB



Tuesday, December 4, 2012
FILTER on strings


     • Functions
        – STRLEN            –   ENCODE_FOR_URI
        – SUBSTR            –   CONCAT
        – UCASE             –   langMatches
        – LCASE             –   REGEX
        – STRSTARTS         –   REPLACE
        – STRENDS
        – CONTAINS          – IRI
        – STRBEFORE
        – STRAFTER

    © 2012 SIB



Tuesday, December 4, 2012
STRLEN == String Length




    © 2012 SIB



Tuesday, December 4, 2012
CONTAINS is case sensitive is it in there




    © 2012 SIB



Tuesday, December 4, 2012
REGEX, just like java regex




    © 2012 SIB



Tuesday, December 4, 2012
BIND


     • Builds new Values
        – Closes the basic graph pattern
                 SELECT ?p WHERE {
                   {
                     ?taxon a :Taxon .
                   }
                   BIND (?taxon AS ?p)
                 }
     • Always declare before use.



    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
BIND can assign any output




    © 2012 SIB



Tuesday, December 4, 2012
Aggregate functions


     • on select line
     • limited in number
         – count
         – sum
         – avg
         – min
         – max
         – groupConcat
         – sample



    © 2012 SIB



Tuesday, December 4, 2012
count




    © 2012 SIB



Tuesday, December 4, 2012
SAMPLE should give a random result back




    © 2012 SIB



Tuesday, December 4, 2012
Follow the path




    © 2012 SIB



Tuesday, December 4, 2012
Path queries




    © 2012 SIB



Tuesday, December 4, 2012
Finding a grand parent using normal joins




    © 2012 SIB



Tuesday, December 4, 2012
Finding a grandParent using a path query




    © 2012 SIB



Tuesday, December 4, 2012
| is OR for predicate




    © 2012 SIB



Tuesday, December 4, 2012
Same result with UNION




    © 2012 SIB



Tuesday, December 4, 2012
Finding any ancestor




    © 2012 SIB



Tuesday, December 4, 2012
Can use the variable in a normal join afterwards




    © 2012 SIB



Tuesday, December 4, 2012
GROUP BY




    © 2012 SIB



Tuesday, December 4, 2012
GROUP BY


     • Needed for aggregate values
     • After closing the where clause
        – ... WHERE {?x ?y ?z} GROUP BY ?x




    © 2012 SIB



Tuesday, December 4, 2012
GROUP BY




    © 2012 SIB



Tuesday, December 4, 2012
HAVING




                            I have carrot !




    © 2012 SIB



Tuesday, December 4, 2012
HAVING


     • FILTER for aggregates
     • After the GROUP BY clause
        – ... GROUP BY ?x HAVING (count(?y) > 2)
        – ... GROUP BY ?x HAVING (min(?y) = 2)
        – etc...




    © 2012 SIB



Tuesday, December 4, 2012
HAVING




    © 2012 SIB



Tuesday, December 4, 2012
LIMITS
         &
            OFFSET




    © 2012 SIB



Tuesday, December 4, 2012
LIMIT and OFFSET

     • OFFSET is skip first results
     • LIMIT return no more than x results




    © 2012 SIB



Tuesday, December 4, 2012
ORDER




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
VALUES


     • Super BIND
     • Provide inline data




    © 2012 SIB



Tuesday, December 4, 2012
© 2012 SIB



Tuesday, December 4, 2012
Examples


     • Parameter lists are between ()


                   VALUES (?annotation) {
                     (core:Disease_Annotation)
                                       Text
                     (core:Disulfide_Bond_Annotation)
                   }




    © 2012 SIB



Tuesday, December 4, 2012
Examples


     • Undef means no value at
        – all not bound
                 VALUES (?annotation ?begin) {
                   (core:Disease_Annotation UNDEF)
                                       Text
                   (core:Disulfide_Bond_Annotation 2)
                 }




    © 2012 SIB



Tuesday, December 4, 2012
VALUES


     • After declaring a set of values you can use them in your
       query.

                 SELECT ?comment WHERE {
                   VALUES (?annotation ?begin) {
                     (core:Disease_Annotation UNDEF)
                     (core:Disulfide_Bond_Annotation 2)
                   }
                   ?annotation rdfs:comment ?comment .
                 }


    © 2012 SIB



Tuesday, December 4, 2012
SERVICE: Using other sparql endpoints


     • SERVICE<URL of other endpoint>
        – Runs a sub query on the other endpoint and merges it
          back into your query.




    © 2012 SIB



Tuesday, December 4, 2012
“Life is better with friends who understand you.”




    © 2012 SIB



Tuesday, December 4, 2012
SERVICE




    © 2012 SIB



Tuesday, December 4, 2012
SERVICE


     • Useful
        – Quick experimenting with combing multiple
          datasources
        – Quick for queries where not to much data is send to
          the remote point

     • Slow
        – When you ask for to much data
        – Remote endpoint not resourced for your questions



    © 2012 SIB



Tuesday, December 4, 2012
Lets make
                            some triples




    © 2012 SIB



Tuesday, December 4, 2012
Construction


     • CONSTRUCT
        – New triples
           • downloads RDF
        – Does not update store




    © 2012 SIB



Tuesday, December 4, 2012
New triples




    © 2012 SIB



Tuesday, December 4, 2012
Constructing an owl:sameAs between two URI




    © 2012 SIB



Tuesday, December 4, 2012
INSERT


     • Adds data
        – like construct




    © 2012 SIB



Tuesday, December 4, 2012
Modifies data




    © 2012 SIB



Tuesday, December 4, 2012
DELETE


     • Removes data
        – Triples matching are removed from the data
        – Triples can be bound using where clause




    © 2012 SIB



Tuesday, December 4, 2012
DELETE




    © 2012 SIB



Tuesday, December 4, 2012
DELETE
     INSERT

     • Single atomic operation.




    © 2012 SIB



Tuesday, December 4, 2012
Atomic operation




    © 2012 SIB



Tuesday, December 4, 2012
I’m exhausted now




    © 2012 SIB



Tuesday, December 4, 2012
Questions




Tuesday, December 4, 2012

Weitere ähnliche Inhalte

Was ist angesagt?

Modelo orientado a objetos
Modelo orientado a objetosModelo orientado a objetos
Modelo orientado a objetos
Daiana de Ávila
 

Was ist angesagt? (19)

Análise de ferramentas para gestão de regras de negócio em sistemas de inform...
Análise de ferramentas para gestão de regras de negócio em sistemas de inform...Análise de ferramentas para gestão de regras de negócio em sistemas de inform...
Análise de ferramentas para gestão de regras de negócio em sistemas de inform...
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQL
 
Linked Data Tutorial
Linked Data TutorialLinked Data Tutorial
Linked Data Tutorial
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
 
06 Modelagem de banco de dados: Modelo Lógico
06  Modelagem de banco de dados: Modelo Lógico06  Modelagem de banco de dados: Modelo Lógico
06 Modelagem de banco de dados: Modelo Lógico
 
Spark's Role in the Big Data Ecosystem (Spark Summit 2014)
Spark's Role in the Big Data Ecosystem (Spark Summit 2014)Spark's Role in the Big Data Ecosystem (Spark Summit 2014)
Spark's Role in the Big Data Ecosystem (Spark Summit 2014)
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQL
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
RDF data validation 2017 SHACL
RDF data validation 2017 SHACLRDF data validation 2017 SHACL
RDF data validation 2017 SHACL
 
Jena
JenaJena
Jena
 
SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectives
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 
Modelo orientado a objetos
Modelo orientado a objetosModelo orientado a objetos
Modelo orientado a objetos
 
Exchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF DataExchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF Data
 
Desarrollo de Software Guiado por Pruebas
Desarrollo de Software Guiado por PruebasDesarrollo de Software Guiado por Pruebas
Desarrollo de Software Guiado por Pruebas
 
No sql Orientado a documento
No sql Orientado a documentoNo sql Orientado a documento
No sql Orientado a documento
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 

Andere mochten auch (9)

SPIN in Five Slides
SPIN in Five SlidesSPIN in Five Slides
SPIN in Five Slides
 
The uni prot knowledgebase
The uni prot knowledgebaseThe uni prot knowledgebase
The uni prot knowledgebase
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database Systems
 
PROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANKPROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANK
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Java and SPARQL
Java and SPARQLJava and SPARQL
Java and SPARQL
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat Sheet
 
Proteomics
ProteomicsProteomics
Proteomics
 

Ähnlich wie Learning sparql 2012 12

Análisis de ataques APT
Análisis de ataques APT Análisis de ataques APT
Análisis de ataques APT
linenoise
 
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
Chiradeep Vittal
 
Avoiding API Waterfalls
Avoiding API WaterfallsAvoiding API Waterfalls
Avoiding API Waterfalls
Jakub Nesetril
 
Improving Front End Performance
Improving Front End PerformanceImproving Front End Performance
Improving Front End Performance
Joseph Scott
 
OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?
Aidan Hogan
 
NOSQL also means RDF stores: an Android case study
NOSQL also means RDF stores: an Android case studyNOSQL also means RDF stores: an Android case study
NOSQL also means RDF stores: an Android case study
Fabrizio Giudici
 

Ähnlich wie Learning sparql 2012 12 (20)

STOP THE INSANITY - Juggle your classes!
STOP THE INSANITY - Juggle your classes!STOP THE INSANITY - Juggle your classes!
STOP THE INSANITY - Juggle your classes!
 
Análisis de ataques APT
Análisis de ataques APT Análisis de ataques APT
Análisis de ataques APT
 
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
What\'s Hot, What\'s Not: Skills For SAS® Professionals (35 Minutes)
What\'s Hot, What\'s Not: Skills For SAS® Professionals (35 Minutes)What\'s Hot, What\'s Not: Skills For SAS® Professionals (35 Minutes)
What\'s Hot, What\'s Not: Skills For SAS® Professionals (35 Minutes)
 
Avoiding API Waterfalls
Avoiding API WaterfallsAvoiding API Waterfalls
Avoiding API Waterfalls
 
Developing RESTful Web APIs with Python, Flask and MongoDB
Developing RESTful Web APIs with Python, Flask and MongoDBDeveloping RESTful Web APIs with Python, Flask and MongoDB
Developing RESTful Web APIs with Python, Flask and MongoDB
 
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
 
Scala
ScalaScala
Scala
 
Improving Front End Performance
Improving Front End PerformanceImproving Front End Performance
Improving Front End Performance
 
Best Practices in Theme Development - WordCamp Orlando 2012
Best Practices in Theme Development - WordCamp Orlando 2012Best Practices in Theme Development - WordCamp Orlando 2012
Best Practices in Theme Development - WordCamp Orlando 2012
 
Ilt forum 2 may 2012
Ilt forum 2 may 2012Ilt forum 2 may 2012
Ilt forum 2 may 2012
 
Thomas risberg mongosv-2012-spring-data-cloud-foundry
Thomas risberg mongosv-2012-spring-data-cloud-foundryThomas risberg mongosv-2012-spring-data-cloud-foundry
Thomas risberg mongosv-2012-spring-data-cloud-foundry
 
OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache Pig
 
NOSQL also means RDF stores: an Android case study
NOSQL also means RDF stores: an Android case studyNOSQL also means RDF stores: an Android case study
NOSQL also means RDF stores: an Android case study
 
Triple Stores
Triple StoresTriple Stores
Triple Stores
 
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
 
RDFa
RDFaRDFa
RDFa
 
Presentation on Windows 8 Application at IIT, University of Dhaka
Presentation on Windows 8 Application at IIT, University of DhakaPresentation on Windows 8 Application at IIT, University of Dhaka
Presentation on Windows 8 Application at IIT, University of Dhaka
 

Mehr von Jerven Bolleman

Mehr von Jerven Bolleman (8)

Semantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLSemantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQL
 
Why sparql tohu
Why sparql tohuWhy sparql tohu
Why sparql tohu
 
RDF: what and why plus a SPARQL tutorial
RDF: what and why plus a SPARQL tutorialRDF: what and why plus a SPARQL tutorial
RDF: what and why plus a SPARQL tutorial
 
UniProtKB/Swiss-Prot:Why sparql?
UniProtKB/Swiss-Prot:Why sparql?UniProtKB/Swiss-Prot:Why sparql?
UniProtKB/Swiss-Prot:Why sparql?
 
sparql,uniprot.org in production
sparql,uniprot.org in productionsparql,uniprot.org in production
sparql,uniprot.org in production
 
The UniProt SPARQL endpoint: 20 billion quads in production
The UniProt SPARQL endpoint: 20 billion quads in productionThe UniProt SPARQL endpoint: 20 billion quads in production
The UniProt SPARQL endpoint: 20 billion quads in production
 
Biohackathon2013: Tripling Bioinformatics Productivity
Biohackathon2013: Tripling Bioinformatics ProductivityBiohackathon2013: Tripling Bioinformatics Productivity
Biohackathon2013: Tripling Bioinformatics Productivity
 
Uni protsparqlcloud
Uni protsparqlcloudUni protsparqlcloud
Uni protsparqlcloud
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Learning sparql 2012 12

  • 1. SPARQL UniProt.RDF Jerven Bolleman Developer Swiss-Prot Group Swiss Institute of Bioinformatics Tuesday, December 4, 2012
  • 2. A few notes before we begin • SPARQL 1 – Some what useful – Standardized in 2008 • SPARQL 1.1 – Very useful – Currently in recommended standard • Still finding incompatibilities • Or not yet implemented features © 2012 SIB Tuesday, December 4, 2012
  • 3. Raise your hand if you have questions © 2012 SIB Tuesday, December 4, 2012
  • 4. Tutorial plan • Set up Topbraid Composer – Skipped in talk – On VM • Gather data from uniprot website – Already there. Text • Learn sparql You do not need Topbraid Composer to use UniProt RDF data or do sparql queries. You can use beta.sparql.uniprot.org as well. © 2012 SIB Tuesday, December 4, 2012
  • 5. Download and install Topbraid composer • Requirements – Sun/Oracle JVM • Go to – http://www.topquadrant.com/products/ TB_download.html – Register – Select any edition, free is ok for today © 2012 SIB Tuesday, December 4, 2012
  • 6. Start Topbraid © 2012 SIB Tuesday, December 4, 2012
  • 7. Setting up a workspace for this tutorial • http://www.topquadrant.com/products/TB_download.html © 2012 SIB Tuesday, December 4, 2012
  • 8. New project • File > New Project > General © 2012 SIB Tuesday, December 4, 2012
  • 9. Gather data from uniprot.org website • In the navigator select the new project you just made. © 2012 SIB Tuesday, December 4, 2012
  • 10. Gather data from uniprot.org website Right click on your new project. Select “Import” in the drop down menu • Import RDF or OWL file from the web © 2012 SIB Tuesday, December 4, 2012
  • 11. Using the same process download core.owl You can see a html view of this schema ontology at http://www.uniprot.org/core/ © 2012 SIB Tuesday, December 4, 2012
  • 12. Gather data from uniprot.org website You can see a html view of this entry at http://www.uniprot.org/taxonomy/40674 © 2012 SIB Tuesday, December 4, 2012
  • 13. Gather data from uniprot.org website • Open the mammalia.rdf file by double clicking © 2012 SIB Tuesday, December 4, 2012
  • 14. You get a very helpfull dialog. Hit yes © 2012 SIB Tuesday, December 4, 2012
  • 15. Its SPARQLy mammal time !! © 2012 SIB Tuesday, December 4, 2012
  • 16. Lets look at an single taxon record © 2012 SIB Tuesday, December 4, 2012
  • 17. Lets navigate to it in TopBraid • Type the uri as is with the angle brackets © 2012 SIB Tuesday, December 4, 2012
  • 18. Investigate the taxon record © 2012 SIB Tuesday, December 4, 2012
  • 19. The “Eastern Chipmunk” in turtle © 2012 SIB Tuesday, December 4, 2012
  • 20. Turtle is the RDF serialization aligned with SPARQL • Shorthand to avoid typing so much – . ‘dot’ is end statement – ; ‘semi-colon’ repeat subject – , ‘comma’ is repeat subject and predicate • prefix – before ‘:’ is abbreviation of uri © 2012 SIB Tuesday, December 4, 2012
  • 21. Why don’t these queries work on the web? • PREFIX – Topbraid composer uses the prefixes defined in the files “overview” tab. – On the web you often have to add these. PREFIX :<http://purl.uniprot.org/core/> SELECT ?x FROM <http://purl.uniprot.org/taxonomy/> WHERE {?x a :Taxon} © 2012 SIB Tuesday, December 4, 2012
  • 22. a = rdf:type = <http://www.w3.org/1999/02/22-rdf- syntax-ns#type> © 2012 SIB Tuesday, December 4, 2012
  • 23. rdfs:subClassOf taxon:45474 is a more specific classification than taxon:13712 © 2012 SIB Tuesday, December 4, 2012
  • 24. rank => “The level, for nomenclatural purposes, of a taxon in a taxonomic hierarchy” © 2012 SIB Tuesday, December 4, 2012
  • 25. Why learn SPARQL • Standardized formal query language – implementation independent • SPARQL ➔ SQL (via R2ML) • SPARQL ➔ webservice (via SADI) • SPARQL ➔ LDAP (e.g. SquirrelRDF) • SPARQL ➔ RDF (triplestore e.g. OWLIM-se) • SPARQL ➔ HADOOP/HIVE (e.g. SHARD) – How you query independent of how you store! © 2012 SIB Tuesday, December 4, 2012
  • 26. Apparently it helps kill vampires !!! © 2012 SIB Tuesday, December 4, 2012
  • 27. Lets learn SPARQL • Queries over RDF data. – Four basic types • SELECT – Returns “tab delimited” results • CONSTRUCT – Makes new triples • DESCRIBE – Returns all triples mentioning a resource • ASK – Return true if anything matches © 2012 SIB Tuesday, December 4, 2012
  • 28. SPARQL:queries triple pattern taxon:9606 rdf:type core:Taxon . © 2012 SIB Tuesday, December 4, 2012
  • 29. SPARQL:queries triple pattern ?anyTaxon rdf:type core:Taxon . © 2012 SIB Tuesday, December 4, 2012
  • 30. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . } © 2012 SIB Tuesday, December 4, 2012
  • 31. SPARQL:queries triple pattern taxon:9606 rdf:type core:Taxon . taxon:9606 core:reviewed “true” . © 2012 SIB Tuesday, December 4, 2012
  • 32. SPARQL:queries triple pattern ?anyTaxon rdf:type core:Taxon . ?anyTaxon core:reviewed “true” . © 2012 SIB Tuesday, December 4, 2012
  • 33. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . ?anyTaxon core:reviewed “true” . } © 2012 SIB Tuesday, December 4, 2012
  • 34. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . ?anyTaxin core:reviewed “true” . } © 2012 SIB Tuesday, December 4, 2012
  • 35. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . $anyTaxon core:reviewed “true” . } © 2012 SIB Tuesday, December 4, 2012
  • 36. Lets learn SPARQL © 2012 SIB Tuesday, December 4, 2012
  • 37. © 2012 SIB Tuesday, December 4, 2012
  • 38. © 2012 SIB Tuesday, December 4, 2012
  • 39. Shorthand a = rdf:type © 2012 SIB Tuesday, December 4, 2012
  • 40. AND join (default) © 2012 SIB Tuesday, December 4, 2012
  • 41. Now you type © 2012 SIB Tuesday, December 4, 2012
  • 42. Remember ‘;’ shortcut © 2012 SIB Tuesday, December 4, 2012
  • 43. Two variables one output column © 2012 SIB Tuesday, December 4, 2012
  • 44. Optional • When values may be missing – yet interesting when they are there • Use as sub query • bound values from outside stay bound inside – ?x ?y?z . OPTIONAL {?x ?b ?c} • ?x same variable = same thing © 2012 SIB Tuesday, December 4, 2012
  • 45. © 2012 SIB Tuesday, December 4, 2012
  • 46. UNION • Allows you to combine query patterns as an OR operation. • Joins are still from outer to inner. © 2012 SIB Tuesday, December 4, 2012
  • 47. UNION © 2012 SIB Tuesday, December 4, 2012
  • 48. Negation • When you do not want a certain category of matches. SELECT ?pet WHERE { ?pet a pets:Friendly . } © 2012 SIB Tuesday, December 4, 2012
  • 49. Oooops © 2012 SIB Tuesday, December 4, 2012
  • 50. Not exists (Negation 1) © 2012 SIB Tuesday, December 4, 2012
  • 51. Minus (Negation 2) © 2012 SIB Tuesday, December 4, 2012
  • 52. MINUS{} or FILTER (NOT EXISTS{}) • Whats the difference? – MINUS subtracts results – NOT EXITS tests if the sub pattern is possible at all. • Normally the faster option. © 2012 SIB Tuesday, December 4, 2012
  • 53. MINUS all data © 2012 SIB Tuesday, December 4, 2012
  • 54. FILTER (NOT EXISTS{}) no results © 2012 SIB Tuesday, December 4, 2012
  • 55. Negation option 3 SPARQL 1.0 SELECT ?subject ?rank WHERE { ?subject core:rank ?rank . OPTIONAL { ?subject core:rank core:Genus . ?subject core:rank ?genus .} FILTER(! BOUND(?genus)) } © 2012 SIB Tuesday, December 4, 2012
  • 56. © 2012 SIB Tuesday, December 4, 2012
  • 57. FILTERS • You just saw it twice – Once in the !BOUND – Once in the NOT EXISTS • FILTERS a result set by possibly removing values – FILTER do not add a value to the result • Inside the same graph pattern order independent. © 2012 SIB Tuesday, December 4, 2012
  • 58. Filter © 2012 SIB Tuesday, December 4, 2012
  • 59. Filter on not in © 2012 SIB Tuesday, December 4, 2012
  • 60. © 2012 SIB Tuesday, December 4, 2012
  • 61. © 2012 SIB Tuesday, December 4, 2012
  • 62. IN © 2012 SIB Tuesday, December 4, 2012
  • 63. © 2012 SIB Tuesday, December 4, 2012
  • 64. FILTER on numbers • < – FILTER (1 < 2) • > – FILTER (2 > 1) • = – FILTER (1 =1) • != – FILTER(1 != 2) • © 2012 SIB Tuesday, December 4, 2012
  • 65. Filters • ?x = ?y does casting (value conversions) – 1.0^^xsd:float = 1^^xsd:int is true • sameTerm(?x, ?y) does not – sameTerm(1.0^^xsd:float, 1^^xsd:int) © 2012 SIB Tuesday, December 4, 2012
  • 66. FILTER on strings • Functions – STRLEN – ENCODE_FOR_URI – SUBSTR – CONCAT – UCASE – langMatches – LCASE – REGEX – STRSTARTS – REPLACE – STRENDS – CONTAINS – IRI – STRBEFORE – STRAFTER © 2012 SIB Tuesday, December 4, 2012
  • 67. STRLEN == String Length © 2012 SIB Tuesday, December 4, 2012
  • 68. CONTAINS is case sensitive is it in there © 2012 SIB Tuesday, December 4, 2012
  • 69. REGEX, just like java regex © 2012 SIB Tuesday, December 4, 2012
  • 70. BIND • Builds new Values – Closes the basic graph pattern SELECT ?p WHERE { { ?taxon a :Taxon . } BIND (?taxon AS ?p) } • Always declare before use. © 2012 SIB Tuesday, December 4, 2012
  • 71. © 2012 SIB Tuesday, December 4, 2012
  • 72. © 2012 SIB Tuesday, December 4, 2012
  • 73. BIND can assign any output © 2012 SIB Tuesday, December 4, 2012
  • 74. Aggregate functions • on select line • limited in number – count – sum – avg – min – max – groupConcat – sample © 2012 SIB Tuesday, December 4, 2012
  • 75. count © 2012 SIB Tuesday, December 4, 2012
  • 76. SAMPLE should give a random result back © 2012 SIB Tuesday, December 4, 2012
  • 77. Follow the path © 2012 SIB Tuesday, December 4, 2012
  • 78. Path queries © 2012 SIB Tuesday, December 4, 2012
  • 79. Finding a grand parent using normal joins © 2012 SIB Tuesday, December 4, 2012
  • 80. Finding a grandParent using a path query © 2012 SIB Tuesday, December 4, 2012
  • 81. | is OR for predicate © 2012 SIB Tuesday, December 4, 2012
  • 82. Same result with UNION © 2012 SIB Tuesday, December 4, 2012
  • 83. Finding any ancestor © 2012 SIB Tuesday, December 4, 2012
  • 84. Can use the variable in a normal join afterwards © 2012 SIB Tuesday, December 4, 2012
  • 85. GROUP BY © 2012 SIB Tuesday, December 4, 2012
  • 86. GROUP BY • Needed for aggregate values • After closing the where clause – ... WHERE {?x ?y ?z} GROUP BY ?x © 2012 SIB Tuesday, December 4, 2012
  • 87. GROUP BY © 2012 SIB Tuesday, December 4, 2012
  • 88. HAVING I have carrot ! © 2012 SIB Tuesday, December 4, 2012
  • 89. HAVING • FILTER for aggregates • After the GROUP BY clause – ... GROUP BY ?x HAVING (count(?y) > 2) – ... GROUP BY ?x HAVING (min(?y) = 2) – etc... © 2012 SIB Tuesday, December 4, 2012
  • 90. HAVING © 2012 SIB Tuesday, December 4, 2012
  • 91. LIMITS & OFFSET © 2012 SIB Tuesday, December 4, 2012
  • 92. LIMIT and OFFSET • OFFSET is skip first results • LIMIT return no more than x results © 2012 SIB Tuesday, December 4, 2012
  • 93. ORDER © 2012 SIB Tuesday, December 4, 2012
  • 94. © 2012 SIB Tuesday, December 4, 2012
  • 95. © 2012 SIB Tuesday, December 4, 2012
  • 96. © 2012 SIB Tuesday, December 4, 2012
  • 97. VALUES • Super BIND • Provide inline data © 2012 SIB Tuesday, December 4, 2012
  • 98. © 2012 SIB Tuesday, December 4, 2012
  • 99. Examples • Parameter lists are between () VALUES (?annotation) { (core:Disease_Annotation) Text (core:Disulfide_Bond_Annotation) } © 2012 SIB Tuesday, December 4, 2012
  • 100. Examples • Undef means no value at – all not bound VALUES (?annotation ?begin) { (core:Disease_Annotation UNDEF) Text (core:Disulfide_Bond_Annotation 2) } © 2012 SIB Tuesday, December 4, 2012
  • 101. VALUES • After declaring a set of values you can use them in your query. SELECT ?comment WHERE { VALUES (?annotation ?begin) { (core:Disease_Annotation UNDEF) (core:Disulfide_Bond_Annotation 2) } ?annotation rdfs:comment ?comment . } © 2012 SIB Tuesday, December 4, 2012
  • 102. SERVICE: Using other sparql endpoints • SERVICE<URL of other endpoint> – Runs a sub query on the other endpoint and merges it back into your query. © 2012 SIB Tuesday, December 4, 2012
  • 103. “Life is better with friends who understand you.” © 2012 SIB Tuesday, December 4, 2012
  • 104. SERVICE © 2012 SIB Tuesday, December 4, 2012
  • 105. SERVICE • Useful – Quick experimenting with combing multiple datasources – Quick for queries where not to much data is send to the remote point • Slow – When you ask for to much data – Remote endpoint not resourced for your questions © 2012 SIB Tuesday, December 4, 2012
  • 106. Lets make some triples © 2012 SIB Tuesday, December 4, 2012
  • 107. Construction • CONSTRUCT – New triples • downloads RDF – Does not update store © 2012 SIB Tuesday, December 4, 2012
  • 108. New triples © 2012 SIB Tuesday, December 4, 2012
  • 109. Constructing an owl:sameAs between two URI © 2012 SIB Tuesday, December 4, 2012
  • 110. INSERT • Adds data – like construct © 2012 SIB Tuesday, December 4, 2012
  • 111. Modifies data © 2012 SIB Tuesday, December 4, 2012
  • 112. DELETE • Removes data – Triples matching are removed from the data – Triples can be bound using where clause © 2012 SIB Tuesday, December 4, 2012
  • 113. DELETE © 2012 SIB Tuesday, December 4, 2012
  • 114. DELETE INSERT • Single atomic operation. © 2012 SIB Tuesday, December 4, 2012
  • 115. Atomic operation © 2012 SIB Tuesday, December 4, 2012
  • 116. I’m exhausted now © 2012 SIB Tuesday, December 4, 2012