SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
Linked Data for 
   functional genomics

                            Mikel Egaña Aranguren

                       3205 School of Computer Science
                    Universidad Politécnica de Madrid (UPM)
                            28660 Boadilla del Monte
                                      Spain

                      Ontology Engineering Group (OEG)
                           http://www.oeg-upm.net

                               megana@fi.upm.es
                        http://mikeleganaaranguren.com

http://www.slideshare.net/MikelEganaAranguren/linked-data-functional-genomics




                                                                                1/12/2011
Index

         What is Linked Data?

         Publishing Linked Data

         Consuming Linked Data

         Issues with (Life Sciences) Linked Data

         Conclusions




Linked Data for functional genomics
What is Linked Data?




Linked Data for functional genomics
What is Linked Data?

      A first step towards the Semantic Web



   Query RDF
                                              “Schema” for RDF




                                               “The HTML for data”




Identify things on the net

Linked Data for functional genomics
What is Linked Data?

         Linked Data principles

         1) Use URIs as names for things


         2) Use HTTP URIs so that people can look up those names


         3) When someone looks up a URI, provide useful information, using the 
            standards (RDF, SPARQL)


         4) Include links to other URIs so that they can discover more things




                                                               http://www.w3.org/DesignIssues/LinkedData.html



Linked Data for functional genomics
What is Linked Data?

         With Linked Data we publish data

         Semantically: the data model is explicit for computers (RDF triple)

                                                Predicate

                                      Subject               Object




Linked Data for functional genomics
What is Linked Data?

         With Linked Data we publish data

         Semantically: the data model is explicit for computers (RDF triple)

                                                 Predicate

                                      Subject                     Object

                                                participates_in


                                      Q5SIF0                      ATP binding




Linked Data for functional genomics
What is Linked Data?

         With Linked Data we publish data

         Semantically: the data model is explicit for computers (RDF triple)

                                                         Predicate

                                        Subject                                      Object

                                                        participates_in


                                        Q5SIF0                                       ATP binding

         Inter-linked: the data is linked to data from other resources over the web

                                      participates_in                              part_of

                                         Predicate                                   Predicate
             Q5SIF0                                             ATP binding                             Signal transduction
             Subject                                            Subject / Object                        Object




Linked Data for functional genomics
What is Linked Data?
   Global network of Linked Data



                                                                         part_of

                                  participates_in
                                                           ATP binding
                                                                                                 Signal transduction
               Q5SIF0

                                                                                          is_regulated_by
                                                    has_part




                                                                                   Regulation of MAP
                                      Purinergic nucleotide                        kinase activity
                                      receptor activity




Linked Data for functional genomics
What is Linked Data?

         Internet of data rather than documents, a “universal DB”


         Find preciselly what we are looking for: direct queries rather than text 
            processing (SPARQL)


         Linking new data is as easy as linking a web page


         We can navigate through the data directly (RDF), rather than navigating 
          through documents that represent data in natural language (HTML)


         Build applications that exploit the data


         Apply automated reasoning on the data


Linked Data for functional genomics
What is Linked Data?
Linked Open Data (LOD) cloud




   http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.html




  Linked Data for functional genomics
What is Linked Data?

                      A graph is a collection of RDF triples

                      A triple store holds different graphs




Linked Data for functional genomics
What is Linked Data?



                    Human             HTML 
                                              Content negotiation
                    Computer          RDF




Linked Data for functional genomics
What is Linked Data?



                    Human             HTML 
                                              Content negotiation
                    Computer          RDF




Linked Data for functional genomics
What is Linked Data?



                    Human             HTML 
                                              Content negotiation
                    Computer          RDF




                     Human            HTML 
                                              Content negotiation
                     Computer         RDF




Linked Data for functional genomics
What is Linked Data?

                                                       Biomolecule


                                                            rdfs:subClassOf

Vocabulary
                                                  Protein                                    Molecular Function
(Ontology)

Instances                                                   rdf:type                                rdf:type

                        rdf:type
                                                                         participates_in


                                                       Q5SIF0                               ATP binding
                                         owl:sameAs


                 CycB


                                                                       Dataset 
                                       Dataset 

 Linked Data for functional genomics
Consuming Linked Data




Linked Data for functional genomics
Consuming Linked Data




                                      Navigate

                                       Query

                                      Meshups


Linked Data for functional genomics
Consuming Linked Data

         Bio2RDF (http://bio2rdf.org/)

         OGOLOD (http://miuras.inf.um.es/~ogo/ogolod.html)

         LinkedLifeData (http://linkedlifedata.com/)

         HyQue* (http://semanticscience.org/projects/hyque/index.html)

         ArrayExpress and Gene expression atlas* 
           (http://www.ebi.ac.uk/arrayexpress/)

                                      * not really LD, close to LD, but (likely) soon will be full LD




Linked Data for functional genomics
Consuming Linked Data

         Navigate




Linked Data for functional genomics
Consuming Linked Data

         Navigate




Linked Data for functional genomics
Consuming Linked Data

         Query different resources combining the information

         Select all located in Y-chromosome, human genes with known molecular 
           interactions, which are analysed with 'Transfection'

         PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
         PREFIX gene: <http://linkedlifedata.com/resource/entrezgene/>
         PREFIX core: <http://purl.uniprot.org/core/>
         PREFIX biopax2: <http://www.biopax.org/release/biopax-level2.owl#>
         PREFIX lifeskim: <http://linkedlifedata.com/resource/lifeskim/>
         PREFIX umls: <http://linkedlifedata.com/resource/umls/>
         PREFIX pubmed: <http://linkedlifedata.com/resource/pubmed/>

         SELECT distinct ?genedescription ?prefLabel
         WHERE {
             ?p biopax2:PHYSICAL-ENTITY ?protein .
             ?protein skos:exactMatch ?uniprotaccession .
             ?uniprotaccession core:organism <http://purl.uniprot.org/taxonomy/9606> .
             ?geneid gene:uniprotAccession ?uniprotaccession .
             ?geneid gene:description ?genedescription .
             ?geneid gene:pubmed ?pmid .
             ?geneid gene:chromosome 'Y' .
             ?pmid lifeskim:mentions ?umlsid .
             ?umlsid skos:prefLabel 'Transfection' .
             ?umlsid skos:prefLabel ?prefLabel .
         }                                                  (http://linkedlifedata.com/sparql)

Linked Data for functional genomics
Consuming Linked Data

         Query different resources combining the information

         We will receive only the triples of that triple store (but we can follow the 
          links to the triples stored in other triple stores!)




Linked Data for functional genomics
Consuming Linked Data

         For retrieving triples from other triple stores we need federated 
           queries:

         SERVICE keyword in SPARQL 1.1
         PREFIX foaf:   <http://xmlns.com/foaf/0.1/>
         SELECT ?name
         FROM <http://example.org/myfoaf.rdf>
         WHERE
         {
           <http://example.org/myfoaf/I> foaf:knows ?person .
           SERVICE <http://people.example.org/sparql> { 
             ?person foaf:name ?name . } 
         }




                                 http://www.w3.org/TR/sparql11-federated-query/




Linked Data for functional genomics
Consuming Linked Data

         Hypotheses evaluation with HyQue*

         http://semanticscience.org/projects/hyque/
         PREFIX hybrow: <http://bio2rdf.org/hybrow:>
         PREFIX semsci: <http://semanticscience.org/resource/>
         PREFIX bio2rdf: <http://bio2rdf.org/ns/bio2rdf:>


         select DISTINCT * where {
                 ?event rdfs:label ?label .
                 ?event rdf:type ?event_type .
                 ?event_type rdfs:label ?event_type_label .
                 ?event hybrow:is_negated ?negated .
                 ?event hybrow:physical_context ?event_location .
                 ?event hybrow:physical_operator ?physical_operator .
                 ?event hybrow:agent_a ?actor .
                 ?event hybrow:agent_b ?target .
                 OPTIONAL {
                 { ?actor rdfs:subClassOf ?actor_type  } UNION { ?actor rdf:type ?actor_type }
                 }
                 OPTIONAL {
                   { ?target rdfs:subClassOf ?target_type  } UNION { ?target rdf:type ?target_type }        
                 }
                 ?actor semsci:isLocatedIn ?actor_gp_id_location .
                 ?actor_gp_id_location rdf:type ?actor_location_type .
                 ?target semsci:isLocatedIn ?target_gp_id_location .
                 ?target_gp_id_location rdf:type ?target_location_type .
                 ?actor semsci:hasFunction ?actor_gp_id_function .
                 ?actor_gp_id_function rdf:type ?actor_function_type .
         }




Linked Data for functional genomics
Consuming Linked Data

         Meshups: applications that consume LOD 

         Combining information from different datasets and/or non LOD resources 
           (e.g. Google maps)

         e.g. specific visualisations

         e.g. “follow your nose” applications




Linked Data for functional genomics
Publishing Linked Data




Linked Data for functional genomics
Publishing Linked Data
XML
Flat file

  DB
                                          Link 
                                       discovery




 Linked Data for functional genomics
Publishing Linked Data

         Announce your data

         Comprehensive Knowledge Archive Network (http://ckan.org/)

         Semantic Web index (http://sindice.com/)




Linked Data for functional genomics
Publishing Linked Data

         Why publish our data in the LOD?




Linked Data for functional genomics
Publishing Linked Data

         Why publish our data in the LOD?

         It's the links, stupid




Linked Data for functional genomics
Publishing Linked Data

         Why publish our data in the LOD?

         It's the links, stupid

         Only publish our data, reference the rest: don't need to 
          duplicate external DB in ours




Linked Data for functional genomics
Publishing Linked Data

         Why publish our data in the LOD?

         It's the links, stupid

         Only publish our data, reference the rest: don't need to 
          duplicate external DB in ours

         External info is updated independently and we get the benefit 
           to our dataset because it's linked to it, without extra effort




Linked Data for functional genomics
Publishing Linked Data

         Why publish our data in the LOD?

         It's the links, stupid (II)




Linked Data for functional genomics
Publishing Linked Data

         Why publish our data in the LOD?

         It's the links, stupid (II)

         By using (HTTP) URIs, others can link to us




Linked Data for functional genomics
Publishing Linked Data

         Why publish our data in the LOD?

         It's the links, stupid (II)

         By using (HTTP) URIs, others can link to us

         Increasing the potential for our data to be discovered




Linked Data for functional genomics
Publishing Linked Data

         Why publish our data in the LOD?

         It's the semantics, stupid




Linked Data for functional genomics
Publishing Linked Data

         Why publish our data in the LOD?

         It's the semantics, stupid

         The meaning of our data is easily machine processable due to 
           RDF (“instances”) and OWL (“schema”)




Linked Data for functional genomics
Issues with (Life Sciences) 
                                Linked Data




Linked Data for functional genomics
Issues with (Life Sciences) Linked Open Data

         Provenance (e.g. For Microarray data)

         Shared identifiers
         http://identifiers.org/
         http://sharedname.org

         Dataset quality

         Ontology modelling

         Consensus ontologies

         Lack of ontologies

         Inference
         To generate triples
         At query time

Linked Data for functional genomics
Conclusions




Linked Data for functional genomics
Conclusions

         Linked Data offers a straight method to publish data 
           semantically in the current web:

         Key 1: use URIs for each and every data item

         Key 2: link data items to internal and external data

         Key 3: represent data with RDF (and OWL)

         Already existing web technology (URI + HTTP) will do the rest 
           smoothly for us

         Knowledge discovery

         Knowledge exploitation




Linked Data for functional genomics
Conclusions

         Linked Data is here to stay

         Already used by many, including governments, BBC, ...

         A first usable version of the Semantic Web with great potential

         Still issues to be solved in the Life Sciences Linked Data 




Linked Data for functional genomics
More information

         Semantic Web Health Care and Life Sciences (HCLS) Interest 
          Group at W3C: http://www.w3.org/blog/hcls

         LD Best practices
         A. Hogan, A. Harth, A. Passant, S. Decker, and A. Polleres. Weaving
             the Pedantic Web. In Linked Data on the Web Workshop (LDOW2010)
             at WWW’2010, 2010.

         http://patterns.dataincubator.org/book/

         Ontology Engineering Group (oeg-upm.net)
              Gov. Information on LOD (GeoLinkedData, Aemet, ...)
              OGOLOD
              Linked Data tools (ODEmapster, ...)




Linked Data for functional genomics
Acknowledgements

         I'm funded by the Marie Curie Cofund programme (FP7) 

         I unashamedly recycled stuff from presentations by Marc-
            Alexandre Nolin and Eric Prud’hommeaux

         I'm learning a lot at the HCLS IG W3C

         NTNU provided the travel/accomodation due to Martin Kuiper's 
          invitation




Linked Data for functional genomics

Weitere ähnliche Inhalte

Andere mochten auch

Project Training Ppt
Project Training PptProject Training Ppt
Project Training Ppt
biinoida
 
Functional Genomics of Plant Pathogen interactions in Wheat Rust Pathosystem
Functional Genomics of Plant Pathogen interactions in Wheat Rust PathosystemFunctional Genomics of Plant Pathogen interactions in Wheat Rust Pathosystem
Functional Genomics of Plant Pathogen interactions in Wheat Rust Pathosystem
Senthil Natesan
 
6 weeks 6 months live project summer industrial training in cmc limited 2012
6 weeks  6 months live project summer industrial training in cmc limited  20126 weeks  6 months live project summer industrial training in cmc limited  2012
6 weeks 6 months live project summer industrial training in cmc limited 2012
CMC Limited
 
6 MONTHS INDUSTRIAL TRAINING PPT EE BY KHUSHI RAM BHARDWAJ(DAVIET JALANDHAR)
6 MONTHS INDUSTRIAL TRAINING PPT EE BY KHUSHI RAM BHARDWAJ(DAVIET JALANDHAR)6 MONTHS INDUSTRIAL TRAINING PPT EE BY KHUSHI RAM BHARDWAJ(DAVIET JALANDHAR)
6 MONTHS INDUSTRIAL TRAINING PPT EE BY KHUSHI RAM BHARDWAJ(DAVIET JALANDHAR)
KHUSHI9999
 
Industrial training presentation
Industrial training presentationIndustrial training presentation
Industrial training presentation
Sayotters
 

Andere mochten auch (20)

BIOL335: Functional genomics
BIOL335: Functional genomicsBIOL335: Functional genomics
BIOL335: Functional genomics
 
EpiVax FastVax 23Feb2010
EpiVax FastVax 23Feb2010EpiVax FastVax 23Feb2010
EpiVax FastVax 23Feb2010
 
Genomics
GenomicsGenomics
Genomics
 
Summer training ppt
Summer training pptSummer training ppt
Summer training ppt
 
Project Training Ppt
Project Training PptProject Training Ppt
Project Training Ppt
 
Functional Genomics of Plant Pathogen interactions in Wheat Rust Pathosystem
Functional Genomics of Plant Pathogen interactions in Wheat Rust PathosystemFunctional Genomics of Plant Pathogen interactions in Wheat Rust Pathosystem
Functional Genomics of Plant Pathogen interactions in Wheat Rust Pathosystem
 
6 weeks 6 months live project summer industrial training in cmc limited 2012
6 weeks  6 months live project summer industrial training in cmc limited  20126 weeks  6 months live project summer industrial training in cmc limited  2012
6 weeks 6 months live project summer industrial training in cmc limited 2012
 
6 MONTHS INDUSTRIAL TRAINING PPT EE BY KHUSHI RAM BHARDWAJ(DAVIET JALANDHAR)
6 MONTHS INDUSTRIAL TRAINING PPT EE BY KHUSHI RAM BHARDWAJ(DAVIET JALANDHAR)6 MONTHS INDUSTRIAL TRAINING PPT EE BY KHUSHI RAM BHARDWAJ(DAVIET JALANDHAR)
6 MONTHS INDUSTRIAL TRAINING PPT EE BY KHUSHI RAM BHARDWAJ(DAVIET JALANDHAR)
 
Proteomics
Proteomics Proteomics
Proteomics
 
Php string
Php stringPhp string
Php string
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Project 0th Review
Project 0th ReviewProject 0th Review
Project 0th Review
 
Software Project Management
Software Project ManagementSoftware Project Management
Software Project Management
 
Zeroth review presentation
Zeroth review presentationZeroth review presentation
Zeroth review presentation
 
Proteomics
ProteomicsProteomics
Proteomics
 
PHP Summer Training Presentation
PHP Summer Training PresentationPHP Summer Training Presentation
PHP Summer Training Presentation
 
Industrial training presentation
Industrial training presentationIndustrial training presentation
Industrial training presentation
 
Example Presentation Latihan Industri
Example Presentation Latihan IndustriExample Presentation Latihan Industri
Example Presentation Latihan Industri
 
Proteomics ppt
Proteomics pptProteomics ppt
Proteomics ppt
 
Industrial training presentation
Industrial training presentation Industrial training presentation
Industrial training presentation
 

Ähnlich wie Linked data functional genomics

Apprendre Via les Objets Xin Chen
Apprendre Via les Objets  Xin ChenApprendre Via les Objets  Xin Chen
Apprendre Via les Objets Xin Chen
cecilechen85
 
Linked data introduction w exempel
Linked data introduction w exempelLinked data introduction w exempel
Linked data introduction w exempel
Kerstin Forsberg
 
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
Maulik Kamdar
 
EDF2012 Peter Boncz - LOD benchmarking SRbench
EDF2012   Peter Boncz - LOD benchmarking SRbenchEDF2012   Peter Boncz - LOD benchmarking SRbench
EDF2012 Peter Boncz - LOD benchmarking SRbench
European Data Forum
 

Ähnlich wie Linked data functional genomics (20)

Linked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale IntegrationLinked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale Integration
 
Pragmatic Approaches to the Semantic Web
Pragmatic Approaches to the Semantic WebPragmatic Approaches to the Semantic Web
Pragmatic Approaches to the Semantic Web
 
Aaai2012
Aaai2012Aaai2012
Aaai2012
 
STI Summit 2011 - Linked data-services-streams
STI Summit 2011 - Linked data-services-streamsSTI Summit 2011 - Linked data-services-streams
STI Summit 2011 - Linked data-services-streams
 
Apprendre Via les Objets Xin Chen
Apprendre Via les Objets  Xin ChenApprendre Via les Objets  Xin Chen
Apprendre Via les Objets Xin Chen
 
From Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsFrom Linked Data to Semantic Applications
From Linked Data to Semantic Applications
 
Omitola birmingham cityuniv
Omitola birmingham cityunivOmitola birmingham cityuniv
Omitola birmingham cityuniv
 
ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017
 
Ontotext Overview Winter 2012
Ontotext Overview Winter 2012Ontotext Overview Winter 2012
Ontotext Overview Winter 2012
 
Crowdsourcing tasks in Linked Data management
Crowdsourcing tasks in Linked Data managementCrowdsourcing tasks in Linked Data management
Crowdsourcing tasks in Linked Data management
 
Linked data introduction w exempel
Linked data introduction w exempelLinked data introduction w exempel
Linked data introduction w exempel
 
Rapid and flexible clinical data extraction and processing using Apache OODT
Rapid and flexible clinical data extraction and processing using Apache OODTRapid and flexible clinical data extraction and processing using Apache OODT
Rapid and flexible clinical data extraction and processing using Apache OODT
 
Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and Federation
 
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...
 
EDF2012 Peter Boncz - LOD benchmarking SRbench
EDF2012   Peter Boncz - LOD benchmarking SRbenchEDF2012   Peter Boncz - LOD benchmarking SRbench
EDF2012 Peter Boncz - LOD benchmarking SRbench
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
 
TripFS presentation at ldow 2010
TripFS presentation at ldow 2010TripFS presentation at ldow 2010
TripFS presentation at ldow 2010
 
AELA: An Adaptive Entity Linking Approach
AELA: An Adaptive Entity Linking ApproachAELA: An Adaptive Entity Linking Approach
AELA: An Adaptive Entity Linking Approach
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1
 

Mehr von Mikel Egaña Aranguren, Ph.D.

Mehr von Mikel Egaña Aranguren, Ph.D. (11)

Life Sciences Linked Data
Life Sciences Linked DataLife Sciences Linked Data
Life Sciences Linked Data
 
Populous swat4ls slides_slideshare
Populous swat4ls slides_slidesharePopulous swat4ls slides_slideshare
Populous swat4ls slides_slideshare
 
OPPL-Galaxy: Enhancing ontology exploitation in Galaxy with OPPL
OPPL-Galaxy: Enhancing ontology exploitation in Galaxy with OPPLOPPL-Galaxy: Enhancing ontology exploitation in Galaxy with OPPL
OPPL-Galaxy: Enhancing ontology exploitation in Galaxy with OPPL
 
Medioambiente Linked Data
Medioambiente Linked DataMedioambiente Linked Data
Medioambiente Linked Data
 
Applying sw mikel_egana
Applying sw mikel_eganaApplying sw mikel_egana
Applying sw mikel_egana
 
Mikel egana itbam_2010_ogo_system
Mikel egana itbam_2010_ogo_systemMikel egana itbam_2010_ogo_system
Mikel egana itbam_2010_ogo_system
 
Aplicación de la Web Semántica en Bioinformática
Aplicación de la Web Semántica en BioinformáticaAplicación de la Web Semántica en Bioinformática
Aplicación de la Web Semántica en Bioinformática
 
Métodos y Resultados Actuales en Bioinformática: know-how y know-what de las ...
Métodos y Resultados Actuales en Bioinformática: know-how y know-what de las ...Métodos y Resultados Actuales en Bioinformática: know-how y know-what de las ...
Métodos y Resultados Actuales en Bioinformática: know-how y know-what de las ...
 
Transforming the Axiomisation of Ontologies: The Ontology Pre-Processor Language
Transforming the Axiomisation of Ontologies: The Ontology Pre-Processor LanguageTransforming the Axiomisation of Ontologies: The Ontology Pre-Processor Language
Transforming the Axiomisation of Ontologies: The Ontology Pre-Processor Language
 
Ontology Design Patterns (ODPs) for bio-ontologies
Ontology Design Patterns (ODPs) for bio-ontologiesOntology Design Patterns (ODPs) for bio-ontologies
Ontology Design Patterns (ODPs) for bio-ontologies
 
Applying Ontology Design Patterns in bio-ontologies
Applying Ontology Design Patterns in bio-ontologiesApplying Ontology Design Patterns in bio-ontologies
Applying Ontology Design Patterns in bio-ontologies
 

Kürzlich hochgeladen

Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 

Kürzlich hochgeladen (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 

Linked data functional genomics

  • 1. Linked Data for  functional genomics Mikel Egaña Aranguren 3205 School of Computer Science Universidad Politécnica de Madrid (UPM) 28660 Boadilla del Monte Spain Ontology Engineering Group (OEG) http://www.oeg-upm.net megana@fi.upm.es http://mikeleganaaranguren.com http://www.slideshare.net/MikelEganaAranguren/linked-data-functional-genomics 1/12/2011
  • 2. Index What is Linked Data? Publishing Linked Data Consuming Linked Data Issues with (Life Sciences) Linked Data Conclusions Linked Data for functional genomics
  • 4. What is Linked Data? A first step towards the Semantic Web Query RDF “Schema” for RDF “The HTML for data” Identify things on the net Linked Data for functional genomics
  • 5. What is Linked Data? Linked Data principles 1) Use URIs as names for things 2) Use HTTP URIs so that people can look up those names 3) When someone looks up a URI, provide useful information, using the  standards (RDF, SPARQL) 4) Include links to other URIs so that they can discover more things http://www.w3.org/DesignIssues/LinkedData.html Linked Data for functional genomics
  • 6. What is Linked Data? With Linked Data we publish data Semantically: the data model is explicit for computers (RDF triple) Predicate Subject Object Linked Data for functional genomics
  • 7. What is Linked Data? With Linked Data we publish data Semantically: the data model is explicit for computers (RDF triple) Predicate Subject Object participates_in Q5SIF0 ATP binding Linked Data for functional genomics
  • 8. What is Linked Data? With Linked Data we publish data Semantically: the data model is explicit for computers (RDF triple) Predicate Subject Object participates_in Q5SIF0 ATP binding Inter-linked: the data is linked to data from other resources over the web participates_in part_of Predicate Predicate Q5SIF0 ATP binding Signal transduction Subject Subject / Object Object Linked Data for functional genomics
  • 9. What is Linked Data? Global network of Linked Data part_of participates_in ATP binding Signal transduction Q5SIF0 is_regulated_by has_part Regulation of MAP Purinergic nucleotide kinase activity receptor activity Linked Data for functional genomics
  • 10. What is Linked Data? Internet of data rather than documents, a “universal DB” Find preciselly what we are looking for: direct queries rather than text  processing (SPARQL) Linking new data is as easy as linking a web page We can navigate through the data directly (RDF), rather than navigating  through documents that represent data in natural language (HTML) Build applications that exploit the data Apply automated reasoning on the data Linked Data for functional genomics
  • 11. What is Linked Data? Linked Open Data (LOD) cloud http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.html Linked Data for functional genomics
  • 12. What is Linked Data? A graph is a collection of RDF triples A triple store holds different graphs Linked Data for functional genomics
  • 13. What is Linked Data? Human HTML  Content negotiation Computer RDF Linked Data for functional genomics
  • 14. What is Linked Data? Human HTML  Content negotiation Computer RDF Linked Data for functional genomics
  • 15. What is Linked Data? Human HTML  Content negotiation Computer RDF Human HTML  Content negotiation Computer RDF Linked Data for functional genomics
  • 16. What is Linked Data? Biomolecule rdfs:subClassOf Vocabulary Protein Molecular Function (Ontology) Instances rdf:type rdf:type rdf:type participates_in Q5SIF0 ATP binding owl:sameAs CycB Dataset  Dataset  Linked Data for functional genomics
  • 18. Consuming Linked Data Navigate Query Meshups Linked Data for functional genomics
  • 19. Consuming Linked Data Bio2RDF (http://bio2rdf.org/) OGOLOD (http://miuras.inf.um.es/~ogo/ogolod.html) LinkedLifeData (http://linkedlifedata.com/) HyQue* (http://semanticscience.org/projects/hyque/index.html) ArrayExpress and Gene expression atlas*  (http://www.ebi.ac.uk/arrayexpress/) * not really LD, close to LD, but (likely) soon will be full LD Linked Data for functional genomics
  • 20. Consuming Linked Data Navigate Linked Data for functional genomics
  • 21. Consuming Linked Data Navigate Linked Data for functional genomics
  • 22. Consuming Linked Data Query different resources combining the information Select all located in Y-chromosome, human genes with known molecular  interactions, which are analysed with 'Transfection' PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX gene: <http://linkedlifedata.com/resource/entrezgene/> PREFIX core: <http://purl.uniprot.org/core/> PREFIX biopax2: <http://www.biopax.org/release/biopax-level2.owl#> PREFIX lifeskim: <http://linkedlifedata.com/resource/lifeskim/> PREFIX umls: <http://linkedlifedata.com/resource/umls/> PREFIX pubmed: <http://linkedlifedata.com/resource/pubmed/> SELECT distinct ?genedescription ?prefLabel WHERE {     ?p biopax2:PHYSICAL-ENTITY ?protein .     ?protein skos:exactMatch ?uniprotaccession .     ?uniprotaccession core:organism <http://purl.uniprot.org/taxonomy/9606> .     ?geneid gene:uniprotAccession ?uniprotaccession .     ?geneid gene:description ?genedescription .     ?geneid gene:pubmed ?pmid .     ?geneid gene:chromosome 'Y' .     ?pmid lifeskim:mentions ?umlsid .     ?umlsid skos:prefLabel 'Transfection' .     ?umlsid skos:prefLabel ?prefLabel . }  (http://linkedlifedata.com/sparql) Linked Data for functional genomics
  • 23. Consuming Linked Data Query different resources combining the information We will receive only the triples of that triple store (but we can follow the  links to the triples stored in other triple stores!) Linked Data for functional genomics
  • 24. Consuming Linked Data For retrieving triples from other triple stores we need federated  queries: SERVICE keyword in SPARQL 1.1 PREFIX foaf:   <http://xmlns.com/foaf/0.1/> SELECT ?name FROM <http://example.org/myfoaf.rdf> WHERE {   <http://example.org/myfoaf/I> foaf:knows ?person .   SERVICE <http://people.example.org/sparql> {      ?person foaf:name ?name . }  } http://www.w3.org/TR/sparql11-federated-query/ Linked Data for functional genomics
  • 25. Consuming Linked Data Hypotheses evaluation with HyQue* http://semanticscience.org/projects/hyque/ PREFIX hybrow: <http://bio2rdf.org/hybrow:> PREFIX semsci: <http://semanticscience.org/resource/> PREFIX bio2rdf: <http://bio2rdf.org/ns/bio2rdf:> select DISTINCT * where { ?event rdfs:label ?label . ?event rdf:type ?event_type . ?event_type rdfs:label ?event_type_label . ?event hybrow:is_negated ?negated . ?event hybrow:physical_context ?event_location . ?event hybrow:physical_operator ?physical_operator . ?event hybrow:agent_a ?actor . ?event hybrow:agent_b ?target . OPTIONAL {         { ?actor rdfs:subClassOf ?actor_type  } UNION { ?actor rdf:type ?actor_type } } OPTIONAL { { ?target rdfs:subClassOf ?target_type  } UNION { ?target rdf:type ?target_type }         } ?actor semsci:isLocatedIn ?actor_gp_id_location . ?actor_gp_id_location rdf:type ?actor_location_type . ?target semsci:isLocatedIn ?target_gp_id_location . ?target_gp_id_location rdf:type ?target_location_type . ?actor semsci:hasFunction ?actor_gp_id_function . ?actor_gp_id_function rdf:type ?actor_function_type . } Linked Data for functional genomics
  • 26. Consuming Linked Data Meshups: applications that consume LOD  Combining information from different datasets and/or non LOD resources  (e.g. Google maps) e.g. specific visualisations e.g. “follow your nose” applications Linked Data for functional genomics
  • 28. Publishing Linked Data XML Flat file DB Link  discovery Linked Data for functional genomics
  • 29. Publishing Linked Data Announce your data Comprehensive Knowledge Archive Network (http://ckan.org/) Semantic Web index (http://sindice.com/) Linked Data for functional genomics
  • 30. Publishing Linked Data Why publish our data in the LOD? Linked Data for functional genomics
  • 31. Publishing Linked Data Why publish our data in the LOD? It's the links, stupid Linked Data for functional genomics
  • 32. Publishing Linked Data Why publish our data in the LOD? It's the links, stupid Only publish our data, reference the rest: don't need to  duplicate external DB in ours Linked Data for functional genomics
  • 33. Publishing Linked Data Why publish our data in the LOD? It's the links, stupid Only publish our data, reference the rest: don't need to  duplicate external DB in ours External info is updated independently and we get the benefit  to our dataset because it's linked to it, without extra effort Linked Data for functional genomics
  • 34. Publishing Linked Data Why publish our data in the LOD? It's the links, stupid (II) Linked Data for functional genomics
  • 35. Publishing Linked Data Why publish our data in the LOD? It's the links, stupid (II) By using (HTTP) URIs, others can link to us Linked Data for functional genomics
  • 36. Publishing Linked Data Why publish our data in the LOD? It's the links, stupid (II) By using (HTTP) URIs, others can link to us Increasing the potential for our data to be discovered Linked Data for functional genomics
  • 37. Publishing Linked Data Why publish our data in the LOD? It's the semantics, stupid Linked Data for functional genomics
  • 38. Publishing Linked Data Why publish our data in the LOD? It's the semantics, stupid The meaning of our data is easily machine processable due to  RDF (“instances”) and OWL (“schema”) Linked Data for functional genomics
  • 39. Issues with (Life Sciences)  Linked Data Linked Data for functional genomics
  • 40. Issues with (Life Sciences) Linked Open Data Provenance (e.g. For Microarray data) Shared identifiers http://identifiers.org/ http://sharedname.org Dataset quality Ontology modelling Consensus ontologies Lack of ontologies Inference To generate triples At query time Linked Data for functional genomics
  • 42. Conclusions Linked Data offers a straight method to publish data  semantically in the current web: Key 1: use URIs for each and every data item Key 2: link data items to internal and external data Key 3: represent data with RDF (and OWL) Already existing web technology (URI + HTTP) will do the rest  smoothly for us Knowledge discovery Knowledge exploitation Linked Data for functional genomics
  • 43. Conclusions Linked Data is here to stay Already used by many, including governments, BBC, ... A first usable version of the Semantic Web with great potential Still issues to be solved in the Life Sciences Linked Data  Linked Data for functional genomics
  • 44. More information Semantic Web Health Care and Life Sciences (HCLS) Interest  Group at W3C: http://www.w3.org/blog/hcls LD Best practices A. Hogan, A. Harth, A. Passant, S. Decker, and A. Polleres. Weaving     the Pedantic Web. In Linked Data on the Web Workshop (LDOW2010)     at WWW’2010, 2010. http://patterns.dataincubator.org/book/ Ontology Engineering Group (oeg-upm.net) Gov. Information on LOD (GeoLinkedData, Aemet, ...) OGOLOD Linked Data tools (ODEmapster, ...) Linked Data for functional genomics
  • 45. Acknowledgements I'm funded by the Marie Curie Cofund programme (FP7)  I unashamedly recycled stuff from presentations by Marc- Alexandre Nolin and Eric Prud’hommeaux I'm learning a lot at the HCLS IG W3C NTNU provided the travel/accomodation due to Martin Kuiper's  invitation Linked Data for functional genomics