SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Triples & Access




   Jan Velterop
“   There is something fascinating about science.
    One gets such wholesale returns of conjecture out of
    such a trifling investment of fact.
                                      ”
                                Mark Twain, Life on the Mississippi
O yeah?
We have far too few returns in terms of usable
knowledge out of such overwhelming investment of
fact!
A lot of fact is deeply hidden!
Current Knowledge Transfer
        A metaphor
  (is Greek for ‘truck’ after all)




      Needle transport
Information overload?

   Too much knowledge?

     Stop acquiring it?

      Just filtering it?

 Or organisation underload?

Lack of conceptual structure?

Unprecedented opportunity?
Information overload?

        Too much knowledge?

          Stop acquiring it?

           Just filtering it?

      Or organisation underload?

     Lack of conceptual structure?

Unprecedented opportunity!
Another
metaphor:

What is the use
of water?
H2O




      Drink
      (take in)
What is the use
of information?
Age to Know




     Read
     (take in)
Publish articles
Stretching the
water metaphor:

It’s already
raining – we
must build the
ark
The ‘animals’ to come on board:
Slide by Carl Lagoze (Cornell) – from this presentation:
http://journal.webscience.org/112/3/orechem.pdf
Stretching the
metaphor
further:

If you need
water, rain is
free
But if you want
quality control
and
convenience:
(node 1, unique ID)                                                (node 2, unique ID)
            < Source concept >              < Relations (edge) >              < Target Concept >

                    class            date       value          owner        condi/on        DOI.


              All Triples                   Smart Triples

curated
 curated
  curated                                                   Curated



                             Remove
Co-occ
                                                            Observational
                              Ambiguity


                                 and


                            Redundancy


                                                            Inferred

                                                                                  Knowledge Space
(node 1, unique ID)                                        (node 2, unique ID)
              < Source concept >                < Relations (edge) >      < Target Concept >

                        class            date      value      author    condi/on          DOI




                                                           }
<Type F1> Database facts (multiple attributes)
<Type F2> Community Annotations                                                            F+

C+

A+
<Type C1> Co-occurrence sentence (abstracts e.g. PubMed)
<Type C2> Co-occurrence Full Text (publisher e.g. Springer)                                     C+

A+
<Type A1> Concept Profile Match
<Type A3> Co-expression (gene expression Databases)                                                    A+
<Type A4> Modelling hypothesis (e.g. Plectix, InWeb)
                                                                                   Multiple Triples
                                         T-Cell Development
                                                                            Graph Building (e.g. WikiPathways)
              Unique to 101668678
                                Cancer Promoting Genes
                                                                                                      Interleukin-7


                Unique to Springer



                                                                Unique to Plectix
Unique to 101668678
(node 1, unique ID)                                        (node 2, unique ID)
              < Source concept >                < Relations (edge) >      < Target Concept >

                        class            date      value      author    condi/on          DOI




                                                           }
<Type F1> Database facts (multiple attributes)
<Type F2> Community Annotations                                                            F+

C+

A+
<Type C1> Co-occurrence sentence (abstracts e.g. PubMed)
<Type C2> Co-occurrence Full Text (publisher e.g. Springer)                                     C+

A+
<Type A1> Concept Profile Match
<Type A3> Co-expression (gene expression Databases)                                                    A+
<Type A4> Modelling hypothesis (e.g. Plectix, InWeb)
                                                                                   Multiple Triples
                                         T-Cell Development
                                                                            Graph Building (e.g. WikiPathways)
              Unique to 101668678
                                Cancer Promoting Genes
                                                                                                      Interleukin-7


                Unique to Springer



                                                                Unique to Plectix
(node 1, unique ID)                                       (node 2, unique ID)
          < Source concept >              < Relations (edge) >     < Target Concept >

                  class            date      value      owner    condi/on        Etc.

Triples                                         Smart Triples
                                                                        In these areas significant value
                             Remove                                     is added to the triples
            Curated          Ambiguity and
                             Redundancy




                              Remove
            Observational     Ambiguity and
                              Redundancy




                              Remove
            Inferred;         Ambiguity and
            constructed       Redundancy                               Knowledge Space
The ‘trustmark’
CWATM:

Triple ‘model’
Best practice
Interoperability
Et cetera
Download
Concept
Web
Alliance
cer/fied
triples

Includes edges from:

 Pubmed (400,000,000 sentences, 5,000,000,000 concept co-occurrences) (from public data)
 Protein databases (UniProt, IntAct, PDB, HPRD – 75,000 human curated PPIs) (from public data)

 Gene (co-expression databases (GEO, Express… – 25 square genes) (from public data)
 STRING edges (200,000 gene-gene edges) (from semi public data)
 InWeb edges (240,000 unique edges from 17 species) (from proprietary data)

 Reactome edges (240,000 unique edges from 17 species) (from proprietary data)
 Chemspider edges (25,000,000 chemicals) (from semi public data)

 Wiki edges (WikEdge = WikiPathways, WikiProfessionals, Omegawiki, Wikigene)
 Plectix edges (5,000 extra edges (PPI modeling) (from proprietary data)
 Private expression data (3000 extra edges, by Merck) (from proprietary data)
 Et Cetera

Weitere ähnliche Inhalte

Ähnlich wie Triples And Access

Algorithm
AlgorithmAlgorithm
Algorithm
seobear
 
Hsis2005 Geospatial Nomadeyes Full
Hsis2005 Geospatial Nomadeyes FullHsis2005 Geospatial Nomadeyes Full
Hsis2005 Geospatial Nomadeyes Full
martindudziak
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003
butest
 
download
downloaddownload
download
butest
 
download
downloaddownload
download
butest
 
Computing with Directed Labeled Graphs
Computing with Directed Labeled GraphsComputing with Directed Labeled Graphs
Computing with Directed Labeled Graphs
Marko Rodriguez
 
Auscert Finding needles in haystacks (the size of countries)
Auscert Finding needles in haystacks (the size of countries)Auscert Finding needles in haystacks (the size of countries)
Auscert Finding needles in haystacks (the size of countries)
packetloop
 

Ähnlich wie Triples And Access (20)

Ontologies
OntologiesOntologies
Ontologies
 
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...
 
Algorithm
AlgorithmAlgorithm
Algorithm
 
Building Interpretable & Secure AI Systems using PyTorch
Building Interpretable & Secure AI Systems using PyTorchBuilding Interpretable & Secure AI Systems using PyTorch
Building Interpretable & Secure AI Systems using PyTorch
 
Hsis2005 Geospatial Nomadeyes Full
Hsis2005 Geospatial Nomadeyes FullHsis2005 Geospatial Nomadeyes Full
Hsis2005 Geospatial Nomadeyes Full
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003
 
download
downloaddownload
download
 
download
downloaddownload
download
 
Roger Dingledine on Tor and blocking resistance
Roger Dingledine on Tor and blocking resistanceRoger Dingledine on Tor and blocking resistance
Roger Dingledine on Tor and blocking resistance
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008
 
The Potential of Metadata - Geoweb 2010
The Potential of Metadata - Geoweb 2010The Potential of Metadata - Geoweb 2010
The Potential of Metadata - Geoweb 2010
 
After Gutenberg: The Tradition of Authenticity in a New Age
After Gutenberg: The Tradition of Authenticity in a New AgeAfter Gutenberg: The Tradition of Authenticity in a New Age
After Gutenberg: The Tradition of Authenticity in a New Age
 
Speech acts meet tagging: NiceTag ontology (Pragmatic Web)
Speech acts meet tagging: NiceTag ontology (Pragmatic Web)Speech acts meet tagging: NiceTag ontology (Pragmatic Web)
Speech acts meet tagging: NiceTag ontology (Pragmatic Web)
 
International workshop on semantic sensor web 2011
International workshop on semantic sensor web 2011International workshop on semantic sensor web 2011
International workshop on semantic sensor web 2011
 
Cto cn
Cto cnCto cn
Cto cn
 
Ontology Engineering SSSC2009
Ontology Engineering SSSC2009Ontology Engineering SSSC2009
Ontology Engineering SSSC2009
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
Computing with Directed Labeled Graphs
Computing with Directed Labeled GraphsComputing with Directed Labeled Graphs
Computing with Directed Labeled Graphs
 
Swat4 ls2012
Swat4 ls2012Swat4 ls2012
Swat4 ls2012
 
Auscert Finding needles in haystacks (the size of countries)
Auscert Finding needles in haystacks (the size of countries)Auscert Finding needles in haystacks (the size of countries)
Auscert Finding needles in haystacks (the size of countries)
 

Mehr von velterop (6)

Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015
 
Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015
 
Ops gen2 phen oa datasharing 19 sep 2011 copy
Ops gen2 phen oa datasharing 19 sep 2011 copyOps gen2 phen oa datasharing 19 sep 2011 copy
Ops gen2 phen oa datasharing 19 sep 2011 copy
 
Measuring is knowing - or is it?
Measuring is knowing -  or is it?Measuring is knowing -  or is it?
Measuring is knowing - or is it?
 
Science publishing, record keeping, knowledge transfer
Science publishing, record keeping, knowledge transferScience publishing, record keeping, knowledge transfer
Science publishing, record keeping, knowledge transfer
 
Cwa Sustainability May8 Final
Cwa Sustainability May8 FinalCwa Sustainability May8 Final
Cwa Sustainability May8 Final
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Triples And Access

  • 1. Triples & Access Jan Velterop
  • 2. There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact. ” Mark Twain, Life on the Mississippi
  • 3. O yeah? We have far too few returns in terms of usable knowledge out of such overwhelming investment of fact! A lot of fact is deeply hidden!
  • 4. Current Knowledge Transfer A metaphor (is Greek for ‘truck’ after all) Needle transport
  • 5. Information overload? Too much knowledge? Stop acquiring it? Just filtering it? Or organisation underload? Lack of conceptual structure? Unprecedented opportunity?
  • 6. Information overload? Too much knowledge? Stop acquiring it? Just filtering it? Or organisation underload? Lack of conceptual structure? Unprecedented opportunity!
  • 8. H2O Drink (take in)
  • 9. What is the use of information?
  • 10. Age to Know Read (take in)
  • 12. Stretching the water metaphor: It’s already raining – we must build the ark
  • 13. The ‘animals’ to come on board:
  • 14. Slide by Carl Lagoze (Cornell) – from this presentation: http://journal.webscience.org/112/3/orechem.pdf
  • 15. Stretching the metaphor further: If you need water, rain is free
  • 16. But if you want quality control and convenience:
  • 17. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value owner condi/on DOI. All Triples Smart Triples curated curated curated Curated Remove Co-occ Observational Ambiguity and Redundancy Inferred Knowledge Space
  • 18. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value author condi/on DOI } <Type F1> Database facts (multiple attributes) <Type F2> Community Annotations F+

C+

A+ <Type C1> Co-occurrence sentence (abstracts e.g. PubMed) <Type C2> Co-occurrence Full Text (publisher e.g. Springer) C+

A+ <Type A1> Concept Profile Match <Type A3> Co-expression (gene expression Databases) A+ <Type A4> Modelling hypothesis (e.g. Plectix, InWeb) Multiple Triples T-Cell Development Graph Building (e.g. WikiPathways) Unique to 101668678 Cancer Promoting Genes Interleukin-7 Unique to Springer Unique to Plectix
  • 20. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value author condi/on DOI } <Type F1> Database facts (multiple attributes) <Type F2> Community Annotations F+

C+

A+ <Type C1> Co-occurrence sentence (abstracts e.g. PubMed) <Type C2> Co-occurrence Full Text (publisher e.g. Springer) C+

A+ <Type A1> Concept Profile Match <Type A3> Co-expression (gene expression Databases) A+ <Type A4> Modelling hypothesis (e.g. Plectix, InWeb) Multiple Triples T-Cell Development Graph Building (e.g. WikiPathways) Unique to 101668678 Cancer Promoting Genes Interleukin-7 Unique to Springer Unique to Plectix
  • 21. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value owner condi/on Etc. Triples Smart Triples In these areas significant value Remove is added to the triples Curated Ambiguity and Redundancy Remove Observational Ambiguity and Redundancy Remove Inferred; Ambiguity and constructed Redundancy Knowledge Space
  • 22. The ‘trustmark’ CWATM: Triple ‘model’ Best practice Interoperability Et cetera
  • 23. Download
Concept
Web
Alliance
cer/fied
triples Includes edges from: Pubmed (400,000,000 sentences, 5,000,000,000 concept co-occurrences) (from public data) Protein databases (UniProt, IntAct, PDB, HPRD – 75,000 human curated PPIs) (from public data) Gene (co-expression databases (GEO, Express… – 25 square genes) (from public data) STRING edges (200,000 gene-gene edges) (from semi public data) InWeb edges (240,000 unique edges from 17 species) (from proprietary data) Reactome edges (240,000 unique edges from 17 species) (from proprietary data) Chemspider edges (25,000,000 chemicals) (from semi public data) Wiki edges (WikEdge = WikiPathways, WikiProfessionals, Omegawiki, Wikigene) Plectix edges (5,000 extra edges (PPI modeling) (from proprietary data) Private expression data (3000 extra edges, by Merck) (from proprietary data) Et Cetera