This document discusses the transition from BioMoby to SADI as a framework for semantic web services. It provides statistics on BioMoby usage and describes demonstrations of how SADI allows complex queries to be answered by discovering and executing relevant web services without a centralized database. The author's vision is for SADI to support the scientific method by enabling personal ontologies and hypotheses to be explicitly expressed and evaluated dynamically.
2. BioMoby Stats in a nutshell >1800 servicesworldwide (~1300 âaliveâ at any given time) 4 major installations of the Moby Service registry Genome Canada, SUN Center of Excellence, Calgary Genome España, Barcelona Supercomputing Center International Rice Research Institute, Philippines Max Planck, Cologne Canadian service registry brokers ~400,000 requests/month Canadian BioMoby services receive ~700,000 uses/month Canadian server just had a significant memory upgrade to improve performance âThe report of my death was an exaggerationâ -- Mark Twain
3. Model Organism Bring Your-Own Database Interface Conference âMOBY-DICâ Emma Lake, Saskatchewan Sept 21, 2001
6. The Holy Grail:(this slide created circa 2002) Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels. Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
9. Imagine there is a âvirtual databaseâ containing all of the data from all of the databases,together with the output ofevery conceivable analysis
11. âSHAREâSemantic Health And Research EnvironmentSADI client applicationhttp://biordf.net/cardioSHARE (Pellet)http://dev.biordf.net/cardioSHARE (Pellet 2)
12. What pathways does UniProt protein P47989 belong to? PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> PREFIX ont: <http://ontology.dumontierlab.com/> PREFIX uniprot: <http://lsrn.org/UniProt:> SELECT ?gene ?pathway WHERE { uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway . }
13.
14.
15.
16. Recapwhat we just saw A standard SPARQL query was entered into SHARE, a SADI-aware query engine
17. Recapwhat we just saw The query was interpreted to extract the âtripleâ patterns subject, predicate, objectbeing requested
18. Recapwhat we just saw Triple-patterns are passed to SADI for Web Service discovery
19. Recapwhat we just saw Services capable of generating those triple-patterns are automatically executed, the triples are stored, and the query is resolved.
20. Recapwhat we just saw We posed, and answered a ~complex database query WITHOUT A DATABASE (in fact, the data didnât even have to exist...)
21. Recapwhat we just saw Note that there is no centralized ontologyUnlike BioMoby, SADI supports all (OWL) ontologiesand does not invent any of its own
22. Holy Grail Demo #1 Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels. Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
24. Show me the latest Blood Urea Nitrogen and Creatinine levelsof patients who appear to be rejecting their transplants PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf> WHERE { ?patient rdf:typepatient:LikelyRejecter . ?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat . }
25. Start burrowing through the LikelyRejector OWL class ï find that we need a regression model OWL class
26. Regression models have features like slopes and intercepts, and so on.The class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties
27. Decomposition of the OWL class uncovers the need for a Linear Regression analysis on the patient blood chemistry data
29. We just dynamically evaluated if individuals matching a particular high-level concept definition existâŠor can exist
30. Holy Grail Demo #2 Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels. Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
37. The Scientific Method Discourse: What do you believe? What do I believe? Disagreement: Youâre wrong! And Iâm gonna prove it! Clarity: This is the experiment I am going to do Reproducibility: This is how I did it (âprovenanceâ) Clarity: This is my new hypothesis
38. The Scientific Method Discourse: What do you believe? What do I believe? Disagreement: Youâre wrong! And Iâm gonna prove it! Clarity: This is the experiment I am going to do Reproducibility: This is how I did it (âprovenanceâ) Clarity: This is my new hypothesis Workflows (e.g. myExperiment)
39.
40. In opposition to the lessons we learnt from Web 2.0 The Semantic Web in Healthcare and Life Sciencesis solving the problems of science⊠âŠby forming institutions
41. Result: Large, centrally-designed and centrally-curated ontologies that enforceâcommunity agreementâ about âbiological realityâ
57. Sharing my ontology also gives opportunities for micro-attribution;âCitationâ of me is transparent and automatic when someone extends my ontology
58. Using SADI and SHAREmypersonal world-view isexplicitlyexpressedand can bedynamically evaluated againstglobal data and knowledge
65. The âLikely Rejecterâ OWL Class is an explicitly-expressed hypothesis; Members of that class may or may not exist!
66.
67.
68. Ontologically-expressed Hypotheses drive the discovery, assembly, and analysis of data capable of evaluating their validity Hypothesis Ischemia SADI + SHARE Hypertension Blood Pressure Analytical Algorithm Database 1 Database 2
69. Join us! SADI and CardioSHARE are Open-Source projects Come join us â weâre having a lot of fun!! http://sadiframework.org SADI SemanticWeb Services Page #SADIFramework
70. C-BRASS: Canadian Bioinformatics Resources As Semantic Servicestogether with Michel Dumontier, Chris Baker ~$1M funding to help us deploy SADI services and provide training for new service providers We can help you get started! âC-BRASSâ is on Facebook! Like
73. Credits Benjamin VanderValk (SADI & CardioSHARE) Luke McCarthy (SADI & CardioSHARE) SoroushSamadian (CardioSHARE) IO Informatics (Knowledge Explorer API) Microsoft Research Fin This presentation available on SlideShare: keywords âwilkinsonâ âboscâ