Más contenido relacionado

Similar a Exploration of a Data Landscape using a Collaborative Linked Data Framework.(20)


Exploration of a Data Landscape using a Collaborative Linked Data Framework.

  1. Exploration of a Data Landscape using a Collaborative Linked Data Framework Laurent Alquier, Project Lead, Informatics CoE
  2. Scientists are looking for answers to translational questions Background images from Flickr - CC Share Alike 2.0 Basic Research Prototype Design or Discovery Preclinical Development Clinical Development FDA Filing, Approval & Launch Prep Translational Research Critical Path Research
  3. Instead they research how to find and access data. Which data source should I choose ? Can I access it ? Who should I talk to ? Scientists want to do Science, not Data Management. Which interfaces are available ?
  4. knowIT helps scientists and IT manage what they know about Information Systems.
  5. Collaborative cataloging of data sources
  6. Structured data inside wiki pages
  7. Advantages of Semantic MediaWiki (SMW) Wiki pages are subjects for Semantic annotations (triples : page – property – value) MediaWiki knowledge management tools Many ways to input and output content.
  8. Overview of the Linked Data Framework RDF CSV XML RDBMS Docs RDB2RDF mapping TM J&J Ontol. Data catalog Data browser & Visualization Ontology Curation J&J RDF SPARQL Semantic Wiki
  9. Reviewing W3C’s Translational Medicine Ontology (TMO) as a base for J&J Ontology
  10. Ask questions using federated queries
  11. Integration with analytical tools
  12. Improved enterprise search
  13. Analysis of data sources content Relfinder
  14. Visualization of relationships between data sources
  15. Next steps Automation and integration with existing ontologies Enable self-describing data sources Dynamic, layered map of data landscape Incorporate additional data sources Drive broader awareness and adoption of the tool

Hinweis der Redaktion

  1. Laurent Alquier Project Lead in the Informatics CoE group, in J&J Pharma R&D. Between R&D and IT. We provide informatics solutions to help researchers.
  2. Scientists want to ask translational questions such as…. Identify patient subgroups based on genetics/genomics. Which biomarkers predict disease progression of people in their 50s and 60s
  3. But they end up spending too much time on questions such as these…
  4. knowIT was designed to help capture what we know about IT systems and facilitate answering real questions about scientific data.
  5. We have used knowIT to catalogue information about data sources…. ++++++++++++++++++++ Built on top of Semantic MediaWiki As a wiki, it is easy to use And allows customization and community features necessary to encourage participation Description of data sources using selected metadata Content updated directly by IT, scientists and data source owners 170 internal and external data sources catalogued
  6. As a Semantic Wiki Data can be accessed through search, faceted navigation, and SPARQL queries SMW gives structure to content of wiki pages And enables queries
  7. SMW  allows direct translation of page content into RDF triples Includes knowledge management tools (from MediaWiki’s experience with Wikipedia) Provides many ways to import and export content
  8. The data source catalog in knowIT is only a first step in a larger Linked Data Framework in J&J With a first application to Neuroscience Translational medicine and open innovation are key drivers Require better understanding of data sources
  9. Here are the details of the architecture Data source catalog and mapping of SPARQL endpoints (knowIT) Relational to RDF mapping (D2R) Triple stores (4store, OpenRDF) Repository for ontological concepts Ontologies uploaded into OpenRDF Sesame triple store
  10. We are using NIF, but exploring adding TMO as a high level ontology framework Used to dereference URIs discovered within data cloud
  11. The framework also provides a C# API for easy access within our application development platform
  12. This API allows the development of visual browsing and query applications 3DX can be used to find data sources
  13. Bring data into a spreadsheet environment, from which a rich array of analytics can be invoked
  14. The SPARQL interface to knowIT content is used to provide incremental updates for our enterprise search engine
  15. … and to analyze relationships within the content of each data source (RelFinder)
  16. ... as well as to visualize relationships between data sources
  17. Next steps will include automation, integration with existing ontologies, self describing data sources, dynamic visualizations as well as a broader adoption.
  18. Finally, I would like to take this occasion to thank …
  19. Questions ?