Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
SEMANTiCS, Industry, Vienna 2015,16-17 September
Semantics for Integrated
Analytical Laboratory Processes
The Allotrope Pe...
slide 2
Agenda
Introduction
Approach and IT-Solution
 Allotrope Data Format
 Domain Taxonomies
 Data Cube Ontology
Inte...
slide 3
Laboratory Analytical Processes
sample dataanalytical process
slide 4
High Variability of Result Data
chromatographypH thermogravimetry
HPLC-MS-MS
…
mass spectroscopy HPLC-MS
cell coun...
slide 5
Laboratory Analytical Processes
application 1
application 2application 3
result data and
process meta-data
slide 6
Common Problems
It’s hard to find data
based on intuitive starting
points [e.g. study, project,
analyst, technique...
slide 7
Landscape of Existing Standards
"The nice thing about standards is that
there are so many to choose from."
Andrew ...
slide 8
Allotrope Data Format
slide 9
Allotrope Foundation
•Subject Matter Experts
•Project Funding
Member
Companies
•Project Management
•Legal & Logist...
slide 10
Allotrope Foundation
•Subject Matter Experts
•Project Funding
Member
Companies
•Project Management
•Legal & Logis...
slide 11
Allotrope Data Format (ADF)
Data Description
RDF Model
Data Cubes
Universal data container
Data Package
Virtual f...
slide 12
API Stack
Allotrope Framework provides APIs to read and write data
contained in ADF
Developers do not have to con...
slide 13
Allotrope Foundation Taxonomies (AFT)
slide 14
Scope and Current Status
Implemented analytical
techniques:
Small molecules
 gas chromatography
 Karl Fischer
...
slide 15
Reused Vocabularies and Ontologies
Used:
 RDFS, OWL, SKOS
 Shape Constraint Language (SHACL)
Directly imported:...
slide 16
Analytical Workflow
slide 17
Analytical Workflow
The basic analytical workflow and data flow gets standardized
slide 18
Process
slide 19
Result
…
n-dimensional result
data, is represented
through a qb:DataSet
…
slide 20
Example: Mass Spectrum
Data set of rank 2.
Additional dimensions:
• sample
• retention time
• device
• …
Meta dat...
slide 21
ADF Data Cube Ontology
ADF Data Cube API
HDF5
ADF Data Cube Ontology
RDF Data Cube
Vocabulary
HDF5 Ontology
ADF-H...
slide 22
ADF Data Cube Ontology
W3C: RDF Data
Cube Vocabulary
HDF5 Ontology
W3C: RDF, OWL, SHACL
ADF Data Cube Ontology AD...
slide 23
ADF Data Cube Ontology
Data Slabs:
Selections on Components
slide 24
ADF Data Cube Ontology
Nominal Scale: sample, run …
Ordinal Scale: sample index, quality (++,+,o,-,--) ...
Interv...
slide 25
ADF Data Cube Ontology
Order Functions:
Required for range selections
slide 26
ADF Data Cube Ontology
HDF Mapping:
Required to map the
data structure from
functional to physical
perspective.
slide 27
ADF Data Cube Ontology
Complex Data Types:
Required mainly for measurements
slide 28
Complex Data Types
weight (mg)
1020
655
weight
1.020 g
655 mg
weight (mg)
1020 +/- 15
655 +/- 12
weight
tare: 25....
slide 29
Integration Projects
slide 30
Company 1
Reference Data Project
Data Lake Project
Lab
Execution
System
Instruments
(multiple)
Data Lake
(Hadoop)...
slide 31
Company 2
Analytical Chemistry in Discovery
Sample
Queue
Analytical
Data Review
ADF HPLC-MS
ADF Methods
MS
HPLC
slide 32
Company 3
Stability and Release Testing
Manufacturing Domain
ADF HPLC-UV
HPLC-UV
Balance
Electronic
Lab
Notebook
...
slide 33
Conclusion
Why Semantics?
Good framework for standardized but extendable data
descriptions which are needed to re...
slide 34
Questions? Heiner Oberkampf
heiner.oberkampf@osthus.com
www.osthus.com
Allotrope Foundation:
www.allotrope.org
Nächste SlideShare
Wird geladen in …5
×

Heiner Oberkampf: Semantics for Integrated Analytical Laboratory Processes – the Allotrope Perspective

747 Aufrufe

Veröffentlicht am

http://2015.semantics.cc/heiner-oberkampf

Veröffentlicht in: Daten & Analysen
  • Als Erste(r) kommentieren

Heiner Oberkampf: Semantics for Integrated Analytical Laboratory Processes – the Allotrope Perspective

  1. 1. SEMANTiCS, Industry, Vienna 2015,16-17 September Semantics for Integrated Analytical Laboratory Processes The Allotrope Perspective Heiner Oberkampf
  2. 2. slide 2 Agenda Introduction Approach and IT-Solution  Allotrope Data Format  Domain Taxonomies  Data Cube Ontology Integration Projects
  3. 3. slide 3 Laboratory Analytical Processes sample dataanalytical process
  4. 4. slide 4 High Variability of Result Data chromatographypH thermogravimetry HPLC-MS-MS … mass spectroscopy HPLC-MS cell counterNMR
  5. 5. slide 5 Laboratory Analytical Processes application 1 application 2application 3 result data and process meta-data
  6. 6. slide 6 Common Problems It’s hard to find data based on intuitive starting points [e.g. study, project, analyst, technique] It’s hard to integrate data from different labs instruments, or online/offline because the file format is different It’s hard to mine a collection of data because the details and the context of the experiment is stored somewhere else Can’t interpret data later because the context is incomplete, inconsistent, often free text Instrument & software interoperability is limited…at best
  7. 7. slide 7 Landscape of Existing Standards "The nice thing about standards is that there are so many to choose from." Andrew S. Tanenbaum DISCLAIMER This is work in progress. It is not a complete list of standards but a tool for research the standards. Allotrope is investigating numerous standards but his graphic is not intended to represent standards Allotrope is commiting to include in the framework. UN/CEFACT Core Components Technical Specification 3.0 Batch ML W3C OWL 2.0 ISO ISO 11179 (Metadata Registry) 1999 ISO 19763 (Metamodel Interoperability) 2013RDF 1.0 SKOS 2012 OMG Allotrope Foundation Common Warehouse Metamodel 1.1 2003 Common Terminology Services 2 1.1 2013 ISO 25694 (Thesauri) Univeral Modeling Language 2.4.1 2012 ASTM AnIML 2.0 HL7 HL7 ISO 12000 (MARTIF) MESA ISO 19773 (Metadata Registry Modules) IETF RFC 2421 (Voice Profile) 2 1998 ISO 1087 (Terminology Vocabulary) 2000 ISO 11404 (General Purpose Datatypes) 2007 ISO 20944 (MDRIB) 2013 UPU S42-1 (Postal address components) 2003 ISO 2832 (IT Vocabulary) 1996-2000 UPU ISO 9899 (Programming Languages C) 1999 ISO 9945 (Filenames) RFC 3986 (URI) 2005 ISO 10646 (Unicode) ISO 646 (IA5 character code) ISO 19107 (Geographic Information) ISO 16684-1 (XMP) 2012 Adobe ISO 639 (Language Codes) ISO 3166 (Country Codes) RFC 2046 (MIME Types) RFC 3066 (Language Codes) OASIS ebXML Registry Information Model 2 3.0 2005 ebXML Registry Services Specification 2.0 2001 genericode 1.0 2007 RFC 2119 (Requirement Keywords) 1997 CMIS 1.1 2012 RFC 2616 (HTTP) 1.1 1999 RFC 3023 (XML Media Types) 2001 RFC 2045 (MIME Format) RFC 4287 (Atom Syndication) RFC 5023 (Atom Publishing) RFC 4918 (WebDAV) XML Schema Datatypes 2004 OData 4.0 ebXML RegRep 4.0 2012 ISO 15000-3 (ebRIM) 2004 XPath 2.0 2.0 2007 XMLDSig 2001 XLink 1.1 1.1 1999 SOAP 1.2 1.2 2003 ISO 19915 (Geographic Information Metadata) ISO 19119 (Geographic Information Services) 2005 LC MARC 21 XML Schema 1.2 2009 MIX 2.0 2006 PREMIS 2.2 2012 NISO Metadata Object Description Standard 3.5 2013 Metadata Authority Description Standard 2.0 2012 ISO 25577 (Information and Documentation - MarcXchange) ISO 20775 (Information and Documentation - Schema for Holdings Information) searchRetrieve 1.0 2013 Search/Retrieval via URL 2.0 Contextual Query Language 1.2 Dublin Core Metadata Element Set 1.1 UKOLN Encoded Archival Description 2002 2002 Text Encoding Initiative DDI Codebook 2.5 OAI Protocol for Metadata Harvesting 2.0 2002 OAI OAI Object Reuse and Exchange 1.0 2008 SPARQL 1.1 2013 ISO 704 (Terminology - Principles and methods) 2000 UNECE ISO 19504 (Common Warehouse Metamodel) Statistical Data and Metadata Exchange 2.1 2011 Common Metadata Framework DDI Alliance DDI Lifecycle 3.1 UNSC EDIFACT Meta Object Facility 1.4.1 2005 Ontology Definition Metamodel 1.0 2009 Information Management Metamodel UML Profile & Metamodel for Services 1.0.1 2012 Semantics of Business Vocabulary and Business Rules 1.2 2013 ISO 6093 (Number Namespace) Metadata Encoding & Transmission Standard 1.10 2013 ISO 15000-4 (ebRS) 2004 ISO 15489 (Records Management) 2001 ISO 23081 (Metadata for records) 2006 ISO 16363 (Audit and Certification of Trustworthy Digital Repositories) 2011 ISO 14721 (OAIS) 2012 Dublin Core Metadata Initiative ISO 15836 (DCMES) SWORD 2.0 2008 JISC BagIt ARK Identifiers ISO 26324 (Digital Object Identifier) 2012 RFC 3652 (Handle System Protocol) 2.1 2003 RFC 3650 (Handle System Overview) 2003 RFC 3651 (Handle System Namespace and Service Definition) 2003 ISO 13120 (ClamML) 2013 ISO 27951 (CTS1) 2009 ISO 27527 (Provider Identification) 2010 ISO 27932 (HL7 Clinical Document Architecture) 2009 ISO 27931 (HL7) 2009 ISO 17115 (Vocabulary for terminological systems) 2007 LMER 1.2 DNB RFC 2141 (URN Syntax) 1997 RFC 1737 (URN Requirements) 1994 RFC 4122 (UUID URN Namespace) 2005 ISO 20652 (PAIMAS) 2006 IMS Content Packaging 1.2 IMS Global Z39.50 (Information Retrieval) 4 2003 ISO 2709 (Format for information exchange) 2008 MARC 21 EAD 2002 FOAF Vocabulary 0.99 2014 FOAF Project RDF Best Practices CoolURIs RDF Vocabulary Description Language 1.0 2004 Extensible Resource Identifier 2.0 2005 RFC 2234 (ABNF) 1997 RFC 3987 (IRI) 2005 RFC 3305 (URI,URL,URN Clarifications) 2002 RFC 2396 (URI) 1998 XRI Data Interchange 2.0 2005 ISO 14533-2 (XAdES) 2012 Canonical XML 1.0 2001 Universal Business Language 2.1 2013 ISO 14662 (Open-edi) 2010 ISO 15000-5 (CCTS) 2005 Z39.88 (OpenURL) 1 2004 Z39.85 (DCMES) 1 2001 ISO 8601 (Dates and Times) 2000 ISO 62264 (B2MML) 2003-2008 ISA 95 2001-2005 ISA 88 ANSI ISO 21000-2 (MPEG-21 DID) 2005 ISO 21000-6 (MPEG-21 RDD) 2004 ISO 21000-7 (MPEG-21 DIA) 2007 ISO 21000-9 (MPEG-21 Fileformat) 2005 ISO 21000-18 (MPEG-21 Streaming) 2007 ISO 14496-12 (base media file format) 2012 RFC 6481(Codecs) 2011 ISO 21000-3 (MPEG-21 DII) 2003 TIFF 6.0 1992 ISO 15444-1 (JPEG2000) 2004 JPEG UnitsML 1.0 2011 NIST hData 1.0 2013 RLUS 1.0.1 2011 LECIS 1.0 2003 ISO 21090 (Health informatics data types) IHE XDS SVSXUA SAML 2.0 2008 XACML 3.0 2013 ASTM E1986 (Access Privileges to Health Info) 2013 ASTM E1869 (Confidentiality, Privacy, Access and Data Security ) 2010 ISO 19005-1b (PDF/A) CDA 2 2008 ISO 19510 (BPMN 2.0) 2013 BPMN 2.0.1 2011 SAA CDISC BRIDG 3.2 Define-XML 2.0 2013 ADaM 2.1 SDM-XML 1.0 CDISC-ODM 1.3.2 SEND 3.0 LAB 1.0.1 ISO 28500 (WARC) 2009 RFC 3629 (UTF-8) 2003 ISO 17025 (Competence of laboratories) 2005 ISOW3C IE TF OASIS OMG LC CDISC NISO OAI
  8. 8. slide 8 Allotrope Data Format
  9. 9. slide 9 Allotrope Foundation •Subject Matter Experts •Project Funding Member Companies •Project Management •Legal & Logistical Support Secretariat •Framework Development •Technical Leadership Professional Software Firm •Requirements & Specifications •Contributions, PoC Applications Partner Network
  10. 10. slide 10 Allotrope Foundation •Subject Matter Experts •Project Funding Member Companies •Project Management •Legal & Logistical Support Secretariat •Framework Development •Technical Leadership Professional Software Firm •Requirements & Specifications •Contributions, PoC Applications Partner Network AbbVie Amgen Baxter Bayer Biogen Boehringer Ingelheim Bristol-Myers Squibb Eli Lilly Genentech/Roche GlaxoSmithKline Merck & Co Pfizer ACD/Labs Agilent BIOVIA BSSN Erasmus MC IDBS Mestrelab Research Mettler Toledo Persistent Riffyn Sartorius Shimadzu Thermo Scientific Univ. Southampton Waters
  11. 11. slide 11 Allotrope Data Format (ADF) Data Description RDF Model Data Cubes Universal data container Data Package Virtual file system * Contains: • Method, instrument, sample, process, result, etc. • Data cube metadata • Binary file metadata • … Analytical data represented by one- or multidimensional arrays. HDF5 Platform Independent File Format Allotrope Data Format * Use is optional Analytical data represented by arbitrary formats, incl. native instrument formats, images, pdf, video, etc. Specifically designed to store and organize large amounts of numerical data.
  12. 12. slide 12 API Stack Allotrope Framework provides APIs to read and write data contained in ADF Developers do not have to concern themselves with RDF, SPARQL, semantics or complex graph patterns Platform independent file format (HDF5) Data Package API Data Cube API Data Description API (Apache Jena) Analytical Data API Taxonomies Triple Store API Taxonomies
  13. 13. slide 13 Allotrope Foundation Taxonomies (AFT)
  14. 14. slide 14 Scope and Current Status Implemented analytical techniques: Small molecules  gas chromatography  Karl Fischer  liquid chromatography  mass spectrometry  nuclear magnetic resonance spectroscopy  thermogravimetric analysis  ultra violet spectroscopy Large molecules  capillary electrophoresis  cell counter  cell culture analyzer  blood gas analysis Both  balance  pH 562 168 2272 283 Number of classes:
  15. 15. slide 15 Reused Vocabularies and Ontologies Used:  RDFS, OWL, SKOS  Shape Constraint Language (SHACL) Directly imported:  Quantities, Units, Dimensions and Data Types Ontologies (QUDT)  The W3C RDF Data Cube Vocabulary (QB) Partly reused definitions:  Chemical Methods Ontology (CHMO)  Proteomics Standards Initiative – Mass Spectrometry (PSI-MS)  International Union of Pure and Applied Chemistry (IUPAC)  …
  16. 16. slide 16 Analytical Workflow
  17. 17. slide 17 Analytical Workflow The basic analytical workflow and data flow gets standardized
  18. 18. slide 18 Process
  19. 19. slide 19 Result … n-dimensional result data, is represented through a qb:DataSet …
  20. 20. slide 20 Example: Mass Spectrum Data set of rank 2. Additional dimensions: • sample • retention time • device • … Meta data is expressed in RDF. Numeric data is natively represented in HDF5. mass intensity af-m:AFM_0000350 af-r:AFR_0000495
  21. 21. slide 21 ADF Data Cube Ontology ADF Data Cube API HDF5 ADF Data Cube Ontology RDF Data Cube Vocabulary HDF5 Ontology ADF-HDF5 Mapping Create and access data cubes. Extends the RDF Data Cube Vocabulary by scales, slabs, order functions and complex data types. Mapping between RDF meta data descriptions and description of physical storage in HDF5. Vocabulary of HDF5 entities and data types. Platform independent file format.
  22. 22. slide 22 ADF Data Cube Ontology W3C: RDF Data Cube Vocabulary HDF5 Ontology W3C: RDF, OWL, SHACL ADF Data Cube Ontology ADF-HDF5 Mapping
  23. 23. slide 23 ADF Data Cube Ontology Data Slabs: Selections on Components
  24. 24. slide 24 ADF Data Cube Ontology Nominal Scale: sample, run … Ordinal Scale: sample index, quality (++,+,o,-,--) ... Interval Scale: temperature, date time … Ratio Scale: mass, duration …
  25. 25. slide 25 ADF Data Cube Ontology Order Functions: Required for range selections
  26. 26. slide 26 ADF Data Cube Ontology HDF Mapping: Required to map the data structure from functional to physical perspective.
  27. 27. slide 27 ADF Data Cube Ontology Complex Data Types: Required mainly for measurements
  28. 28. slide 28 Complex Data Types weight (mg) 1020 655 weight 1.020 g 655 mg weight (mg) 1020 +/- 15 655 +/- 12 weight tare: 25.3332 +/- 0.2 g net: 20.219 +/- 0.2 g Complex Data types are expressed using the Shapes Constraint Language (SHACL). https://w3c.github.io/data-shapes/shacl/
  29. 29. slide 29 Integration Projects
  30. 30. slide 30 Company 1 Reference Data Project Data Lake Project Lab Execution System Instruments (multiple) Data Lake (Hadoop) ADF (multiple) AF Taxonomies
  31. 31. slide 31 Company 2 Analytical Chemistry in Discovery Sample Queue Analytical Data Review ADF HPLC-MS ADF Methods MS HPLC
  32. 32. slide 32 Company 3 Stability and Release Testing Manufacturing Domain ADF HPLC-UV HPLC-UV Balance Electronic Lab Notebook ADF Methods
  33. 33. slide 33 Conclusion Why Semantics? Good framework for standardized but extendable data descriptions which are needed to realize the potential of the available data. Linked Data allows to relate information stored in ADF with additional context: e.g. materials, devices, chemicals, processes, locations etc. Initially: Experiments for approval for drugs. Today: Experiments generate data that can be used in many different contexts.
  34. 34. slide 34 Questions? Heiner Oberkampf heiner.oberkampf@osthus.com www.osthus.com Allotrope Foundation: www.allotrope.org

×