Started in 2004 (under ASTM Committee E13.15) the Analytical Information Markup Language (AnIML) is an XML based standard for capturing, sharing, viewing, and archiving analytical instrument data from any analytical technique.
This paper discusses the AnIML standard in terms of philosophy, structure, usage, and the resources available to work with the standard. Examples will be given for different techniques as well as strategies for migration of legacy data. Finally, the current status of the standard and time frame for promulgation through ASTM will be reported.
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
AnIML: A New Analytical Data Standard
1. AnIML:
A New Analytical Data Standard
Stuart J. Chalk, Department of Chemistry, University of North Florida
schalk@unf.edu
ACS Meeting Boston 2015
2. Data Formats
Goals for Data Handling
Introduction to AnIML
Sections of an AnIML file
AnIML Schemas and Files
AnIML Technique Definitions
Publishing Instrument Data
Referencing Data Elements
Calculations on Data
Future Developments
Conclusion
Overview
3. Native Data Formats
Proprietary formats
"Metadata" separated from result data
Metadata and data in multiple files
Metadata not available electronically
No way to link metadata with result data
Interchange Data Formats
Available for only a few techniques
ANDI — GC, LC, MS
JCAMP-DX — UV-Vis, IR, NMR, UV/Vis, IMS
Fixed order, fixed syntax, immutable formats
Content limitations
Inconsistent implementations
Current Data Formats
4. Extensible
Easy to add new elements without breaking existing
applications
Flexible
Useful for diverse needs: Interchange, Interconversion,
Archiving...
Useable & Maintainable
Easy to create, use, adapt, maintain...
Readily available tools
Acceptable
Use standard mechanisms accepted by mainstream
computing
Human readable
eXtensible Markup Language
Goals for Data Handling
5. Extensible Markup Language (XML) specification
Development under ASTM E13.15 ‘AnIML Task Group’
Data standard to:
“Develop an analytical data standard that can
be used to store data from any analytical instrument”
Introduction to AnIML
http://animl.sourceforge.net
6. JCAMP-DX
http://www.jcamp-dx.org/
ANDI (netCDF)
ThermoML (NIST)
SpectroML
Nguyen, A. D. T., Arslan, A., Travis, J., Smith, M., Schafer, R., &
Kramer, G. W. (2004) ‘Molecular Spectrometry Data
Interchange Applications for NIST's SpectroML’, JALA 9 (6),
346-354. doi:10.1016/j.jala.2004.09.001
Generalized Analytical Markup Language (GAML)
http://www.gaml.org/
First official meeting March 23, 2003 @ ASTM
Brief History of Time AnIML
7. Broad scope
Different types of data
Size of data sets
Everyone calls ‘widgit’ something different
Need for metadata dictionaries
One size does not fit all
Getting broad community involvement
Domain experts
User communities
What format?
Challenges for AnIML
8. AnIML XML elements are ‘pigeon holes’ for metadata
Minimal ‘required’ information
If it’s not required you don’t have to include the element
Extensible
Store raw data not processed data
(except for FT techniques)
Support for legacy data
Record of changes
Validatable
Signable (digital sense)
AnIML Design Philosophy
16. Data storage
format
Not just for
spectral data
Access
Data
Metadata
Manipulate
using XSLT
Validate
Signable
AnIML in an ELN
17. AnIML Viewer -> Jmol/JSpecView (http://jmol.sourceforge.net)
Publish Supplementary Data
18. Conversion of AnIML data to SVG using XSLT
Convert to Image File for Publication
19. Expose an AnIML file at a URL
Optional: Define a DOI for that URL
Use XPath to reference a specific data point in an AnIML file
//ExperimentStepSet[1]/ExperimentStep[1]/Method[1]/Auth
or[1]/Name[1]
Encode the XPath expression so it can be part of the URL
Open Instrument Data
20. Part of a Data Management Plan
Federal agencies are mandating data be made available
Long term archive format for research data
Referenceable if available online
Searchable with Xquery
Publish data processing algorithms (XSLT)
Future proof data -> conversion to future data formats
21. The Healthcare and Life Science (HCLS) Community Profile
is a Note from the Semantic Web HCLS Interest Group
Access to consistent, high-quality metadata is critical to finding,
understanding, and reusing scientific data. This document
describes a consensus among participating stakeholders in the
Health Care and the Life Sciences domain on the description of
datasets using the Resource Description Framework (RDF). This
specification meets key functional requirements, reuses existing
vocabularies to the extent that it is possible, and addresses
elements of data description, versioning, provenance,
discovery, exchange, query, and retrieval.
Data Descriptions:
HCLS Community Profile
http://www.w3.org/TR/hcls-dataset/
22. AnIML 1.0 Deliverables
Core Schema - Fundamental framework for AnIML documents
Technique Schema - Fundamental framework for technique definition and
extension documents
AnIML Technique Definition Documents (ATDD) - Rules for content of
specific technique file
AnIML Naming and Design Rules - Specifies rules about data element
structure for interoperability
Standard Practice for AnIML Files - Describes how the specification is
supposed to work
How to Create a Technique Definition Document - Guidelines for creating
new technique definition documents
Other documents
Draft Requirements Specification for AnIML Version 1.0
Requirements and Goals of the Analytical Information Markup Language
AnIML Specification
http://animl.sourceforge.net
23. Documentation
Core specification
Technique and extension specification
Naming and design rules
Annotated technique definitions
(UV/Vis, IR, 1D NMR, MS, Chromatography)
Balloting through ASTM (end of 2015)
Vendor, User, Developer extensions
Semantic extension of AnIML metadata items
Future Developments
24. Conclusion
AnIML is a great solution
for storing instrument data
Human readable (UTF-8)
Platform neutral
Archivable
Validatable
AnIML leverages the extensive
XML ecosystem of tools
Software engineers know XML