In honor of JC Bradley, and the spirit of openness that he inspired, a new online resource called the Open Spectral Database (http://www.osdb.info) was announced in August of this year (aplha version). Built using open source tools, using open code, and open to community input about design and functionality, the OSD is available for anyone to submit spectral data and make it available to the scientific community. This paper will detail the reaction to the website, look at how the site has evolved since August (beta version), and offer a glimpse of what the future may bring for the site.
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Reactions to the Open Spectral Database
1. Reactions to
The Open Spectral Database
http://osdb.info
Stuart J. Chalk, Department of Chemistry
University of North Florida
schalk@unf.edu
Instigator: Tony Williams
SCTY 28 – Pacifichem 2015
2. What would Jean-Claude Bradley have wanted?
Share and Reuse Research Data!
How Do You Make Everything Open?
JCAMP Implementation
The Open Spectral Database
Data Model
Live Demo (fingers crossed)
Future Plans
Conclusion
Outline
3. What Would JCB Have Wanted?
Simple: Openness as the norm not the exception
Data made available, without restriction, so its useful
Mechanisms/tools to make data available
Formats to allow others to get the data…
…but also so its easy to use
Annotated data to make it easy to find
Community driven promotion of and action on these issues
4. Ryan P. Womack (2015) Research Data in Core Journals in Biology, Chemistry, Mathematics,
and Physics. PLoS ONE 10(12): e0143460. doi:10.1371/journal.pone.0143460
Share and Reuse Research Data!
5. You have to know/define what “everything” means
Open Data
Open Data Model
Open and useable data structures
Open Code
Open to input from the community on all aspects
Open to add, extend, change, and rethink all of this
How Do You Make Everything Open?
6. Spectral data – There are many formats but only one
open and generally accepted standard – JCAMP
Its not perfect…
…but its an output format people can share
Lets export the data, metadata, and inference as
much as possible from JCAMP files
Not as easy as it seems…
First Attempt
7.
8. Great data exchange format, however…
…not meant to be computer input…
…more a way to get data out so a human can process
Missing parameters (metadata)
Missing data
Incorrect values
Extra data
Incorrectly compressed
Challenges with JCAMP
9. Upload JCAMP spectra
Data and metadata extracted
Organize metadata so it can be used to find data
Use REST based website and API to make data available
and allow searching – document API
Make the website available as a project on GitHub and
invite the community to get involved
The Open Spectral Database
11. JCAMP file is imported into PHP as an array, then
Clean
Uncomment ($$)
Separate
Labeled Data Records (LDRs)
Parameters (##.)
User Defined Labels (##$)
Validate
Standardize
Decompress
Convert to output format or store in database
Ingestion Process
12. In order to organize the data and metadata it is
distributed across a number of tables in the database
This is a generic science data model that is being used
for multiple projects
Not limited to spectra or even just Chemistry data
Data Model
16. Enthusiastic Feedback with constructive comments…
Spectral list is boring needs molecules linked to spectra
Less metadata on the spectral page with option to see more
Revise homepage to make it more inviting
Reactions to Alpha Version
17. Again Enthusiastic…
”Love the layout! Very clean…”
“Nice Work!” (Twitter comment)
… with constructive comments
Needs a zoom spectra feature
Clicking on spectrum provides data that is not useful
Maybe you could use JSpecView rather than Flot?
Reactions to Beta Version
18. Handle more complicated JCAMP files
Handle file formats other than JCAMP
Export in AnIML format
Expand the API
Improve Flot viewer functionality (e.g. zoom)
Add JSpecView spectral viewer
Endpoint summary page
Document the website (GitHub)
Document how to contribute to the website (GitHub)
Solicit feature requests and encourage contributions
Things To Do
19. Take Home
The OSD is open for the community to develop and
implement ideas about open spectral data re:
Data Model
API features
Export Formats
Services
Community Involvement!
Use as a data source for other applications
Submission of feature requests
Participation as code contributor