2. Outline
• What we’re working on at the moment
• A names backbone
• Kew’s role
• What “services” we have just now
• … and why so few?
• Considerations
• Services on the names backbone & timescales
4. … and what we’re aiming for:
Authoritative data, reduced duplication, many more links
5. Names are key to linking the data:
build a “names backbone”
== “an environment for the management of multiple
overlapping classifications and tracking how these
change over time”
Not a monolith:
• Built on a layered view of the domain – clearly
separating names and taxonomy
• Names form the objective basis for higher layers
10. Names backbone is wider than
Kew
• We need to draw in data curated elsewhere, both
names and concepts:
• Vascular plants
• “Lower” plants
• Mycology
• ... and zoological names
…Kew’s role is as a service consumer as well as a
service provider
11. What services we have at the
moment
Various things for particular projects
… Used by known partners
... Answering specific, tactical needs
Are these really services?
• Not widely advertised
• Not opened up for anybody to use
...Not necessarily a strategic commitment
12. Service example: OpenUp
Name and concept checking for the data quality toolkit.
• Standard message format
But:
• Concepts not persistently identified
• No throughput management, so not widely available
13. i.e. a (short term) system view…
Many overlapping systems, few links
14. … rather than the long term
Authoritative data, persistently identified
16. A long-term, sustainable service:
1. Authoritative data
2. Persistently identified
3. Standards based
17. ... and it also needs:
• Robustness / sustainability
• Management of throughput
• Communications with end users
• Support
• Help
• Example code
• Usage monitoring
• Sharing usage logs
• Terms of use
18. Analogy with collaborative
development
• Technical considerations
vs
• Social / political considerations
19. All this should be service accessible…
…persistently identified data classes & inter-connections
20. Services: name occurrence layer
- Data input / output:
DwCA
-Linking and
reviewing links
-RSS feeds to
indicate activity
21. Services: names layer
- Data input / output:
TCS
-Propose addition /
edit of names
-RSS feeds to
indicate activity
22. Services: concepts layer
- Data input / output:
TCS
-Create
classifications using
names
-Propose
addition / edit of
names to names
layer
-RSS feeds
23. How the names backbone will
support services
We’re working to enable service level access to the
data, by:
• Establishing authority
• Reducing duplication
• Data standards to represent well-known entities
• Persistent identifiers on those well-known entities
• Meaningful versioning – what changed, when
• Enabling remote curation
24. Timescales 2013
Till March:
First release : familial and generic classification
April – August:
Extend to name occurrence layer
Extend to species – incorporate WCS
Prioritise compilation process
September onwards (inc TDWG):
Comms w. service providers / consumers
Hinweis der Redaktion
Many systems few links.Huge overlap in data and functionalityA single scientific question can be answered in multiple different ways
DEFRA funded project – for Kew internal information management, but applicable wider.Staffed with a development team of 5, and a data improvement team of 4, plus people working on project management and business change.Names are crucial to Kew’s scientific work and day to day management of the collections.We have many systems which hold nomenclatural and taxonomic information
Name occurrence layer – any informal attempt at the transcription of a nameSome name occurrences are code governed names – eligible to appear in the next layer – the names layer – this holds all the objective published facts about a name – its orthography, authorship, protologue reference, type citation and objective synonymyConcepts layer – hypotheses draw these names together to form concepts via heterotypic synonymy.Most people are interested in working with concepts. Unfortunately most people are only armed with name occurrences.
Name occurrence layer – any informal attempt at the transcription of a nameSome name occurrences are code governed names – eligible to appear in the next layer – the names layer – this holds all the objective published facts about a name – its orthography, authorship, protologue reference, type citation and objective synonymyConcepts layer – hypotheses draw these names together to form concepts via heterotypic synonymy.Most people are interested in working with concepts. Unfortunately most people are only armed with name occurrences.
IPNI / IF / Zoobank
WSCP etc
Maybe ask about distribution / replicated system?
OpenUp as an example
OpenUp as an example
Many systems few links.Huge overlap in data and functionalityA single scientific question can be answered in multiple different ways
Robust / quality /sustainable?
Social issues are hard to resolve... But its not all doom and gloom:
We aim to open up our resources and allow their use as a platform upon which others can work.