SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Services, and Kew’s
   (names) data

     Nicky Nicolson, RBG Kew
Outline

• What we’re working on at the moment
   • A names backbone
• Kew’s role
• What “services” we have just now
   • … and why so few?
• Considerations
• Services on the names backbone & timescales
The current situation…
Many overlapping systems, few links
… and what we’re aiming for:
Authoritative data, reduced duplication, many more links
Names are key to linking the data:
   build a “names backbone”

== “an environment for the management of multiple
  overlapping classifications and tracking how these
  change over time”
Not a monolith:
   • Built on a layered view of the domain – clearly
     separating names and taxonomy
   • Names form the objective basis for higher layers
Names backbone: a layered environment
Name occurrence layer AKA
           “Nomen-clutter”

== any attempt
at the
transcription of
a name..
Names layer

Holds objective
published facts
about a name:
-Orthography
- Authorship
- Protologue
reference
- Type citation
- Objective
synonymy
Concepts layer

Hypotheses
draw names
together to form
concepts via
heterotypic
synonymy
Names backbone is wider than
              Kew
• We need to draw in data curated elsewhere, both
names and concepts:
  • Vascular plants
  • “Lower” plants
  • Mycology
  • ... and zoological names

…Kew’s role is as a service consumer as well as a
 service provider
What services we have at the
              moment
Various things for particular projects
… Used by known partners
... Answering specific, tactical needs

Are these really services?
• Not widely advertised
• Not opened up for anybody to use

...Not necessarily a strategic commitment
Service example: OpenUp

Name and concept checking for the data quality toolkit.

• Standard message format

But:
• Concepts not persistently identified
• No throughput management, so not widely available
i.e. a (short term) system view…
    Many overlapping systems, few links
… rather than the long term
 Authoritative data, persistently identified
Short term view == unhappy man
           in Glasgow
A long-term, sustainable service:

1. Authoritative data
2. Persistently identified
3. Standards based
... and it also needs:

• Robustness / sustainability
• Management of throughput
• Communications with end users
• Support
   • Help
   • Example code
• Usage monitoring
• Sharing usage logs
• Terms of use
Analogy with collaborative
              development
     • Technical considerations
vs
     • Social / political considerations
All this should be service accessible…
 …persistently identified data classes & inter-connections
Services: name occurrence layer


- Data input / output:
DwCA
-Linking and
reviewing links
-RSS feeds to
indicate activity
Services: names layer
- Data input / output:
TCS
-Propose addition /
edit of names
-RSS feeds to
indicate activity
Services: concepts layer
- Data input / output:
TCS
-Create
classifications using
names
-Propose
addition / edit of
names to names
layer
-RSS feeds
How the names backbone will
          support services
We’re working to enable service level access to the
 data, by:
  • Establishing authority
  • Reducing duplication
  • Data standards to represent well-known entities
  • Persistent identifiers on those well-known entities
  • Meaningful versioning – what changed, when
  • Enabling remote curation
Timescales 2013

Till March:
   First release : familial and generic classification
April – August:
   Extend to name occurrence layer
   Extend to species – incorporate WCS
   Prioritise compilation process
September onwards (inc TDWG):
   Comms w. service providers / consumers

Weitere ähnliche Inhalte

Was ist angesagt?

FAIR Dataverse
FAIR DataverseFAIR Dataverse
FAIR Dataversevty
 
Text and Data Mining
Text and Data MiningText and Data Mining
Text and Data MiningCrossref
 
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...Crossref
 
System Update (2011 CrossRef Workshops)
System Update (2011 CrossRef Workshops)System Update (2011 CrossRef Workshops)
System Update (2011 CrossRef Workshops)Crossref
 
Understanding Crossref Metadata
Understanding Crossref MetadataUnderstanding Crossref Metadata
Understanding Crossref MetadataCrossref
 
Nosql database presentation
Nosql database  presentationNosql database  presentation
Nosql database presentationmusaab fathi
 
Introduction to Web Services
Introduction to Web ServicesIntroduction to Web Services
Introduction to Web ServicesJeffrey Anderson
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of librariesRegan Harper
 
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118Crossref
 

Was ist angesagt? (10)

FAIR Dataverse
FAIR DataverseFAIR Dataverse
FAIR Dataverse
 
Text and Data Mining
Text and Data MiningText and Data Mining
Text and Data Mining
 
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
 
System Update (2011 CrossRef Workshops)
System Update (2011 CrossRef Workshops)System Update (2011 CrossRef Workshops)
System Update (2011 CrossRef Workshops)
 
Oct 14 NISO Webinar: Cloud and Web Services for Libraries
Oct 14 NISO Webinar: Cloud and Web Services for LibrariesOct 14 NISO Webinar: Cloud and Web Services for Libraries
Oct 14 NISO Webinar: Cloud and Web Services for Libraries
 
Understanding Crossref Metadata
Understanding Crossref MetadataUnderstanding Crossref Metadata
Understanding Crossref Metadata
 
Nosql database presentation
Nosql database  presentationNosql database  presentation
Nosql database presentation
 
Introduction to Web Services
Introduction to Web ServicesIntroduction to Web Services
Introduction to Web Services
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of libraries
 
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
 

Andere mochten auch

IPNI PhytoKeys integration
IPNI PhytoKeys integrationIPNI PhytoKeys integration
IPNI PhytoKeys integrationnickyn
 
Kew at the pro-iBiosphere data hackathon
Kew at the pro-iBiosphere data hackathonKew at the pro-iBiosphere data hackathon
Kew at the pro-iBiosphere data hackathonnickyn
 
Rda p5-env-plenary-nn
Rda p5-env-plenary-nnRda p5-env-plenary-nn
Rda p5-env-plenary-nnnickyn
 
A names backbone - a graph of taxonomy
A names backbone - a graph of taxonomyA names backbone - a graph of taxonomy
A names backbone - a graph of taxonomynickyn
 
Challenges in developing names services - RDA
Challenges in developing names services - RDAChallenges in developing names services - RDA
Challenges in developing names services - RDAnickyn
 
829 tdwg-2015-nicolson-kew-strings-to-things
829 tdwg-2015-nicolson-kew-strings-to-things829 tdwg-2015-nicolson-kew-strings-to-things
829 tdwg-2015-nicolson-kew-strings-to-thingsnickyn
 
names-backbone-graph-TDWG
names-backbone-graph-TDWGnames-backbone-graph-TDWG
names-backbone-graph-TDWGnickyn
 
Building a names backbone
Building a names backboneBuilding a names backbone
Building a names backbonenickyn
 

Andere mochten auch (8)

IPNI PhytoKeys integration
IPNI PhytoKeys integrationIPNI PhytoKeys integration
IPNI PhytoKeys integration
 
Kew at the pro-iBiosphere data hackathon
Kew at the pro-iBiosphere data hackathonKew at the pro-iBiosphere data hackathon
Kew at the pro-iBiosphere data hackathon
 
Rda p5-env-plenary-nn
Rda p5-env-plenary-nnRda p5-env-plenary-nn
Rda p5-env-plenary-nn
 
A names backbone - a graph of taxonomy
A names backbone - a graph of taxonomyA names backbone - a graph of taxonomy
A names backbone - a graph of taxonomy
 
Challenges in developing names services - RDA
Challenges in developing names services - RDAChallenges in developing names services - RDA
Challenges in developing names services - RDA
 
829 tdwg-2015-nicolson-kew-strings-to-things
829 tdwg-2015-nicolson-kew-strings-to-things829 tdwg-2015-nicolson-kew-strings-to-things
829 tdwg-2015-nicolson-kew-strings-to-things
 
names-backbone-graph-TDWG
names-backbone-graph-TDWGnames-backbone-graph-TDWG
names-backbone-graph-TDWG
 
Building a names backbone
Building a names backboneBuilding a names backbone
Building a names backbone
 

Ähnlich wie Services and Kew's (names) data

10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...DuraSpace
 
Metadata, Open Access and More: Crossref presentation
Metadata, Open Access and More: Crossref presentationMetadata, Open Access and More: Crossref presentation
Metadata, Open Access and More: Crossref presentationCrossref
 
Accelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO WayAccelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO WayMongoDB
 
NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)Christine Stohn
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Oscar Corcho
 
Crossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinarCrossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinarCrossref
 
Crossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinarCrossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinarVanessa Fairhurst
 
IASSIST40: Data management & curation workshop
IASSIST40: Data management & curation workshopIASSIST40: Data management & curation workshop
IASSIST40: Data management & curation workshopRobin Rice
 
Crossref LIVE US Online
Crossref LIVE US OnlineCrossref LIVE US Online
Crossref LIVE US OnlineCrossref
 
LDAP - Lightweight Directory Access Protocol
LDAP - Lightweight Directory Access ProtocolLDAP - Lightweight Directory Access Protocol
LDAP - Lightweight Directory Access ProtocolS. Hasnain Raza
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
Enterprise ready: a look at Neo4j in production
Enterprise ready: a look at Neo4j in productionEnterprise ready: a look at Neo4j in production
Enterprise ready: a look at Neo4j in productionNeo4j
 
Crossref XML and tools for small publishers (EASE Conference 2018)
Crossref XML and tools for small publishers (EASE Conference 2018)Crossref XML and tools for small publishers (EASE Conference 2018)
Crossref XML and tools for small publishers (EASE Conference 2018)Crossref
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Dataaba-sah
 
Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...ResearchSpace
 

Ähnlich wie Services and Kew's (names) data (20)

10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
 
NISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to RealityNISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to Reality
 
Metadata, Open Access and More: Crossref presentation
Metadata, Open Access and More: Crossref presentationMetadata, Open Access and More: Crossref presentation
Metadata, Open Access and More: Crossref presentation
 
Accelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO WayAccelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO Way
 
NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)
 
RA21 Charleston Library Conference Presentation
RA21 Charleston Library Conference Presentation RA21 Charleston Library Conference Presentation
RA21 Charleston Library Conference Presentation
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
Crossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinarCrossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinar
 
Crossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinarCrossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinar
 
IASSIST40: Data management & curation workshop
IASSIST40: Data management & curation workshopIASSIST40: Data management & curation workshop
IASSIST40: Data management & curation workshop
 
Crossref LIVE US Online
Crossref LIVE US OnlineCrossref LIVE US Online
Crossref LIVE US Online
 
LDAP - Lightweight Directory Access Protocol
LDAP - Lightweight Directory Access ProtocolLDAP - Lightweight Directory Access Protocol
LDAP - Lightweight Directory Access Protocol
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Enterprise ready: a look at Neo4j in production
Enterprise ready: a look at Neo4j in productionEnterprise ready: a look at Neo4j in production
Enterprise ready: a look at Neo4j in production
 
DataShare for UC Campuses
DataShare for UC CampusesDataShare for UC Campuses
DataShare for UC Campuses
 
Crossref XML and tools for small publishers (EASE Conference 2018)
Crossref XML and tools for small publishers (EASE Conference 2018)Crossref XML and tools for small publishers (EASE Conference 2018)
Crossref XML and tools for small publishers (EASE Conference 2018)
 
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Data
 
Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...
 

Services and Kew's (names) data

  • 1. Services, and Kew’s (names) data Nicky Nicolson, RBG Kew
  • 2. Outline • What we’re working on at the moment • A names backbone • Kew’s role • What “services” we have just now • … and why so few? • Considerations • Services on the names backbone & timescales
  • 3. The current situation… Many overlapping systems, few links
  • 4. … and what we’re aiming for: Authoritative data, reduced duplication, many more links
  • 5. Names are key to linking the data: build a “names backbone” == “an environment for the management of multiple overlapping classifications and tracking how these change over time” Not a monolith: • Built on a layered view of the domain – clearly separating names and taxonomy • Names form the objective basis for higher layers
  • 6. Names backbone: a layered environment
  • 7. Name occurrence layer AKA “Nomen-clutter” == any attempt at the transcription of a name..
  • 8. Names layer Holds objective published facts about a name: -Orthography - Authorship - Protologue reference - Type citation - Objective synonymy
  • 9. Concepts layer Hypotheses draw names together to form concepts via heterotypic synonymy
  • 10. Names backbone is wider than Kew • We need to draw in data curated elsewhere, both names and concepts: • Vascular plants • “Lower” plants • Mycology • ... and zoological names …Kew’s role is as a service consumer as well as a service provider
  • 11. What services we have at the moment Various things for particular projects … Used by known partners ... Answering specific, tactical needs Are these really services? • Not widely advertised • Not opened up for anybody to use ...Not necessarily a strategic commitment
  • 12. Service example: OpenUp Name and concept checking for the data quality toolkit. • Standard message format But: • Concepts not persistently identified • No throughput management, so not widely available
  • 13. i.e. a (short term) system view… Many overlapping systems, few links
  • 14. … rather than the long term Authoritative data, persistently identified
  • 15. Short term view == unhappy man in Glasgow
  • 16. A long-term, sustainable service: 1. Authoritative data 2. Persistently identified 3. Standards based
  • 17. ... and it also needs: • Robustness / sustainability • Management of throughput • Communications with end users • Support • Help • Example code • Usage monitoring • Sharing usage logs • Terms of use
  • 18. Analogy with collaborative development • Technical considerations vs • Social / political considerations
  • 19. All this should be service accessible… …persistently identified data classes & inter-connections
  • 20. Services: name occurrence layer - Data input / output: DwCA -Linking and reviewing links -RSS feeds to indicate activity
  • 21. Services: names layer - Data input / output: TCS -Propose addition / edit of names -RSS feeds to indicate activity
  • 22. Services: concepts layer - Data input / output: TCS -Create classifications using names -Propose addition / edit of names to names layer -RSS feeds
  • 23. How the names backbone will support services We’re working to enable service level access to the data, by: • Establishing authority • Reducing duplication • Data standards to represent well-known entities • Persistent identifiers on those well-known entities • Meaningful versioning – what changed, when • Enabling remote curation
  • 24. Timescales 2013 Till March: First release : familial and generic classification April – August: Extend to name occurrence layer Extend to species – incorporate WCS Prioritise compilation process September onwards (inc TDWG): Comms w. service providers / consumers

Hinweis der Redaktion

  1. Many systems few links.Huge overlap in data and functionalityA single scientific question can be answered in multiple different ways
  2. DEFRA funded project – for Kew internal information management, but applicable wider.Staffed with a development team of 5, and a data improvement team of 4, plus people working on project management and business change.Names are crucial to Kew’s scientific work and day to day management of the collections.We have many systems which hold nomenclatural and taxonomic information
  3. Name occurrence layer – any informal attempt at the transcription of a nameSome name occurrences are code governed names – eligible to appear in the next layer – the names layer – this holds all the objective published facts about a name – its orthography, authorship, protologue reference, type citation and objective synonymyConcepts layer – hypotheses draw these names together to form concepts via heterotypic synonymy.Most people are interested in working with concepts. Unfortunately most people are only armed with name occurrences.
  4. Name occurrence layer – any informal attempt at the transcription of a nameSome name occurrences are code governed names – eligible to appear in the next layer – the names layer – this holds all the objective published facts about a name – its orthography, authorship, protologue reference, type citation and objective synonymyConcepts layer – hypotheses draw these names together to form concepts via heterotypic synonymy.Most people are interested in working with concepts. Unfortunately most people are only armed with name occurrences.
  5. IPNI / IF / Zoobank
  6. WSCP etc
  7. Maybe ask about distribution / replicated system?
  8. OpenUp as an example
  9. OpenUp as an example
  10. Many systems few links.Huge overlap in data and functionalityA single scientific question can be answered in multiple different ways
  11. Robust / quality /sustainable?
  12. Social issues are hard to resolve... But its not all doom and gloom:
  13. We aim to open up our resources and allow their use as a platform upon which others can work.