Services and Kew's (names) data

•Als PPTX, PDF herunterladen•

2 gefällt mir•468 views

nickyn

Outline

• What we’re working on at the moment
• A names backbone
• Kew’s role
• What “services” we have just now
• … and why so few?
• Considerations
• Services on the names backbone & timescales

The current situation…
Many overlapping systems, few links

… and what we’re aiming for:
Authoritative data, reduced duplication, many more links

Names are key to linking the data:
build a “names backbone”

== “an environment for the management of multiple
overlapping classifications and tracking how these
change over time”
Not a monolith:
• Built on a layered view of the domain – clearly
separating names and taxonomy
• Names form the objective basis for higher layers

Name occurrence layer AKA
“Nomen-clutter”

== any attempt
at the
transcription of
a name..

Names layer

Holds objective
published facts
about a name:
-Orthography
- Authorship
- Protologue
reference
- Type citation
- Objective
synonymy

Concepts layer

Hypotheses
draw names
together to form
concepts via
heterotypic
synonymy

Names backbone is wider than
Kew
• We need to draw in data curated elsewhere, both
names and concepts:
• Vascular plants
• “Lower” plants
• Mycology
• ... and zoological names

…Kew’s role is as a service consumer as well as a
service provider

What services we have at the
moment
Various things for particular projects
… Used by known partners
... Answering specific, tactical needs

Are these really services?
• Not widely advertised
• Not opened up for anybody to use

...Not necessarily a strategic commitment

Service example: OpenUp

Name and concept checking for the data quality toolkit.

• Standard message format

But:
• Concepts not persistently identified
• No throughput management, so not widely available

i.e. a (short term) system view…
Many overlapping systems, few links

… rather than the long term
Authoritative data, persistently identified

Short term view == unhappy man
in Glasgow

A long-term, sustainable service:

1. Authoritative data
2. Persistently identified
3. Standards based

... and it also needs:

• Robustness / sustainability
• Management of throughput
• Communications with end users
• Support
• Help
• Example code
• Usage monitoring
• Sharing usage logs
• Terms of use

Analogy with collaborative
development
• Technical considerations
vs
• Social / political considerations

All this should be service accessible…
…persistently identified data classes & inter-connections

Services: name occurrence layer

- Data input / output:
DwCA
-Linking and
reviewing links
-RSS feeds to
indicate activity

Services: names layer
- Data input / output:
TCS
-Propose addition /
edit of names
-RSS feeds to
indicate activity

Services: concepts layer
- Data input / output:
TCS
-Create
classifications using
names
-Propose
addition / edit of
names to names
layer
-RSS feeds

How the names backbone will
support services
We’re working to enable service level access to the
data, by:
• Establishing authority
• Reducing duplication
• Data standards to represent well-known entities
• Persistent identifiers on those well-known entities
• Meaningful versioning – what changed, when
• Enabling remote curation

Timescales 2013

Till March:
First release : familial and generic classification
April – August:
Extend to name occurrence layer
Extend to species – incorporate WCS
Prioritise compilation process
September onwards (inc TDWG):
Comms w. service providers / consumers

Empfohlen

Content Registration at Crossref - LIVE Kuala LumpurCrossref

Introduction to Crossref - Crossref LIVE BangkokCrossref

Introduction to Crossref - Crossref LIVE Kuala LumpurCrossref

Introduction to Crossref: History, Mission, MembersCrossref

Crossref Metadata and Metadata ServicesCrossref

Managing plagiarism: Similarity CheckCrossref

Introduction to CrossrefCrossref

Crossref LIVE UK OnlineCrossref

Empfohlen

Content Registration at Crossref - LIVE Kuala LumpurCrossref

Introduction to Crossref - Crossref LIVE BangkokCrossref

Introduction to Crossref - Crossref LIVE Kuala LumpurCrossref

Introduction to Crossref: History, Mission, MembersCrossref

Crossref Metadata and Metadata ServicesCrossref

Managing plagiarism: Similarity CheckCrossref

Introduction to CrossrefCrossref

Crossref LIVE UK OnlineCrossref

FAIR Dataversevty

Text and Data MiningCrossref

2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...Crossref

System Update (2011 CrossRef Workshops)Crossref

Oct 14 NISO Webinar: Cloud and Web Services for LibrariesNational Information Standards Organization (NISO)

Understanding Crossref MetadataCrossref

Nosql database presentationmusaab fathi

Introduction to Web ServicesJeffrey Anderson

Linked data and the future of librariesRegan Harper

Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118Crossref

IPNI PhytoKeys integrationnickyn

Kew at the pro-iBiosphere data hackathonnickyn

Rda p5-env-plenary-nnnickyn

A names backbone - a graph of taxonomynickyn

Challenges in developing names services - RDAnickyn

829 tdwg-2015-nicolson-kew-strings-to-thingsnickyn

names-backbone-graph-TDWGnickyn

Building a names backbonenickyn

10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...DuraSpace

NISO Webinar: Library Linked Data: From Vision to RealityNational Information Standards Organization (NISO)

Metadata, Open Access and More: Crossref presentationCrossref

Accelerating Delivery of Data Products - The EBSCO WayMongoDB

Weitere ähnliche Inhalte

Was ist angesagt?

FAIR Dataversevty

Text and Data MiningCrossref

2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...Crossref

System Update (2011 CrossRef Workshops)Crossref

Oct 14 NISO Webinar: Cloud and Web Services for LibrariesNational Information Standards Organization (NISO)

Understanding Crossref MetadataCrossref

Nosql database presentationmusaab fathi

Introduction to Web ServicesJeffrey Anderson

Linked data and the future of librariesRegan Harper

Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118Crossref

Was ist angesagt? (10)

FAIR Dataverse

Text and Data Mining

2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...

System Update (2011 CrossRef Workshops)

Oct 14 NISO Webinar: Cloud and Web Services for Libraries

Understanding Crossref Metadata

Nosql database presentation

Introduction to Web Services

Linked data and the future of libraries

Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118

Andere mochten auch

IPNI PhytoKeys integrationnickyn

Kew at the pro-iBiosphere data hackathonnickyn

Rda p5-env-plenary-nnnickyn

A names backbone - a graph of taxonomynickyn

Challenges in developing names services - RDAnickyn

829 tdwg-2015-nicolson-kew-strings-to-thingsnickyn

names-backbone-graph-TDWGnickyn

Building a names backbonenickyn

Andere mochten auch (8)

IPNI PhytoKeys integration

Kew at the pro-iBiosphere data hackathon

Rda p5-env-plenary-nn

A names backbone - a graph of taxonomy

Challenges in developing names services - RDA

829 tdwg-2015-nicolson-kew-strings-to-things

names-backbone-graph-TDWG

Building a names backbone

Ähnlich wie Services and Kew's (names) data

10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...DuraSpace

NISO Webinar: Library Linked Data: From Vision to RealityNational Information Standards Organization (NISO)

Metadata, Open Access and More: Crossref presentationCrossref

Accelerating Delivery of Data Products - The EBSCO WayMongoDB

NISO access related projects (presented at the Charleston conference 2016)Christine Stohn

RA21 Charleston Library Conference Presentation National Information Standards Organization (NISO)

Why do they call it Linked Data when they want to say...?Oscar Corcho

Crossref for Ambassadors - Introductory webinarCrossref

Crossref for Ambassadors - Introductory webinarVanessa Fairhurst

IASSIST40: Data management & curation workshopRobin Rice

Crossref LIVE US OnlineCrossref

LDAP - Lightweight Directory Access ProtocolS. Hasnain Raza

RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.

RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu

Enterprise ready: a look at Neo4j in productionNeo4j

DataShare for UC CampusesUniversity of California Curation Center

Crossref XML and tools for small publishers (EASE Conference 2018)Crossref

November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...National Information Standards Organization (NISO)

Hide the Stack:Toward Usable Linked Dataaba-sah

Paving the way to open and interoperable research data service workflows Prog...ResearchSpace

Ähnlich wie Services and Kew's (names) data (20)

10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...

NISO Webinar: Library Linked Data: From Vision to Reality

Metadata, Open Access and More: Crossref presentation

Accelerating Delivery of Data Products - The EBSCO Way

NISO access related projects (presented at the Charleston conference 2016)

RA21 Charleston Library Conference Presentation

Why do they call it Linked Data when they want to say...?

Crossref for Ambassadors - Introductory webinar

IASSIST40: Data management & curation workshop

Crossref LIVE US Online

LDAP - Lightweight Directory Access Protocol

RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...

RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...

Enterprise ready: a look at Neo4j in production

DataShare for UC Campuses

Crossref XML and tools for small publishers (EASE Conference 2018)

November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...

Hide the Stack:Toward Usable Linked Data

Paving the way to open and interoperable research data service workflows Prog...

Services and Kew's (names) data

1. Services, and Kew’s (names) data Nicky Nicolson, RBG Kew

2. Outline • What we’re working on at the moment • A names backbone • Kew’s role • What “services” we have just now • … and why so few? • Considerations • Services on the names backbone & timescales

3. The current situation… Many overlapping systems, few links

4. … and what we’re aiming for: Authoritative data, reduced duplication, many more links

5. Names are key to linking the data: build a “names backbone” == “an environment for the management of multiple overlapping classifications and tracking how these change over time” Not a monolith: • Built on a layered view of the domain – clearly separating names and taxonomy • Names form the objective basis for higher layers

6. Names backbone: a layered environment

7. Name occurrence layer AKA “Nomen-clutter” == any attempt at the transcription of a name..

8. Names layer Holds objective published facts about a name: -Orthography - Authorship - Protologue reference - Type citation - Objective synonymy

9. Concepts layer Hypotheses draw names together to form concepts via heterotypic synonymy

10. Names backbone is wider than Kew • We need to draw in data curated elsewhere, both names and concepts: • Vascular plants • “Lower” plants • Mycology • ... and zoological names …Kew’s role is as a service consumer as well as a service provider

11. What services we have at the moment Various things for particular projects … Used by known partners ... Answering specific, tactical needs Are these really services? • Not widely advertised • Not opened up for anybody to use ...Not necessarily a strategic commitment

12. Service example: OpenUp Name and concept checking for the data quality toolkit. • Standard message format But: • Concepts not persistently identified • No throughput management, so not widely available

13. i.e. a (short term) system view… Many overlapping systems, few links

14. … rather than the long term Authoritative data, persistently identified

15. Short term view == unhappy man in Glasgow

16. A long-term, sustainable service: 1. Authoritative data 2. Persistently identified 3. Standards based

17. ... and it also needs: • Robustness / sustainability • Management of throughput • Communications with end users • Support • Help • Example code • Usage monitoring • Sharing usage logs • Terms of use

18. Analogy with collaborative development • Technical considerations vs • Social / political considerations

19. All this should be service accessible… …persistently identified data classes & inter-connections

20. Services: name occurrence layer - Data input / output: DwCA -Linking and reviewing links -RSS feeds to indicate activity

21. Services: names layer - Data input / output: TCS -Propose addition / edit of names -RSS feeds to indicate activity

22. Services: concepts layer - Data input / output: TCS -Create classifications using names -Propose addition / edit of names to names layer -RSS feeds

23. How the names backbone will support services We’re working to enable service level access to the data, by: • Establishing authority • Reducing duplication • Data standards to represent well-known entities • Persistent identifiers on those well-known entities • Meaningful versioning – what changed, when • Enabling remote curation

24. Timescales 2013 Till March: First release : familial and generic classification April – August: Extend to name occurrence layer Extend to species – incorporate WCS Prioritise compilation process September onwards (inc TDWG): Comms w. service providers / consumers

Hinweis der Redaktion

Many systems few links.Huge overlap in data and functionalityA single scientific question can be answered in multiple different ways
DEFRA funded project – for Kew internal information management, but applicable wider.Staffed with a development team of 5, and a data improvement team of 4, plus people working on project management and business change.Names are crucial to Kew’s scientific work and day to day management of the collections.We have many systems which hold nomenclatural and taxonomic information
Name occurrence layer – any informal attempt at the transcription of a nameSome name occurrences are code governed names – eligible to appear in the next layer – the names layer – this holds all the objective published facts about a name – its orthography, authorship, protologue reference, type citation and objective synonymyConcepts layer – hypotheses draw these names together to form concepts via heterotypic synonymy.Most people are interested in working with concepts. Unfortunately most people are only armed with name occurrences.
Name occurrence layer – any informal attempt at the transcription of a nameSome name occurrences are code governed names – eligible to appear in the next layer – the names layer – this holds all the objective published facts about a name – its orthography, authorship, protologue reference, type citation and objective synonymyConcepts layer – hypotheses draw these names together to form concepts via heterotypic synonymy.Most people are interested in working with concepts. Unfortunately most people are only armed with name occurrences.
IPNI / IF / Zoobank
WSCP etc
Maybe ask about distribution / replicated system?
OpenUp as an example
OpenUp as an example
Many systems few links.Huge overlap in data and functionalityA single scientific question can be answered in multiple different ways
Robust / quality /sustainable?
Social issues are hard to resolve... But its not all doom and gloom:
We aim to open up our resources and allow their use as a platform upon which others can work.