Ringgold was excited to present at the 2015 Frankfurt Book Fair, Professional & Scientific Information Hot Spots Stage.
'Metadata & Standard Identifiers in Scholarly Publishing’ showed how your organization can benefit from our data services in the ever-challenging scholarly landscape.
How is Real-Time Analytics Different from Traditional OLAP?
Metadata & Standards in Scholarly Communication
1. Metadata & Standards in
Scholarly Communication
Frankfurt Book Fair – Hot Spot
October 14 2015
@RinggoldInc
4. What problems did this lack of standardization
cause?
• Economic
• Logistical
• Military
People were unable to move freely about,
or to capitalize on new markets
Places were disconnected from each other,
and some towns became known as “break
of gauge” hubs, the place where railway
systems met up.
Things – goods – could not move freely
about, be sold into new markets.
Conversion technologies needed to be
developed, such as special dual gauge
railcars were developed.
Result: Added expense, time, and
inefficiency.
5. Solution: Conversion to a Standard
• May 31, 1886
• 36 hours
• 3 inches
• 11,500 miles of track across
the southern United States
Result: Unified infrastructure to move goods & people
7. What are
we trying to
connect?
People: Authors,
Members, Editors,
Readers, Researchers
Places: Licensees,
Publishers,
Funders,
Intermediaries
Things: Ideas,
Content, Research
data, Grants,
Citations
8. Where are we trying to move our ideas & our
information?
• Around our company
• To/from external partners
• To/from scholars around
the globe
• Into the great unknown
9. What problems are we facing?
• Entity management:
Jens-Peter Mueller or J-P. Müller?
Uni Hannover or Hanover College?
• Discoverability: Users, librarians,
researchers, students.
• Interoperability: Systems, languages,
data silos based on functions.
11. Your passport to the
world of institutions
• More than 400,000 institutions
playing roles in scholarly
communications
• Disambiguate: Ringgold ID
• Describe: Up to 25 pieces of
structured metadata about each one
• Link: Organized hierarchically
Problems solved: Entity
management & interoperability
12. Identify Data Elements
• Ringgold Identifier
• Name: official & alternatives
• Location
• URL/domain
• Size metrics
• Tier assignments: JISC,
Carnegie, Ringgold
• Authentication: Athens, IPs
• Ringgold Type: sector & subject
• Links: Hierarchical & consortia
• ISNI matched to each Identify
record
• Expanded descriptive metadata:
• Granular subjects
• Reach, sites
• Economic model, governance
• Level within hierarchy
• Mission, description
• Activity status
13. Prepare your published
content for the journey
• Structured metadata
• Descriptive abstracts to drive
discovery
• Professionally written
• Systems-agnostic delivery
• Broad licensing
Problem solved: Discoverability
14. ProtoView Data
Elements
• Ringgold ID & ISNI for publisher
• Abstract for book and chapters
• Bibliographic info
• URL of work
• Ringgold Subjects
• TOC
• DOIs
• Chapter titles and page ranges
• Cover image
15. Bon Voyage: Journeys in
Scholarly Communications
Three examples of institutional identifiers & structured data in action
16. Taylor & Francis: Normalizing in-house data
• Challenge: Overworked customer
service, licensees frustrated
• Underlying cause: Duplicate and
inaccurate customer records
• Solution: Applied Ringgold
Identifiers to sold-to and
licensed-to accounts
"If we still had all the
previous problems, we'd
need a customer service
team that is double the
size.”
--Sarah Wright, Customer
Services Director
Results:
18. ORCID: Results
Stats: As of September 2015:
• 340,00+ ORCID records with educational affiliation
• 327,000+ ORCID records with employment affiliation
Benefits already being realized:
• Linking researchers to their thesis ID and degree-granting higher education
institution
• Tracking grantees and researchers across their research career
• Supporting access to institutional resources
• Enabling access to research findings supported by public funds
• Providing unambiguous affiliation data during manuscript submission or grant
application
• Enabling credit to be given for peer review and other contributions
ORCID sees a future where additional identifiers and registries are linked together,
further advancing the potential for connections.
19. Aries + Copyright Clearance Center
Connecting multiple organizations &
data sets
Challenge: Correct application
of APC rules & discounts
• Multiple systems & data sources involved
• Complex criteria + complex institutional
relationships
Solution: Get everyone
speaking the same language
20. Solving the challenge of APC discounts
Publisher
identifies
institutions
eligible for
discount
Holds &
administers
pricing rules
Author
affiliation
entered in
EM
Ringgold
ID 12266
21. Where are we going? Everywhere.
Ringgold’s Mission
To provide identifiers and
structured data to power the
efficient exchange of
information throughout the
scholarly research community.
Does anyone know what this is a map of?
5 minute railway story.
MAY DELETE THIS SLIDE IN FAVOR OF THE NEXT ONE ONLY.
Rail gauge, in other words, the distance between the two rails in a standard railroad track. This is a map of current gauges, and as you can see there is no global standard.
In the days when we moved hardgoods and people around from place to place, railroads became a key mode of transport. As miraculous as rail travel was, there was one thing that limited its potential: a lack of a standard gauge.
Train gauge: and example of a infrastructure standard that enables smooth, efficient, broad transmission of people and things.
As disconnected as this is, It used to be worse: In the US
In the US, multiple gauges were used, ranging from 2 feet to 6 feet wide. As the country became more interconnected, lines began to meet up in the concentrated northeast, a standard gauge of approx. 4 feet 9 inches began to be adopted (thanks to many imported railcars from the UK, which used this gauge).
Following the Civil War, trade between the South and North grew and the break of gauge became a major economic nuisance. Competitive pressures had forced all the Canadian railways to convert to standard gauge by 1880, and Illinois Central converted its south line to New Orleans to standard gauge in 1881, putting pressure on the southern railways.
In 1886, the southern railroads agreed to coordinate changing gauge on all their tracks. After considerable debate and planning, most of the southern rail network was converted from 5 ft (1,524 mm) gauge to 4 ft 9 in (1,448 mm) gauge, then the standard of the Pennsylvania Railroad, over two remarkable days beginning on Monday, May 31, 1886. Over a period of 36 hours, tens of thousands of workers pulled the spikes from the west rail of all the broad gauge lines in the South, moved them 3 in (76 mm) east and spiked them back in place. The new gauge was close enough that standard gauge equipment could run on it without problem. By June 1886, all major railroads in North America were using approximately the same gauge. The final conversion to true standard gauge took place gradually as track was maintained.[1] Now, the only broad-gauge rail systems in the United States are some city transit systems
https://en.wikipedia.org/wiki/Track_gauge_in_the_United_States
http://southern.railfan.net/ties/1966/66-8/gauge.html
HANDOFF TO CHRISTINE: INFORMATION INFRASTRUCTURE
New standards & infrastructure are needed if we are to connect our scholarly world. But, we are still in essence trying to connect people, places, and things.
WHAT ARE WE TRYING TO TRANSMIT?: Ideas in whatever format they may be: books, journals, research data; the content itself.
Records about the business of scholarship: financial and other transactional information such as subscription fees, APCs.
We want to connect authors with reviewers and readers; and we want to connect those people with institutions: where they perform their research, got their education, the inst that funded their research, or who published their article, who should be paid royalties and owns the rights.
Now we are talking about the systems which house, transmit, and receive all that data:
Our internal systems: if we are a publisher we probably have more than a few: peer review, finance, fulfillment
And to all our publishing service partners, licensees, libraries, and users. We do not know all the data-driven systems which can possibly come in contact with our content & our data.
Every time one system talks to another, there is the possibility of a breakdown in communication: and what we are aiming for is a frictionless environment.
What form can those communication breakdowns take? Multiple:
Not understanding which person, place, or thing, to which the data refers.
Not realizing all the possible systems which might interact with our data. Most of us deal with a global user base, libraries around the world, and an array of publishing services systems. How can we minimize the possibility that we will be misunderstood?
What Ringgold specializes in is the ability to dramatically improve the infrastructure, but supplying standard institutional identifiers, and structured metadata for both the world of institutions and scholarly content.
Metadata & IDs assure your content/information smooth passage on its journey, like a passport.
Unique identifiers, like a passports, are universally understood & recognized, and offers smooth passage from one system to another. Agnostic of language, territory.
A few identifiers which are broadly adopted include ORCIDs for individual researchers, ISSNs for journals, DOI’s for digital content, and the Ringgold Identifier for institutions.
Our Ringgold IDs are part of the Identify Database, which covers all manner of institutions playing any sort of role in scholarly comms, such as universities, funders, publishers, commercial entities, hospitals, etc. In addition to applying a unique ID to each organization, we go further and describe them in great detail, and join all related records together into family trees or hierarchies. We provide this connective tissue between universities and their subject departments, or companies and their subsidiaries, so that our users can make sense of the complex networks.
In addition to solving the problems of entity management and data interoperability with unique IDs, the supplemental metadata & hierarchies power business intelligence for our clients, allows them to minimize manual entry of data (which really cuts down on the need for retrospective cleanup and error correction). In short it acts as an institutional authority file, understood by more than 70 publishers and intermediaries in the scholarly space.
Here is the descriptive metadata we hold. So, in addition to having unique institutions disambiguated, this metadata allows for robust and multifaceted analysis of the institutional base.
Metadata is a well-packed suitcase: we don’t know where we will wind up, but we want to be prepared for any user, any system, even ones which we don’t know about yet. Our users explore the world of scholarly content via knowledge bases, discovery services, web searching, and of course library catalogs – so our editors create and augment any metadata from the publisher, so that it is as complete as possible. One thing that distinguishes ProtoView is our creation of original, keyword-rich abstracts for titles and chapters. Our research shows that abstracts enhance the rate of purchase, which I’m sure will surprise no one who has ever bought a book off of Amazon.
In addition to creating this high-quality metadata, we go further and disseminate it via our network of licensees.
These data elements can vary depending on the needs of the publisher. Important to note that while we create data that will enable broad discovery, it also supports interoperability via the inclusion of Ringgold IDs, ISNIs, and DOIs.
So let’s take a look at a few examples of ways our partners and clients have applied these principles to solve common infrastructure problems.
This is an example of how a publisher used unique identifiers to solve issues within their 4 walls:
T&F found themselves with a large volume of duplicate or inaccurate customer accounts, which resulted in the inability to quickly ascertain what content a given user was entitled to and resolve access denials.
So they decided to clean up their customer records using an external authority file, the Identify Database. They applied Ringgold IDs to all institutional accounts, which allowed them to confidently ID all those duplicates, and radically reduce the amount of time spent on the simplest of inquiries – they experienced a 70% reduction in the volume of such queries, now that they’ve got more trustworthy data in their customer service and authentication systems.
I’ve already mentioned ORCID, so I wanted to highlight how they have applied Identify in order to join up researchers with their affiliations: Rather than having users enter their employment and educational affiliations using free text – which would just need to get cleaned up at some point in the future – ORCID has embedded Identify’s institutions in the back-end of their registration system. Now, researchers can instantly create a clean link in their record to the right affiliation. It’s a researcher-friendly process, appearing to them as a simple drop-down menu, but powerfully embedding authoritative institutional metadata the ORCID record. We’ve made a frictionless and unambiguous connection between a person and a place.
And finally, here’s a case where multiple players are transmitting clean data – using the passport of standard identifiers – to seamlessly connect systems and resolve an issue. The problem is one that is really scaling up, with the growth of open access: the application of institution-based discounts to article processing charges. Criteria for discounting or waiving OA charges are becoming more complex: they can be based on the type of institution, the institution’s status as a subscriber or institutional affiliate, and ensuring an instant, accurate calculation of that APC is now expected by authors at the point they submit their manuscript.
What did we do? We began working with all the players, to remove some of the friction – the ambiguity – from these data transfers.
We have three different business partners involved in this workflow: publisher, peer review & mss system in Aries, and CCC handling the calculation & billing for APCs. With each of them using the same “passport” or set of standard identifiers, systems can transmit data easily, and make that connection between the author’s clean affiliation, and the publisher’s set of eligible institutions. CCC is also able to travel up & down the Ringgold “family trees” of institutions, so when the publisher has determined that the Univ of Michigan gets a discount, but the author says they are from the UM Medical School, the proper connection is still made, and the publishers business rules are supported.
This is really an ideal case, and shows just the beginning of what’s possible when we all work using standards and structured metadata.
Because the only thing that is certain as we develop is that our data about people/places/things, and the content itself, will continue to be created, transmitted and stored in a digital infrastructure, and be intercepted by a growing number of systems. We will want to analyze our data, and parse our content in more granular ways, which will only be effectively done if we future-proof it with the use of proper identifiers and metadata.