The document discusses CrossRef's mission to enable easy identification and use of trustworthy electronic content. It does this through reference linking, cited by linking, metadata services, plagiarism detection, and new initiatives like CrossMark and Contributor ID. The document also discusses three identity problems: authentication and authorization, name variations, and disambiguation. It proposes using unique identifiers like DOIs and Contributor IDs to help solve these problems and enable knowledge discovery and authentication.
2. CrossRef’s Mission
To enable easy identification and
use of trustworthy electronic
content by promoting the
cooperative development and
application of a sustainable
infrastructure
3. CrossRef’s Mission
To enable easy identification and
use of trustworthy electronic
content by promoting the
cooperative development and
application of a sustainable
infrastructure
4. CrossRef’s Mission
To enable easy identification and
use of trustworthy electronic
content by promoting the
cooperative development and
application of a sustainable
infrastructure
5. CrossRef’s Mission
To enable easy identification and
use of trustworthy electronic
content by promoting the
cooperative development and
application of a sustainable
infrastructure
6. How we fulfill our mission:
• Reference Linking
• CitedBy Linking
• CrossRef Metadata
Services
• CrossCheck
Plagiarism Detection
• New initiatives like
CrossMark and
Contributor ID
7. 1. Experts build system
Internet Trust Antipattern
In Google We Trust? Geoffrey Bilder, Journal of Electronic Publishing, vol. 9, no. 1, Winter 2006
3.
Us
er
s
ap
pe
ar
2.
Sy
st
e
m
to
ut
ed
as
n
o
n-
hi
er
ar
4. Carp
ensues
5.
Re
gul
ati
on
re
st
or
es
or
de
r
stem touted as
n-hierarchical
9. 1. Authentication and Authorization:
How do I get to what I want?
Uname/Passwords:
bank: 12345
retirement: 12345cam
editorial system: mom’s birthday
12. Carol Anne Meyer
Meyer, Carol Anne
Carol A. Meyer
Meyer, Carol A.
Meyer, C.A.
C.A. Meyer
Carol Meyer
Meyer, Carol
C. Meyer
Meyer, C.
Ms. Meyer
Mrs. Meyer
Carol Meyer Ruben
Mrs. Ruben
Mrs. Ruben-Meyer
Mommy
foodislove
30. Find out more...
• Researcher Identification Primer
www.gen2phen.org/researcher-identification-primer
• CrossTech Blog
http://www.crossref.org/CrossTech/
• Gobbledygook, 17 Feb 2009, Martin Fenner’s interview of Geoff Bilder
http://network.nature.com/people/mfenner/blog/2009/02/17/interview-with-geoffrey-bilder
• “Are You Ready to Become a Number?” Martin Enserink, Science 27 March 2009:
1662-1664, DOI: 10.1126/science.323.5922.1662
• “I Am Not a Scientist, I Am a Number.” Philip Bourne and PE Fink (2008) PLoS
Comput Biol 4(12): e1000247. DOI:10.1371/journal.pcbi.1000247
31.
32.
33. Photo : Tim Parkinson
A few more things
about CrossRef
There is a battle being waged between opposing camps about whether it is best to rely on experts as authoritative sources of information or whether the “wisdom of crowds” and web 2.0 magic will replace experts.
Did you hear about the time when a president went to visit a nursing home?He walked up to a lady in a wheel chair and tried to be polite, but found that he wasn't being very successful at carrying on a conversation with her. Finally, in desperation he said, "Ma'am, do you know who I am?"She answered, "No sir, I don't know who you are--but if you go up to that desk they can tell you."
2700 publishers and societies
Almost 20,000 journal titles
close to 36 million dois registered.
Reference linking includes multiple content types, backfiles
System is started by self-selecting core group of high-trust technologists (or specialists of some sort).
System is touted as authority-less, non-hierarchical, etc. But this is not true (see 1).
The general population starts using the system.
The system nearly breaks under the strain of untrustworthy users.
Regulatory controls are instituted to restore order. Sometimes they are automated, sometimes not.
If the regulatory controls work, the system is again touted as authority-less, non-hierarchical, etc. But this is not true (see 5).
To explain this trend I want to quickly review the Internet Trust Anti-pattern as outlined by Geoffrey Bilder – many of you may have heard him speak about this but he defined it very well in a blog posting and journal article from a couple of years ago.
I think that in the InternetTrust Anti-pattern lies hope for scholarly publishers.
Solutions include OpenId, Shibboleth
Open ID is user centric--claim your own id
Shibboleth is Institution verified
Vince Smith http://vsmith.info/OpenID
These are all legitimate names for me.
Multiple authors can have the same name
Zoominfo is an example of claiming
Problems is aging and abandoned data.
Which John Smith wrote this paper
Much worse with lots of asian authors – fewer surnames.
Names changes
Classic problems of identity – CR is in a position to do something cross-eyed.
Internal holding name – talked to you about some of the problems with name authority but publishers have some specific problems with use of manuscript tracking systems – so authentication is a real issue. So we must address both authorization and disambiguation.
Why not just authors? reviewers, editors, correspondents, bloggers, commenters
Use Cases?
Knowledge Discovery:
* Determine what IDs authored/edited/reviewed document X * What documents where authored/edited/reviewed by ID Y * What IDs are related to ID Z and what is the nature of that relationship (e.g. co-authored, edited, reviewed) * What (subject to privacy settings) is the profile information for ID Z (e.g. institutional affiliation, email address, etc.) * All the author IDs and their respective publications where the institutional affiliation recorded by the author is X * Etc.At this point I feel obliged to point out that the bulk of our requirements gathering has been focused on trying to understand the needs of our member publishers. The reason I mention this here is that the bulk of the “authentication” use cases that we identified are all focused around making publisher back-office systems less cumbersome. So, for instance, publishers are interested in using a “contributor id” for: * single sign-on (SSO) for manuscript tracking systems * Disambiguating contact information for use by editorial offices, royalty payments systems, copyright clearances, etc. * Automatic updating of email addresses for table of contents (TOC) alerts and other automated email communications * Automated tools for detecting potential reviewers, including tools for detecting potential conflicts of interest * Synchronization with publisher web site user profiles and granting researchers customized, privileged access to content based on profiles * Understanding all of the manifold ways in which an individual “contributes” to a publisher or a field (e.g. As an editor, reviewer, letter writer, conference chair, etc.).
Study of cracked pots. Author submits manuscript –first time for the journal. So the journal says – who are you? Go get an ID from CrossRef – registers and gets identity to provide to journal. Claim things like what’s they’ve published in the past –institution, homepage. Interesting thing
Interesting thing – article is published and metadata goes to CrossRef – author IDs can be sent too. Once CR has them then we can figure out what that author has published (not just what they claim they’ve published). Publishers are verifying author claims. Multiple levels of claims – some more authoritative than others. Series of different trust measures.
1. Lookup all contributors associated with a particular DOI
2. Lookup all DOIs associated with a contributor
3.Lookup profile metadata associated with a specified contributor
4. Lookup all contributors associated with another contributor and the type of association that exists between them (co‐author, co‐cited, etc.)
Other organizations are interested in the identification problem--we are working with them.
Thomson Researcher ID--DisambiguationScopus AUthor Identifier-Disambiguation
Author Resolver--Proquest disambiguation plus profiles from Scholar Universe
Gen2Phen--Researcher Identification
Prototyping
So, what’s happening now?
Discussions around "contributor Ids" (aka "Author ID, Researcher ID, etc.) seem to be becoming quite popular. In the interview that I pointed to in my last post, I mentioned that CrossRef has been talking with a group of researchers who were very interested in creating some sort of authenticated contributor ID as a mechanism for controlling who gets trusted access to sensitive genome-wide aggregate genotype data.
Well, I'm delighted to say that said group of researchers(at the GEN2PHEN project) have created a "Researcher Identification Primer" website in which they outline the many use-cases and issues around creating a mechanism for unambiguously identifying and/or authenticating researchers. This looks like a great resource and I expect it will serve as a useful focus for further discussion around the issue.
Article in Yesterday’s Science by
Some Other Stuff:
Books Best Practices
Reciprocal Reference Linking
Backfiles
46 publishers
9 of the largest 10 publishers in CR
12.5 million documents indexed.
17K titles
Join us