CrossRef Overview and Initiatives, Copenhagen, June 2013
1. CrossRef Overview and Initiatives
Rachael Lammey
Product Manager
Denmark on the Road to Best Practice in Scholarly Publishing
June 2013
2. • Association of scholarly publishers
– 1500 Members
– Representing 4000 publishers
• 1700 Library Affiliates who can query system for DOIs and metadata
• DOI Registration Agency for Scholarly Publications (mostly in English
language)
• Over 60 million DOIs assigned
3. • Provides services publishers
cannot accomplish alone - they
require collaboration
• 16-member international board
of directors from membership
• Many types of publishers:
Commercial, societies, non-
profits, university presses, Open
Access publishers -
66% non-profit
• All subjects: STM, humanities,
social science, professional
More about CrossRef
4. ?
Why do publishers join?
To get persistent identifiers for their content
To drive more traffic to their content
To turn references into hyperlinks
To pull in cited-by links (who cites this?)
Participate in other collaborative services
(CrossCheck, CrossMark)
7. User clicks on
CrossRef DOI
reference link in
Journal A
Guo W, Wang ZY, Wang YL, Zhang ZP, Gui JF. Isolation and
characterization of six microsatellite markers in the large yellow croaker
(Pseucosciaena crocea Richardson). Mol Ecol Notes, 2005, 5(2): 369–371.
[CrossRef]
DOI
directory
returns URL
User accesses
cited article in
Journal B
9. "CrossRef's goal is to be a trusted
collaborative organization with broad
community connections; authoritative and
innovative in support of a persistent,
sustainable infrastructure for scholarly
communication."
16. Prefix:
Assigned to members
Format is 10.XXXX (or 10.XXXXX)
Identifies who initially created the
DOI
Prefix does not identify the
current owner of the DOI
Suffix:
Unique within a prefix –
a DOI can only be
assigned to one item
Consistent
Logical
Easily documented
Readily implemented
Assigning DOIs to your content
17. Response page must include:
bibliographic information about the item
means to access full text
the DOI
DOI Display guidelines!
http://www.crossref.org/02publishers/doi_display_guidelines.html
CrossRef DOIs should always be displayed as permanent URLs in the
online environment.
YES: http://dx.doi.org/10.5555/imadoi
NO: doi: 10.5555/imadoi
Publish
18. DOIs are required on the response page, recommended on other pages:
Tables of contents
Abstracts
Full text HTML and PDF articles and other scholarly documents
Citation downloads to reference management systems
Metadata feeds to third parties
“How to Cite This” instructions on content pages
Social networking links
Anywhere users are directed to a permanent, stable, or persistent
link to content.
22. Deposit
interfaces
The vast majority of
transactions are made via a
machine interface
Public interface:
Web deposit form:
http://www.crossref.org/webDeposit/
System interface:
http://doi.crossref.org
HTTP
26. CrossRef Cited-By Linking
Who’s Citing You?
Discover how your
publications are being
cited and incorporate
DOI links to the citing
content into your online
publication.
27. • 283 Members
• 305,917,275 Cited-By Links
• 23,243,270 DOIs with Cited-By Links
• 18,116,258 Documents with
References
Who’s Citing You?
33. 473 publishers
Over 37 million content items indexed
86,760 titles
80,000+ manuscripts checked each month
34.
35.
36.
37. A logo that identifies a publisher-maintained copy
of a piece of content
Clicking the logo tells you
Whether there have been any updates
If this instance is being maintained by the
publisher
Where the publisher-maintained version is
Other important publication record information
What is CrossMark?
38.
39.
40.
41.
42.
43.
44.
45. • 60,000 CrossMark records
450+ corrections
• 6000 “hits” on 1500 unique
records
Statistics
46. Where to find help:
Help documentation: http://help.crossref.org
CrossRef support: email support@crossref.org or visit
http://support.crossref.org
Webinars: http://www.crossref.org/01company/webinars.html
Staying up to date:
Announcements forum:
http://support.crossref.org/forums/147622-
announcements
subscribe via RSS or email
CrossRef Quarterly: CrossRef newsletter
CrossRef Blog
CrossTech Blog
Fees:
So at the end of 1999 a group of publishers got together and decided to collaborate to solve the problem and CrossRef was set up as a strategic org - CrossRef is a non-profit membership association of publishers with all members being equal. We were founded to provide services to publishers that are best achieved collaboratively - or doing those things that publishers can ’t do on their own. We are run by and for publishers and we include all types of publishers.
What do I mean by this?
CrossRef was founded eleven years ago to solve the problem of broken links. The web is all about links, but links break. This is annoying if you ’ re browsing the web and want to follow an interesting link, but in the context of scholarly publishing it becomes more than annoying - if you can ’ t follow a citation from one paper to another you ’ re being hampered in your research. CItation linking is one of the greatest benefits of online publishing, but it really does need to be reliable
When a researcher is looking for high quality scholarly content you don ’t want to retrieve the 404 - page not found error. Having this happen undermines trust in the scholarly system and in scholarly publishers.
And it works like this: publishers use CrossRef DOIs to link to content, usually from the references at the end of articles. Users click on those DOI-based links and are referred via the CrossRef database to the cited article at it ’ s correct location on the web. If content moves the publisher only has to update the CrossRef database once, and all of the publishers that are linking to their content using CrossRef DOIs will be redirected to the content in its new location.
Best way is to give an example. DOI takes me to the article homepage. Book Chapter etc. Lord of the Mines. I’ll come back to DOIs in a moment. This DOI will always can take me
For a bit more background, this is the CrossRef mission statement. We ’ re a not-for profit membership association here to support the scholarly communications industry. As you can see our mission is about more than just reference linking, even though linking is still the main thing that we do for publishers. And as such we have developed other services to meet the needs of our membership - originality screening being one of them. I’ll run you through some other initiatives in this presentation, including a very new one.
1. A DOI link consists of two parts: the DOI resolver URL, and the DOI itself. 2. When combined, the DOI is ‘made actionable’, that is, made into a link. 3. When you click on the link, you’re taken to the current URL of the item. The URL for the DOI in this example has been updated 5 times since it was initially deposited in June 2004. 4. We updated our DOI display guidelines in early August – we now ask that DOIs be represented as a link with the http://dx.doi.org prefix
Registration of content with CrossRef - reference matching and use of DOIs for linking. Hop between different publisher systems.
Multiple resolution--the DOI gives the user a choice of which links to follow.
A few numbers for you to give some idea of how CrossRef has grown in the ten years since its launch... Books are the fastest growing at the moment - most publishers have assigned DOIs to their journals and journal archives, but more and more are now starting to assign them to their books, and to register their book metadata with CrossRef. Publishers are also registering components - 274,000 so far. rts, standards, dissertations, and supplementary materials, but we do have some flexibility so if you want to assign DOIs to a content type not listed here, please ask. DATA>
In March 2013 there were over 90 million clicks on CrossRef DOI links, so 90 million citations resolved to content.
Basically it should be on the article homepage
This is your introduction to XML! In brief, XML offers a widely adopted standard way of representing text and data in a format that can be processed without much human or machine intelligence. Information formated in XML can be exchanged across platforms, languages, and applications, and can be used with a wide range of development tools and utilities. You can see it’s fairly self-explanatory.
We call the process of sending in metadata to the CR system ‘depositing’. We sometimes use the terms ‘deposit’ and ‘register’ interchangeably but they’re slightly different - when a publisher ‘deposits’ a DOI, the metadata is added to the CrossRef database, making the DOI retrievable. The DOI is also registered with the Handle resolver, meaning the DOI and URL only – no citation metadata is recorded by the Handle resolver. Immediately after the submission is processed, the system sends you a submission log. This is very important - data is often messy, and we try to keep the messy stuff out of our database, so there are many reasons your submission might fail.
Submission methods vary from very robust complicated systems to one guy cutting and pasting stuff from Word into our web deposit form (which converts the data to XML). Most deposits are made via machine interfaces. Data is sent to us via HTTP POST – we do have a simple java-based tool that can be used for uploads, it’s available in our help documentation. Many publishers prefer to create their own tools. We do not currently accept FTP deposits.
Basic web-deposit form.
We’re also working on a reference extraction tool that will strip references from PDFs. Explain how it works. Helps with XML aspect of things.
Clearest way is to give examples.
The first thing that I always say when I talk about CrossCheck is that although we call it a plagiarism detection service, it doesn ’ t actually detect plagiarism.
So to look at the process in a little more detail: you submit your manuscript to the iThenticate system, and it is by default checked against three databases of content. It is checked against web content - iThenticate indexes web pages in much the same way as a search engine, but with the added advantage that they keep an archive of web pages going back eight years. The manuscript is checked against the CrossCheck database, which contains the content from all of the participating CrossCheck publishers. And it ’ s also checked against a growing repository of online and offline content that iThenticate is gathering and indexing, including datbases from Gale and Ebsco, and sites such as PubMed and Arxiv.org. And as before, matches retrieved by comparison with these databases are pulled into a report for an editor to examine in more detail.
The progress of CrossCheck to date. Very comprehensive database - can see list of titles on our website.
And you get to this, which is the first of four different report manipulations available - this one is called the Similarity Report: Manuscript on left, matches on right from highest to lowest. Scroll up and down to compare. URLs (plus date) or citation depending on database. Links. Ability to exclude a match if you know it ’ s not relevant. Click on the left to see side by side report Show link to Document Viewer and touch on report view
You might have spotted in the previous examples that the technology isn ’ t just looking for word for word matches. The way that it breaks the text down allows it to spot passages of text with word substitutions, so it is looking for similar as well as identical text. In this example you can see that some of the words have been very subtly substituted or moved but the technology still picks them up. Ask publishers to let authors know they ’ re using it, messages in submission systems or on homepages.
UPDATES!
So CrossMark. At its simplest it ’s a logo that publishers will apply to content that they publish. When a reader clicks on the logo they will quickly and easily be able to tell: The best way to explain it is to show some examples.
This is the killer example is an important one. This is a PDF that includes the CrossMark logo - a clickable logo. Providing the the user is online, when they click on the logo it will pop up a webpage...
with the CrossMark dialogue box giving the latest status. This is an example where CrossMark is at its most useful, alerting the user to the fact that the document they have locally on their machine has updates. This is going to be the most common scenario in which CrossMark really provides the reader with a valuable service, as it ’s alerting them to something that they would otherwise most probably have missed.
The second example is of a corrected article from another of our pilot publishers, the International Union of Crystallographers. Here, clicking on the logo brings up the same CrossMark dialog box...
..but with information that alerts the reader to changes. Updates are available for this document. It says that there is a correction and gives a link to the correction.
You may have noticed in that previous example that there is an additional tab appearing in the dialogue box at the top here - the record tab.
This is where you can show additional metadata about the piece of content if you choose to do so. The publisher decides what to put here and can use these fields to define publication practices. You don ’t have to populate this tab at all if you prefer not to, and if you don’t supply an additional metadata the tab simply won’t show. The fields are defined and labelled by the publisher, and there can be as many or as few as you choose. This particular data from another of our pilot participants, the International Union of Crystallography, and you can see that they are sharing some really useful information on the copyright, review process and publication history.