Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Text and Data Mining with CrossRef

980 Aufrufe

Veröffentlicht am

Text and Data Mining with CrossRef. At The British Library's "Text Mining: Opportunities and Tools" event.

Veröffentlicht in: Wissenschaft
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Text and Data Mining with CrossRef

  1. 1. Text and Data Mining with CrossRef Joe Wass www.crossref.org jwass@crossref.org @joewass British Library, November 2014 Joe Wass (CrossRef) 1 / 30
  2. 2. Academic Life before Computers Joe Wass (CrossRef) 2 / 30
  3. 3. URLs for everyone! Joe Wass (CrossRef) 3 / 30
  4. 4. But linkrot! 3% of links unavailable after a year 1 1https://en.wikipedia.org/wiki/Link_rot Joe Wass (CrossRef) 4 / 30
  5. 5. DOI Digital Object Identi
  6. 6. er http://dx.doi.org/10.5555/12345678 persistent unique cross-publisher industry standard you can click them! Joe Wass (CrossRef) 5 / 30
  7. 7. 2 est 2000 2Other DOI Registration Agencies Available Joe Wass (CrossRef) 6 / 30
  8. 8. DOIs forever Joe Wass (CrossRef) 7 / 30
  9. 9. DOIs everywhere Joe Wass (CrossRef) 8 / 30
  10. 10. DOIs everywhere! Joe Wass (CrossRef) 9 / 30
  11. 11. DOIs everywhere!! Joe Wass (CrossRef) 10 / 30
  12. 12. DOIs everywhere!!! Joe Wass (CrossRef) 11 / 30
  13. 13. DOIs everywhere!!!! Joe Wass (CrossRef) 12 / 30
  14. 14. Metadata In Metadata Out Joe Wass (CrossRef) 13 / 30
  15. 15. CrossRef Association of scholarly publishers 15 years old this year 70,416,598 DOIs not only links I CrossCheck plagiarism detection I CrossMark retraction notices I an API I metadata F titles F tables of contents F authors F ISSN F datasets F funding information F license information F full-text links Joe Wass (CrossRef) 14 / 30
  16. 16. What's this got to do with TDM? It's all about the links (and metadata). Work ow for Text and Data Mining 1 Identify corpus 2 Somehow get hold of corpus 1 Figure out the license for each document 2 Figure out where to get the document 3 Download it 3 Clever algorithms 1 That's your problem Repeat for very large numbers of documents. Joe Wass (CrossRef) 15 / 30
  17. 17. CrossRef Metadata DOIs + license information + full-text URLs = corpus cross-publisher API cross-publisher data schema Joe Wass (CrossRef) 16 / 30
  18. 18. Joe Wass (CrossRef) 17 / 30
  19. 19. api.crossref.org Joe Wass (CrossRef) 18 / 30
  20. 20. Demo time! Joe Wass (CrossRef) 19 / 30
  21. 21. Joe Wass (CrossRef) 20 / 30
  22. 22. Joe Wass (CrossRef) 21 / 30
  23. 23. Joe Wass (CrossRef) 22 / 30
  24. 24. Joe Wass (CrossRef) 23 / 30
  25. 25. Joe Wass (CrossRef) 24 / 30
  26. 26. Joe Wass (CrossRef) 25 / 30
  27. 27. Joe Wass (CrossRef) 26 / 30
  28. 28. Joe Wass (CrossRef) 27 / 30
  29. 29. Joe Wass (CrossRef) 28 / 30
  30. 30. More metadata > 1,100,000 articles and counting 11 million more coming soon more publishers in the pipeline I American Institute of Physics (AIP) I American Physical Society (APS) I Elsevier I HighWire Press I Institute of Physics (IoPP) I Springer I Taylor & Francis I Walter de Gruyter I Wiley 120,000 Creative Commons articles Joe Wass (CrossRef) 29 / 30
  31. 31. Text and Data Mining with CrossRef Joe Wass www.crossref.org jwass@crossref.org @joewass British Library, November 2014 http://www.crossref.org http://tdmsupport.crossref.org http://api.crossref.org https://github.com/CrossRef/rest-api-doc/blob/master/rest_api_tour.md Joe Wass (CrossRef) 30 / 30

×