SlideShare ist ein Scribd-Unternehmen logo
1 von 26
1
CrossRef 2010 Annual Member Meeting - London
Page 1
CrossRef Annual Meeting – London
Workshops
15 November 2010
2
CrossRef 2010 Annual Member Meeting - London
Page 2
Workshops Agenda
9:30-10:00 Coffee & Tea
10:00-11:30 System Update ….. Andrew Gilmartin, Senior Software Developer
Chuck Koscher, Director of Technology
11:30-12:00 CrossMark …………Geoff Bilder, Director of Strategic Initiatives
12:00-12:30 CrossCheck ………. Kirsty Meddings, Product Manager
12:30-1:15 Lunch
1:15-2:15 Metadata Quality ….Patricia Feeney, Product Support Manager
2:15-2:45 Cited-by Linking ……Carol Anne Meyer, Business Development and Marketing Manager
Chuck Koscher
2:45-3:00 Break
3:00-4:00 DOI Workflow Issues, Working with Vendors ……. Carol Anne Meyer
4:00-4:45 Boot Camp …………Carol Anne Meyer
Tim Pickard, System Support Analyst/Administrator
4:45-5:15 Books ……………….Carol Anne Meyer,
3
CrossRef 2010 Annual Member Meeting - London
Page 3
System Update
System status
Rewrite review
Rewrite implementation
Discussion
4
CrossRef 2010 Annual Member Meeting - London
Page 4
System status
5
CrossRef 2010 Annual Member Meeting - London
Page 5
System status
6
CrossRef 2010 Annual Member Meeting - London
Page 6
7
CrossRef 2010 Annual Member Meeting - London
Page 7
8
CrossRef 2010 Annual Member Meeting - London
Page 8
Old system
New Q system
The switch
9
CrossRef 2010 Annual Member Meeting - London
Page 9
System status
 Deposit processing
 Suspended for 2+ weekends for Oracle DB upgrade (to 11g)
 Processing times remain the same. (50% under 5 min, 30% more
under 1 hour)
 Large re-deposits (Elsevier plans for 2011)
 Schema relatively unchanged in 2+ years (keep adding MIME types)
 Deposit focus areas for 2011 (other than the re-write)
 Investigating a PDF upload option
(for depositing a DOI and the article’s references)
 Modify WebDeposit to allow users to edit an existing DOI’s metadata
 Maintenance on NLM DTD deposit tool
10
CrossRef 2010 Annual Member Meeting - London
Page 10
11
CrossRef 2010 Annual Member Meeting - London
Page 11
12
CrossRef 2010 Annual Member Meeting - London
Page 12
System rewrite
 The Query System (QS), where are we?
 Its taking longer than we thought.
 QS is 99% ready, periodically in service since starting mid Sept.
 Last vexing problem solved (database connection dead-lock)?
 Performance improvement is very encouraging.
 Metrics and measurement capability greatly improved.
 The Deposit System (DS), where are we?
 Initial design discussions have been held, documentation is under way.
 Implementation to start in January
 Development will take until mid year, then lots of testing
 Data clean up will be part of the migration process (mainly titles)

13
CrossRef 2010 Annual Member Meeting - London
Page 13
⋅ Modularity of design
⋅ Utility of APIs where possible
⋅ Data stores that enable XML capabilities
⋅ Minimize dependency on proprietary systems
•That CrossRef should ultimately own the intellectual property in the software at
the heart of its operations
• That CrossRef should not risk or jeopardize the reliability and throughput offered
by the existing system
• That CrossRef should remain free to develop further applications for other
purposes which need to interface to the reference-linking systems and/or its data
System rewrite
 Rewrite 2 Working Group – Final report November 2008
14
CrossRef 2010 Annual Member Meeting - London
Page 14
O Unit testing (regression testing)
O Scriptable data ingestion work flow
F Richer metadata querying capability
F Integrated data harvesting capabilities
F Dealing with references using other character sets
F Crawling of content to ingest it Vs. making deposits
F Depositing of non journal content
F Matching unstructured references using full text of equiv
F Querying of non journal content
F Real time, cited-by queries - with data-driven APIs
F More content types, including language variants
F More granular typing of journal articles
F Improved reporting facilities
F More useful user interface for members
System rewrite  Rewrite 2 Working Group – Final report November 2008
A Solve NFS issue
A Federate architecture
A Database redesign
A Redesign event notification model (replace email)
O Improved title management and control
O Better publisher/member management model
O Daily testing/monitoring (data integrity)
O Built in health and status monitoring
O Performance improvements and queue management
Now Soon Later
15
CrossRef 2010 Annual Member Meeting - London
Page 15
System rewrite
 Technical Objectives
 Rework a 9 year old system
 Address a declining performance situation
 Improve administrative aspects (better control and reporting)
 Facilitate extensibility
 Staff’s better able to respond due to operational insight
 Business Objectives
 Develop internal capabilities ($ for every change Atypon makes)
 Secure an independent path (continuity)
 Benefit of being on a ‘shared’ platform nearing zero
 Maintain access to technical expertise
16
CrossRef 2010 Annual Member Meeting - London
Page 16
Late 2010 thru mid 2011
HAProxy
HTTP Traffic
MySQLLucene BerkelyDB
FrontEnd QS
(Spring)
(Tomcat)
Deposit System
(old Atypon EDS)
BackEnd ServicesActive MQ
(messaging)
Oracle
(prime)
Oracle
(active-stndby) Constant
Replication
Oracle Group
New System
External messaging
(email, etc)
System rewrite
17
CrossRef 2010 Annual Member Meeting - London
Page 17
Q3 2011
HAProxy
HTTP Traffic
MySQLLucene BerkelyDB
FrontEnd QS
(Spring)
(Tomcat)
BackEnd ServicesActive MQ
(messaging)
Oracle
(prime)
Oracle
(active-stndby) Constant
Replication
Oracle Group
New System
External messaging
(email, etc)
Deposit Processing
FrontEnd DS
(Spring)
(Tomcat)
• File Upload
• Deposit reports
System rewrite
18
CrossRef 2010 Annual Member Meeting - London
Page 18
Deposit DB
(prime)
Oracle Group
System rewrite
Deposit DB
(standby)
Oracle
Replication
Query DB
(prime)
Query DB
(secondary)
Oracle
Replication
New Deposit System
Database
Updater
Primary Datacenter
Deposit DB
(prime)
Query DB
(prime)
Recovery Datacenter
19
CrossRef 2010 Annual Member Meeting - London
Page 19
 Query system feature changes
 Tweaks to the matching logic (discoveries made porting the code)
 Fixed some nagging characteristics
 Aggregate email notices for alerts
 Implement HTTP free-text matching (still needs work, ‘alpha’)
 Process free-text references for cited-by (done, stable, uses
refXpress)
 Establish better user model:
1. Username & passwords for members (Query and deposit)
2. Registered email address of non members (Query only)
System rewrite
Use
Registration
Form
Receive
Email
Use
Validation
Form
20
CrossRef 2010 Annual Member Meeting - London
Page 20
21
CrossRef 2010 Annual Member Meeting - London
Page 21
System rewrite
Simple Text Query
22
CrossRef 2010 Annual Member Meeting - London
Page 22
 Uses refXpress to break free-text into XML suitable for
running a metadata query
23
CrossRef 2010 Annual Member Meeting - London
Page 23
 Uses QS Formatted Citation Parse to break free-text into
XML suitable for running a metadata query, if that fails uses
QS Formatted Citation Search (with high threshold) to search
Lucene index for a DOI.
24
CrossRef 2010 Annual Member Meeting - London
Page 24
But be careful !
<citation key="b53_366">
<unstructured_citation>
53. O.S. Gudmundsson, S.D.S. Jois, D.G. Vander Velde, T.J. Siahaan, B. Wang, and R.T.
Borchardt (1999 ) The effect of conformation on the membrane permeability of coumarinic
acid-
and phenylpropionic acid-based cyclic prodrugs of opioid peptides.J. Pept. Res.53 , 383 -392 .
</unstructured_citation>
</citation>
<doi type="journal_article">
10.1034/j.1399-3011.1999.00076.x</doi>
<issn type="print">1397-002X</issn>
<issn type="electronic">1399-3011</issn>
<journal_title>Journal of Peptide Research</journal_title>
<contributors>
<contributor sequence="first" contributor_role="author">
<given_name>O.S.</given_name>
<surname>Gudmundsson</surname>
</contributor>
</contributors>
<volume>53</volume>
<issue>4</issue>
<first_page>383</first_page>
<last_page>392</last_page>
<year media_type="print">1999</year>
<publication_type>full_text</publication_type>
<article_title>
The effect of conformation on the membrane permeation of
coumarinic acid- and phenylpropionic acid-based cyclic
prodrugs of opioid peptides
</article_title>
<doi type="journal_article">
10.1034/j.1399-3011.1999.00077.x</doi>
<issn type="print">1397-002X</issn>
<issn type="electronic">1399-3011</issn>
<journal_title>Journal of Peptide Research</journal_title>
<contributors>
<contributor sequence="first" contributor_role="author">
<given_name>O.S.</given_name>
<surname>Gudmundsson</surname>
</contributor>
</contributors>
<volume>53</volume>
<issue>4</issue>
<first_page>403</first_page>
<last_page>413</last_page>
<year media_type="print">1999</year>
<publication_type>full_text</publication_type>
<article_title>
The effect of conformation of the acyloxyalkoxy-based cyclic
prodrugs of opioid peptides on their membrane permeability
</article_title>
Still yields this
But the correct answer is this
25
CrossRef 2010 Annual Member Meeting - London
Page 25
 Deposit system feature changes
 Parse the XML prior to accepting the upload
 Process XML, register DOIs regardless of metadata ingestion
problems
 Provide aggregated deposit reports (daily?)
 Integrate Schematron checks into deposit process
 Robust title ownership model, not based on prefix, with shared
ownership options
 Separate deposit metadata organization from query metadata
organization (ex. Allow title substitution
System rewrite
26
CrossRef 2010 Annual Member Meeting - London
Page 26
Andrew

Weitere ähnliche Inhalte

Ähnlich wie System Update 2010 CrossRef Workshops Chuck Koscher

10135 a 11
10135 a 1110135 a 11
10135 a 11
Bố Su
 
Oracle Data Integrator Administration and Development
Oracle Data Integrator Administration and DevelopmentOracle Data Integrator Administration and Development
Oracle Data Integrator Administration and Development
Md. Noor Alam
 
AnalysisServices
AnalysisServicesAnalysisServices
AnalysisServices
webuploader
 
System Update 2010 Annual Meeting
System Update 2010 Annual MeetingSystem Update 2010 Annual Meeting
System Update 2010 Annual Meeting
Crossref
 
Toan Tran_Resume_2016
Toan Tran_Resume_2016Toan Tran_Resume_2016
Toan Tran_Resume_2016
Toan Tran
 
Oracle data integrator training from hyderabad
Oracle data integrator training from hyderabadOracle data integrator training from hyderabad
Oracle data integrator training from hyderabad
FuturePoint Technologies
 

Ähnlich wie System Update 2010 CrossRef Workshops Chuck Koscher (20)

ALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and ToolsALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and Tools
 
10135 a 11
10135 a 1110135 a 11
10135 a 11
 
Innovate2010 jazz keynote
Innovate2010 jazz keynoteInnovate2010 jazz keynote
Innovate2010 jazz keynote
 
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
 
CrossRef System Update
CrossRef System UpdateCrossRef System Update
CrossRef System Update
 
WSO2Con USA 2017: Journey of Migration from Legacy ESB to Modern WSO2 ESB Pla...
WSO2Con USA 2017: Journey of Migration from Legacy ESB to Modern WSO2 ESB Pla...WSO2Con USA 2017: Journey of Migration from Legacy ESB to Modern WSO2 ESB Pla...
WSO2Con USA 2017: Journey of Migration from Legacy ESB to Modern WSO2 ESB Pla...
 
Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
Structure, Personalization, Scale: A Deep Dive into LinkedIn SearchStructure, Personalization, Scale: A Deep Dive into LinkedIn Search
Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
 
Oracle Data Integrator Administration and Development
Oracle Data Integrator Administration and DevelopmentOracle Data Integrator Administration and Development
Oracle Data Integrator Administration and Development
 
Sensor metadata management with SWM (SMWCon fall 2013)
Sensor metadata management with SWM (SMWCon fall 2013)Sensor metadata management with SWM (SMWCon fall 2013)
Sensor metadata management with SWM (SMWCon fall 2013)
 
Summary Technical Presentation (General)
Summary Technical Presentation (General)Summary Technical Presentation (General)
Summary Technical Presentation (General)
 
AnalysisServices
AnalysisServicesAnalysisServices
AnalysisServices
 
Database project
Database projectDatabase project
Database project
 
Phase Two: What’s Next for Life Sciences and Enterprise Content Management
Phase Two: What’s Next for Life Sciences and Enterprise Content ManagementPhase Two: What’s Next for Life Sciences and Enterprise Content Management
Phase Two: What’s Next for Life Sciences and Enterprise Content Management
 
Teched Middle East New World of SharePoint 2010 Administration with Joel Oles...
Teched Middle East New World of SharePoint 2010 Administration with Joel Oles...Teched Middle East New World of SharePoint 2010 Administration with Joel Oles...
Teched Middle East New World of SharePoint 2010 Administration with Joel Oles...
 
Integration Patterns for Big Data Applications
Integration Patterns for Big Data ApplicationsIntegration Patterns for Big Data Applications
Integration Patterns for Big Data Applications
 
System Update 2010 Annual Meeting
System Update 2010 Annual MeetingSystem Update 2010 Annual Meeting
System Update 2010 Annual Meeting
 
09 si(systems analysis and design )
09 si(systems analysis and design )09 si(systems analysis and design )
09 si(systems analysis and design )
 
Toan Tran_Resume_2016
Toan Tran_Resume_2016Toan Tran_Resume_2016
Toan Tran_Resume_2016
 
Oracle data integrator training from hyderabad
Oracle data integrator training from hyderabadOracle data integrator training from hyderabad
Oracle data integrator training from hyderabad
 
10 Ways SharePoint 2010 Will Impact your Notes Migration
10 Ways SharePoint 2010 Will Impact your Notes Migration10 Ways SharePoint 2010 Will Impact your Notes Migration
10 Ways SharePoint 2010 Will Impact your Notes Migration
 

Mehr von Crossref

Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
Crossref
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usar
Crossref
 

Mehr von Crossref (20)

Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
 
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
 
Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español
 
Working with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowWorking with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to know
 
Преимущества и варианты использования метаданных в Crossref / The Value and ...
Преимущества и варианты использования метаданных в Crossref /  The Value and ...Преимущества и варианты использования метаданных в Crossref /  The Value and ...
Преимущества и варианты использования метаданных в Crossref / The Value and ...
 
Seminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolSeminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en español
 
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
 
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
 
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
 
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
 
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ... Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 
Los Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionLos Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de Investigacion
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, Indonesia
 
crossmark update
crossmark updatecrossmark update
crossmark update
 
Participation reports webinar December 2020
Participation reports webinar December 2020Participation reports webinar December 2020
Participation reports webinar December 2020
 
Participation reports webinar November 2020
Participation reports webinar November 2020Participation reports webinar November 2020
Participation reports webinar November 2020
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usar
 
Crossref LIVE UK Online
Crossref LIVE UK OnlineCrossref LIVE UK Online
Crossref LIVE UK Online
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

System Update 2010 CrossRef Workshops Chuck Koscher

  • 1. 1 CrossRef 2010 Annual Member Meeting - London Page 1 CrossRef Annual Meeting – London Workshops 15 November 2010
  • 2. 2 CrossRef 2010 Annual Member Meeting - London Page 2 Workshops Agenda 9:30-10:00 Coffee & Tea 10:00-11:30 System Update ….. Andrew Gilmartin, Senior Software Developer Chuck Koscher, Director of Technology 11:30-12:00 CrossMark …………Geoff Bilder, Director of Strategic Initiatives 12:00-12:30 CrossCheck ………. Kirsty Meddings, Product Manager 12:30-1:15 Lunch 1:15-2:15 Metadata Quality ….Patricia Feeney, Product Support Manager 2:15-2:45 Cited-by Linking ……Carol Anne Meyer, Business Development and Marketing Manager Chuck Koscher 2:45-3:00 Break 3:00-4:00 DOI Workflow Issues, Working with Vendors ……. Carol Anne Meyer 4:00-4:45 Boot Camp …………Carol Anne Meyer Tim Pickard, System Support Analyst/Administrator 4:45-5:15 Books ……………….Carol Anne Meyer,
  • 3. 3 CrossRef 2010 Annual Member Meeting - London Page 3 System Update System status Rewrite review Rewrite implementation Discussion
  • 4. 4 CrossRef 2010 Annual Member Meeting - London Page 4 System status
  • 5. 5 CrossRef 2010 Annual Member Meeting - London Page 5 System status
  • 6. 6 CrossRef 2010 Annual Member Meeting - London Page 6
  • 7. 7 CrossRef 2010 Annual Member Meeting - London Page 7
  • 8. 8 CrossRef 2010 Annual Member Meeting - London Page 8 Old system New Q system The switch
  • 9. 9 CrossRef 2010 Annual Member Meeting - London Page 9 System status  Deposit processing  Suspended for 2+ weekends for Oracle DB upgrade (to 11g)  Processing times remain the same. (50% under 5 min, 30% more under 1 hour)  Large re-deposits (Elsevier plans for 2011)  Schema relatively unchanged in 2+ years (keep adding MIME types)  Deposit focus areas for 2011 (other than the re-write)  Investigating a PDF upload option (for depositing a DOI and the article’s references)  Modify WebDeposit to allow users to edit an existing DOI’s metadata  Maintenance on NLM DTD deposit tool
  • 10. 10 CrossRef 2010 Annual Member Meeting - London Page 10
  • 11. 11 CrossRef 2010 Annual Member Meeting - London Page 11
  • 12. 12 CrossRef 2010 Annual Member Meeting - London Page 12 System rewrite  The Query System (QS), where are we?  Its taking longer than we thought.  QS is 99% ready, periodically in service since starting mid Sept.  Last vexing problem solved (database connection dead-lock)?  Performance improvement is very encouraging.  Metrics and measurement capability greatly improved.  The Deposit System (DS), where are we?  Initial design discussions have been held, documentation is under way.  Implementation to start in January  Development will take until mid year, then lots of testing  Data clean up will be part of the migration process (mainly titles) 
  • 13. 13 CrossRef 2010 Annual Member Meeting - London Page 13 ⋅ Modularity of design ⋅ Utility of APIs where possible ⋅ Data stores that enable XML capabilities ⋅ Minimize dependency on proprietary systems •That CrossRef should ultimately own the intellectual property in the software at the heart of its operations • That CrossRef should not risk or jeopardize the reliability and throughput offered by the existing system • That CrossRef should remain free to develop further applications for other purposes which need to interface to the reference-linking systems and/or its data System rewrite  Rewrite 2 Working Group – Final report November 2008
  • 14. 14 CrossRef 2010 Annual Member Meeting - London Page 14 O Unit testing (regression testing) O Scriptable data ingestion work flow F Richer metadata querying capability F Integrated data harvesting capabilities F Dealing with references using other character sets F Crawling of content to ingest it Vs. making deposits F Depositing of non journal content F Matching unstructured references using full text of equiv F Querying of non journal content F Real time, cited-by queries - with data-driven APIs F More content types, including language variants F More granular typing of journal articles F Improved reporting facilities F More useful user interface for members System rewrite  Rewrite 2 Working Group – Final report November 2008 A Solve NFS issue A Federate architecture A Database redesign A Redesign event notification model (replace email) O Improved title management and control O Better publisher/member management model O Daily testing/monitoring (data integrity) O Built in health and status monitoring O Performance improvements and queue management Now Soon Later
  • 15. 15 CrossRef 2010 Annual Member Meeting - London Page 15 System rewrite  Technical Objectives  Rework a 9 year old system  Address a declining performance situation  Improve administrative aspects (better control and reporting)  Facilitate extensibility  Staff’s better able to respond due to operational insight  Business Objectives  Develop internal capabilities ($ for every change Atypon makes)  Secure an independent path (continuity)  Benefit of being on a ‘shared’ platform nearing zero  Maintain access to technical expertise
  • 16. 16 CrossRef 2010 Annual Member Meeting - London Page 16 Late 2010 thru mid 2011 HAProxy HTTP Traffic MySQLLucene BerkelyDB FrontEnd QS (Spring) (Tomcat) Deposit System (old Atypon EDS) BackEnd ServicesActive MQ (messaging) Oracle (prime) Oracle (active-stndby) Constant Replication Oracle Group New System External messaging (email, etc) System rewrite
  • 17. 17 CrossRef 2010 Annual Member Meeting - London Page 17 Q3 2011 HAProxy HTTP Traffic MySQLLucene BerkelyDB FrontEnd QS (Spring) (Tomcat) BackEnd ServicesActive MQ (messaging) Oracle (prime) Oracle (active-stndby) Constant Replication Oracle Group New System External messaging (email, etc) Deposit Processing FrontEnd DS (Spring) (Tomcat) • File Upload • Deposit reports System rewrite
  • 18. 18 CrossRef 2010 Annual Member Meeting - London Page 18 Deposit DB (prime) Oracle Group System rewrite Deposit DB (standby) Oracle Replication Query DB (prime) Query DB (secondary) Oracle Replication New Deposit System Database Updater Primary Datacenter Deposit DB (prime) Query DB (prime) Recovery Datacenter
  • 19. 19 CrossRef 2010 Annual Member Meeting - London Page 19  Query system feature changes  Tweaks to the matching logic (discoveries made porting the code)  Fixed some nagging characteristics  Aggregate email notices for alerts  Implement HTTP free-text matching (still needs work, ‘alpha’)  Process free-text references for cited-by (done, stable, uses refXpress)  Establish better user model: 1. Username & passwords for members (Query and deposit) 2. Registered email address of non members (Query only) System rewrite Use Registration Form Receive Email Use Validation Form
  • 20. 20 CrossRef 2010 Annual Member Meeting - London Page 20
  • 21. 21 CrossRef 2010 Annual Member Meeting - London Page 21 System rewrite Simple Text Query
  • 22. 22 CrossRef 2010 Annual Member Meeting - London Page 22  Uses refXpress to break free-text into XML suitable for running a metadata query
  • 23. 23 CrossRef 2010 Annual Member Meeting - London Page 23  Uses QS Formatted Citation Parse to break free-text into XML suitable for running a metadata query, if that fails uses QS Formatted Citation Search (with high threshold) to search Lucene index for a DOI.
  • 24. 24 CrossRef 2010 Annual Member Meeting - London Page 24 But be careful ! <citation key="b53_366"> <unstructured_citation> 53. O.S. Gudmundsson, S.D.S. Jois, D.G. Vander Velde, T.J. Siahaan, B. Wang, and R.T. Borchardt (1999 ) The effect of conformation on the membrane permeability of coumarinic acid- and phenylpropionic acid-based cyclic prodrugs of opioid peptides.J. Pept. Res.53 , 383 -392 . </unstructured_citation> </citation> <doi type="journal_article"> 10.1034/j.1399-3011.1999.00076.x</doi> <issn type="print">1397-002X</issn> <issn type="electronic">1399-3011</issn> <journal_title>Journal of Peptide Research</journal_title> <contributors> <contributor sequence="first" contributor_role="author"> <given_name>O.S.</given_name> <surname>Gudmundsson</surname> </contributor> </contributors> <volume>53</volume> <issue>4</issue> <first_page>383</first_page> <last_page>392</last_page> <year media_type="print">1999</year> <publication_type>full_text</publication_type> <article_title> The effect of conformation on the membrane permeation of coumarinic acid- and phenylpropionic acid-based cyclic prodrugs of opioid peptides </article_title> <doi type="journal_article"> 10.1034/j.1399-3011.1999.00077.x</doi> <issn type="print">1397-002X</issn> <issn type="electronic">1399-3011</issn> <journal_title>Journal of Peptide Research</journal_title> <contributors> <contributor sequence="first" contributor_role="author"> <given_name>O.S.</given_name> <surname>Gudmundsson</surname> </contributor> </contributors> <volume>53</volume> <issue>4</issue> <first_page>403</first_page> <last_page>413</last_page> <year media_type="print">1999</year> <publication_type>full_text</publication_type> <article_title> The effect of conformation of the acyloxyalkoxy-based cyclic prodrugs of opioid peptides on their membrane permeability </article_title> Still yields this But the correct answer is this
  • 25. 25 CrossRef 2010 Annual Member Meeting - London Page 25  Deposit system feature changes  Parse the XML prior to accepting the upload  Process XML, register DOIs regardless of metadata ingestion problems  Provide aggregated deposit reports (daily?)  Integrate Schematron checks into deposit process  Robust title ownership model, not based on prefix, with shared ownership options  Separate deposit metadata organization from query metadata organization (ex. Allow title substitution System rewrite
  • 26. 26 CrossRef 2010 Annual Member Meeting - London Page 26 Andrew