SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
Experience with MarkLogic at Elsevier

Bradley P. Allen and Darin McBeath, Elsevier Labs
Presentation at NoSQL Now 2011
San Jose, CA, USA
2011-08-25
Elsevier: who we are


 •   Elsevier , part of the Reed Elsevier group, is a world leading publisher of
     scientific, technical and medical full text literature. 7,000 employees in over
     70 offices worldwide publish more than 2,500 journal titles and 11,000
     online books.
                                                                             Global
                          Global                  Global                     market
                        community                audience
                                                                                    North
                          7,000 editors         15 million doctors,                America
                    =   70,000 editorial
                                           +    nurses and health     +
                        board members             professionals
                                                   10 million+            Europe
                        200,000 referees                                                Asia-
                                               researchers in 4,500                    Pacific
                        500,000+ authors            institutes
                                                5 million students




                                                                                                 2
MarkLogic at Elsevier


 • MarkLogic is used pervasively throughout our
   business
   – Science and Technology
   – Health Sciences
   – Operations
 • It is also a strategic technology for our sister
   Reed Elsevier organization LexisNexis
 • We were an early adopter of MarkLogic
   – Began working with MarkLogic in 2001

                                                      3
Motivations for MarkLogic adoption


 • Company was committed to XML standard for
   content representation
 • Vision of building Web services on top of XML
   content repositories
 • Enabling new information solutions through
   reuse and mashup of existing journal and
   book content
 • Relational technologies not a good fit
                                               4
MarkLogic applications at Elsevier
Business     Product          Description                                                MarkLogic Features Used            Launched
Science &  Scopus             The largest abstract and citation database containing      Repository, Transformation, and    2005
Technology                    both peer-reviewed research literature and quality         some extensions (such as
                              web sources                                                fast/accurate counting).
                              Contains 50+ million abstracts
                              Original application that used MarkLogic
             Scopus           Offline version of Scopus                                  Repository, Transformation         2007
             Custom Data
             EMBASE           Biomedical database with over 24 million indexed           Repository, Search,                2008
                              records                                                    Transformation
             Methods          Task-specific search for experimental methods and          Repository, Content Processing     2010
             Navigator        protocols across 40,000 articles                           Framework
             HazMat           Chemical safety database based on Bretherick's             Repository, Content Processing     2010
             Navigator        Handbook of Reactive Chemical Hazards, others              Framework
             SciVal Funding   Database of current research funding opportunities         Repository, Content Processing     2010
                              and award information                                      Framework
Health       Books            1000 books supporting multiple Health Sciences             Repository, ability to present     2006
Sciences                      applications (HESI, NursingConsult, MDConsult).            content quickly/easily by
                                                                                         chapter, section, paragraph
             Health           Health Sciences journal platform                           Repository, Search,                2007
             Connect                                                                     Transformation
             Linked Data      500,000 content enhancement metadata documents             Repository, Xpath and a handful    2011
             Repository       100% XQuery application                                    of proprietary extensions
Operations   ConSyn           Batch retrieval service for 10+ million journal articles   Search, Repository, Task Server,   2010
                                                                                         Zip, Security, Transformation         5
MarkLogic benefits and challenges at Elsevier


 • MarkLogic brings us two big benefits
   – Excellent fit with how we represent our content
   – Tools (XQuery, XSLT) that support working with that
     content representation
 • Those benefits come with challenges, some old,
   some new
   –   Developer productivity and adoption
   –   Standards and interoperability
   –   Software ecosystem
   –   Total solution fit
   –   TCO relative to other solutions

                                                           6
Developer productivity and adoption


 • XQuery can be a powerful language for rapid
   prototyping
   – Can support writing complete web applications
 • Experienced XQuery resources are difficult to
   find
   – Especially relative to emerging JSON/Web
     framework resources
 • Difficult to motivate developers committed to
   more mainstream frameworks, patterns, and
   languages

                                                     7
Standards and interoperability

 • Vendors view XQuery in different ways: some view it as a
   query language, some as a transformation language, some as
   a programming language, all of the above, etc.
 • These disparate views often lead to confusion in the
   community as to what really is XQuery
 • XQuery interoperability is currently difficult and it is doubtful
   that it ever will be beyond simple applications
    – Groups such as eXPath will help tidy up some interfaces, but there is
      far more work that needs to be done.
    – Elsevier Labs has investigated this issue in the context of the SciVal
      Showcase application using 4 different XQuery engines (MarkLogic,
      eXist, 28ms, and XQIB)
    – This experiment highlighted the differences in the implementations
      (and the looseness of the W3C recommendation)


                                                                               8
Software ecosystem


 • The eco-system around XQuery and
   MarkLogic is lacking
   – Not a tremendous amount of open source
     and/or 3rd party modules or language bindings
 • The IDEs and debugging tools (while vastly
   improved) are still not at par with other
   query languages


                                                     9
Total solution fit


 • MarkLogic started out as an XML database
   solution
 • It has added functionality (e.g. free text search)
   matured over the years
    – This is a big part of its intended use at LexisNexis
 • We struggle to understand the tradeoffs
   between a single solution vs. composition of
   best-of-breed solution (e.g. MarkLogic
   standalone vs. MarkLogic integrated with Solr)

                                                             10
TCO relative to other solutions


 • Traditional enterprise software licensing can
   lead to significant costs
 • NoSQL document database solutions with
   business models based on open source plus
   support services are an emerging alternative
 • Still working on determining TCO tradeoff
   between the two in an enterprise context


                                                   11
MarkLogic in the context of NoSQL in general


 • NoSQL before it was cool
 • But there are emerging differences between
   the document stores for traditional vs.
   Internet publishing
   – XML/XQuery/XSLT vs. JSON/UnSQL/Javascript
   – Manual scale-out vs automated scale-out
 • Overhead of legacy standards can be a drag
   – Where is XML in its adoption lifecycle?
   – How does HTML5 fit in?

                                                 12
Future use of MarkLogic at Elsevier

 • Persisting as foundation of content repository efforts
     – XML legacy drives continued use
 • Turnkey SaaS for publishing, newer NoSQL solutions competing for
   attention
     – Solutions that layer XML processing and query technologies on top of non-XML
       NoSQL stores are beginning to appear (e.g. Ambrosoft’s XML DB project)
 • Design choices driven by consumer Internet use cases may not yield as
   good a fit to information publishing as MarkLogic
     – Emphasis on join-free queries and use-case-driven indexing
 • We are watching to see how emerging best practices and design patterns
   associated with consumer Internet that are good fits are supported
   moving forward
     – Auto-scaling
     – Web application frameworks
     – HTML5



                                                                                      13
Summary


• We were an early adopter of MarkLogic
• Over ten years it has become a mature
  product that we rely on extensively across our
  business
• The response of MarkLogic to the emergence
  of NoSQL document stores, non-XML
  document serializations and application
  design patterns from the consumer Internet is
  of keen interest to us

                                               14

Weitere ähnliche Inhalte

Ähnlich wie Experience with MarkLogic at Elsevier

From Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and CollaborationsFrom Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and CollaborationsSimeon Warner
 
Ontology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical studyOntology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical studyDebashisnaskar
 
OOR--Open-Ontology-Repository--jun2010
OOR--Open-Ontology-Repository--jun2010OOR--Open-Ontology-Repository--jun2010
OOR--Open-Ontology-Repository--jun2010Peter Yim
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyDebashisnaskar
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
 
Semtech 2011 Elsevier PureDiscovery
Semtech 2011 Elsevier PureDiscoverySemtech 2011 Elsevier PureDiscovery
Semtech 2011 Elsevier PureDiscoveryvisha1gupta
 
Ala bh-em-201424-9
Ala bh-em-201424-9Ala bh-em-201424-9
Ala bh-em-201424-9zepheiraorg
 
Open archives initiatives(final)
 Open archives initiatives(final) Open archives initiatives(final)
Open archives initiatives(final)floyd taag
 
Open archives initiatives(final)
 Open archives initiatives(final) Open archives initiatives(final)
Open archives initiatives(final)floyd taag
 
Open archives initiatives(final)
 Open archives initiatives(final) Open archives initiatives(final)
Open archives initiatives(final)floyd taag
 
-Open Archives Initiatives(final)
-Open Archives Initiatives(final)-Open Archives Initiatives(final)
-Open Archives Initiatives(final)floyd taag
 
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...Open Science Fair
 
KOS Management - The case of the Organic.Edunet Ontology
KOS Management - The case of the Organic.Edunet OntologyKOS Management - The case of the Organic.Edunet Ontology
KOS Management - The case of the Organic.Edunet OntologyVassilis Protonotarios
 
Open archives initiatives(final)
 Open archives initiatives(final) Open archives initiatives(final)
Open archives initiatives(final)marevil awas
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
 
An introduction to repository reference models
An introduction to repository reference modelsAn introduction to repository reference models
An introduction to repository reference modelsJulie Allinson
 

Ähnlich wie Experience with MarkLogic at Elsevier (20)

From Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and CollaborationsFrom Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and Collaborations
 
394 wade word2007-ssp2008
394 wade word2007-ssp2008394 wade word2007-ssp2008
394 wade word2007-ssp2008
 
Ontology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical studyOntology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical study
 
OOR--Open-Ontology-Repository--jun2010
OOR--Open-Ontology-Repository--jun2010OOR--Open-Ontology-Repository--jun2010
OOR--Open-Ontology-Repository--jun2010
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical Study
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?
 
Semtech 2011 Elsevier PureDiscovery
Semtech 2011 Elsevier PureDiscoverySemtech 2011 Elsevier PureDiscovery
Semtech 2011 Elsevier PureDiscovery
 
Ala bh-em-201424-9
Ala bh-em-201424-9Ala bh-em-201424-9
Ala bh-em-201424-9
 
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
 
Open archives initiatives(final)
 Open archives initiatives(final) Open archives initiatives(final)
Open archives initiatives(final)
 
Open archives initiatives(final)
 Open archives initiatives(final) Open archives initiatives(final)
Open archives initiatives(final)
 
Open archives initiatives(final)
 Open archives initiatives(final) Open archives initiatives(final)
Open archives initiatives(final)
 
-Open Archives Initiatives(final)
-Open Archives Initiatives(final)-Open Archives Initiatives(final)
-Open Archives Initiatives(final)
 
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
 
KOS Management - The case of the Organic.Edunet Ontology
KOS Management - The case of the Organic.Edunet OntologyKOS Management - The case of the Organic.Edunet Ontology
KOS Management - The case of the Organic.Edunet Ontology
 
Open archives initiatives(final)
 Open archives initiatives(final) Open archives initiatives(final)
Open archives initiatives(final)
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
Knowledge Organization Systems (KOS): Management of Classification Systems in...
Knowledge Organization Systems (KOS): Management of Classification Systems in...Knowledge Organization Systems (KOS): Management of Classification Systems in...
Knowledge Organization Systems (KOS): Management of Classification Systems in...
 
An introduction to repository reference models
An introduction to repository reference modelsAn introduction to repository reference models
An introduction to repository reference models
 

Mehr von DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

Mehr von DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Kürzlich hochgeladen

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 

Kürzlich hochgeladen (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 

Experience with MarkLogic at Elsevier

  • 1. Experience with MarkLogic at Elsevier Bradley P. Allen and Darin McBeath, Elsevier Labs Presentation at NoSQL Now 2011 San Jose, CA, USA 2011-08-25
  • 2. Elsevier: who we are • Elsevier , part of the Reed Elsevier group, is a world leading publisher of scientific, technical and medical full text literature. 7,000 employees in over 70 offices worldwide publish more than 2,500 journal titles and 11,000 online books. Global Global Global market community audience North 7,000 editors 15 million doctors, America = 70,000 editorial + nurses and health + board members professionals 10 million+ Europe 200,000 referees Asia- researchers in 4,500 Pacific 500,000+ authors institutes 5 million students 2
  • 3. MarkLogic at Elsevier • MarkLogic is used pervasively throughout our business – Science and Technology – Health Sciences – Operations • It is also a strategic technology for our sister Reed Elsevier organization LexisNexis • We were an early adopter of MarkLogic – Began working with MarkLogic in 2001 3
  • 4. Motivations for MarkLogic adoption • Company was committed to XML standard for content representation • Vision of building Web services on top of XML content repositories • Enabling new information solutions through reuse and mashup of existing journal and book content • Relational technologies not a good fit 4
  • 5. MarkLogic applications at Elsevier Business Product Description MarkLogic Features Used Launched Science & Scopus The largest abstract and citation database containing Repository, Transformation, and 2005 Technology both peer-reviewed research literature and quality some extensions (such as web sources fast/accurate counting). Contains 50+ million abstracts Original application that used MarkLogic Scopus Offline version of Scopus Repository, Transformation 2007 Custom Data EMBASE Biomedical database with over 24 million indexed Repository, Search, 2008 records Transformation Methods Task-specific search for experimental methods and Repository, Content Processing 2010 Navigator protocols across 40,000 articles Framework HazMat Chemical safety database based on Bretherick's Repository, Content Processing 2010 Navigator Handbook of Reactive Chemical Hazards, others Framework SciVal Funding Database of current research funding opportunities Repository, Content Processing 2010 and award information Framework Health Books 1000 books supporting multiple Health Sciences Repository, ability to present 2006 Sciences applications (HESI, NursingConsult, MDConsult). content quickly/easily by chapter, section, paragraph Health Health Sciences journal platform Repository, Search, 2007 Connect Transformation Linked Data 500,000 content enhancement metadata documents Repository, Xpath and a handful 2011 Repository 100% XQuery application of proprietary extensions Operations ConSyn Batch retrieval service for 10+ million journal articles Search, Repository, Task Server, 2010 Zip, Security, Transformation 5
  • 6. MarkLogic benefits and challenges at Elsevier • MarkLogic brings us two big benefits – Excellent fit with how we represent our content – Tools (XQuery, XSLT) that support working with that content representation • Those benefits come with challenges, some old, some new – Developer productivity and adoption – Standards and interoperability – Software ecosystem – Total solution fit – TCO relative to other solutions 6
  • 7. Developer productivity and adoption • XQuery can be a powerful language for rapid prototyping – Can support writing complete web applications • Experienced XQuery resources are difficult to find – Especially relative to emerging JSON/Web framework resources • Difficult to motivate developers committed to more mainstream frameworks, patterns, and languages 7
  • 8. Standards and interoperability • Vendors view XQuery in different ways: some view it as a query language, some as a transformation language, some as a programming language, all of the above, etc. • These disparate views often lead to confusion in the community as to what really is XQuery • XQuery interoperability is currently difficult and it is doubtful that it ever will be beyond simple applications – Groups such as eXPath will help tidy up some interfaces, but there is far more work that needs to be done. – Elsevier Labs has investigated this issue in the context of the SciVal Showcase application using 4 different XQuery engines (MarkLogic, eXist, 28ms, and XQIB) – This experiment highlighted the differences in the implementations (and the looseness of the W3C recommendation) 8
  • 9. Software ecosystem • The eco-system around XQuery and MarkLogic is lacking – Not a tremendous amount of open source and/or 3rd party modules or language bindings • The IDEs and debugging tools (while vastly improved) are still not at par with other query languages 9
  • 10. Total solution fit • MarkLogic started out as an XML database solution • It has added functionality (e.g. free text search) matured over the years – This is a big part of its intended use at LexisNexis • We struggle to understand the tradeoffs between a single solution vs. composition of best-of-breed solution (e.g. MarkLogic standalone vs. MarkLogic integrated with Solr) 10
  • 11. TCO relative to other solutions • Traditional enterprise software licensing can lead to significant costs • NoSQL document database solutions with business models based on open source plus support services are an emerging alternative • Still working on determining TCO tradeoff between the two in an enterprise context 11
  • 12. MarkLogic in the context of NoSQL in general • NoSQL before it was cool • But there are emerging differences between the document stores for traditional vs. Internet publishing – XML/XQuery/XSLT vs. JSON/UnSQL/Javascript – Manual scale-out vs automated scale-out • Overhead of legacy standards can be a drag – Where is XML in its adoption lifecycle? – How does HTML5 fit in? 12
  • 13. Future use of MarkLogic at Elsevier • Persisting as foundation of content repository efforts – XML legacy drives continued use • Turnkey SaaS for publishing, newer NoSQL solutions competing for attention – Solutions that layer XML processing and query technologies on top of non-XML NoSQL stores are beginning to appear (e.g. Ambrosoft’s XML DB project) • Design choices driven by consumer Internet use cases may not yield as good a fit to information publishing as MarkLogic – Emphasis on join-free queries and use-case-driven indexing • We are watching to see how emerging best practices and design patterns associated with consumer Internet that are good fits are supported moving forward – Auto-scaling – Web application frameworks – HTML5 13
  • 14. Summary • We were an early adopter of MarkLogic • Over ten years it has become a mature product that we rely on extensively across our business • The response of MarkLogic to the emergence of NoSQL document stores, non-XML document serializations and application design patterns from the consumer Internet is of keen interest to us 14