SlideShare a Scribd company logo
1 of 19
Shenghui Wang
Rob Koopman
Exploring a world of
networked information
built from free-text
metadata
OCLC Research EMEA
ELAG2015
What would you do if you are
interested in a topic?
Difficult to answer these questions:
• What are the different aspects of this topic?
• Are there related aspects missing in my search terms?
• Who are the most prominent authors about this topic?
• Which journals publish most about this topic?
• How have others — e.g. librarians — described and classified
this topic?
Demo
• http://thoth.pica.nl/relate?input=opac
How do we do this?
• OFFLINE: generates a semantic representation
for each entity
• ONLINE: finds the most related entities and
using multidimensional scaling to display
Build semantic representation
• Basic assumptions
– Entities can be represented by its context
– Entities which share more context are more likely
to be related
• Context is the textual environment where an
entity occurs
• The effects of state prekindergarten programs on young
children’s school readiness in five states
• [author:jung kwanghee]
• [subject:readiness for school]
Dataset
● ArticleFirst, 65 million articles
● Selected 4 million entities (topical terms,
authors, ISSNs, Dewey decimal codes)
● Represented by 1 million topical terms
But a matrix of 4M x 1M is too big to process
Dimension reduction based on Random Projection
C: a co-occurrence matrix
R: a random matrix of +/-1
C’: approximation of C
after random projection
-- Semantic matrix
Online interface
• Find mutual nearest neighbors
• Use multidimensional scaling to display
Nearest neighbors
Mutual nearest neighbors
Possible applications
• Explorative interface
• Context based search:
– brain
• Journal finder
– Arctic ice journals
– http://brain.oxfordjournals.org/
• Author name disambiguation
– pre kindergarten
Context matters!
• What does “young” mean in
- AritcleFirst
- WorldCat
- Astrophysics
- Art
Ariadne
(demo) http://thoth.pica.nl/relate
• An extremely fast way of navigating large scale
hetereogeneous entities
• Generalisable to different datasets
– Full WorldCat
– Small but highly curated astrophysics dataset
• Supports explorative information retrieval and
entity disambiguation
References
• Koopman, Rob, and Shenghui Wang. 2014. “Where Should I Publish? Detecting
Journal Similarity Based on What Has Been Published There.” In Proceedings of
Digital Libraries 2014, 483–484. London, United Kingdom. Association for
Computing Machinery. Paper, Poster
• Koopman, Rob, Shenghui Wang, Andrea Scharnhorst, and Gwenn Englebienne.
2015. “Ariadne’s Thread — Interactive Navigation in a World of Networked
Information”. In CHI '15 Extended Abstracts on Human Factors in Computing
Systems. ACM, Seoul, South Korea. Paper, Poster
• Koopman, Rob, Shenghui Wang and Andrea Scharnhorst. 2015. “Contextualization
of topics - browsing through terms, authors, journals and cluster allocations”. In
Proceedings of 15th International Conference on Scientometrics & Informetrics.
Istanbul, Turkey. Paper
Explore. Share. Magnify.
Thank you
Shenghui Wang
Rob Koopman
OCLC Research EMEA
shenghui.wang@oclc.org
rob.koopman@oclc.org

More Related Content

What's hot

OA in the Library Collection: The Challenge of Identifying and Managing Open ...
OA in the Library Collection: The Challenge of Identifying and Managing Open ...OA in the Library Collection: The Challenge of Identifying and Managing Open ...
OA in the Library Collection: The Challenge of Identifying and Managing Open ...NASIG
 
OCLC and the Social Web: Building tools, providing platforms, engaging the co...
OCLC and the Social Web:Building tools, providing platforms, engaging the co...OCLC and the Social Web:Building tools, providing platforms, engaging the co...
OCLC and the Social Web: Building tools, providing platforms, engaging the co...Andy Havens
 
Working collaboratively: scaling infrastructure, services, learning and innov...
Working collaboratively: scaling infrastructure, services, learning and innov...Working collaboratively: scaling infrastructure, services, learning and innov...
Working collaboratively: scaling infrastructure, services, learning and innov...lisld
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC
 
Collection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentCollection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentConstance Malpas
 
The library in the life of the user
The library in the life of the userThe library in the life of the user
The library in the life of the userlisld
 
Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0Guus van den Brekel
 
Thinking about technology .... differently
Thinking about technology .... differentlyThinking about technology .... differently
Thinking about technology .... differentlylisld
 
Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...lisld
 
Collections unbound: collection directions and the RLUK collective collection
Collections unbound: collection directions and the RLUK collective collectionCollections unbound: collection directions and the RLUK collective collection
Collections unbound: collection directions and the RLUK collective collectionlisld
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentationekansa
 
Collection Directions: Some Reflections on Libraries and Stewardship of the ...
 Collection Directions: Some Reflections on Libraries and Stewardship of the ... Collection Directions: Some Reflections on Libraries and Stewardship of the ...
Collection Directions: Some Reflections on Libraries and Stewardship of the ...OCLC
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?OCLC
 
The OCLC Research Library Partnership
The OCLC Research Library PartnershipThe OCLC Research Library Partnership
The OCLC Research Library PartnershipOCLC
 
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction DataThe SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction DataOCLC
 
The facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentThe facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentlisld
 
From local infrastructure to engagement - thinking about the library in the l...
From local infrastructure to engagement - thinking about the library in the l...From local infrastructure to engagement - thinking about the library in the l...
From local infrastructure to engagement - thinking about the library in the l...lisld
 
Libraries: technology as artifact and technology in practice
Libraries: technology as artifact and technology in practiceLibraries: technology as artifact and technology in practice
Libraries: technology as artifact and technology in practicelisld
 

What's hot (20)

OA in the Library Collection: The Challenge of Identifying and Managing Open ...
OA in the Library Collection: The Challenge of Identifying and Managing Open ...OA in the Library Collection: The Challenge of Identifying and Managing Open ...
OA in the Library Collection: The Challenge of Identifying and Managing Open ...
 
OCLC and the Social Web: Building tools, providing platforms, engaging the co...
OCLC and the Social Web:Building tools, providing platforms, engaging the co...OCLC and the Social Web:Building tools, providing platforms, engaging the co...
OCLC and the Social Web: Building tools, providing platforms, engaging the co...
 
Redefining the Academic Library
Redefining the Academic LibraryRedefining the Academic Library
Redefining the Academic Library
 
Working collaboratively: scaling infrastructure, services, learning and innov...
Working collaboratively: scaling infrastructure, services, learning and innov...Working collaboratively: scaling infrastructure, services, learning and innov...
Working collaboratively: scaling infrastructure, services, learning and innov...
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.
 
Collection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentCollection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environment
 
The library in the life of the user
The library in the life of the userThe library in the life of the user
The library in the life of the user
 
Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0
 
Thinking about technology .... differently
Thinking about technology .... differentlyThinking about technology .... differently
Thinking about technology .... differently
 
Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...
 
Collections unbound: collection directions and the RLUK collective collection
Collections unbound: collection directions and the RLUK collective collectionCollections unbound: collection directions and the RLUK collective collection
Collections unbound: collection directions and the RLUK collective collection
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentation
 
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
 
Collection Directions: Some Reflections on Libraries and Stewardship of the ...
 Collection Directions: Some Reflections on Libraries and Stewardship of the ... Collection Directions: Some Reflections on Libraries and Stewardship of the ...
Collection Directions: Some Reflections on Libraries and Stewardship of the ...
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?
 
The OCLC Research Library Partnership
The OCLC Research Library PartnershipThe OCLC Research Library Partnership
The OCLC Research Library Partnership
 
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction DataThe SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
 
The facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentThe facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environment
 
From local infrastructure to engagement - thinking about the library in the l...
From local infrastructure to engagement - thinking about the library in the l...From local infrastructure to engagement - thinking about the library in the l...
From local infrastructure to engagement - thinking about the library in the l...
 
Libraries: technology as artifact and technology in practice
Libraries: technology as artifact and technology in practiceLibraries: technology as artifact and technology in practice
Libraries: technology as artifact and technology in practice
 

Similar to Exploring a world of networked information built from free-text metadata

Forty Years of the OTA
Forty Years of the OTAForty Years of the OTA
Forty Years of the OTAMartin Wynne
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Searching of Web and Electronic Resources
Searching of Web and Electronic Resources Searching of Web and Electronic Resources
Searching of Web and Electronic Resources Bramesha B
 
LSC Glasgow 061609
LSC Glasgow 061609LSC Glasgow 061609
LSC Glasgow 061609John MacColl
 
Gujranwala medical collge digital library access
Gujranwala medical collge digital library accessGujranwala medical collge digital library access
Gujranwala medical collge digital library accessAsif Iqbal
 
Ontologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture WebOntologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture WebGuus Schreiber
 
Webscale Discovery and Information Literacy
Webscale Discovery and Information LiteracyWebscale Discovery and Information Literacy
Webscale Discovery and Information LiteracyCharleston Conference
 
Webscale discovery and information literacy
Webscale discovery and information literacyWebscale discovery and information literacy
Webscale discovery and information literacyli1smc
 
FORCE11: Future of Research Communications and e-Scholarship
FORCE11:  Future of Research Communications and e-ScholarshipFORCE11:  Future of Research Communications and e-Scholarship
FORCE11: Future of Research Communications and e-ScholarshipMaryann Martone
 
Bridging the Gap: Encouraging Engagement with Library Services and Technologies
Bridging the Gap: Encouraging Engagement with Library Services and TechnologiesBridging the Gap: Encouraging Engagement with Library Services and Technologies
Bridging the Gap: Encouraging Engagement with Library Services and TechnologiesTed Lin (林泰宏)
 
Printing chocolate bars
Printing chocolate barsPrinting chocolate bars
Printing chocolate barshebertm3308
 
Who, What, Where,Why and How
Who, What, Where,Why and HowWho, What, Where,Why and How
Who, What, Where,Why and HowRachel Frick
 
Innovative Librarianship - Lib 3.0: The need, opportunity and trends
Innovative Librarianship - Lib 3.0: The need, opportunity and trendsInnovative Librarianship - Lib 3.0: The need, opportunity and trends
Innovative Librarianship - Lib 3.0: The need, opportunity and trendsAnil67
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18Rafael Alvarado
 

Similar to Exploring a world of networked information built from free-text metadata (20)

Forty Years of the OTA
Forty Years of the OTAForty Years of the OTA
Forty Years of the OTA
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Searching of Web and Electronic Resources
Searching of Web and Electronic Resources Searching of Web and Electronic Resources
Searching of Web and Electronic Resources
 
LSC Glasgow 061609
LSC Glasgow 061609LSC Glasgow 061609
LSC Glasgow 061609
 
Ecdl2004
Ecdl2004Ecdl2004
Ecdl2004
 
Gujranwala medical collge digital library access
Gujranwala medical collge digital library accessGujranwala medical collge digital library access
Gujranwala medical collge digital library access
 
Ontologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture WebOntologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture Web
 
Webscale Discovery and Information Literacy
Webscale Discovery and Information LiteracyWebscale Discovery and Information Literacy
Webscale Discovery and Information Literacy
 
Webscale discovery and information literacy
Webscale discovery and information literacyWebscale discovery and information literacy
Webscale discovery and information literacy
 
Alpsp final martone
Alpsp final martoneAlpsp final martone
Alpsp final martone
 
FORCE11: Future of Research Communications and e-Scholarship
FORCE11:  Future of Research Communications and e-ScholarshipFORCE11:  Future of Research Communications and e-Scholarship
FORCE11: Future of Research Communications and e-Scholarship
 
Bridging the Gap: Encouraging Engagement with Library Services and Technologies
Bridging the Gap: Encouraging Engagement with Library Services and TechnologiesBridging the Gap: Encouraging Engagement with Library Services and Technologies
Bridging the Gap: Encouraging Engagement with Library Services and Technologies
 
2014_WWW_BTOR
2014_WWW_BTOR2014_WWW_BTOR
2014_WWW_BTOR
 
Printing chocolate bars
Printing chocolate barsPrinting chocolate bars
Printing chocolate bars
 
Ngsp
NgspNgsp
Ngsp
 
Ir1
Ir1Ir1
Ir1
 
Who, What, Where,Why and How
Who, What, Where,Why and HowWho, What, Where,Why and How
Who, What, Where,Why and How
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
Innovative Librarianship - Lib 3.0: The need, opportunity and trends
Innovative Librarianship - Lib 3.0: The need, opportunity and trendsInnovative Librarianship - Lib 3.0: The need, opportunity and trends
Innovative Librarianship - Lib 3.0: The need, opportunity and trends
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18
 

More from Shenghui Wang

Non-parametric Subject Prediction
Non-parametric Subject PredictionNon-parametric Subject Prediction
Non-parametric Subject PredictionShenghui Wang
 
Our journey with semantic embedding
Our journey with semantic embeddingOur journey with semantic embedding
Our journey with semantic embeddingShenghui Wang
 
Linking entities via semantic indexing
Linking entities via semantic indexingLinking entities via semantic indexing
Linking entities via semantic indexingShenghui Wang
 
Semantic indexing for KOS
Semantic indexing for KOSSemantic indexing for KOS
Semantic indexing for KOSShenghui Wang
 
Contextualization of topics - browsing through terms, authors, journals and c...
Contextualization of topics - browsing through terms, authors, journals and c...Contextualization of topics - browsing through terms, authors, journals and c...
Contextualization of topics - browsing through terms, authors, journals and c...Shenghui Wang
 
Ariadne's Thread -- Exploring a world of networked information built from fre...
Ariadne's Thread -- Exploring a world of networked information built from fre...Ariadne's Thread -- Exploring a world of networked information built from fre...
Ariadne's Thread -- Exploring a world of networked information built from fre...Shenghui Wang
 
Learning Concept Mappings from Instance Similarity
Learning Concept Mappings from Instance SimilarityLearning Concept Mappings from Instance Similarity
Learning Concept Mappings from Instance SimilarityShenghui Wang
 
Measuring the dynamic bi-directional influence between content and social ne...
Measuring the dynamic bi-directional influence between content and social ne...Measuring the dynamic bi-directional influence between content and social ne...
Measuring the dynamic bi-directional influence between content and social ne...Shenghui Wang
 
Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning Shenghui Wang
 
What is concept dirft and how to measure it?
What is concept dirft and how to measure it?What is concept dirft and how to measure it?
What is concept dirft and how to measure it?Shenghui Wang
 
Study concept drift in political ontologies
Study concept drift in political ontologiesStudy concept drift in political ontologies
Study concept drift in political ontologiesShenghui Wang
 

More from Shenghui Wang (13)

Non-parametric Subject Prediction
Non-parametric Subject PredictionNon-parametric Subject Prediction
Non-parametric Subject Prediction
 
Our journey with semantic embedding
Our journey with semantic embeddingOur journey with semantic embedding
Our journey with semantic embedding
 
Linking entities via semantic indexing
Linking entities via semantic indexingLinking entities via semantic indexing
Linking entities via semantic indexing
 
Semantic indexing for KOS
Semantic indexing for KOSSemantic indexing for KOS
Semantic indexing for KOS
 
Contextualization of topics - browsing through terms, authors, journals and c...
Contextualization of topics - browsing through terms, authors, journals and c...Contextualization of topics - browsing through terms, authors, journals and c...
Contextualization of topics - browsing through terms, authors, journals and c...
 
Ariadne's Thread -- Exploring a world of networked information built from fre...
Ariadne's Thread -- Exploring a world of networked information built from fre...Ariadne's Thread -- Exploring a world of networked information built from fre...
Ariadne's Thread -- Exploring a world of networked information built from fre...
 
Learning Concept Mappings from Instance Similarity
Learning Concept Mappings from Instance SimilarityLearning Concept Mappings from Instance Similarity
Learning Concept Mappings from Instance Similarity
 
Measuring the dynamic bi-directional influence between content and social ne...
Measuring the dynamic bi-directional influence between content and social ne...Measuring the dynamic bi-directional influence between content and social ne...
Measuring the dynamic bi-directional influence between content and social ne...
 
Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning
 
What is concept dirft and how to measure it?
What is concept dirft and how to measure it?What is concept dirft and how to measure it?
What is concept dirft and how to measure it?
 
ICA Slides
ICA SlidesICA Slides
ICA Slides
 
ECCS 2010
ECCS 2010ECCS 2010
ECCS 2010
 
Study concept drift in political ontologies
Study concept drift in political ontologiesStudy concept drift in political ontologies
Study concept drift in political ontologies
 

Recently uploaded

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Exploring a world of networked information built from free-text metadata

  • 1. Shenghui Wang Rob Koopman Exploring a world of networked information built from free-text metadata OCLC Research EMEA ELAG2015
  • 2. What would you do if you are interested in a topic?
  • 3.
  • 4.
  • 5. Difficult to answer these questions: • What are the different aspects of this topic? • Are there related aspects missing in my search terms? • Who are the most prominent authors about this topic? • Which journals publish most about this topic? • How have others — e.g. librarians — described and classified this topic?
  • 7. How do we do this? • OFFLINE: generates a semantic representation for each entity • ONLINE: finds the most related entities and using multidimensional scaling to display
  • 8. Build semantic representation • Basic assumptions – Entities can be represented by its context – Entities which share more context are more likely to be related • Context is the textual environment where an entity occurs • The effects of state prekindergarten programs on young children’s school readiness in five states • [author:jung kwanghee] • [subject:readiness for school]
  • 9. Dataset ● ArticleFirst, 65 million articles ● Selected 4 million entities (topical terms, authors, ISSNs, Dewey decimal codes) ● Represented by 1 million topical terms But a matrix of 4M x 1M is too big to process
  • 10. Dimension reduction based on Random Projection C: a co-occurrence matrix R: a random matrix of +/-1 C’: approximation of C after random projection -- Semantic matrix
  • 11. Online interface • Find mutual nearest neighbors • Use multidimensional scaling to display
  • 14.
  • 15. Possible applications • Explorative interface • Context based search: – brain • Journal finder – Arctic ice journals – http://brain.oxfordjournals.org/ • Author name disambiguation – pre kindergarten
  • 16. Context matters! • What does “young” mean in - AritcleFirst - WorldCat - Astrophysics - Art
  • 17. Ariadne (demo) http://thoth.pica.nl/relate • An extremely fast way of navigating large scale hetereogeneous entities • Generalisable to different datasets – Full WorldCat – Small but highly curated astrophysics dataset • Supports explorative information retrieval and entity disambiguation
  • 18. References • Koopman, Rob, and Shenghui Wang. 2014. “Where Should I Publish? Detecting Journal Similarity Based on What Has Been Published There.” In Proceedings of Digital Libraries 2014, 483–484. London, United Kingdom. Association for Computing Machinery. Paper, Poster • Koopman, Rob, Shenghui Wang, Andrea Scharnhorst, and Gwenn Englebienne. 2015. “Ariadne’s Thread — Interactive Navigation in a World of Networked Information”. In CHI '15 Extended Abstracts on Human Factors in Computing Systems. ACM, Seoul, South Korea. Paper, Poster • Koopman, Rob, Shenghui Wang and Andrea Scharnhorst. 2015. “Contextualization of topics - browsing through terms, authors, journals and cluster allocations”. In Proceedings of 15th International Conference on Scientometrics & Informetrics. Istanbul, Turkey. Paper
  • 19. Explore. Share. Magnify. Thank you Shenghui Wang Rob Koopman OCLC Research EMEA shenghui.wang@oclc.org rob.koopman@oclc.org

Editor's Notes

  1. Opac -> journal -> author -> [author:medeiros norm] -> worldcat Ambiguous names: [author:balas janet l] [author:balas j l]
  2. Journal finder Name disam