SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
DiJeSt.net
LOD4JS
Linked Open Data for
Jewish Studies
A DiJeSt Project
Yael Netzer,
Kepa Rodriguez,
Sinai Rusinek
The project DiJest was supported by Rothschild Foundation Hanadiv Europe slides: https://bit.ly/ajl2020
DiJeSt.net
DiJeSt.net
DiJeSt Data: - becoming more connected!
DiJeSt.net
Catalogues as data
Traditional objectives of catalogues and authority files (among others):
- search a book, discover a book
- search by author, place, subject
- indexing the inventory
Catalogues are bodies of knowledge that change in time
Our objective and methodology:
Transfer catalogues/authority files into tables
Understand content as a whole
Inspect variations and unify values such as dates and place names
View information in additional perspectives: “distant reading” Digital Humanities concept
Derive new representations (e.g., Linked Open Data, maps)
DiJeSt.net
Reading Authority Files - People
Starting point: Israeli National Library authority file heb100.xml
July 2018 (marc/xml file)
203,771 Entities
DiJeSt.net
Reading Catalogues:
Bibliography of the Hebrew Book
Small sized (107977 records), comprehensive, importance (authority)
But:
messy, not machine readable
(names, dates, places etc.)
No mapping to the authority files of NLI:
We did it using OpenRefine
DiJeSt.net
Reading Authority File: Places
Connect place names in catalogue to place authority files, following
previous work on Kima (by Sinai Rusinek)
DiJeSt.net
Process
1. Identify relevant fields, subfields, indicators in marc(xml,json)from NLI
2. Map into columns using python code
3. Inspect with OpenRefine
4. Back to 1 until satisfied with results
DiJeSt.net
DiJeSt.net
OpenRefine - A powerful tool
- Faceting and clustering
- Many knowledge representation formats
- Easy identification of errors and inconsistencies
- Built-in functions, possible usage of python (jython), closure
- Easy API calls, collect and add information
- Reconciliation with external, LOD resources
- Various exporting options, including generation of linked data (RDFs)
- Open source, resourceful community, FAIR principles
https://openrefine.org
DiJeSt.net
Collection as Data
DiJeSt.net
Data model: framework and ontologies (1)
When we designed our model, we decided:
● To make our data available for reuse: open data sharing policy.
● To make our data understandable for computers (machine-actionable).
● To make possible to navigate from our data to other resources.
● Expose our data as Linked Open Data (LOD).
DiJeSt.net
Data model: framework and ontologies (2)
LOD uses the Resource Description Framework (RDF) to describe the data as subject-predicate-object
triples.
● djr:book_152786 dcterms:creator djr:person_1403804
○ “The book with ID 152786 was created by the person with ID 1403804”
○ The author of “‫במלכודת‬ ‫”נערות‬ is Mordechai Narkis
We use standard ontologies to model our entity types.
That increases understandability for non-human agents.
● Authorities:
○ Person: skos, schema.org, dbpedia, rdaregistry, eac-cpf
○ Place: skos, schema.org, wgs84_pos
● Books:
○ Dublin core terms, fabio, GND, bibframe
DiJeSt.net
Data model: framework and ontologies (3)
DiJeSt.net
DiJeSt.net
What can be done with it? eLinda
● http://tdk-p6.cs.technion.ac.il:8083/
● With Oren Mishali, Technion Data & Knowledge Lab (TD&K)
●
DiJeSt.net
DiJeSt.net
DiJeSt.net
DiJeSt.net
DiJeSt.net
DiJeSt.net
DiJeSt.net
Vision
Connect with other LOD resources:
Jewish Book Shelf
Judaica link
Wikidata
Epidat
Future projects:
Expand the model to also represent:
● Book copies - library holdings
● Book copies can be owned and censored
by people (Footprints)
● Works: model Ben Yehuda Project
● Places: model KIMA
Include authorities for Publishers and Printers (NLI)
DiJeSt.net
DiJeSt.netUsing
Kima
http://data.geo-kima.org/
and CartoDB to create an
interactive map of BHB
DiJeSt.net
label: ‫!תודה‬
alternativeLabel: !‫תודה‬
alternativeLabel: thank you!
http://dijest.net/
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Weitere ähnliche Inhalte

Was ist angesagt?

Linked data for librarians
Linked data for librariansLinked data for librarians
Linked data for librarians
trevorthornton
 
First they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and UsedFirst they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and Used
Rensselaer Polytechnic Institute
 
Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)
Figoblog
 

Was ist angesagt? (20)

Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web
 
What flavor of linked data is best for your collection?
What flavor of linked data is best for your collection? What flavor of linked data is best for your collection?
What flavor of linked data is best for your collection?
 
SemanticWebApp
SemanticWebAppSemanticWebApp
SemanticWebApp
 
Cultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data CollectionsCultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data Collections
 
LOD/LAM Presentation
LOD/LAM PresentationLOD/LAM Presentation
LOD/LAM Presentation
 
Metadata ppt
Metadata ppt Metadata ppt
Metadata ppt
 
LODLAM Landscape
LODLAM LandscapeLODLAM Landscape
LODLAM Landscape
 
LODLAM Landscape NOTES
LODLAM Landscape NOTESLODLAM Landscape NOTES
LODLAM Landscape NOTES
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in Libraries
 
Open data easy, explicit and fast
Open data easy, explicit and fastOpen data easy, explicit and fast
Open data easy, explicit and fast
 
Gonzalez-8-jun15
Gonzalez-8-jun15Gonzalez-8-jun15
Gonzalez-8-jun15
 
Linked data for librarians
Linked data for librariansLinked data for librarians
Linked data for librarians
 
Washington Linked Data Authority Service at University of Houston
Washington Linked Data Authority Service at University of HoustonWashington Linked Data Authority Service at University of Houston
Washington Linked Data Authority Service at University of Houston
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Linked Data
Linked DataLinked Data
Linked Data
 
First they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and UsedFirst they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and Used
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDS
 
Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)
 
Publishing the British National Bibliography as Linked Open Data / Corine Del...
Publishing the British National Bibliography as Linked Open Data / Corine Del...Publishing the British National Bibliography as Linked Open Data / Corine Del...
Publishing the British National Bibliography as Linked Open Data / Corine Del...
 
Wikidata
WikidataWikidata
Wikidata
 

Ähnlich wie LOD4JS - Linked Open Data for Jewish Studies

Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
Anja Jentzsch
 
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked dataESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
eswcsummerschool
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011
lljohnston
 

Ähnlich wie LOD4JS - Linked Open Data for Jewish Studies (20)

Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloud
 
LIS 653 fall 2013 final project posters
LIS 653 fall 2013 final project postersLIS 653 fall 2013 final project posters
LIS 653 fall 2013 final project posters
 
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked DataIFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
 
Linked Data: Why Bother?
Linked Data:  Why Bother?Linked Data:  Why Bother?
Linked Data: Why Bother?
 
Semantic Web in Action
Semantic Web in ActionSemantic Web in Action
Semantic Web in Action
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
 
The Web of Linked Data and its information
The Web of Linked Data and its informationThe Web of Linked Data and its information
The Web of Linked Data and its information
 
What's goin' on?
What's goin' on?What's goin' on?
What's goin' on?
 
Nemeth Marton - Widening the limits of cognitive reception with online digita...
Nemeth Marton - Widening the limits of cognitive reception with online digita...Nemeth Marton - Widening the limits of cognitive reception with online digita...
Nemeth Marton - Widening the limits of cognitive reception with online digita...
 
FAIR data: LOUD for all audiences
FAIR data: LOUD for all audiencesFAIR data: LOUD for all audiences
FAIR data: LOUD for all audiences
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Irish Digital Libraries Summit
Irish Digital Libraries SummitIrish Digital Libraries Summit
Irish Digital Libraries Summit
 
lodlam summit session browsable linked data
lodlam summit session browsable linked datalodlam summit session browsable linked data
lodlam summit session browsable linked data
 
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked dataESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
 
Links and Entities: The Library Data Revolution
Links and Entities: The Library Data RevolutionLinks and Entities: The Library Data Revolution
Links and Entities: The Library Data Revolution
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011
 
Marac subject guides pflug
Marac subject guides pflugMarac subject guides pflug
Marac subject guides pflug
 

Mehr von Kepa J. Rodriguez

Information Extraction on Noisy Texts for Historical Research
Information Extraction on Noisy Texts for Historical ResearchInformation Extraction on Noisy Texts for Historical Research
Information Extraction on Noisy Texts for Historical Research
Kepa J. Rodriguez
 

Mehr von Kepa J. Rodriguez (9)

The use of controlled and structured vocabularies in a digitally joined-up world
The use of controlled and structured vocabularies in a digitally joined-up worldThe use of controlled and structured vocabularies in a digitally joined-up world
The use of controlled and structured vocabularies in a digitally joined-up world
 
Use case: data edited as a book !!!
Use case: data edited as a book !!!Use case: data edited as a book !!!
Use case: data edited as a book !!!
 
Building a 3-gram model for Language Identification
Building a 3-gram model for Language IdentificationBuilding a 3-gram model for Language Identification
Building a 3-gram model for Language Identification
 
Design and prototype of a Help Desk System for EHRI: an Information Retrieval...
Design and prototype of a Help Desk System for EHRI: an Information Retrieval...Design and prototype of a Help Desk System for EHRI: an Information Retrieval...
Design and prototype of a Help Desk System for EHRI: an Information Retrieval...
 
Information Extraction on Noisy Texts for Historical Research
Information Extraction on Noisy Texts for Historical ResearchInformation Extraction on Noisy Texts for Historical Research
Information Extraction on Noisy Texts for Historical Research
 
Named entity extraction tools for raw OCR text
Named entity extraction tools for raw OCR textNamed entity extraction tools for raw OCR text
Named entity extraction tools for raw OCR text
 
Active Annotation of Corpora.
Active Annotation of Corpora.Active Annotation of Corpora.
Active Annotation of Corpora.
 
Resources for linguistically motivated Multilingual Anaphora Resolution
Resources for linguistically motivated Multilingual Anaphora ResolutionResources for linguistically motivated Multilingual Anaphora Resolution
Resources for linguistically motivated Multilingual Anaphora Resolution
 
Cross Document Coreference
Cross Document CoreferenceCross Document Coreference
Cross Document Coreference
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

LOD4JS - Linked Open Data for Jewish Studies

  • 1. DiJeSt.net LOD4JS Linked Open Data for Jewish Studies A DiJeSt Project Yael Netzer, Kepa Rodriguez, Sinai Rusinek The project DiJest was supported by Rothschild Foundation Hanadiv Europe slides: https://bit.ly/ajl2020
  • 2. DiJeSt.net DiJeSt.net DiJeSt Data: - becoming more connected!
  • 3. DiJeSt.net Catalogues as data Traditional objectives of catalogues and authority files (among others): - search a book, discover a book - search by author, place, subject - indexing the inventory Catalogues are bodies of knowledge that change in time Our objective and methodology: Transfer catalogues/authority files into tables Understand content as a whole Inspect variations and unify values such as dates and place names View information in additional perspectives: “distant reading” Digital Humanities concept Derive new representations (e.g., Linked Open Data, maps)
  • 4. DiJeSt.net Reading Authority Files - People Starting point: Israeli National Library authority file heb100.xml July 2018 (marc/xml file) 203,771 Entities
  • 5. DiJeSt.net Reading Catalogues: Bibliography of the Hebrew Book Small sized (107977 records), comprehensive, importance (authority) But: messy, not machine readable (names, dates, places etc.) No mapping to the authority files of NLI: We did it using OpenRefine
  • 6. DiJeSt.net Reading Authority File: Places Connect place names in catalogue to place authority files, following previous work on Kima (by Sinai Rusinek)
  • 7. DiJeSt.net Process 1. Identify relevant fields, subfields, indicators in marc(xml,json)from NLI 2. Map into columns using python code 3. Inspect with OpenRefine 4. Back to 1 until satisfied with results
  • 9. DiJeSt.net OpenRefine - A powerful tool - Faceting and clustering - Many knowledge representation formats - Easy identification of errors and inconsistencies - Built-in functions, possible usage of python (jython), closure - Easy API calls, collect and add information - Reconciliation with external, LOD resources - Various exporting options, including generation of linked data (RDFs) - Open source, resourceful community, FAIR principles https://openrefine.org
  • 11. DiJeSt.net Data model: framework and ontologies (1) When we designed our model, we decided: ● To make our data available for reuse: open data sharing policy. ● To make our data understandable for computers (machine-actionable). ● To make possible to navigate from our data to other resources. ● Expose our data as Linked Open Data (LOD).
  • 12. DiJeSt.net Data model: framework and ontologies (2) LOD uses the Resource Description Framework (RDF) to describe the data as subject-predicate-object triples. ● djr:book_152786 dcterms:creator djr:person_1403804 ○ “The book with ID 152786 was created by the person with ID 1403804” ○ The author of “‫במלכודת‬ ‫”נערות‬ is Mordechai Narkis We use standard ontologies to model our entity types. That increases understandability for non-human agents. ● Authorities: ○ Person: skos, schema.org, dbpedia, rdaregistry, eac-cpf ○ Place: skos, schema.org, wgs84_pos ● Books: ○ Dublin core terms, fabio, GND, bibframe
  • 13. DiJeSt.net Data model: framework and ontologies (3)
  • 15. DiJeSt.net What can be done with it? eLinda ● http://tdk-p6.cs.technion.ac.il:8083/ ● With Oren Mishali, Technion Data & Knowledge Lab (TD&K) ●
  • 22. DiJeSt.net Vision Connect with other LOD resources: Jewish Book Shelf Judaica link Wikidata Epidat Future projects: Expand the model to also represent: ● Book copies - library holdings ● Book copies can be owned and censored by people (Footprints) ● Works: model Ben Yehuda Project ● Places: model KIMA Include authorities for Publishers and Printers (NLI)
  • 25. DiJeSt.net label: ‫!תודה‬ alternativeLabel: !‫תודה‬ alternativeLabel: thank you! http://dijest.net/ This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.