SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
Change Tracking in Knowledge
Organization Systems with skos-history
Joachim Neubert & Osma Suominen
ZBW – Leibniz Information Centre for Economics, Kiel/Hamburg &
The National Library of Finland, Helsinki
DCMI/ASIST/AIMS Webinar Series:
Generic Tools and Methods for SKOS-based Concept Schemes
16.3.2016
Agenda
 User questions and requirements
 Getting a grip on changes:
 Overview
 Creating a version store
 Generic queries
 Dataset-specific adaption of queries
 skos-history in use
 Application at the National Library of Finland
 Application for STW Thesaurus for Economics
 Outlook: Future work and the skos-history project
Page 2
What users want to know …
Page 3
… when we publish a new KOS version:
 What‘s new?
 What has changed?
Use cases for extended change information
Page 4
 Human indexers wanting to learn about new and deprecated
concepts
 Human indexers (and supporting applications) re-indexing large sets
of documents
 People maintaining mappings to other vocabularies, and applications
supporting them
 People maintaining a derived subset of a KOS
 Vocabulary-based automatic or semi-automatic indexing applications
 Search applications utilizing the KOS
Agenda
 User questions and requirements
 Getting a grip on changes:
 Overview
 Creating a version store
 Generic queries
 Dataset-specific adaption of queries
 skos-history in use
 Application at the National Library of Finland
 Application for STW Thesaurus for Economics
 Outlook: Future work and the skos-history project
Page 5
Overview: getting a grip on changes
Provided that we have no access to the KOS maintenance system
where the changes take place originally, or can’t extend it to report this
changes comprehensively.
Dataset versioning + skos-history approach
=> should work on every SKOS vocabulary
Page 6
Scope of vocabulary versioning
 Versioning the concept scheme, not each individual concept
 URIs for the concepts remain stable over the different versions
 Distinct versions of a vocabulary, or at least timestamped dumps,
must be available
 Support for a continuous flow of changes, e.g., the LoC Subject
Headings, or the concepts of the GND, is currently not provided
Page 7
Three basic steps to an actionable skos-history
Start with one SKOS file per version.
1) Create the deltas - insertions and deletions - between every two
version files.
(Via a raw diff of sorted ntriples files, or via SPARQL MINUS in a
triple store.This gives you thousands and thousands of differences -
added or deleted triples -, even excluding bnodes.)
2) Load the version files and the insertions and deletions into a triple
store as named graphs.
3) Add metadata about the versions and the deltas in a separate
„version history graph“.
Page 8
https://github.com/jneubert/skos-history/blob/master/bin/load_versions.sh
Agenda
 User questions and requirements
 Getting a grip on changes:
 Overview
 Creating a version store
 Generic queries
 Dataset-specific adaption of queries
 skos-history in use
 Application at the National Library of Finland
 Application for STW Thesaurus for Economics
 Outlook: Future work and the skos-history project
Page 9
Hands on: Create a version store for skos-history
Requirements:
 SPARQL 1.1 compliant service or repository (‘triple store’),
accessible in read/write mode
https://github.com/NatLibFi/Skosmos/wiki/InstallTutorial#install-jena-fuseki
 An environment for executing bash scripts for the data load script
(any Linux should do, Cygwin may).
Tutorial: https://github.com/jneubert/skos-history/wiki/Tutorial
Code of scripts and queries: also on GitHub
Page 10
Load a version store: config file for JEL
Page 11
Configuration for Fuseki (https://github.com/jneubert/skos-history/blob/master/bin/jel.config);
see also configuration for Sesame (https://github.com/jneubert/skos-history/blob/master/bin/jel.sesame.config)
Load a version store: load_versions.sh script
Page 12
Load a version store: load_versions.sh script
Page 13
Page 14
Example endpoint:http://zbw.eu/beta/sparql/stwv/query
Version History Graph, discoverable via
fix URI, e.g.: http://zbw.eu/stw/version
Version History Graph, published as HTML/RDFa
Page 15
http://zbw.eu/stw/version
Vocabularies used for the plumbing
 dc:/dcterms:
Dublin Core, as usual the base for everything
 void: http://rdfs.org/ns/void#
Vocabulary of interlinked datasets
 sd: http://www.w3.org/ns/sparql-service-description#
SPARQL service description
 delta: http://www.w3.org/2004/delta#
Differences between RDF graphs
 dsv: http://purl.org/iso25964/DataSet/Versioning#
Version history records (providing version identifier and date) and a
pointer to the current version – outside the actual version data
 sh: http://purl.org/skos-history/
Scheme and concept version deltas
Page 16
What’s the benefit?
A database of all versions of a KOS and all deltas between versions
– which can be queried in parallel!
Page 17
Agenda
 User questions and requirements
 Getting a grip on changes:
 Overview
 Creating a version store
 Generic queries
 Dataset-specific adaption of queries
 skos-history in use
 Application at the National Library of Finland
 Application for STW Thesaurus for Economics
 Outlook: Future work and the skos-history project
Page 18
Page 19
http://zbw.eu/beta/sparql-lab/?queryRef=https://api.github.com/repos/jneubert/skos-history/contents/sparql/added_concepts.rq
Query for added concepts
Newly inserted concepts – results
Page 20
Reports operating on standard SKOS structures
Page 21
https://github.com/jneubert/skos-history/tree/master/sparql
Reports … (continued)
Page 22
Changed notations
Page 23
http://zbw.eu/beta/sparql-lab/?queryRef=https://api.github.com/repos/jneubert/skos-history/contents/sparql/changed_notations.rq
New concepts, split from old ones
Page 24
Labels moved to added concepts:
http://zbw.eu/beta/sparql-lab/?queryRef=https://api.github.com/repos/jneubert/skos-history/contents/sparql/labels_moved_to_added_concepts.rq
Change history of a concept: “Personnel selection”
Page 25
http://zbw.eu/beta/sparql-lab/?queryRef=https://api.github.com/repos/jneubert/skos-history/contents/sparql/concept_deltas.rq
Agenda
 User questions and requirements
 Getting a grip on changes:
 Overview
 Creating a version store
 Generic queries
 Dataset-specific adaption of queries
 skos-history in use
 Application at the National Library of Finland
 Application for STW Thesaurus for Economics
 Outlook: Future work and the skos-history project
Page 26
GND subjects by subject category – query
Page 27
https://github.com/jneubert/skos-history/blob/master/sparql/swdskos/added_concepts_by_category.rq
GND subjects by subject category – results
Page 28
STW deprecated concepts – query
Page 29
https://github.com/jneubert/skos-history/blob/master/sparql/stw/deprecated_concepts_by_category.rq
STW deprecated concepts – result
Page 30
Agenda
 User questions and requirements
 Getting a grip on changes:
 Overview
 Creating a version store
 Generic queries
 Dataset-specific adaption of queries
 skos-history in use
 Application at the National Library of Finland
 Application for STW Thesaurus for Economics
 Outlook: Future work and the skos-history project
Page 31
skos-history at the National Library of Finland
see separate slides at http://tinyurl.com/skos-history-nlf
Page 32
Agenda
 User questions and requirements
 Getting a grip on changes:
 Overview
 Creating a version store
 Generic queries
 Dataset-specific adaption of queries
 skos-history in use
 Application at the National Library of Finland
 Application for STW Thesaurus for Economics
 Outlook: Future work and the skos-history project
Page 33
STW Thesaurus for Economics
 created in the 1990s
 on the web and available as SKOS since 2009
 bilingual (German/English)
 about 6000 descriptors, 500 subject categories
 overhaul during the last five years (five consecutive versions)
Page 34
STW change reports (precompiled query results)
Page 35
Visualizing change with aggregated data
Page 36
Page 37
Drill down from chart to change report
Page 38
Future work and the skos-history project
 Apply to differing concept schemes
 Distill general properties useful for human-readable change
reports as well as machine-actionable data
 Get a grip on clusters of interrelated changes
Please consider joining – particularly if
 you are in charge of a KOS and want to publish its change history
 you are using one or several KOS in an application, or
intellectually, and want to trace and re-apply upstream changes
 just feel challenged by the task
Page 39
Page 40
Thanks for listening!
Joachim Neubert
ZBW – Leibniz Information Centre for Economics
j.neubert@zbw.eu
Osma Suominen
The National Library of Finland
osma.suominen@helsinki.fi
Project repository: https://github.com/jneubert/skos-history

Weitere ähnliche Inhalte

Was ist angesagt?

Maurer Presentation - WARCnet Spring Meeting 2021
Maurer Presentation - WARCnet Spring Meeting 2021Maurer Presentation - WARCnet Spring Meeting 2021
Maurer Presentation - WARCnet Spring Meeting 2021WARCnet
 
Europe PubMed Central and Linked Data
Europe PubMed Central and Linked DataEurope PubMed Central and Linked Data
Europe PubMed Central and Linked DataJee-Hyub Kim
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Projectmbruemmer
 
Elag integrating open source
Elag integrating open sourceElag integrating open source
Elag integrating open sourceTheodor Tolstoy
 
Tuesday 5 May: Definition and Representation of National Web Domains across W...
Tuesday 5 May: Definition and Representation of National Web Domains across W...Tuesday 5 May: Definition and Representation of National Web Domains across W...
Tuesday 5 May: Definition and Representation of National Web Domains across W...WARCnet
 
Bingham, De Wild & Aasman Presentation
Bingham, De Wild & Aasman PresentationBingham, De Wild & Aasman Presentation
Bingham, De Wild & Aasman PresentationWARCnet
 
Open Data Mashups: linking fragments into mosaics
Open Data Mashups: linking fragments into mosaicsOpen Data Mashups: linking fragments into mosaics
Open Data Mashups: linking fragments into mosaicsphduchesne
 
Versioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsVersioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsRuben Taelman
 
Documentation With Open Source Tools
Documentation With Open Source ToolsDocumentation With Open Source Tools
Documentation With Open Source ToolsRashad Aliyev
 
Documentation With Open Source Tools·(ასლი)
Documentation With Open Source Tools·(ასლი)Documentation With Open Source Tools·(ასლი)
Documentation With Open Source Tools·(ასლი)Rashad Aliyev
 

Was ist angesagt? (12)

Maurer Presentation - WARCnet Spring Meeting 2021
Maurer Presentation - WARCnet Spring Meeting 2021Maurer Presentation - WARCnet Spring Meeting 2021
Maurer Presentation - WARCnet Spring Meeting 2021
 
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge BasesLOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
 
Europe PubMed Central and Linked Data
Europe PubMed Central and Linked DataEurope PubMed Central and Linked Data
Europe PubMed Central and Linked Data
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Project
 
Elag integrating open source
Elag integrating open sourceElag integrating open source
Elag integrating open source
 
Tuesday 5 May: Definition and Representation of National Web Domains across W...
Tuesday 5 May: Definition and Representation of National Web Domains across W...Tuesday 5 May: Definition and Representation of National Web Domains across W...
Tuesday 5 May: Definition and Representation of National Web Domains across W...
 
Bingham, De Wild & Aasman Presentation
Bingham, De Wild & Aasman PresentationBingham, De Wild & Aasman Presentation
Bingham, De Wild & Aasman Presentation
 
Open Data Mashups: linking fragments into mosaics
Open Data Mashups: linking fragments into mosaicsOpen Data Mashups: linking fragments into mosaics
Open Data Mashups: linking fragments into mosaics
 
VeloxDFS
VeloxDFSVeloxDFS
VeloxDFS
 
Versioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsVersioned Triple Pattern Fragments
Versioned Triple Pattern Fragments
 
Documentation With Open Source Tools
Documentation With Open Source ToolsDocumentation With Open Source Tools
Documentation With Open Source Tools
 
Documentation With Open Source Tools·(ასლი)
Documentation With Open Source Tools·(ასლი)Documentation With Open Source Tools·(ასლი)
Documentation With Open Source Tools·(ასლი)
 

Andere mochten auch

Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...
Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...
Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...Joachim Neubert
 
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)Linked Data Publishing with Drupal (SWIB12 Lightning Talk)
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)Joachim Neubert
 
Linked data enhanced publishing for special collections (with Drupal)
Linked data enhanced publishing for special collections (with Drupal)Linked data enhanced publishing for special collections (with Drupal)
Linked data enhanced publishing for special collections (with Drupal)Joachim Neubert
 
Using Wikidata as an Authority for the SowiDataNet Research Data Repository
Using Wikidata as an Authority for the SowiDataNet Research Data RepositoryUsing Wikidata as an Authority for the SowiDataNet Research Data Repository
Using Wikidata as an Authority for the SowiDataNet Research Data RepositoryJoachim Neubert
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationBenjamin Good
 
Authority Control: Wikipedia + Wikidata
Authority Control: Wikipedia + WikidataAuthority Control: Wikipedia + Wikidata
Authority Control: Wikipedia + WikidataErika Herzog
 
Account-Based Marketing Automation
Account-Based Marketing AutomationAccount-Based Marketing Automation
Account-Based Marketing AutomationRon Corbisier
 
Account Based Marketing Overview
Account Based Marketing OverviewAccount Based Marketing Overview
Account Based Marketing OverviewRon Corbisier
 
Verifiable, linked open knowledge that anyone can edit
Verifiable, linked open knowledge that anyone can editVerifiable, linked open knowledge that anyone can edit
Verifiable, linked open knowledge that anyone can editDario Taraborelli
 
Anforderungen an Thesauri im Semantic Web
Anforderungen an Thesauri im Semantic WebAnforderungen an Thesauri im Semantic Web
Anforderungen an Thesauri im Semantic WebJoachim Neubert
 

Andere mochten auch (10)

Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...
Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...
Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...
 
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)Linked Data Publishing with Drupal (SWIB12 Lightning Talk)
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)
 
Linked data enhanced publishing for special collections (with Drupal)
Linked data enhanced publishing for special collections (with Drupal)Linked data enhanced publishing for special collections (with Drupal)
Linked data enhanced publishing for special collections (with Drupal)
 
Using Wikidata as an Authority for the SowiDataNet Research Data Repository
Using Wikidata as an Authority for the SowiDataNet Research Data RepositoryUsing Wikidata as an Authority for the SowiDataNet Research Data Repository
Using Wikidata as an Authority for the SowiDataNet Research Data Repository
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocuration
 
Authority Control: Wikipedia + Wikidata
Authority Control: Wikipedia + WikidataAuthority Control: Wikipedia + Wikidata
Authority Control: Wikipedia + Wikidata
 
Account-Based Marketing Automation
Account-Based Marketing AutomationAccount-Based Marketing Automation
Account-Based Marketing Automation
 
Account Based Marketing Overview
Account Based Marketing OverviewAccount Based Marketing Overview
Account Based Marketing Overview
 
Verifiable, linked open knowledge that anyone can edit
Verifiable, linked open knowledge that anyone can editVerifiable, linked open knowledge that anyone can edit
Verifiable, linked open knowledge that anyone can edit
 
Anforderungen an Thesauri im Semantic Web
Anforderungen an Thesauri im Semantic WebAnforderungen an Thesauri im Semantic Web
Anforderungen an Thesauri im Semantic Web
 

Ähnlich wie Change Tracking in Knowledge Organization Systems with skos-history

Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsLeveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsJoachim Neubert
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overviewBigData_Europe
 
Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817Figoblog
 
TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31Dag Endresen
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...BigData_Europe
 
Bibliosight (UKCoRR presentation)
Bibliosight (UKCoRR presentation)Bibliosight (UKCoRR presentation)
Bibliosight (UKCoRR presentation)Nick Sheppard
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Ivan Ermilov
 
Matthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings FundMatthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings FundTracy Kent
 
SKOS - 2007 Open Forum on Metadata Registries - NYC
SKOS - 2007 Open Forum on Metadata Registries - NYCSKOS - 2007 Open Forum on Metadata Registries - NYC
SKOS - 2007 Open Forum on Metadata Registries - NYCjonphipps
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigData_Europe
 
2005-01-04 Web Services Survey an Inventory Background, Goals and Status
2005-01-04 Web Services Survey an Inventory Background, Goals and Status2005-01-04 Web Services Survey an Inventory Background, Goals and Status
2005-01-04 Web Services Survey an Inventory Background, Goals and StatusRudolf Husar
 
Web Services Inventory
Web Services InventoryWeb Services Inventory
Web Services InventoryRudolf Husar
 
Publishing the British National Bibliography as Linked Open Data / Corine Del...
Publishing the British National Bibliography as Linked Open Data / Corine Del...Publishing the British National Bibliography as Linked Open Data / Corine Del...
Publishing the British National Bibliography as Linked Open Data / Corine Del...CIGScotland
 
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Nikolaos Konstantinou
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330glorykim
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330광영 김
 

Ähnlich wie Change Tracking in Knowledge Organization Systems with skos-history (20)

Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsLeveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overview
 
Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817
 
TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
 
Bologna
BolognaBologna
Bologna
 
Bibliosight (UKCoRR presentation)
Bibliosight (UKCoRR presentation)Bibliosight (UKCoRR presentation)
Bibliosight (UKCoRR presentation)
 
Shadle "Library Catalog Metadata Basics for Publishers"
Shadle "Library Catalog Metadata Basics for Publishers"Shadle "Library Catalog Metadata Basics for Publishers"
Shadle "Library Catalog Metadata Basics for Publishers"
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
 
Matthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings FundMatthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings Fund
 
SKOS - 2007 Open Forum on Metadata Registries - NYC
SKOS - 2007 Open Forum on Metadata Registries - NYCSKOS - 2007 Open Forum on Metadata Registries - NYC
SKOS - 2007 Open Forum on Metadata Registries - NYC
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal Pilots
 
2005-01-04 Web Services Survey an Inventory Background, Goals and Status
2005-01-04 Web Services Survey an Inventory Background, Goals and Status2005-01-04 Web Services Survey an Inventory Background, Goals and Status
2005-01-04 Web Services Survey an Inventory Background, Goals and Status
 
Web Services Inventory
Web Services InventoryWeb Services Inventory
Web Services Inventory
 
Publishing the British National Bibliography as Linked Open Data / Corine Del...
Publishing the British National Bibliography as Linked Open Data / Corine Del...Publishing the British National Bibliography as Linked Open Data / Corine Del...
Publishing the British National Bibliography as Linked Open Data / Corine Del...
 
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330
 
58s
58s58s
58s
 
58s
58s58s
58s
 

Mehr von Joachim Neubert

Linking the 20th century paper history to the sum of all knowledge
Linking the 20th century paper history to the sum of all knowledgeLinking the 20th century paper history to the sum of all knowledge
Linking the 20th century paper history to the sum of all knowledgeJoachim Neubert
 
Exploring and mapping the category system of the world‘s largest public press...
Exploring and mapping the category system of the world‘s largest public press...Exploring and mapping the category system of the world‘s largest public press...
Exploring and mapping the category system of the world‘s largest public press...Joachim Neubert
 
Donating data to Wikidata: First experiences from the „20th Century Press Arc...
Donating data to Wikidata: First experiences from the „20th Century Press Arc...Donating data to Wikidata: First experiences from the „20th Century Press Arc...
Donating data to Wikidata: First experiences from the „20th Century Press Arc...Joachim Neubert
 
Wikidata as a hub for the linked data cloud
Wikidata as a hub for the linked data cloudWikidata as a hub for the linked data cloud
Wikidata as a hub for the linked data cloudJoachim Neubert
 
Wikidata as opportunity for special collections: the 20th Century Press Archi...
Wikidata as opportunity for special collections: the 20th Century Press Archi...Wikidata as opportunity for special collections: the 20th Century Press Archi...
Wikidata as opportunity for special collections: the 20th Century Press Archi...Joachim Neubert
 
20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata20th Century Press Archives goes Wikidata
20th Century Press Archives goes WikidataJoachim Neubert
 
Chancen und Herausforderungen einer komplementären Nutzung von GND und Wikidata
Chancen und Herausforderungen einer komplementären Nutzung von GND und WikidataChancen und Herausforderungen einer komplementären Nutzung von GND und Wikidata
Chancen und Herausforderungen einer komplementären Nutzung von GND und WikidataJoachim Neubert
 
Pressemappe 20. Jahrhundert: Personen- und Firmendossiers
Pressemappe 20. Jahrhundert: Personen- und FirmendossiersPressemappe 20. Jahrhundert: Personen- und Firmendossiers
Pressemappe 20. Jahrhundert: Personen- und FirmendossiersJoachim Neubert
 
20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata20th Century Press Archives goes Wikidata
20th Century Press Archives goes WikidataJoachim Neubert
 
Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)
Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)
Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)Joachim Neubert
 
Making Wikidata fit as a Linking Hub for Knowledge Organization Systems
Making Wikidata fit as a Linking Hub for Knowledge Organization SystemsMaking Wikidata fit as a Linking Hub for Knowledge Organization Systems
Making Wikidata fit as a Linking Hub for Knowledge Organization SystemsJoachim Neubert
 
Linking authorities through Wikidata
Linking authorities through WikidataLinking authorities through Wikidata
Linking authorities through WikidataJoachim Neubert
 
Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Wikidata as a linking hub for knowledge organization systems? Integrating an ...Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Wikidata as a linking hub for knowledge organization systems? Integrating an ...Joachim Neubert
 
Wikidata as authority linking hub
Wikidata as authority linking hubWikidata as authority linking hub
Wikidata as authority linking hubJoachim Neubert
 
EconBiz Research Dataset (SWIB16 Lightning Talk)
EconBiz Research Dataset (SWIB16 Lightning Talk)EconBiz Research Dataset (SWIB16 Lightning Talk)
EconBiz Research Dataset (SWIB16 Lightning Talk)Joachim Neubert
 
Linked Data Publishing with Drupal (SWIB13 workshop)
Linked Data Publishing with Drupal (SWIB13 workshop)Linked Data Publishing with Drupal (SWIB13 workshop)
Linked Data Publishing with Drupal (SWIB13 workshop)Joachim Neubert
 
Using Web Taxonomies in Drupal
Using Web Taxonomies in DrupalUsing Web Taxonomies in Drupal
Using Web Taxonomies in DrupalJoachim Neubert
 

Mehr von Joachim Neubert (18)

Linking the 20th century paper history to the sum of all knowledge
Linking the 20th century paper history to the sum of all knowledgeLinking the 20th century paper history to the sum of all knowledge
Linking the 20th century paper history to the sum of all knowledge
 
Exploring and mapping the category system of the world‘s largest public press...
Exploring and mapping the category system of the world‘s largest public press...Exploring and mapping the category system of the world‘s largest public press...
Exploring and mapping the category system of the world‘s largest public press...
 
Donating data to Wikidata: First experiences from the „20th Century Press Arc...
Donating data to Wikidata: First experiences from the „20th Century Press Arc...Donating data to Wikidata: First experiences from the „20th Century Press Arc...
Donating data to Wikidata: First experiences from the „20th Century Press Arc...
 
Wikidata (für Archive)
Wikidata (für Archive)Wikidata (für Archive)
Wikidata (für Archive)
 
Wikidata as a hub for the linked data cloud
Wikidata as a hub for the linked data cloudWikidata as a hub for the linked data cloud
Wikidata as a hub for the linked data cloud
 
Wikidata as opportunity for special collections: the 20th Century Press Archi...
Wikidata as opportunity for special collections: the 20th Century Press Archi...Wikidata as opportunity for special collections: the 20th Century Press Archi...
Wikidata as opportunity for special collections: the 20th Century Press Archi...
 
20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata
 
Chancen und Herausforderungen einer komplementären Nutzung von GND und Wikidata
Chancen und Herausforderungen einer komplementären Nutzung von GND und WikidataChancen und Herausforderungen einer komplementären Nutzung von GND und Wikidata
Chancen und Herausforderungen einer komplementären Nutzung von GND und Wikidata
 
Pressemappe 20. Jahrhundert: Personen- und Firmendossiers
Pressemappe 20. Jahrhundert: Personen- und FirmendossiersPressemappe 20. Jahrhundert: Personen- und Firmendossiers
Pressemappe 20. Jahrhundert: Personen- und Firmendossiers
 
20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata
 
Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)
Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)
Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)
 
Making Wikidata fit as a Linking Hub for Knowledge Organization Systems
Making Wikidata fit as a Linking Hub for Knowledge Organization SystemsMaking Wikidata fit as a Linking Hub for Knowledge Organization Systems
Making Wikidata fit as a Linking Hub for Knowledge Organization Systems
 
Linking authorities through Wikidata
Linking authorities through WikidataLinking authorities through Wikidata
Linking authorities through Wikidata
 
Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Wikidata as a linking hub for knowledge organization systems? Integrating an ...Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Wikidata as a linking hub for knowledge organization systems? Integrating an ...
 
Wikidata as authority linking hub
Wikidata as authority linking hubWikidata as authority linking hub
Wikidata as authority linking hub
 
EconBiz Research Dataset (SWIB16 Lightning Talk)
EconBiz Research Dataset (SWIB16 Lightning Talk)EconBiz Research Dataset (SWIB16 Lightning Talk)
EconBiz Research Dataset (SWIB16 Lightning Talk)
 
Linked Data Publishing with Drupal (SWIB13 workshop)
Linked Data Publishing with Drupal (SWIB13 workshop)Linked Data Publishing with Drupal (SWIB13 workshop)
Linked Data Publishing with Drupal (SWIB13 workshop)
 
Using Web Taxonomies in Drupal
Using Web Taxonomies in DrupalUsing Web Taxonomies in Drupal
Using Web Taxonomies in Drupal
 

Kürzlich hochgeladen

WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024Jan Löffler
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilitiesalihassaah1994
 
Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpressssuser166378
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Shubham Pant
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteMavein
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...APNIC
 
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSedrianrheine
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSlesteraporado16
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxnaveenithkrishnan
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsRoxana Stingu
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdfShreedeep Rayamajhi
 
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfmchristianalwyn
 

Kürzlich hochgeladen (12)

WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilities
 
Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpress
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a Website
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
 
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptx
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
 
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
 

Change Tracking in Knowledge Organization Systems with skos-history

  • 1. Change Tracking in Knowledge Organization Systems with skos-history Joachim Neubert & Osma Suominen ZBW – Leibniz Information Centre for Economics, Kiel/Hamburg & The National Library of Finland, Helsinki DCMI/ASIST/AIMS Webinar Series: Generic Tools and Methods for SKOS-based Concept Schemes 16.3.2016
  • 2. Agenda  User questions and requirements  Getting a grip on changes:  Overview  Creating a version store  Generic queries  Dataset-specific adaption of queries  skos-history in use  Application at the National Library of Finland  Application for STW Thesaurus for Economics  Outlook: Future work and the skos-history project Page 2
  • 3. What users want to know … Page 3 … when we publish a new KOS version:  What‘s new?  What has changed?
  • 4. Use cases for extended change information Page 4  Human indexers wanting to learn about new and deprecated concepts  Human indexers (and supporting applications) re-indexing large sets of documents  People maintaining mappings to other vocabularies, and applications supporting them  People maintaining a derived subset of a KOS  Vocabulary-based automatic or semi-automatic indexing applications  Search applications utilizing the KOS
  • 5. Agenda  User questions and requirements  Getting a grip on changes:  Overview  Creating a version store  Generic queries  Dataset-specific adaption of queries  skos-history in use  Application at the National Library of Finland  Application for STW Thesaurus for Economics  Outlook: Future work and the skos-history project Page 5
  • 6. Overview: getting a grip on changes Provided that we have no access to the KOS maintenance system where the changes take place originally, or can’t extend it to report this changes comprehensively. Dataset versioning + skos-history approach => should work on every SKOS vocabulary Page 6
  • 7. Scope of vocabulary versioning  Versioning the concept scheme, not each individual concept  URIs for the concepts remain stable over the different versions  Distinct versions of a vocabulary, or at least timestamped dumps, must be available  Support for a continuous flow of changes, e.g., the LoC Subject Headings, or the concepts of the GND, is currently not provided Page 7
  • 8. Three basic steps to an actionable skos-history Start with one SKOS file per version. 1) Create the deltas - insertions and deletions - between every two version files. (Via a raw diff of sorted ntriples files, or via SPARQL MINUS in a triple store.This gives you thousands and thousands of differences - added or deleted triples -, even excluding bnodes.) 2) Load the version files and the insertions and deletions into a triple store as named graphs. 3) Add metadata about the versions and the deltas in a separate „version history graph“. Page 8 https://github.com/jneubert/skos-history/blob/master/bin/load_versions.sh
  • 9. Agenda  User questions and requirements  Getting a grip on changes:  Overview  Creating a version store  Generic queries  Dataset-specific adaption of queries  skos-history in use  Application at the National Library of Finland  Application for STW Thesaurus for Economics  Outlook: Future work and the skos-history project Page 9
  • 10. Hands on: Create a version store for skos-history Requirements:  SPARQL 1.1 compliant service or repository (‘triple store’), accessible in read/write mode https://github.com/NatLibFi/Skosmos/wiki/InstallTutorial#install-jena-fuseki  An environment for executing bash scripts for the data load script (any Linux should do, Cygwin may). Tutorial: https://github.com/jneubert/skos-history/wiki/Tutorial Code of scripts and queries: also on GitHub Page 10
  • 11. Load a version store: config file for JEL Page 11 Configuration for Fuseki (https://github.com/jneubert/skos-history/blob/master/bin/jel.config); see also configuration for Sesame (https://github.com/jneubert/skos-history/blob/master/bin/jel.sesame.config)
  • 12. Load a version store: load_versions.sh script Page 12
  • 13. Load a version store: load_versions.sh script Page 13
  • 14. Page 14 Example endpoint:http://zbw.eu/beta/sparql/stwv/query Version History Graph, discoverable via fix URI, e.g.: http://zbw.eu/stw/version
  • 15. Version History Graph, published as HTML/RDFa Page 15 http://zbw.eu/stw/version
  • 16. Vocabularies used for the plumbing  dc:/dcterms: Dublin Core, as usual the base for everything  void: http://rdfs.org/ns/void# Vocabulary of interlinked datasets  sd: http://www.w3.org/ns/sparql-service-description# SPARQL service description  delta: http://www.w3.org/2004/delta# Differences between RDF graphs  dsv: http://purl.org/iso25964/DataSet/Versioning# Version history records (providing version identifier and date) and a pointer to the current version – outside the actual version data  sh: http://purl.org/skos-history/ Scheme and concept version deltas Page 16
  • 17. What’s the benefit? A database of all versions of a KOS and all deltas between versions – which can be queried in parallel! Page 17
  • 18. Agenda  User questions and requirements  Getting a grip on changes:  Overview  Creating a version store  Generic queries  Dataset-specific adaption of queries  skos-history in use  Application at the National Library of Finland  Application for STW Thesaurus for Economics  Outlook: Future work and the skos-history project Page 18
  • 20. Newly inserted concepts – results Page 20
  • 21. Reports operating on standard SKOS structures Page 21 https://github.com/jneubert/skos-history/tree/master/sparql
  • 24. New concepts, split from old ones Page 24 Labels moved to added concepts: http://zbw.eu/beta/sparql-lab/?queryRef=https://api.github.com/repos/jneubert/skos-history/contents/sparql/labels_moved_to_added_concepts.rq
  • 25. Change history of a concept: “Personnel selection” Page 25 http://zbw.eu/beta/sparql-lab/?queryRef=https://api.github.com/repos/jneubert/skos-history/contents/sparql/concept_deltas.rq
  • 26. Agenda  User questions and requirements  Getting a grip on changes:  Overview  Creating a version store  Generic queries  Dataset-specific adaption of queries  skos-history in use  Application at the National Library of Finland  Application for STW Thesaurus for Economics  Outlook: Future work and the skos-history project Page 26
  • 27. GND subjects by subject category – query Page 27 https://github.com/jneubert/skos-history/blob/master/sparql/swdskos/added_concepts_by_category.rq
  • 28. GND subjects by subject category – results Page 28
  • 29. STW deprecated concepts – query Page 29 https://github.com/jneubert/skos-history/blob/master/sparql/stw/deprecated_concepts_by_category.rq
  • 30. STW deprecated concepts – result Page 30
  • 31. Agenda  User questions and requirements  Getting a grip on changes:  Overview  Creating a version store  Generic queries  Dataset-specific adaption of queries  skos-history in use  Application at the National Library of Finland  Application for STW Thesaurus for Economics  Outlook: Future work and the skos-history project Page 31
  • 32. skos-history at the National Library of Finland see separate slides at http://tinyurl.com/skos-history-nlf Page 32
  • 33. Agenda  User questions and requirements  Getting a grip on changes:  Overview  Creating a version store  Generic queries  Dataset-specific adaption of queries  skos-history in use  Application at the National Library of Finland  Application for STW Thesaurus for Economics  Outlook: Future work and the skos-history project Page 33
  • 34. STW Thesaurus for Economics  created in the 1990s  on the web and available as SKOS since 2009  bilingual (German/English)  about 6000 descriptors, 500 subject categories  overhaul during the last five years (five consecutive versions) Page 34
  • 35. STW change reports (precompiled query results) Page 35
  • 36. Visualizing change with aggregated data Page 36
  • 38. Drill down from chart to change report Page 38
  • 39. Future work and the skos-history project  Apply to differing concept schemes  Distill general properties useful for human-readable change reports as well as machine-actionable data  Get a grip on clusters of interrelated changes Please consider joining – particularly if  you are in charge of a KOS and want to publish its change history  you are using one or several KOS in an application, or intellectually, and want to trace and re-apply upstream changes  just feel challenged by the task Page 39
  • 40. Page 40 Thanks for listening! Joachim Neubert ZBW – Leibniz Information Centre for Economics j.neubert@zbw.eu Osma Suominen The National Library of Finland osma.suominen@helsinki.fi Project repository: https://github.com/jneubert/skos-history