SlideShare a Scribd company logo
1 of 19
Download to read offline
Integrating applications & projects
                            =
    Dynamic & repeatable transformation
   of existing Thesauri and Authority lists
                 into SKOS
                            +
 Cross-tabulation of Concepts Linked Data
   Presentation to the Linked Data Meeting
     University College of London, September 14th 2010
by Christophe Dupriez, Destin SSEB, dupriez@destin.be
        working for Belgium Poison Centre
     rue Bruyn 1, B-1120 Brussels (Belgium)
The main request from Users:
Whenever a concept is mentioned,
concise visual clues about:
 • Where it comes from? (e.g. : substances)
 • Where is it also mentioned?
  (e.g.     : in MDs Wiki+paper files)
 • For which role?
  (e.g. This substance, is it a problem or a cure? )
 • About how many times is it mentioned?
  (e.g.            =893 bibliographic records, 203 products,
  2180 calls, 65 medical reports)
+ single click to access any of those when desired.
Use Case : Current Awareness
From a list of subjects and
document types (e.g. reviews, case reports…) :
 • filter remote sources (e.g. PubMed)
 • help index new records with our vocabularies.
         We must manage equivalences
between our thesauri and remote ones (e.g. MeSH)
         (browse, display, validate, update...)
     Our Concept                  MeSH Heading
                        EQ
       & NTs...        EQ~
                                       & NTs...
                       BTM
                       NTM
Use Case: Emergencies
     words              Concepts               Data
           Feedback



1)   Powerful word search
2)   Information to discriminate between Concepts
3)   Identified Concepts = Clues to gather Linked Data
4)   Managing sets of data for different Clues
5)   Browsing Data to discriminate between Hypothesis
      Support MDs to build their recommendations;
          Analysis of Events for Toxico-Vigilance
Benefits of integrating SKOS
  (terminology and concepts management)
in all applications of users' workbench
• Multilingual data and user interface.
• Exhaustivity: Searches retrieves specifics, synonyms,
  translations and equivalent concepts in other “aligned”
  thesauri.
• Precision: precise result for a given concept
• Strongly validated updates; Data entry helped by Auto-
  complete
• Better metadata model, easier to maintain
Benefits of integrating
             Concepts' Usages information
        in all applications of users' workbench
•                               : concept's references enriched
    with statistics and links to places where they are also used.
• Promotes direct linking from a concept to its usages within
  applications.
• Promotes homogenous display and functionalities to create,
  display, update, link, unlink concepts to applications
  elements.
• Usage statistics (and search link) near each mention of a
  concept (passage from one application to another)
• Better metadata model, easier to maintain
1. BIBL application: Articles about Human Toxicology
Internal Thesauri (Subject Vocabularies):
  1) Substances
  2) Living beings (plants, animals, mushrooms...)
  3) Symptoms, Treatments
  4) Places
External Thesauri and Vocabularies:
  1) MeSH
  2) NCBI Taxonomy, SP2000 Catalogue of Life
  3) CAS/EINECS (REACH, ChemID+)
              2. WIKI application (“SAQ”)
Advices from MDs to others about how to manage
situations linked to the different concepts of the internal
thesauri.
3. CASES application
Data about calls received and cases reviewed.
Internal Thesauri already mentioned.
 4. PROD application: Mixtures sold on BE market
Internal Thesaurus: Substances
External Thesauri and Vocabularies:
(development to be undertaken by a network of Poison Centres)
 1)   CAS/EINECS (REACH, ChemID+)
 2)   Product Usage Categories
                5. CONTACT application
Topic specialists and Products' Manufacturers/Distributors
Internal Thesauri:
  1) Subject thesauri already mentioned
  2) Places
6. ASKOSI: Thesauri based Applications' Manager
Integration under ASKOSI umbrella remains to be done for
applications 3. 4. and 5. above.
ASKOSI.org is an open project to create Java tools to
integrate the benefits of terminology / concept
usages management within applications. It is:
  1. A Java Archive (JAR) providing an API aligned
   on SKOS conceptual organisation to access:
   1. Local or remote vocabularies / thesauri
    (being SKOS or not)
   2. Application data linked to Concepts;
  2. A Web Application to browse gathered
   thesauri (and to manage their interrelations)
Integrating ASKOSI with applications
                                                   Users

                                                                                             External Applications


                                                              Internet Browser                          SKOS RDF or XML
Our Java developments                                                                                   + Usage Statistics
                                                                                      HTTP

Java Open Source software that we
adapt to our needs                                                           Apache Tomcat J2EE

                                                                   Shared ASKOSI.JAR (SKOS Schemes Accesses
         Open Source Software                                           + Usage statistics by applications)

                                                           ASKOSI         Previous           DSpace
                                                             Web           Queries                         Apache
                                                           Applicati      and Data                         JspWiki
                                                              on          Navigator




                                                                         JDBC +
                                                                          SQL                     Search Engine
                                                                                                  Apache SolR +
                                                                                                 Lucene adapted
                                    XML, CSV                        SQL                            to our needs
                                    or RDF files                  Database
The ASKOSI JAR
• API aligned on W3C SKOS data structure
   = JavaBeans in-memory data structure
   = XML Structure http://www.askosi.org/ConceptScheme.xsd
 ◦ISO 25964 will be also considered.
• SQL, CSV, RDF and XML data sources
• Accesses can be Dynamic or Static (periodic reload)
  to import the data sources with SKOS goggles
• Usage statistics: which applications are using which
  SKOS concepts, how (roles) and how many times?
• Designed for data sharing: all applications in the
  same Web Application Container (J2EE) access a
  single copy of the data.
Remote Sources  ASKOSI
• Big thesauri: periodic editions of UMLS,
  Agrovoc, Catalogue OfLife (CoL), etc.:
1.   Parameterize ASKOSI for a static SQL source
2.   Load a local MySQL database with
     the new edition of UMLS/Agrovoc/CoL/...
3.   Reload corresponding schemes
• SKOS/RDF/XML Remote Web Services:
1.   Parameterize ASKOSI for load from
     a remote URL + XSLT transformation
2.   (Auto)reload of corresponding schemes
     (“one concept at a time” must be developed)
Internal Sources  ASKOSI
• Local Authority lists:
   Parameterize ASKOSI for a dynamic SQL
     source: ASKOSI gets data up-to-date.
• Legacy applications:
1.    Parameterize ASKOSI for XML file/URL load
      + XSLT transformation if necessary
2.    Regularly generate the XML file
      with local usage data
3.    (Auto)reload of corresponding schemes
• Little lists or small thesauri:
1.    Parameterise ASKOSI for Excel CSV source.
2. (Auto)reload of corresponding schemes
Parameters to “SKOSify”
                      the SQL Data Source
                 for the WindMusic Thesaurus
type=SQL                                     url=jdbc:postgresql://dbserver:5432/dspace
pool=wind                                    driver=org.postgresql.Driver
title-en=Keywords                            username = dspace
title-fr=Mots-clés                           password = xxxxxxxxx
title-de=Stichwörtern                        validation=SELECT 1 #Oracle: SELECT 1 FROM DUAL
title-es=Palabras claves                     IDdc=select … as key, metadata_field_id as value
title-nl=Trefwoord                                  from metadatafieldregistry;
title.lorthes-en=Keywords                    IDhandle=select … as key, resource_id as value from handle;
…
display-en=http:/dspace/handle/68502/[about]
icon-en=/dspace/image/68502/27.gif
create-en=http:/dspace/submit?post=yes&collection={IDhandle@27}&step=0
…
notation.lorthes=SELECT h.handle AS about, i.text_value AS notation
             from item as m, handle as h, metadatavalue as I
             where i.metadata_field_id={IDdc@identifier.loris}
                    … and m.owning_collection={IDhandle@27}
labels=SELECT h.handle AS about, t.text_value AS label, t.text_lang AS lang
          from item as m, handle as h, metadatavalue as t
          where h.resource_type_id=2
                … and t.metadata_field_id={IDdc@title}
               and m.owning_collection={IDhandle@27}
…alternates…broaders…broadmatches…notes…
The ASKOSI Web Application
• Authority lists browsing:
  ◦ Thesauri trees
  ◦ Alphabetical lists
  ◦ Decreasing Usage Frequency
  ◦Powerful word and string search tool
• SKOS Concepts display in different formats / extents
  ◦ Generation of SKOS RDF
• Validations:
  ◦ Data errors
  ◦ Terminology validations (ambiguity, missing translation)
  ◦ Hierarchy validations (loops, siblings)
• Links to applications using the SKOS concepts
    In development: search history manager, changes approval
    workflow, cross thesauri equivalence relations management .
Open Questions to the Community
•   Users navigating the “Linked Data” Web need concise
    visual clues to decide what to do next, knowing what
    is behind each possible click:
    How could we standardize a visual symbols
    system, the road signs of the Linked Data Web
    and its SKOS roundabouts? (proposals next page)
•   What is behind each concept? what is linked?
    How could we standardize different harvesting
    mechanisms for Concepts Usage Data?
    ◦ “Push” vs “Pull” mode

    ◦ Local vs Remote

    ◦ Absolute vs Incremental

    ◦ Results varying with user authorisations or preferences

•   Linking with URIs allows user side or server side integration of
    applications.
    But between users and applications? Within an application?
 
       

      
          Standardising
          Symbols?

          Your opinion about
          ConceptScheme symbols
          below?




          Detailed proposals available at:
          http://www.destin.be/ASKOSI/Wiki.jsp?page=Icons%20for%20SKOS
Call to Collaborations!
1.   We want to integrate a “voting system” for reviewing SKOS / RDF
     statement contributions, including mappings between thesauri.
     Students welcome!
     Full proposed specs on: http://www.askosi.org/maintenance.pdf
2.   Where could we discuss “Roadsigns for the Linked Data Web”?
3.   Where could we discuss “Concepts Usage Data Harvesting”?
4.   Where could we discuss “Concept References encoding (indexing
     chains) within Applications” ?

                                         christophe.dupriez@destin.be
                                  christophe.dupriez@poisoncentre.be

More Related Content

What's hot

Munching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processingMunching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processingabial
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data PlatformsCassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data PlatformsDataStax Academy
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfan
 
A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...Editor IJCATR
 
Azure Data Factory usage at Aucfanlab
Azure Data Factory usage at AucfanlabAzure Data Factory usage at Aucfanlab
Azure Data Factory usage at AucfanlabAucfan
 
Introduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, LucidworksIntroduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, LucidworksLucidworks
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...ijdms
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesLucidworks (Archived)
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)dnaber
 

What's hot (20)

Munching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processingMunching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processing
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data PlatformsCassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
 
SamBO
SamBOSamBO
SamBO
 
Unit 3 MongDB
Unit 3 MongDBUnit 3 MongDB
Unit 3 MongDB
 
SamKK
SamKKSamKK
SamKK
 
elasticsearch
elasticsearchelasticsearch
elasticsearch
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -
 
Artigo no sql x relational
Artigo no sql x relationalArtigo no sql x relational
Artigo no sql x relational
 
Lucene basics
Lucene basicsLucene basics
Lucene basics
 
Apache lucene
Apache luceneApache lucene
Apache lucene
 
A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...
 
Mongo db
Mongo dbMongo db
Mongo db
 
Azure Data Factory usage at Aucfanlab
Azure Data Factory usage at AucfanlabAzure Data Factory usage at Aucfanlab
Azure Data Factory usage at Aucfanlab
 
Introduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, LucidworksIntroduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User Preferences
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
 

Viewers also liked

AKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and ControlAKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and ControlDESTIN-Informatique.com
 
Poesía em
Poesía emPoesía em
Poesía emEVT
 
AKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and ControlAKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and ControlDESTIN-Informatique.com
 
Poesía medieval
Poesía medievalPoesía medieval
Poesía medievalEVT
 
Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)maditabalnco
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome EconomyHelge Tennø
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsBarry Feldman
 

Viewers also liked (8)

Publishing Linked Data using Schema.org
Publishing Linked Data using Schema.orgPublishing Linked Data using Schema.org
Publishing Linked Data using Schema.org
 
AKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and ControlAKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and Control
 
Poesía em
Poesía emPoesía em
Poesía em
 
AKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and ControlAKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and Control
 
Poesía medieval
Poesía medievalPoesía medieval
Poesía medieval
 
Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post Formats
 

Similar to Integrating Concepts and Usage Data

Elastic search overview
Elastic search overviewElastic search overview
Elastic search overviewABC Talks
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic WebIvan Herman
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataversevty
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problemsAbhishek Gupta
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...Jennifer Bowen
 
WOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsWOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsAndreas Kamilaris
 
Overview AG AKSW
Overview AG AKSWOverview AG AKSW
Overview AG AKSWSören Auer
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...DataWorks Summit
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebNuxeo
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8dallemang
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingInexture Solutions
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Dr. Haxel Consult
 

Similar to Integrating Concepts and Usage Data (20)

Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
963
963963
963
 
Resume of Min Xu
Resume of Min XuResume of Min Xu
Resume of Min Xu
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
The Power of Elasticsearch
The Power of ElasticsearchThe Power of Elasticsearch
The Power of Elasticsearch
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problems
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
 
WOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsWOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of Things
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Overview AG AKSW
Overview AG AKSWOverview AG AKSW
Overview AG AKSW
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
 
Big Data , Big Problem?
Big Data , Big Problem?Big Data , Big Problem?
Big Data , Big Problem?
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth Using
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents
 

Recently uploaded

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Integrating Concepts and Usage Data

  • 1. Integrating applications & projects = Dynamic & repeatable transformation of existing Thesauri and Authority lists into SKOS + Cross-tabulation of Concepts Linked Data Presentation to the Linked Data Meeting University College of London, September 14th 2010 by Christophe Dupriez, Destin SSEB, dupriez@destin.be working for Belgium Poison Centre rue Bruyn 1, B-1120 Brussels (Belgium)
  • 2. The main request from Users: Whenever a concept is mentioned, concise visual clues about: • Where it comes from? (e.g. : substances) • Where is it also mentioned? (e.g. : in MDs Wiki+paper files) • For which role? (e.g. This substance, is it a problem or a cure? ) • About how many times is it mentioned? (e.g. =893 bibliographic records, 203 products, 2180 calls, 65 medical reports) + single click to access any of those when desired.
  • 3. Use Case : Current Awareness From a list of subjects and document types (e.g. reviews, case reports…) : • filter remote sources (e.g. PubMed) • help index new records with our vocabularies. We must manage equivalences between our thesauri and remote ones (e.g. MeSH) (browse, display, validate, update...) Our Concept MeSH Heading EQ & NTs... EQ~ & NTs... BTM NTM
  • 4. Use Case: Emergencies words Concepts Data Feedback 1) Powerful word search 2) Information to discriminate between Concepts 3) Identified Concepts = Clues to gather Linked Data 4) Managing sets of data for different Clues 5) Browsing Data to discriminate between Hypothesis Support MDs to build their recommendations; Analysis of Events for Toxico-Vigilance
  • 5. Benefits of integrating SKOS (terminology and concepts management) in all applications of users' workbench • Multilingual data and user interface. • Exhaustivity: Searches retrieves specifics, synonyms, translations and equivalent concepts in other “aligned” thesauri. • Precision: precise result for a given concept • Strongly validated updates; Data entry helped by Auto- complete • Better metadata model, easier to maintain
  • 6. Benefits of integrating Concepts' Usages information in all applications of users' workbench • : concept's references enriched with statistics and links to places where they are also used. • Promotes direct linking from a concept to its usages within applications. • Promotes homogenous display and functionalities to create, display, update, link, unlink concepts to applications elements. • Usage statistics (and search link) near each mention of a concept (passage from one application to another) • Better metadata model, easier to maintain
  • 7. 1. BIBL application: Articles about Human Toxicology Internal Thesauri (Subject Vocabularies): 1) Substances 2) Living beings (plants, animals, mushrooms...) 3) Symptoms, Treatments 4) Places External Thesauri and Vocabularies: 1) MeSH 2) NCBI Taxonomy, SP2000 Catalogue of Life 3) CAS/EINECS (REACH, ChemID+) 2. WIKI application (“SAQ”) Advices from MDs to others about how to manage situations linked to the different concepts of the internal thesauri.
  • 8. 3. CASES application Data about calls received and cases reviewed. Internal Thesauri already mentioned. 4. PROD application: Mixtures sold on BE market Internal Thesaurus: Substances External Thesauri and Vocabularies: (development to be undertaken by a network of Poison Centres) 1) CAS/EINECS (REACH, ChemID+) 2) Product Usage Categories 5. CONTACT application Topic specialists and Products' Manufacturers/Distributors Internal Thesauri: 1) Subject thesauri already mentioned 2) Places
  • 9. 6. ASKOSI: Thesauri based Applications' Manager Integration under ASKOSI umbrella remains to be done for applications 3. 4. and 5. above. ASKOSI.org is an open project to create Java tools to integrate the benefits of terminology / concept usages management within applications. It is: 1. A Java Archive (JAR) providing an API aligned on SKOS conceptual organisation to access: 1. Local or remote vocabularies / thesauri (being SKOS or not) 2. Application data linked to Concepts; 2. A Web Application to browse gathered thesauri (and to manage their interrelations)
  • 10. Integrating ASKOSI with applications Users External Applications Internet Browser SKOS RDF or XML Our Java developments + Usage Statistics HTTP Java Open Source software that we adapt to our needs Apache Tomcat J2EE Shared ASKOSI.JAR (SKOS Schemes Accesses Open Source Software + Usage statistics by applications) ASKOSI Previous DSpace Web Queries Apache Applicati and Data JspWiki on Navigator JDBC + SQL Search Engine Apache SolR + Lucene adapted XML, CSV SQL to our needs or RDF files Database
  • 11. The ASKOSI JAR • API aligned on W3C SKOS data structure = JavaBeans in-memory data structure = XML Structure http://www.askosi.org/ConceptScheme.xsd ◦ISO 25964 will be also considered. • SQL, CSV, RDF and XML data sources • Accesses can be Dynamic or Static (periodic reload) to import the data sources with SKOS goggles • Usage statistics: which applications are using which SKOS concepts, how (roles) and how many times? • Designed for data sharing: all applications in the same Web Application Container (J2EE) access a single copy of the data.
  • 12. Remote Sources  ASKOSI • Big thesauri: periodic editions of UMLS, Agrovoc, Catalogue OfLife (CoL), etc.: 1. Parameterize ASKOSI for a static SQL source 2. Load a local MySQL database with the new edition of UMLS/Agrovoc/CoL/... 3. Reload corresponding schemes • SKOS/RDF/XML Remote Web Services: 1. Parameterize ASKOSI for load from a remote URL + XSLT transformation 2. (Auto)reload of corresponding schemes (“one concept at a time” must be developed)
  • 13. Internal Sources  ASKOSI • Local Authority lists: Parameterize ASKOSI for a dynamic SQL source: ASKOSI gets data up-to-date. • Legacy applications: 1. Parameterize ASKOSI for XML file/URL load + XSLT transformation if necessary 2. Regularly generate the XML file with local usage data 3. (Auto)reload of corresponding schemes • Little lists or small thesauri: 1. Parameterise ASKOSI for Excel CSV source. 2. (Auto)reload of corresponding schemes
  • 14. Parameters to “SKOSify” the SQL Data Source for the WindMusic Thesaurus type=SQL url=jdbc:postgresql://dbserver:5432/dspace pool=wind driver=org.postgresql.Driver title-en=Keywords username = dspace title-fr=Mots-clés password = xxxxxxxxx title-de=Stichwörtern validation=SELECT 1 #Oracle: SELECT 1 FROM DUAL title-es=Palabras claves IDdc=select … as key, metadata_field_id as value title-nl=Trefwoord from metadatafieldregistry; title.lorthes-en=Keywords IDhandle=select … as key, resource_id as value from handle; … display-en=http:/dspace/handle/68502/[about] icon-en=/dspace/image/68502/27.gif create-en=http:/dspace/submit?post=yes&collection={IDhandle@27}&step=0 … notation.lorthes=SELECT h.handle AS about, i.text_value AS notation from item as m, handle as h, metadatavalue as I where i.metadata_field_id={IDdc@identifier.loris} … and m.owning_collection={IDhandle@27} labels=SELECT h.handle AS about, t.text_value AS label, t.text_lang AS lang from item as m, handle as h, metadatavalue as t where h.resource_type_id=2 … and t.metadata_field_id={IDdc@title} and m.owning_collection={IDhandle@27} …alternates…broaders…broadmatches…notes…
  • 15. The ASKOSI Web Application • Authority lists browsing: ◦ Thesauri trees ◦ Alphabetical lists ◦ Decreasing Usage Frequency ◦Powerful word and string search tool • SKOS Concepts display in different formats / extents ◦ Generation of SKOS RDF • Validations: ◦ Data errors ◦ Terminology validations (ambiguity, missing translation) ◦ Hierarchy validations (loops, siblings) • Links to applications using the SKOS concepts In development: search history manager, changes approval workflow, cross thesauri equivalence relations management .
  • 16.
  • 17. Open Questions to the Community • Users navigating the “Linked Data” Web need concise visual clues to decide what to do next, knowing what is behind each possible click: How could we standardize a visual symbols system, the road signs of the Linked Data Web and its SKOS roundabouts? (proposals next page) • What is behind each concept? what is linked? How could we standardize different harvesting mechanisms for Concepts Usage Data? ◦ “Push” vs “Pull” mode ◦ Local vs Remote ◦ Absolute vs Incremental ◦ Results varying with user authorisations or preferences • Linking with URIs allows user side or server side integration of applications. But between users and applications? Within an application?
  • 18.          Standardising Symbols? Your opinion about ConceptScheme symbols below? Detailed proposals available at: http://www.destin.be/ASKOSI/Wiki.jsp?page=Icons%20for%20SKOS
  • 19. Call to Collaborations! 1. We want to integrate a “voting system” for reviewing SKOS / RDF statement contributions, including mappings between thesauri. Students welcome! Full proposed specs on: http://www.askosi.org/maintenance.pdf 2. Where could we discuss “Roadsigns for the Linked Data Web”? 3. Where could we discuss “Concepts Usage Data Harvesting”? 4. Where could we discuss “Concept References encoding (indexing chains) within Applications” ? christophe.dupriez@destin.be christophe.dupriez@poisoncentre.be