SlideShare a Scribd company logo
1 of 44
OKFN Korea
Hackathon Day
2013. 06. 22.
Toward Open Data World
OKFN Korea2
What is linked
data, Open
data?
Refine
Modelling
Access
Triple
Storage
other topics
image: Leo Oosterloo @ flickr.com
서울시 데이터 Enrichment
 목표
 서울시 데이터 상세화를 위한 온톨로지 설계 또는 매핑
 구조화, 의미화, 그리고 연결: 서울시 데이터 (비정형 데이터)를 온톨로지를 이용해
모델링하고, 외부 데이터와 연결
 영문화: 비 한국어권 사용자가 사용할 수 있는 서울시 데이터 제공
 범위
 서울시 데이터셋 약 40종
 문화재: 문화재청에서 수집한 국내 문화재 (국보, 보물, 지정문화재, 무형문화재 등)
 방법론: 기존 RDF 어휘의 재사용을 통해 데이터 모델링
 1) 데이터 선정: 서울시 열린데이터 광장에서 모델링 대상 데이터셋 선정
 2) 데이터 셋 항목 검토: 데이터 셋의 개별 항목과 Dbpedia 온톨로지 (클래스, 속성)
의 매핑 관계 검토
• Dbpedia 온톨로지: 사물에 대한 개념 및 위키피디아 infobox 항목을 포함하고 있음
OKFN Korea3
서울시 데이터 Enrichment
 예를 들어, '박물관'을 모델링 할 경우,
• 박물관에 대한 infobox 템플릿을 위키피디아에서 선택
• Dbpedia에서 박물관 infobox와 매핑한 어휘 선택
• 어휘와 데이터셋 항목 매핑
• 매핑되지 않는 항목의 모델링 여부 결정 (클래스, 속성 포함): 모델링 도구 결정 필요
• URI 체계 (별도 설계 필요) 적용
• 온톨로지 스키마 설계 완료
 3) 데이터 정제
• Google Refine을 통해 데이터 정제
• Refine에서 추가하기 전에 할 작업
• 위치 데이터: 원본 데이터 (서울시)에 위치값을 변환 또는 추가
• 영문명: 한글명의 변환, 매핑 (수작업 필요)
• Refine에서 할 작업
– 한글, 영문 위키피디아 URL 추가
– Dbpedia, Freebase URL 추가: Refine reconciliation을 이용해서 추가
– RDF 변환 매핑 Skelton 작업
– RDF, Excel 추출
 4) 데이터 업로드 (RDF 또는 Excel)
 데이터 스토어 선택
 Jena, 4Store, …
OKFN Korea4
Contents
OKFN Korea
Modeling Issues1
Management Issues2
5
Modelling – RDF
Subject Predicate Object
Modelling – RDF
Subject Predicate Object
some school has a name/label some literal
Modelling – RDF
Subject Predicate Object
http://education.data.gov.uk
/id/school/401874
has a name/label ―Cardiff High School‖
Modelling – RDF
Subject Predicate Object
http://education.data.gov.uk
/id/school/401874
http://www.w3.org/2000/01/
rdf-schema#label
―Cardiff High School‖
Modelling – RDF
Subject Predicate Object
school:401874 rdfs:label ―Cardiff High School‖
where
school: = http://education.data.gov.uk/id/school/
rdfs: = http://www.w3.org/2000/01/rdf-schema#
Modelling – RDF
Subject Predicate Object
school:401874 rdfs:label ―Cardiff High School‖
school:401874 ont:districtAdministrative la:00PT
la:00PT rdfs:label Cardiff
Modelling – RDF
Subject Predicate Object
school:401874 rdfs:label ―Cardiff High School‖
school:401874 ont:districtAdministrative la:00PT
la:00PT rdfs:label ―Cardiff‖
school:401874
―Cardiff High School‖
ont:districtAdministrative
la:00PT
―Cardiff‖
rdfs:label
rdfs:label
Modelling – RDF
Subject Predicate Object
school:401874 rdfs:label ―Cardiff High School‖
school:401874 ont:districtAdministrative la:00PT
la:00PT rdfs:label ―Cardiff‖
la:00PT rdfs:label ―Caerdydd‖@cy
Modelling – vocabularies
Logical modelling
modelling the domain, not a particular
data structure
 what exists
 what is asserted? what can you deduce from
that?
 not about constraints as such
 monotonic, open world
controlled
vocabulary
taxonomy
thesaurus
ontology
Ontology
Modelling – vocabularies
unfamiliar terminology but related to
 information architecture and conceptual
modelling
 domain-driven design
 ... and yes knowledge representation
Elements of:
 Vocabulary (defining terms)
• I define a relationship called “prescribed dose.”
 Schema (defining types)
• “prescribed dose” relates “treatments” to “dosagee
s”
 Taxonomy (defining hierarchies)
• Any “doctor” is a “medical professional”
16
RDF Schema is…
Modelling – RDFS
RDF vocabulary description language
classes, types and type hierarchy
ont:School rdfs:Class
rdf:type
―School‖
rdfs:label
Modelling – RDFS
RDF vocabulary description language
classes, types and type hierarchy
ont:WelshEstablishment
ont:School rdfs:Class
rdf:type
rdf:typerdfs:subClassOf
―School‖
rdfs:label
Modelling – RDFS
RDF vocabulary description language
classes, types and type hierarchy
school:401874
ont:WelshEstablishment
ont:WelshEstablishment
ont:School rdfs:Class rdf:typerdf:type
rdf:typerdfs:subClassOf
―School‖
rdfs:label
Modelling – RDFS
RDF vocabulary description language
classes, types and type hierarchy
school:401874
ont:WelshEstablishment
ont:WelshEstablishment
ont:School rdfs:Class rdf:typerdf:type
rdf:typerdfs:subClassOf
school:401874
ont:WelshEstablishment
ont:School
rdf:type

―School‖
rdfs:label
―School‖
rdfs:label
Modelling – RDFS
RDF vocabulary description language
properties, property hierarchy
school:401874
person:JoeBloggs
ont:staffAt
ont:headOf
rdf:Property
ont:headOf
rdf:type
rdfs:subPropertyOf

school:401874person:JoeBloggs
ont:staffAt
ont:headOf
Modelling – RDFS
RDF vocabulary description language
class/property relations
 domain
 range
Already have power to do some vocab
ulary mapping
 declare classes or properties from different vo
cabularies to be equivalent:
A rdfs:subClassOf B
B rdfs:subClassOf A
WOL OWL is…
23
Web Ontology Language
Elements of ontology
 Same/different identity
• “author” and “auteur” are the same relation
• two resources with the same “ISBN” are the same
“book”
 More expressive type definitions
• A “cycle” is a “vehicle” with at least one “wheel”
• A “bicycle” is a “cycle” with exactly two “wheels”
 More expressive relation definitions
• “sibling” is a symmetric predicate
• the value of the “favorite dwarf” relation must be one of
“happy”, “sleepy”, “sneezy”, “grumpy”, “dopey”,
“bashful”, “doc”
OWL is…
24
Answer questions of
 Consistency
• Are there any contradictions in this model?
 Classification
• What are all the inferred types of this resource?
 Satisfiability
• Are there any classes in this ontology that cannot p
ossibly have any members?
What can we do with OWL?
25
Building Useful Ontologies
 Developing and maintaining quality ontolgies is very
challenging
 Users need tools and services, e.g., to help check
if ontology is:
 Meaningful — all named classes can have instances
http://www.aber.ac.uk/compsci/public/media/presentations/OUCL-seminar.ppt
Building Useful Ontologies
 Developing and maintaining quality ontolgies is very
challenging
 Users need tools and services, e.g., to help check
if ontology is:
 Meaningful — all named classes can have instances
 Correct — captures intuitions of domain experts
Building Useful Ontologies
 Developing and maintaining quality ontolgies is very
challenging
 Users need tools and services, e.g., to help check if ont
ology is:
 Meaningful — all named classes can have instances
 Correct — captures intuitions of domain experts
 Minimally redundant — no unintended synonyms

Banana split Banana sundae
Modelling - OWL
 richer modelling and semantics
 axioms on properties
 transitive, symmetric, inverseOf, ...
 functional, inverse functional
 equivalent property
 axioms on classes
 intersection, union, disjoint, equivalent
 restrictions on classes
 some value from, all values from, cardinality, has value,
one of, keys
 axioms on individuals
 same as, different from, all different
 imports
Modelling – OWL
supports much richer modelling
consistency checking of model
consistency checking of data
 some surprises if used to schema languages
 open world, no unique name assumption
 can extend to closed world checking
inference
 classification
 inferred relationships
Modelling
Spectrum of goals and styles
Lightweight vocabularies Rich ontological models
 simple modelling
 just enough agreement
to get useful work done
 removing boundaries to
enable information to be
found and connected
 global consistency not
possible
 a little semantics goes
a long way
 rich domain models
 need expressivity
 consistency is critical
 make complex infere
nces you can rely on,
across data you trust
 knowledge is power
Modelling
Ontology reuse
invest in complete ontology for a domain
 rich but general model, may be modular inside
 strong ―ontological commitment‖
 e.g. medical ontologies
reuse small, common, vocabularies
 FOAF, SIOC, Dublin Core, Org ...
 pick and choose classes and properties you need
 fill in a few missing links for your domain
generic reusable vocabularies
 Data cube vocabulary
Reusable, public on
tologies
33
Measurement Units Ontology
The Event Ontology
FOAF
schema.org is one of a number of
microdata vocabularies
it is a shared collection of microdata
schemas for use by webmasters
includes a type hierarchy, like an
RDFS schema
 starts with top-level Thing and DataType
types
 properties are inherited by descendant types
Schema.org
34
annotate an item with text-valued
properties using the “itemprop”
attribute
microdata properties
35
<div itemscope>
<p>My name is <span itemprop="name">Daniel</span>.</p>
</div>
<div itemscope>
<p>Flavors in my favorite ice cream:</p>
<ul>
<li itemprop="flavor">Lemon sorbet</li>
<li itemprop="flavor">Apricot sorbet</li>
</ul>
</div>
Google
Yahoo
Bing
Why should you use schema.org?
36
Top types
37
maintains schema.org ↔RDF
mappings
 there are mappings for BIBO, DBpedia,
Dublin Core, FOAF, GoodRelations, SIOC,
and WordNet
also provides examples, tutorials, and
data dumps
Schema.rdfs.org
38
Triple Store
OKFN Korea39
Triple Store & RDB
OKFN Korea
http://blog.gniewoslaw.pl/2012/11/relational-databases-vs-triple-stores/
40
Storage Solutions
for RDF Data
Triple Table (Basic Idea)
 Store all RDF triples in a single table
 Create indexes on combinations of S, P, and O
OKFN Korea41
The Internet Map
OKFN Korea
http://internet-map.net/
42
credits
These slides are partially based on
“Linked data and its role in the
semantic web” by Dave Reynolds,
Epimorphics Ltd.
OKFN Korea43
OKFN Korea

More Related Content

What's hot

A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
María Poveda Villalón
 
Ontology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical studyOntology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical study
Debashisnaskar
 
Lecture linked data cloud & sparql
Lecture linked data cloud & sparqlLecture linked data cloud & sparql
Lecture linked data cloud & sparql
Dhavalkumar Thakker
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
Besnik Fetahu
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011
Lihua Zhao
 

What's hot (20)

Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
OEG-Tools for supporting Ontology Engineering
OEG-Tools for supporting Ontology EngineeringOEG-Tools for supporting Ontology Engineering
OEG-Tools for supporting Ontology Engineering
 
What makes a linked data pattern interesting?
What makes a linked data pattern interesting?What makes a linked data pattern interesting?
What makes a linked data pattern interesting?
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
 
Social Tags and Linked Data for Ontology Development: A Case Study in the Fin...
Social Tags and Linked Data for Ontology Development: A Case Study in the Fin...Social Tags and Linked Data for Ontology Development: A Case Study in the Fin...
Social Tags and Linked Data for Ontology Development: A Case Study in the Fin...
 
Ontology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical studyOntology and Ontology Libraries: a critical study
Ontology and Ontology Libraries: a critical study
 
Bh14 ogo
Bh14 ogoBh14 ogo
Bh14 ogo
 
Ontology
Ontology Ontology
Ontology
 
LUCERO - Building the Open University Web of Linked Data
LUCERO - Building the Open University Web of Linked DataLUCERO - Building the Open University Web of Linked Data
LUCERO - Building the Open University Web of Linked Data
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
 
Lecture linked data cloud & sparql
Lecture linked data cloud & sparqlLecture linked data cloud & sparql
Lecture linked data cloud & sparql
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
 
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
 
Quality Metrics for Linked Open Data
Quality Metrics for  Linked Open Data Quality Metrics for  Linked Open Data
Quality Metrics for Linked Open Data
 
Use of Research (Meta-)Data - Finding researchers in/across organizations -
Use of Research (Meta-)Data  - Finding researchers in/across organizations -Use of Research (Meta-)Data  - Finding researchers in/across organizations -
Use of Research (Meta-)Data - Finding researchers in/across organizations -
 
Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic Framework
 
Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficient
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
 

Viewers also liked

20130622 okf hackathon t1
20130622 okf hackathon t120130622 okf hackathon t1
20130622 okf hackathon t1
Seonho Kim
 
자바와 사용하기2
자바와 사용하기2자바와 사용하기2
자바와 사용하기2
destinycs
 
4장 테스트 자동화의 철학
4장 테스트 자동화의 철학4장 테스트 자동화의 철학
4장 테스트 자동화의 철학
samagu0030
 
자연스러운 세부 수준 변화
자연스러운 세부 수준 변화자연스러운 세부 수준 변화
자연스러운 세부 수준 변화
samagu0030
 
Refactoring -chapter 7,8-
Refactoring -chapter 7,8-Refactoring -chapter 7,8-
Refactoring -chapter 7,8-
Kwang Jung Kim
 

Viewers also liked (20)

Scala
ScalaScala
Scala
 
6.테이블만들기
6.테이블만들기6.테이블만들기
6.테이블만들기
 
Luminus : Html templating
Luminus : Html templating Luminus : Html templating
Luminus : Html templating
 
20130622 okf hackathon t1
20130622 okf hackathon t120130622 okf hackathon t1
20130622 okf hackathon t1
 
픽킹
픽킹픽킹
픽킹
 
실무로 배우는 시스템 성능 최적화 10부. 네트워크 모니터링
실무로 배우는 시스템 성능 최적화   10부. 네트워크 모니터링실무로 배우는 시스템 성능 최적화   10부. 네트워크 모니터링
실무로 배우는 시스템 성능 최적화 10부. 네트워크 모니터링
 
Clojure 스터디 Luminus Routing
Clojure 스터디 Luminus RoutingClojure 스터디 Luminus Routing
Clojure 스터디 Luminus Routing
 
Fsm
FsmFsm
Fsm
 
클로저 1
클로저 1클로저 1
클로저 1
 
Clojure programming-김민지
Clojure programming-김민지Clojure programming-김민지
Clojure programming-김민지
 
(Lisp)
(Lisp)(Lisp)
(Lisp)
 
On lisp ch09
On lisp ch09On lisp ch09
On lisp ch09
 
On lisp ch18
On lisp ch18On lisp ch18
On lisp ch18
 
자바와 사용하기2
자바와 사용하기2자바와 사용하기2
자바와 사용하기2
 
ALLDATA 2015 - RDF Based Linked Data Management as a DaaS Platform
ALLDATA 2015 - RDF Based Linked Data Management as a DaaS PlatformALLDATA 2015 - RDF Based Linked Data Management as a DaaS Platform
ALLDATA 2015 - RDF Based Linked Data Management as a DaaS Platform
 
픽킹
픽킹픽킹
픽킹
 
4장 테스트 자동화의 철학
4장 테스트 자동화의 철학4장 테스트 자동화의 철학
4장 테스트 자동화의 철학
 
자연스러운 세부 수준 변화
자연스러운 세부 수준 변화자연스러운 세부 수준 변화
자연스러운 세부 수준 변화
 
Refactoring -chapter 7,8-
Refactoring -chapter 7,8-Refactoring -chapter 7,8-
Refactoring -chapter 7,8-
 
7.데이터수정
7.데이터수정7.데이터수정
7.데이터수정
 

Similar to 20130622 okfn hackathon t2

Ocwc global 2013 locwd a vocabulary for ocw based on linked open data techn...
Ocwc global 2013   locwd a vocabulary for ocw based on linked open data techn...Ocwc global 2013   locwd a vocabulary for ocw based on linked open data techn...
Ocwc global 2013 locwd a vocabulary for ocw based on linked open data techn...
The Open Education Consortium
 
Copac: Reengineering the UK national academic union catalogue to serve the 21...
Copac: Reengineering the UK national academic union catalogue to serve the 21...Copac: Reengineering the UK national academic union catalogue to serve the 21...
Copac: Reengineering the UK national academic union catalogue to serve the 21...
Joy Palmer
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
Carole Goble
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
Lucy McKenna
 
Presentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOMPresentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOM
Mathieu d'Aquin
 

Similar to 20130622 okfn hackathon t2 (20)

Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Ocwc global 2013 locwd a vocabulary for ocw based on linked open data techn...
Ocwc global 2013   locwd a vocabulary for ocw based on linked open data techn...Ocwc global 2013   locwd a vocabulary for ocw based on linked open data techn...
Ocwc global 2013 locwd a vocabulary for ocw based on linked open data techn...
 
An Architecture based on Linked Data technologies for the Integration of OER ...
An Architecture based on Linked Data technologies for the Integration of OER ...An Architecture based on Linked Data technologies for the Integration of OER ...
An Architecture based on Linked Data technologies for the Integration of OER ...
 
Linked Data for Libraries: Experiments between Cornell, Harvard and Stanford
Linked Data for Libraries: Experiments between Cornell, Harvard and StanfordLinked Data for Libraries: Experiments between Cornell, Harvard and Stanford
Linked Data for Libraries: Experiments between Cornell, Harvard and Stanford
 
From ontology to wiki
From ontology to wikiFrom ontology to wiki
From ontology to wiki
 
Copac: Reengineering the UK national academic union catalogue to serve the 21...
Copac: Reengineering the UK national academic union catalogue to serve the 21...Copac: Reengineering the UK national academic union catalogue to serve the 21...
Copac: Reengineering the UK national academic union catalogue to serve the 21...
 
Linking Data, Linking People
Linking Data, Linking PeopleLinking Data, Linking People
Linking Data, Linking People
 
Opening up MOOCs for OER management on the Web of linked data
Opening up MOOCs for OER management on the Web of linked dataOpening up MOOCs for OER management on the Web of linked data
Opening up MOOCs for OER management on the Web of linked data
 
A theory of Metadata enriching & filtering
A theory of  Metadata enriching & filteringA theory of  Metadata enriching & filtering
A theory of Metadata enriching & filtering
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overview
 
Ontology Engineering
Ontology EngineeringOntology Engineering
Ontology Engineering
 
Knowledge Graph Construction and the Role of DBPedia
Knowledge Graph Construction and the Role of DBPediaKnowledge Graph Construction and the Role of DBPedia
Knowledge Graph Construction and the Role of DBPedia
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
 
UMBEL: Subject Concepts Layer for the Web
UMBEL: Subject Concepts Layer for the WebUMBEL: Subject Concepts Layer for the Web
UMBEL: Subject Concepts Layer for the Web
 
Presentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOMPresentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOM
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

20130622 okfn hackathon t2

  • 1. OKFN Korea Hackathon Day 2013. 06. 22. Toward Open Data World
  • 2. OKFN Korea2 What is linked data, Open data? Refine Modelling Access Triple Storage other topics image: Leo Oosterloo @ flickr.com
  • 3. 서울시 데이터 Enrichment  목표  서울시 데이터 상세화를 위한 온톨로지 설계 또는 매핑  구조화, 의미화, 그리고 연결: 서울시 데이터 (비정형 데이터)를 온톨로지를 이용해 모델링하고, 외부 데이터와 연결  영문화: 비 한국어권 사용자가 사용할 수 있는 서울시 데이터 제공  범위  서울시 데이터셋 약 40종  문화재: 문화재청에서 수집한 국내 문화재 (국보, 보물, 지정문화재, 무형문화재 등)  방법론: 기존 RDF 어휘의 재사용을 통해 데이터 모델링  1) 데이터 선정: 서울시 열린데이터 광장에서 모델링 대상 데이터셋 선정  2) 데이터 셋 항목 검토: 데이터 셋의 개별 항목과 Dbpedia 온톨로지 (클래스, 속성) 의 매핑 관계 검토 • Dbpedia 온톨로지: 사물에 대한 개념 및 위키피디아 infobox 항목을 포함하고 있음 OKFN Korea3
  • 4. 서울시 데이터 Enrichment  예를 들어, '박물관'을 모델링 할 경우, • 박물관에 대한 infobox 템플릿을 위키피디아에서 선택 • Dbpedia에서 박물관 infobox와 매핑한 어휘 선택 • 어휘와 데이터셋 항목 매핑 • 매핑되지 않는 항목의 모델링 여부 결정 (클래스, 속성 포함): 모델링 도구 결정 필요 • URI 체계 (별도 설계 필요) 적용 • 온톨로지 스키마 설계 완료  3) 데이터 정제 • Google Refine을 통해 데이터 정제 • Refine에서 추가하기 전에 할 작업 • 위치 데이터: 원본 데이터 (서울시)에 위치값을 변환 또는 추가 • 영문명: 한글명의 변환, 매핑 (수작업 필요) • Refine에서 할 작업 – 한글, 영문 위키피디아 URL 추가 – Dbpedia, Freebase URL 추가: Refine reconciliation을 이용해서 추가 – RDF 변환 매핑 Skelton 작업 – RDF, Excel 추출  4) 데이터 업로드 (RDF 또는 Excel)  데이터 스토어 선택  Jena, 4Store, … OKFN Korea4
  • 6. Modelling – RDF Subject Predicate Object
  • 7. Modelling – RDF Subject Predicate Object some school has a name/label some literal
  • 8. Modelling – RDF Subject Predicate Object http://education.data.gov.uk /id/school/401874 has a name/label ―Cardiff High School‖
  • 9. Modelling – RDF Subject Predicate Object http://education.data.gov.uk /id/school/401874 http://www.w3.org/2000/01/ rdf-schema#label ―Cardiff High School‖
  • 10. Modelling – RDF Subject Predicate Object school:401874 rdfs:label ―Cardiff High School‖ where school: = http://education.data.gov.uk/id/school/ rdfs: = http://www.w3.org/2000/01/rdf-schema#
  • 11. Modelling – RDF Subject Predicate Object school:401874 rdfs:label ―Cardiff High School‖ school:401874 ont:districtAdministrative la:00PT la:00PT rdfs:label Cardiff
  • 12. Modelling – RDF Subject Predicate Object school:401874 rdfs:label ―Cardiff High School‖ school:401874 ont:districtAdministrative la:00PT la:00PT rdfs:label ―Cardiff‖ school:401874 ―Cardiff High School‖ ont:districtAdministrative la:00PT ―Cardiff‖ rdfs:label rdfs:label
  • 13. Modelling – RDF Subject Predicate Object school:401874 rdfs:label ―Cardiff High School‖ school:401874 ont:districtAdministrative la:00PT la:00PT rdfs:label ―Cardiff‖ la:00PT rdfs:label ―Caerdydd‖@cy
  • 14. Modelling – vocabularies Logical modelling modelling the domain, not a particular data structure  what exists  what is asserted? what can you deduce from that?  not about constraints as such  monotonic, open world controlled vocabulary taxonomy thesaurus ontology Ontology
  • 15. Modelling – vocabularies unfamiliar terminology but related to  information architecture and conceptual modelling  domain-driven design  ... and yes knowledge representation
  • 16. Elements of:  Vocabulary (defining terms) • I define a relationship called “prescribed dose.”  Schema (defining types) • “prescribed dose” relates “treatments” to “dosagee s”  Taxonomy (defining hierarchies) • Any “doctor” is a “medical professional” 16 RDF Schema is…
  • 17. Modelling – RDFS RDF vocabulary description language classes, types and type hierarchy ont:School rdfs:Class rdf:type ―School‖ rdfs:label
  • 18. Modelling – RDFS RDF vocabulary description language classes, types and type hierarchy ont:WelshEstablishment ont:School rdfs:Class rdf:type rdf:typerdfs:subClassOf ―School‖ rdfs:label
  • 19. Modelling – RDFS RDF vocabulary description language classes, types and type hierarchy school:401874 ont:WelshEstablishment ont:WelshEstablishment ont:School rdfs:Class rdf:typerdf:type rdf:typerdfs:subClassOf ―School‖ rdfs:label
  • 20. Modelling – RDFS RDF vocabulary description language classes, types and type hierarchy school:401874 ont:WelshEstablishment ont:WelshEstablishment ont:School rdfs:Class rdf:typerdf:type rdf:typerdfs:subClassOf school:401874 ont:WelshEstablishment ont:School rdf:type  ―School‖ rdfs:label ―School‖ rdfs:label
  • 21. Modelling – RDFS RDF vocabulary description language properties, property hierarchy school:401874 person:JoeBloggs ont:staffAt ont:headOf rdf:Property ont:headOf rdf:type rdfs:subPropertyOf  school:401874person:JoeBloggs ont:staffAt ont:headOf
  • 22. Modelling – RDFS RDF vocabulary description language class/property relations  domain  range Already have power to do some vocab ulary mapping  declare classes or properties from different vo cabularies to be equivalent: A rdfs:subClassOf B B rdfs:subClassOf A
  • 23. WOL OWL is… 23 Web Ontology Language
  • 24. Elements of ontology  Same/different identity • “author” and “auteur” are the same relation • two resources with the same “ISBN” are the same “book”  More expressive type definitions • A “cycle” is a “vehicle” with at least one “wheel” • A “bicycle” is a “cycle” with exactly two “wheels”  More expressive relation definitions • “sibling” is a symmetric predicate • the value of the “favorite dwarf” relation must be one of “happy”, “sleepy”, “sneezy”, “grumpy”, “dopey”, “bashful”, “doc” OWL is… 24
  • 25. Answer questions of  Consistency • Are there any contradictions in this model?  Classification • What are all the inferred types of this resource?  Satisfiability • Are there any classes in this ontology that cannot p ossibly have any members? What can we do with OWL? 25
  • 26. Building Useful Ontologies  Developing and maintaining quality ontolgies is very challenging  Users need tools and services, e.g., to help check if ontology is:  Meaningful — all named classes can have instances http://www.aber.ac.uk/compsci/public/media/presentations/OUCL-seminar.ppt
  • 27. Building Useful Ontologies  Developing and maintaining quality ontolgies is very challenging  Users need tools and services, e.g., to help check if ontology is:  Meaningful — all named classes can have instances  Correct — captures intuitions of domain experts
  • 28. Building Useful Ontologies  Developing and maintaining quality ontolgies is very challenging  Users need tools and services, e.g., to help check if ont ology is:  Meaningful — all named classes can have instances  Correct — captures intuitions of domain experts  Minimally redundant — no unintended synonyms  Banana split Banana sundae
  • 29. Modelling - OWL  richer modelling and semantics  axioms on properties  transitive, symmetric, inverseOf, ...  functional, inverse functional  equivalent property  axioms on classes  intersection, union, disjoint, equivalent  restrictions on classes  some value from, all values from, cardinality, has value, one of, keys  axioms on individuals  same as, different from, all different  imports
  • 30. Modelling – OWL supports much richer modelling consistency checking of model consistency checking of data  some surprises if used to schema languages  open world, no unique name assumption  can extend to closed world checking inference  classification  inferred relationships
  • 31. Modelling Spectrum of goals and styles Lightweight vocabularies Rich ontological models  simple modelling  just enough agreement to get useful work done  removing boundaries to enable information to be found and connected  global consistency not possible  a little semantics goes a long way  rich domain models  need expressivity  consistency is critical  make complex infere nces you can rely on, across data you trust  knowledge is power
  • 32. Modelling Ontology reuse invest in complete ontology for a domain  rich but general model, may be modular inside  strong ―ontological commitment‖  e.g. medical ontologies reuse small, common, vocabularies  FOAF, SIOC, Dublin Core, Org ...  pick and choose classes and properties you need  fill in a few missing links for your domain generic reusable vocabularies  Data cube vocabulary
  • 33. Reusable, public on tologies 33 Measurement Units Ontology The Event Ontology FOAF
  • 34. schema.org is one of a number of microdata vocabularies it is a shared collection of microdata schemas for use by webmasters includes a type hierarchy, like an RDFS schema  starts with top-level Thing and DataType types  properties are inherited by descendant types Schema.org 34
  • 35. annotate an item with text-valued properties using the “itemprop” attribute microdata properties 35 <div itemscope> <p>My name is <span itemprop="name">Daniel</span>.</p> </div> <div itemscope> <p>Flavors in my favorite ice cream:</p> <ul> <li itemprop="flavor">Lemon sorbet</li> <li itemprop="flavor">Apricot sorbet</li> </ul> </div>
  • 38. maintains schema.org ↔RDF mappings  there are mappings for BIBO, DBpedia, Dublin Core, FOAF, GoodRelations, SIOC, and WordNet also provides examples, tutorials, and data dumps Schema.rdfs.org 38
  • 40. Triple Store & RDB OKFN Korea http://blog.gniewoslaw.pl/2012/11/relational-databases-vs-triple-stores/ 40
  • 41. Storage Solutions for RDF Data Triple Table (Basic Idea)  Store all RDF triples in a single table  Create indexes on combinations of S, P, and O OKFN Korea41
  • 42. The Internet Map OKFN Korea http://internet-map.net/ 42
  • 43. credits These slides are partially based on “Linked data and its role in the semantic web” by Dave Reynolds, Epimorphics Ltd. OKFN Korea43

Editor's Notes

  1. Definition.