SlideShare ist ein Scribd-Unternehmen logo
1 von 64
Downloaden Sie, um offline zu lesen
Andrea Wei-Ching Huang
Institute of Information Science, Academia Sinica, Taipei, Taiwan
June.01 2015 @ IIS R101
1. Why & What
2. Semantic Enrichment
3. What if : Digital Archives Taiwan
A Preliminary Study on Wikipedia : DBpedia : Wikidata
Wikipedia-centric KBs
WHY WE CARE ?
數位典藏索引典
is-a
sister-Project
will-Use
data- Extraction
supports
google knowledge graph + vault
Collaborative Knowledge Bases
data- Extraction
data- Extraction
data- Transferring
2001
2007
2010
2012
Ontology
Infobox
Extraction
Spotlight
DBpedia 1.0 DBpedia 3.9
2013
20082007
Knowledge Graph
2014
Knowledge Vault
2002
2015
Knowledge Bases: Timeline
?
2001
2007
2010
2012
Ontology
Infobox
Extraction
Spotlight
DBpedia 1.0 DBpedia 3.9
2013
20082007
Knowledge Graph
2014
Knowledge Vault
2002
2015
Knowledge Bases: Timeline
Collaboratively-generated, semi-structured
information is made up of content which is
(a) semantified,
(b) wide-coverage,
(c) up-to-date,
(d) multilingual,
(e) free in nature.
Hovy et al., Collaboratively built semi-structured content and Artificial
Intelligence: The story so far, Artificial Intelligence (2012)
Source; Arnold, P., & Rahm, E. (2015). Automatic Extraction of Semantic Relations from Wikipedia.
International Journal on Artificial Intelligence Tools, 24(2), 1540010.
Basic Information Table
Wikipedia DBpedeia Wikidata
Website wikipedia.org dbpedia.org www.wikidata.org
Release Time January 15, 2001 23 January 2007 30 October 2012
Description "the free encyclopedia" “the Semantic Web mirror of
Wikipedia”
“Wikipedia for data”
Host Wikimedia Foundation, Inc. University of Leipzig; University of
Mannheim; OpenLink Software
Wikimedia Foundation, Inc.
Creators Jimmy Wales, Larry Sanger Wikimedia community
Data mainly from Wikipedians Wikipedia Wikipedia and sister projects
Generation method manual/community-created automatic/
semi-automatic
semi-automatic;
manual/community-created
Advantage Free text /
Easiness of access and contribution
LOD Hub,
Semantic coverage & depth
Quality (accuracy) : URI/
Provenance/Contextual representation
Operation Media Wiki Virtuoso Universal Server MediaWiki Extension: Wikibase
URI/IRI Schemes (language).wikipedia.org/wiki/Name Wikipedia-like IRIs
(language).dbpedia.org/resource/Name
Language independent number IDs
http://wikidata.org/wiki/Qxxx or Pxxx
Data structure Mostly unstructured texts;
Semi-structured: infobox, category…
RDF; Named Graphs;
DBpedia ontology (dynamic structure)
Wikibase data model ; Wikibase system
ontology; Wikidata:WikiProject Ontology
Data Access Wikipedia Data Dumps DBpedia dumps RDF dumps
Free Text Search SPARQL endpoint Wikidata Query (WDQ)
MediaWiki API DBpedia Spotlight (annotating mentions
of DBpedia resources in text)
Wikidata API
License CC Attribution / Share-Alike 3.0; text
with dual-licensed under GFDL;
media licensing varies.
GNU General Public License CC0 1.0
Language(s)
Support
288 (as May 2015) 111(extraction of Wikipedia)
119 (see DBpedia dumps)
>125 (as Mar, 2015)
27 (DBpedia Ontology)
358 (as Aug.2014)
Mutual Relation Wikipedia:Wikidata Dbpedia:Wikidata Wikidata:DBpedia
1 Arts and culture
1.1 Award
1.2 Book
1.3 Comic book
1.4 Fictional character
1.5 Fictional element
1.6 Film
1.7 Game
1.8 Language
1.8.1 Styles
1.8.2 Other language
1.9 Music
1.10 Publishing
1.11 Radio
1.12 Television
1.13 Other arts and culture
2 Geography and place
2.1 Geography
2.2 Place
2.3 Buildings and structures
2.3.1 Entertainment venues and structures
2.3.2 Historic sites and structures
2.3.3 Other buildings and structures
3 Health and fitness
3.1 Medicine
3.2 Other health and fitness
4 History and events
4.1 Event
4.2 History
5 Mathematics and abstraction
6 Person
6.1 Religious person
6.2 Royalty and nobility
6.3 Sportsperson
6.3.1 American football person
6.3.2 Baseball person
6.3.3 Basketball person
6.3.4 Motorsports person
6.3.5 Other sportsperson
6.4 Other person
7 Religion and belief
7.1 Religious building
7.2 Other religion
8 Science and nature
8.1 Biology
8.1.1 Botany
8.1.2 Animal
8.1.3 Other biology
8.2 Astronomy
8.2.1 Spaceflight
8.2.2 Other astronomy
8.3 Geology
8.4 Weather
8.5 Other science and nature
9 Society and social science
9.1 Business and economics
9.2 Education
9.3 Food and drinks
9.4 Law
9.5 Military and war
9.6 Numismatics
9.7 Organization
9.8 Politics and government
9.8.1 Cabinet
9.8.2 Constituency
9.8.3 Legislature
9.8.4 Party
9.9 Other politics and government
9.10 Transport
9.10.1 Air transport
9.10.2 Automotive
9.10.3 Highway and street
9.10.4 Public transport
9.10.5 Rail transport
9.10.6 Water transport
9.10.7 Other transport
9.11 Sports
9.11.1 American football
9.11.2 Association football (soccer)
9.11.3 Athletics
9.11.4 Australian rules football
9.11.5 Canadian football
9.11.6 Badminton
9.11.7 Baseball
9.11.8 Basketball
9.11.9 Boxing
9.11.10 Cricket
9.11.11 Curling
9.11.12 Cycling
9.11.13 Field hockey
9.11.14 Figure skating
9.11.15 Floorball
9.11.16 Gaelic games
9.11.17 Golf
9.11.18 Handball
9.11.19 Horse racing
9.11.20 Ice hockey
9.11.21 Lacrosse
9.11.22 Martial arts
9.11.23 Motorsports
9.11.24 Multi-sport competition
9.11.25 Netball
9.11.26 Roller hockey
9.11.27 Rowing
9.11.28 Rugby league
9.11.29 Rugby union
9.11.30 Sailing
9.11.31 Skiing
9.11.32 Softball
9.11.33 Squash
9.11.34 Swimming
9.11.35 Tennis
9.11.36 Volleyball
9.11.37 Wrestling
9.11.38 Other sports
9.12 Other society and social sciences
10 Technology and applied science
10.1 Computing
10.1.1 Hardware
10.1.2 Software
10.1.3 Other computing
10.2 Photography
10.3 Other technology
11 Other
11.1 Shimming
11.2 Parent templates
11.3 Internal use
11.4 Not infoboxes
11.4.1 Subtemplates
11.4.2 Cleanup
11.4.3 Documentation
11.5 Pre-filled
11.6 Unsorted
12 See also
13 External links
http://en.wikipedia.org/wiki/Wikipedia:List_of_infoboxes
Wikipedia InfoBoxes
A
► Agriculture (43 C, 189 P)
► Architecture (41 C, 90 P)
▼ Arts (36 C, 72 P)
▼ Arts by culture (7 C)
► Artists by culture (11 C, 1 P)
► Celtic art (4 C, 62 P)
► Cinema by culture (16 C, 2 P)
▼ Painting by culture (12 C)
► Paintings by nationality (36 C)
► Ancient Greek pottery (7 C, 19 P)
► Bangladeshi painting (1 C, 2 P)
► Brazilian painting (1 C, 3 P)
▼ Chinese painting (10 C, 30 P)
► Art movements in Chinese painting (5 P)
► Banhua (1 P)
► Chinese ink brush (6 P)
► Ming dynasty painting (1 C, 4 P)
▼ Chinese painters (42 C, 3 P)
► Painters from Anhui (1 C, 13 P)
► Painters from Beijing (1 C, 10 P)
► Chinese landscape painters (8 C, 1 P)
► Painters from Chongqing (1 C)
► Five Dynasties and Ten Kingdoms painters (5 C)
► Painters from Fujian (1 C, 11 P)
► Painters from Gansu (1 C, 1 P)
► Painters from Guangdong (1 C, 12 P)
► Painters from Guangxi (1 C, 1 P)
► Painters from Guizhou (1 C)
► Painters from Hebei (1 C, 1 P)
► Painters from Heilongjiang (1 C)
► Painters from Henan (1 C, 14 P)
► Hong Kong painters (6 P)
► Painters from Hubei (1 C, 3 P)
► Painters from Hunan (1 C, 3 P)
► Painters from Jiangsu (1 C, 75 P)
► Painters from Jiangxi (1 C, 12 P)
► Painters from Jilin (1 C)
► Jin dynasty (1115–1234) painters (1 P)
► Jin dynasty (265–420) painters (2 P)
► Painters from Liaoning (1 C)
► Ming dynasty painters (1 C, 38 P)
► People's Republic of China painters (24 C, 6 P)
► Chinese portrait painters (14 P)
► Qing dynasty painters (1 C, 63 P)
▼ Republic of China painters (1 C, 52 P)
► Republic of China landscape painters (2 P)
► Painters from Shaanxi (1 C, 6 P)
► Painters from Shandong (1 C, 9 P)
► Painters from Shanghai (1 C, 14 P)
► Painters from Shanxi (2 P)
► Painters from Sichuan (1 C, 7 P)
► Song dynasty painters (1 C, 29 P)
► Southern and Northern Dynasties painters (6 C)
► Sui dynasty painters (2 P)
► Tang dynasty painters (1 C, 10 P)
► Three Kingdoms painters (1 C)
► Painters from Tianjin (3 P)
► Yuan dynasty painters (1 C, 23 P)
► Painters from Yunnan (1 C, 1 P)
► Painters from Zhejiang (1 C, 61 P)
► Chinese painter stubs (191 P)
► Chinese paintings (3 C, 14 P)
► Qing dynasty painting (1 P)
► Song dynasty painting (1 C, 1 P)
► Tang dynasty painting (2 C, 2 P)
► Tibetan painting (1 C, 10 P)
► Radio by culture (2 C)
► Television by culture (7 C)
► Theatre by culture (9 C)
► Arts by period (1 C, 1 P)
► Arts by place (5 C, 1 P)
► Aesthetics (19 C, 130 P)
► Artists (39 C, 67 P, 2 F)
► Audiovisual art (2 C, 1 P)
► Arts awards (13 C, 41 P)
► Art competitions (1 C, 3 P)
► Crafts (31 C, 97 P)
► Creative works (21 C, 2 P)
► Culinary arts (2 C, 19 P)
► Arts databases (1 C, 8 P)
► Disability in the arts (5 C, 19 P)
► Arts districts (2 C, 57 P)
► Arts events (9 C, 10 P)
► Funerary art (2 C, 14 P)
► Arts genres by country or nationality (20 C)
► Artistic incompetence (17 P)
► Art and culture law (2 C, 13 P)
► Arts-related lists (20 C, 42 P)
► Literature (52 C, 86 P)
► Arts occupations (9 C, 34 P)
► Arts organizations (24 C, 61 P)
► People associated with the arts (11 C, 5 P)
► Performing arts (44 C, 107 P)
► Perfumery (6 C, 44 P)
► Plastic arts (8 C, 3 P)
► The arts and politics (7 C, 7 P)
► Religion and the arts (9 C, 2 P)
▼ Topics in the arts (8 C, 2 P)
► Topics in popular culture (41 C, 96 P)
► Angels in art (2 C, 116 P)
► Animals in art (22 C, 102 P)
► Anti-fascist works (3 C, 4 P)
▼ Art by subject (22 C, 6 P)
► Statues by subject (11 C, 1 P)
▼ Paintings by subject (7 C, 2 P)
▼ Portraits by subject (6 C, 5 P)
► Portraits of monarchs (3 C, 13 P, 1 F)
► Portraits of popes (8 P, 1 F)
► Portraits of historial figures (2 P)
► Self-portraits (1 C, 54 P, 3 F)
► Portraits of William Shakespeare (15 P)
▼ Portraits of women (1 C, 2 P)
► Mona Lisa (13 P, 3 F)
► Paintings set in cabarets (4 P)
► Landscape paintings (56 P)
► Maritime paintings (1 C, 29 P)
► Paintings depicting myths (1 C, 5 P)
► Paintings of people (5 C, 19 P)
► War paintings (1 C, 68 P)
► Angels in art (2 C, 116 P)
► Animals in art (22 C, 102 P)
► Black people in art (13 P)
► Botanical art (1 C, 17 P)
► Dacia in art (1 C, 11 P)
► Death in art (4 C, 14 P)
► Depictions of kneeling (5 P)
► Environmental art (3 C, 26 P)
► Marine art (5 C, 11 P, 1 F)
► Mathematics and art (3 C, 5 P)
► Military art (7 C, 74 P, 1 F)
► Moon in art (3 P)
► Native Americans in art (19 P, 1 F)
► Depictions of people (22 C)
► Political art (10 C, 61 P)
► Religious art (7 C, 12 P)
► Science in art (1 C, 12 P)
► Sexuality in arts (4 C, 1 P)
► Slavery in art (11 P, 1 F)
► Vodou art (2 P)
► Censorship in the arts (3 C, 52 P)
► Military of the United States in art (5 C, 1 P)
► Virtual reality in fiction (6 C, 82 P)
► Arts venues (4 C, 4 P)
► Visual arts (45 C, 69 P)
► Women and the arts (17 C, 26 P)
► Works about the arts (20 C, 1 P)
► Wikipedia books on arts (7 C, 7 P)
► Art stubs (21 C, 379 P)
B
► Behavior (24 C, 50 P)
C
► Chronology (20 C, 52 P)
► Creativity (18 C, 63 P)
► Culture (46 C, 62 P)
D
► Disciplines (8 C, 1 P)
E
► Education (59 C, 197 P)
► Environment (47 C, 75 P)
G
► Geography (28 C, 79 P)
► Government (66 C, 113 P)
H
► Health (40 C, 4 P)
► History (34 C, 37 P)
► Humanities (33 C, 80 P)
► Humans (25 C, 43 P)
I
► Industry (37 C, 101 P)
► Information (25 C, 33 P)
K
► Knowledge (31 C, 96 P)
L
► Language (26 C, 69 P)
► Law (27 C, 75 P)
M
► Mathematics (19 C, 9 P)
► Medicine (23 C, 18 P)
► Mind (37 C, 18 P)
N
► Nature (23 C, 9 P)
O
► Objects (6 C, 2 P)
P
► People (13 C, 4 P)
► Politics (36 C, 50 P)
S
► Science (38 C, 32 P)
► Sports (36 C, 9 P)
► Structure (24 C, 13 P)
► Systems (7 C, 23 P)
T
► Technology (52 C, 134 P)
U
► Universe (10 C, 24 P)
W
► World (13 C, 12 P)
Wikipedia Category
http://en.wikipedia.org/wiki/Category:Main_topic_classifications
35 subcategories13 topics
Semi-structured
Advantages of Collaborative Knowledge Generation
"Wiki" is a Hawaiian word meaning…
http://www.wikidata.org/wiki/Q128736
The results of Wikipedia article and
Wikidata about John Nash’s car accident
after 17 hours of related news release.
Wikipedia DBpedeia Wikidata
Language(s) Support 288 (as May 2015) 111(extraction of Wikipedia)
119 (see DBpedia dumps)
>125 (as Mar, 2015)
27 (DBpedia Ontology)
358 (as Aug.2014)
Semi-structured data for
semantic enrichment
Infobox
Categories
Structured information is hidden in Article
Wikitext / templates such as: infobox and
categories.
Source: Broughton, J. (2008). Wikipedia: the missing manual. " O'Reilly Media, Inc.".
http://en.wikipedia.org/wiki/Template:Infobox_organization
https://zh.wikipedia.org/wiki/Template:Infobox_Organization
https://zh.wikipedia.org/w/index.php?title=中央研究院
“Infobox templates contain
important facts and statistics
of a type which are common
to related articles.” (source)
http://en.wikipedia.org/w/index.php?title=Special%3ACategoryTree&target=National+academies+of+sciences&mo
de=categories&namespaces=
WikiTaxonomy is generated by
traversing the network and deciding
for each pair of categories whether
the sub-category isa a super-category. Hovy et al., Collaboratively built semi-structured content and Artificial
Intelligence: The story so far, Artificial Intelligence (2012)
Main Reference: Lehmann, Jens, et al. (2015) "DBpedia–A large-scale, multilingual
knowledge base extracted from Wikipedia." Semantic Web, Vol 6. No.2
Data Extraction and Mapping
Data Dumps
Extractors turn a specific type of wiki markup into triples.
http://dbpedia.org/resource/Academia_Sinica
http://en.wikipedia.org/wiki/Academia_Sinica
1. Labels
2. Abstracts
3. Interlanguage links
4. Images
5. Redirects
6. Disambiguation
7. External links
8. Page links
9. Homepages
10. Geo-coordinates
11. Person data
12. PND
13. SKOS categories
14. Page ID
15. Revision ID
16. Category label
17. Article categories
18. Mappings
19. Infobox
19
extractorshttp://wiki.dbpedia.org/online-access/DBpediaLive
dcterms:subject
dbr:Academia_Sinica
• category:Members_of_Academia_Sinica
• category:National_academies_of_arts_and_humanities
• category:National_academies_of_sciences
• category:Organizations_established_in_1928
• category:1928_establishments_in_China
• category:Research_institutes_in_the_Republic_of_China
Academia Sinica
DBPedia Thematic Overview
Revised Source from: Valsecchi, F., Abrate, M., Bacciu, C., Tesconi, M., & Marchetti, A. DBpedia Atlas: Mapping
the Uncharted Lands of Linked Data. Linked Data on the Web (LDOW2015)
DBpedia Atlas, online at http://wafi.iit.cnr.it/lod/dbpedia/atlas.
 the largest classes of the ontology: Agent, Place, Work, Species, and TimePeriod
 most deepest levels of the ontology are in Place : Diocese class (has 5 super classes)
and OverseasDepartment, HistoricalDistrict, FormerMunicipality, HistoricalProvince (6 super classes)
 the highest average outdegree: Soccer Manager, Jockey and Horse Trainer (bottom right)
 the lowest depth/average outdegree: CareerStation, PersonFunction and TimePeriod
http://wafi.iit.cnr.it/lod/dbpedia/atlas/#Academia_Sinica
http://en.lodlive.it/?http://dbpedia.org/resource/Academia_Sinica
Main Reference: Vrandečić, D., & Krötzsch, M. (2014). Wikidata: a free
collaborative knowledgebase. Communications of the ACM, 57(10), 78-85.
http://en.wikipedia.org/wiki/Academia_Sinica
Wikidata item
http://www.wikidata.org/wiki/Q337266
Wikidata: Centralized languages + infoboxes to a unique entry
https://tools.wmflabs.org/reasonator/?q=Q337266
https://tools.wmflabs.org/wikidata-todo/cloudy_concept.php?q=Q337266&lang=en
http://www.wikidata.org/wiki/Q337266
Academia Sinica (Q337266)
Statements
Academia Sinica
/m/0216tkFreebase
identifier stated in Freebase Data Dumps as
publication of 28 October 2013
Contextual information/
ternary relations/ are
represented by the
“qualifier”
1.Item
1. Item identifier (number prefixed with Q)
2. Fingerprint, consisting of:
1. Multilingual label*
2. Multilingual description*
3. Multilingual aliases
3. Statements, each consisting of:
1. Claim, consisting of:
1. Property
2. Value
3. Qualifiers (additional property-
value pairs)
2. References (each consisting of one or
more property-value pairs)
3. Rank
4. Site links
2. Property
1. Property identifier (number prefixed with P)
2. Fingerprint, consisting of:
1. Multilingual label*
2. Multilingual description*
3. Multilingual aliases
3. Statements, each consisting of:
1. Claim, consisting of:
1. Property
2. Value
3. Qualifiers (additional property-value
pairs)
2. References (each consisting of one or more
property-value pairs)
3. Rank
4. Datatype
Wikibase database content can be summarized as follows:
Entity is one of the following three types of Wikibase pages, each with database content:
3. Query**
*) Unless label and/or description of an entity are not empty, within the scope of an entity type, an entity's combination
of label and description in a certain language must be unique.
**) Under development.
http://www.mediawiki.org/wiki/Wikibase/DataModel/Primer
What if ?
Digital Archives Taiwan
http://catalog.digitalarchives.tw/http://digitalarchives.tw/ http://dat.digitalarchives.tw/ontology/
http://dat.digitalarchives.tw/
dat.digitalarchives.tw can answer questions like:
Q1:銅琺瑯方瓶有哪些語意概念?
What concepts are represented in the Artifact A ?
Q2: 概念侈口(器口向 外張)描述了哪些
器物?
What artifacts have been described by the concept X ?
Q3: 器物一和器物二有哪些相似的特質?
What relations are between A and B ( or more) ?
1. 25 Artifact : 374 triple
2. 6 classes (details)
3. core properties: 10/11 dat:ceramicCharacteristics ; [陶瓷性狀描述]
not been used yet.
4. Concepts: 148 dat concepts + 39 AAT
5. 24/25 Artifacts use AAT; the main properties to relate AAT are
dat:ArtifactType /[器物類型], dct:created /[創作時代] and
dct:medium
6. 181 instances (details) : 148 concepts + 25 Artifact + 8 meta (4
datasets + 3 reusing + 1 Article ) using 40 properties (details)
7. Total triples : 641
Data Profiling : 25 artifacts in dat.digitalarchives.tw
Q1:器物銅琺瑯方瓶有哪些語意概念?
What concepts are represented in the
Artifact A ?
Property Value
dat:artifactType <http://dat.digitalarchives.tw/Concept/800000632>
dat:artifactType <http://vocab.getty.edu/aat/300010898>
dat:componentForm <http://dat.digitalarchives.tw/Concept/800000886>
dat:componentForm <http://dat.digitalarchives.tw/Concept/800000913>
dat:componentForm <http://dat.digitalarchives.tw/Concept/800000915>
dat:componentForm <http://dat.digitalarchives.tw/Concept/800001103>
dat:componentForm <http://dat.digitalarchives.tw/Concept/800001205>
dct:created unavailable
dat:decorationSubject <http://dat.digitalarchives.tw/Concept/800000295>
r4r:hasProvenance prv:DataCreation
dct:instructionalMethod <http://vocab.getty.edu/aat/300053778>
r4r:isPartOf <http://dat.digitalarchives.tw/data/Dataset/10000001>
dct:title 銅琺瑯方瓶
rdf:type dat:Artifact
schema:url <http://catalog.digitalarchives.tw/item/00/30/e5/f1.html>
銅琺瑯方瓶
http://dat.digitalarchives.tw/resource/Artifact/3204593
http://en.lodlive.it/?http://dat.digitalarchives.tw/resource/Artifact/3204593
dcat:Datasat
dat:Artifact
Concept
(dat)
Provenance
Source
aat:cloisonné
aat:glassware
山水
(作品類型)
方瓶
圓口
折肩/直長腹/短頸/外撇圈足
圓口
aat:glassware
Q2: 概念侈口描述了哪些器物?
ASK
What artifacts have been described by the
concept X ?
Concept
(dat)
侈口
dat:Artifact
dat:Artifact
http://en.lodlive.it/?http://dat.digitalarchives.tw/Concept/800001099
任意選兩/多個藏
品,他們之間的
關係 ?
Relation between two/more objects
http://catalog.digitalarchives.tw/item/00/0c/c5/bf.html http://catalog.digitalarchives.tw/item/00/33/49/cf.html
http://dat.digitalarchives.tw/resource/Artifact/837055 http://dat.digitalarchives.tw/resource/Artifact/3361231
Q3: 器物一和器物二有哪些相似的特質?
ASK
p o
器物類型 如意
創作時代 乾隆
r4r:isPartOf <http://dat.digitalarchives.tw/data/Dataset/10000001>
rdf:type dat:Artifact
r4r:hasProvenance prv:DataCreation
(器物)風格 Qianlong (Chinese dynastic style)
器物類型 bottles
•數位典藏索引典物件層面 【800001814】
• <古董珍玩與各式收藏> 【800002058】
• 珍玩 【800002059】
• 如意 【800001497】
• 奇石 【800001501】
• 山子 【800001502】
• 湖石 【800001503】
• 插屏 【800001504】
• 清供 【800001505】
• 盆景(擺設) 【800001506】
器物類型
http://catalog.digitalarchives.tw/item/00/0c/c5/bf.html http://catalog.digitalarchives.tw/item/00/33/49/cf.html
創作時代 (器物)風格
dat:artifactType
dat:style
•數位典藏索引典 【800001809】
•風格與時代層面 【800001811】
• <中國風格與時代> 【800001854】
• <中國朝代> 【800001913】
• 清 【800001971】
• 乾隆 【800001975】
• 順治 【800001972】
• 康熙 【800001973】
• 雍正 【800001974】
dct:created
http://dat.digitalarchives.tw/resource/Artifact/837055 http://dat.digitalarchives.tw/resource/Artifact/3361231
impact: Centralization languages/infobox
impact: Creating and updating list articles
What if ?
External Resources for
Semantic Representation & Enrichment of
Languages, Time, Place, Multimedia …
https://tools.wmflabs.org/reasonator/?q=Q8733
What if ?
Through the Eyes of
Wikipedia + Dbpedia + Wikidata
1. Wikidata URI for disambiguation?
2. Enrichment by embedding Wikidata
information to our interfaces? (no
extraction & maintenance tasks)
3. Logical reasoning through Wikidata or
DBPedia or (Wikidata +DBPedia) to infer
new knowledge ?
Upload 84 hundred thousand cc DATA to
Wikimedia Commons?
Dbpedia Wikidata
is on the way.
http://wikidata.dbpedia.org/
What if ?
Cultural Heritage meet
Wikipedia /Wikidata
(Europeana + AAT + Wikidata )
Source:
Vladimir Alexiev, Wikidata, a target
for Europeana’s semantic strategy
(Glam-Wiki 2015)
Vocabularies
linkage &
Coreferences
The new move towards the possible partnership of
Europeana and Wikidata
http://pro.europeana.eu/files/Europeana_Professional/Europeana
_Network/europeana_wikimedia_taskforce_report_2015.pdf
http://dat.digitalarchives.tw/Concept/800002441
Wikidata as a
Solution?
https://tools.wmflabs.org/reasonator/?q=Q123314
Thank you
This document is made available under the Creative Commons Licence CC-BY-SA 4.0
Citation Information: Andrea Wei-Ching Huang (2015) A Preliminary Study on Wikipedia, DBpedia
and Wikidata. URL: http://andrea-index.blogspot.tw/2015/06/wikipedia-dbpedia-wikidata.html

Weitere ähnliche Inhalte

Andere mochten auch

Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...khcoder
 
Eye Catching
Eye CatchingEye Catching
Eye Catchingsokoban
 
What They Dont Teach You At Sloan
What They Dont Teach You At SloanWhat They Dont Teach You At Sloan
What They Dont Teach You At SloanBrian Halligan
 
Power Point
Power PointPower Point
Power PointAster
 
Creando Un Blog
Creando Un BlogCreando Un Blog
Creando Un BlogAster
 
Excitans - Visievorming op zorg-ICT
Excitans - Visievorming op zorg-ICTExcitans - Visievorming op zorg-ICT
Excitans - Visievorming op zorg-ICTForugy
 
Diccionario visual
Diccionario visualDiccionario visual
Diccionario visualAster
 
Carnaval la Venetia ...2008
Carnaval la Venetia ...2008Carnaval la Venetia ...2008
Carnaval la Venetia ...2008sokoban
 
Noul Paraclis Al Sfantului Nectarie
Noul Paraclis Al Sfantului NectarieNoul Paraclis Al Sfantului Nectarie
Noul Paraclis Al Sfantului Nectariesokoban
 
arte chocolate.pps
arte chocolate.ppsarte chocolate.pps
arte chocolate.ppssokoban
 
Snow &amp; Ice festival
Snow &amp; Ice festivalSnow &amp; Ice festival
Snow &amp; Ice festivalsokoban
 
051102 Online Community Mapping
051102 Online Community Mapping051102 Online Community Mapping
051102 Online Community Mappingandrea huang
 
Bor In Anii Regimului Comunist Observatii Pe Marginea Raportului Tismaneanu
Bor In Anii Regimului Comunist   Observatii Pe Marginea Raportului TismaneanuBor In Anii Regimului Comunist   Observatii Pe Marginea Raportului Tismaneanu
Bor In Anii Regimului Comunist Observatii Pe Marginea Raportului Tismaneanusokoban
 
Presentation on channel, community, content for startupbisnis
Presentation on channel, community, content for startupbisnisPresentation on channel, community, content for startupbisnis
Presentation on channel, community, content for startupbisnisReinҲ Rein
 
Andrei Kuraev Despre Hristos
Andrei Kuraev   Despre HristosAndrei Kuraev   Despre Hristos
Andrei Kuraev Despre Hristossokoban
 
Fwd: [Fwd: [Fwd: [Fwd: Fw: Fw: foarte frumosi]]]]
Fwd: [Fwd: [Fwd: [Fwd: Fw: Fw: foarte frumosi]]]]Fwd: [Fwd: [Fwd: [Fwd: Fw: Fw: foarte frumosi]]]]
Fwd: [Fwd: [Fwd: [Fwd: Fw: Fw: foarte frumosi]]]]sokoban
 

Andere mochten auch (19)

Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
 
Eye Catching
Eye CatchingEye Catching
Eye Catching
 
What They Dont Teach You At Sloan
What They Dont Teach You At SloanWhat They Dont Teach You At Sloan
What They Dont Teach You At Sloan
 
Virtual Sicily Trip
Virtual Sicily TripVirtual Sicily Trip
Virtual Sicily Trip
 
Power Point
Power PointPower Point
Power Point
 
Creando Un Blog
Creando Un BlogCreando Un Blog
Creando Un Blog
 
Excitans - Visievorming op zorg-ICT
Excitans - Visievorming op zorg-ICTExcitans - Visievorming op zorg-ICT
Excitans - Visievorming op zorg-ICT
 
Diccionario visual
Diccionario visualDiccionario visual
Diccionario visual
 
Carnaval la Venetia ...2008
Carnaval la Venetia ...2008Carnaval la Venetia ...2008
Carnaval la Venetia ...2008
 
Noul Paraclis Al Sfantului Nectarie
Noul Paraclis Al Sfantului NectarieNoul Paraclis Al Sfantului Nectarie
Noul Paraclis Al Sfantului Nectarie
 
arte chocolate.pps
arte chocolate.ppsarte chocolate.pps
arte chocolate.pps
 
12
1212
12
 
Janis Joplin The Legend
Janis Joplin The LegendJanis Joplin The Legend
Janis Joplin The Legend
 
Snow &amp; Ice festival
Snow &amp; Ice festivalSnow &amp; Ice festival
Snow &amp; Ice festival
 
051102 Online Community Mapping
051102 Online Community Mapping051102 Online Community Mapping
051102 Online Community Mapping
 
Bor In Anii Regimului Comunist Observatii Pe Marginea Raportului Tismaneanu
Bor In Anii Regimului Comunist   Observatii Pe Marginea Raportului TismaneanuBor In Anii Regimului Comunist   Observatii Pe Marginea Raportului Tismaneanu
Bor In Anii Regimului Comunist Observatii Pe Marginea Raportului Tismaneanu
 
Presentation on channel, community, content for startupbisnis
Presentation on channel, community, content for startupbisnisPresentation on channel, community, content for startupbisnis
Presentation on channel, community, content for startupbisnis
 
Andrei Kuraev Despre Hristos
Andrei Kuraev   Despre HristosAndrei Kuraev   Despre Hristos
Andrei Kuraev Despre Hristos
 
Fwd: [Fwd: [Fwd: [Fwd: Fw: Fw: foarte frumosi]]]]
Fwd: [Fwd: [Fwd: [Fwd: Fw: Fw: foarte frumosi]]]]Fwd: [Fwd: [Fwd: [Fwd: Fw: Fw: foarte frumosi]]]]
Fwd: [Fwd: [Fwd: [Fwd: Fw: Fw: foarte frumosi]]]]
 

Mehr von andrea huang

Reuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and RealizationReuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and Realizationandrea huang
 
結構資料的再次使用:語意、連結與實作
結構資料的再次使用:語意、連結與實作結構資料的再次使用:語意、連結與實作
結構資料的再次使用:語意、連結與實作andrea huang
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositoriesandrea huang
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...andrea huang
 
20130805 Activating Linked Open Data in Libraries Archives and Museums
20130805 Activating Linked Open Data in Libraries Archives and Museums20130805 Activating Linked Open Data in Libraries Archives and Museums
20130805 Activating Linked Open Data in Libraries Archives and Museumsandrea huang
 
101203 An event ontology for crisis-disaster information
101203 An event ontology for crisis-disaster information101203 An event ontology for crisis-disaster information
101203 An event ontology for crisis-disaster informationandrea huang
 
081016 Social Tagging, Online Communication, and Peircean Semiotics
081016 Social Tagging, Online Communication, and Peircean Semiotics081016 Social Tagging, Online Communication, and Peircean Semiotics
081016 Social Tagging, Online Communication, and Peircean Semioticsandrea huang
 
060817 Participation Collaboration Mapping
060817 Participation Collaboration Mapping060817 Participation Collaboration Mapping
060817 Participation Collaboration Mappingandrea huang
 
070928 Collaborative Geospatial Mapping And Data Authorization
070928 Collaborative Geospatial Mapping And Data Authorization070928 Collaborative Geospatial Mapping And Data Authorization
070928 Collaborative Geospatial Mapping And Data Authorizationandrea huang
 
041018 Community Gis
041018 Community Gis041018 Community Gis
041018 Community Gisandrea huang
 
051207 Commonsense Geography Meets Web Technology
051207 Commonsense Geography Meets Web Technology 051207 Commonsense Geography Meets Web Technology
051207 Commonsense Geography Meets Web Technology andrea huang
 

Mehr von andrea huang (11)

Reuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and RealizationReuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and Realization
 
結構資料的再次使用:語意、連結與實作
結構資料的再次使用:語意、連結與實作結構資料的再次使用:語意、連結與實作
結構資料的再次使用:語意、連結與實作
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
 
20130805 Activating Linked Open Data in Libraries Archives and Museums
20130805 Activating Linked Open Data in Libraries Archives and Museums20130805 Activating Linked Open Data in Libraries Archives and Museums
20130805 Activating Linked Open Data in Libraries Archives and Museums
 
101203 An event ontology for crisis-disaster information
101203 An event ontology for crisis-disaster information101203 An event ontology for crisis-disaster information
101203 An event ontology for crisis-disaster information
 
081016 Social Tagging, Online Communication, and Peircean Semiotics
081016 Social Tagging, Online Communication, and Peircean Semiotics081016 Social Tagging, Online Communication, and Peircean Semiotics
081016 Social Tagging, Online Communication, and Peircean Semiotics
 
060817 Participation Collaboration Mapping
060817 Participation Collaboration Mapping060817 Participation Collaboration Mapping
060817 Participation Collaboration Mapping
 
070928 Collaborative Geospatial Mapping And Data Authorization
070928 Collaborative Geospatial Mapping And Data Authorization070928 Collaborative Geospatial Mapping And Data Authorization
070928 Collaborative Geospatial Mapping And Data Authorization
 
041018 Community Gis
041018 Community Gis041018 Community Gis
041018 Community Gis
 
051207 Commonsense Geography Meets Web Technology
051207 Commonsense Geography Meets Web Technology 051207 Commonsense Geography Meets Web Technology
051207 Commonsense Geography Meets Web Technology
 

Kürzlich hochgeladen

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Kürzlich hochgeladen (20)

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

A preliminary study on Wikipedia Dbpdeia and Wikidata

  • 1. Andrea Wei-Ching Huang Institute of Information Science, Academia Sinica, Taipei, Taiwan June.01 2015 @ IIS R101 1. Why & What 2. Semantic Enrichment 3. What if : Digital Archives Taiwan A Preliminary Study on Wikipedia : DBpedia : Wikidata
  • 4. 數位典藏索引典 is-a sister-Project will-Use data- Extraction supports google knowledge graph + vault Collaborative Knowledge Bases data- Extraction data- Extraction data- Transferring
  • 5. 2001 2007 2010 2012 Ontology Infobox Extraction Spotlight DBpedia 1.0 DBpedia 3.9 2013 20082007 Knowledge Graph 2014 Knowledge Vault 2002 2015 Knowledge Bases: Timeline ?
  • 6. 2001 2007 2010 2012 Ontology Infobox Extraction Spotlight DBpedia 1.0 DBpedia 3.9 2013 20082007 Knowledge Graph 2014 Knowledge Vault 2002 2015 Knowledge Bases: Timeline
  • 7. Collaboratively-generated, semi-structured information is made up of content which is (a) semantified, (b) wide-coverage, (c) up-to-date, (d) multilingual, (e) free in nature. Hovy et al., Collaboratively built semi-structured content and Artificial Intelligence: The story so far, Artificial Intelligence (2012)
  • 8. Source; Arnold, P., & Rahm, E. (2015). Automatic Extraction of Semantic Relations from Wikipedia. International Journal on Artificial Intelligence Tools, 24(2), 1540010.
  • 9. Basic Information Table Wikipedia DBpedeia Wikidata Website wikipedia.org dbpedia.org www.wikidata.org Release Time January 15, 2001 23 January 2007 30 October 2012 Description "the free encyclopedia" “the Semantic Web mirror of Wikipedia” “Wikipedia for data” Host Wikimedia Foundation, Inc. University of Leipzig; University of Mannheim; OpenLink Software Wikimedia Foundation, Inc. Creators Jimmy Wales, Larry Sanger Wikimedia community Data mainly from Wikipedians Wikipedia Wikipedia and sister projects Generation method manual/community-created automatic/ semi-automatic semi-automatic; manual/community-created Advantage Free text / Easiness of access and contribution LOD Hub, Semantic coverage & depth Quality (accuracy) : URI/ Provenance/Contextual representation Operation Media Wiki Virtuoso Universal Server MediaWiki Extension: Wikibase URI/IRI Schemes (language).wikipedia.org/wiki/Name Wikipedia-like IRIs (language).dbpedia.org/resource/Name Language independent number IDs http://wikidata.org/wiki/Qxxx or Pxxx Data structure Mostly unstructured texts; Semi-structured: infobox, category… RDF; Named Graphs; DBpedia ontology (dynamic structure) Wikibase data model ; Wikibase system ontology; Wikidata:WikiProject Ontology Data Access Wikipedia Data Dumps DBpedia dumps RDF dumps Free Text Search SPARQL endpoint Wikidata Query (WDQ) MediaWiki API DBpedia Spotlight (annotating mentions of DBpedia resources in text) Wikidata API License CC Attribution / Share-Alike 3.0; text with dual-licensed under GFDL; media licensing varies. GNU General Public License CC0 1.0 Language(s) Support 288 (as May 2015) 111(extraction of Wikipedia) 119 (see DBpedia dumps) >125 (as Mar, 2015) 27 (DBpedia Ontology) 358 (as Aug.2014) Mutual Relation Wikipedia:Wikidata Dbpedia:Wikidata Wikidata:DBpedia
  • 10. 1 Arts and culture 1.1 Award 1.2 Book 1.3 Comic book 1.4 Fictional character 1.5 Fictional element 1.6 Film 1.7 Game 1.8 Language 1.8.1 Styles 1.8.2 Other language 1.9 Music 1.10 Publishing 1.11 Radio 1.12 Television 1.13 Other arts and culture 2 Geography and place 2.1 Geography 2.2 Place 2.3 Buildings and structures 2.3.1 Entertainment venues and structures 2.3.2 Historic sites and structures 2.3.3 Other buildings and structures 3 Health and fitness 3.1 Medicine 3.2 Other health and fitness 4 History and events 4.1 Event 4.2 History 5 Mathematics and abstraction 6 Person 6.1 Religious person 6.2 Royalty and nobility 6.3 Sportsperson 6.3.1 American football person 6.3.2 Baseball person 6.3.3 Basketball person 6.3.4 Motorsports person 6.3.5 Other sportsperson 6.4 Other person 7 Religion and belief 7.1 Religious building 7.2 Other religion 8 Science and nature 8.1 Biology 8.1.1 Botany 8.1.2 Animal 8.1.3 Other biology 8.2 Astronomy 8.2.1 Spaceflight 8.2.2 Other astronomy 8.3 Geology 8.4 Weather 8.5 Other science and nature 9 Society and social science 9.1 Business and economics 9.2 Education 9.3 Food and drinks 9.4 Law 9.5 Military and war 9.6 Numismatics 9.7 Organization 9.8 Politics and government 9.8.1 Cabinet 9.8.2 Constituency 9.8.3 Legislature 9.8.4 Party 9.9 Other politics and government 9.10 Transport 9.10.1 Air transport 9.10.2 Automotive 9.10.3 Highway and street 9.10.4 Public transport 9.10.5 Rail transport 9.10.6 Water transport 9.10.7 Other transport 9.11 Sports 9.11.1 American football 9.11.2 Association football (soccer) 9.11.3 Athletics 9.11.4 Australian rules football 9.11.5 Canadian football 9.11.6 Badminton 9.11.7 Baseball 9.11.8 Basketball 9.11.9 Boxing 9.11.10 Cricket 9.11.11 Curling 9.11.12 Cycling 9.11.13 Field hockey 9.11.14 Figure skating 9.11.15 Floorball 9.11.16 Gaelic games 9.11.17 Golf 9.11.18 Handball 9.11.19 Horse racing 9.11.20 Ice hockey 9.11.21 Lacrosse 9.11.22 Martial arts 9.11.23 Motorsports 9.11.24 Multi-sport competition 9.11.25 Netball 9.11.26 Roller hockey 9.11.27 Rowing 9.11.28 Rugby league 9.11.29 Rugby union 9.11.30 Sailing 9.11.31 Skiing 9.11.32 Softball 9.11.33 Squash 9.11.34 Swimming 9.11.35 Tennis 9.11.36 Volleyball 9.11.37 Wrestling 9.11.38 Other sports 9.12 Other society and social sciences 10 Technology and applied science 10.1 Computing 10.1.1 Hardware 10.1.2 Software 10.1.3 Other computing 10.2 Photography 10.3 Other technology 11 Other 11.1 Shimming 11.2 Parent templates 11.3 Internal use 11.4 Not infoboxes 11.4.1 Subtemplates 11.4.2 Cleanup 11.4.3 Documentation 11.5 Pre-filled 11.6 Unsorted 12 See also 13 External links http://en.wikipedia.org/wiki/Wikipedia:List_of_infoboxes Wikipedia InfoBoxes A ► Agriculture (43 C, 189 P) ► Architecture (41 C, 90 P) ▼ Arts (36 C, 72 P) ▼ Arts by culture (7 C) ► Artists by culture (11 C, 1 P) ► Celtic art (4 C, 62 P) ► Cinema by culture (16 C, 2 P) ▼ Painting by culture (12 C) ► Paintings by nationality (36 C) ► Ancient Greek pottery (7 C, 19 P) ► Bangladeshi painting (1 C, 2 P) ► Brazilian painting (1 C, 3 P) ▼ Chinese painting (10 C, 30 P) ► Art movements in Chinese painting (5 P) ► Banhua (1 P) ► Chinese ink brush (6 P) ► Ming dynasty painting (1 C, 4 P) ▼ Chinese painters (42 C, 3 P) ► Painters from Anhui (1 C, 13 P) ► Painters from Beijing (1 C, 10 P) ► Chinese landscape painters (8 C, 1 P) ► Painters from Chongqing (1 C) ► Five Dynasties and Ten Kingdoms painters (5 C) ► Painters from Fujian (1 C, 11 P) ► Painters from Gansu (1 C, 1 P) ► Painters from Guangdong (1 C, 12 P) ► Painters from Guangxi (1 C, 1 P) ► Painters from Guizhou (1 C) ► Painters from Hebei (1 C, 1 P) ► Painters from Heilongjiang (1 C) ► Painters from Henan (1 C, 14 P) ► Hong Kong painters (6 P) ► Painters from Hubei (1 C, 3 P) ► Painters from Hunan (1 C, 3 P) ► Painters from Jiangsu (1 C, 75 P) ► Painters from Jiangxi (1 C, 12 P) ► Painters from Jilin (1 C) ► Jin dynasty (1115–1234) painters (1 P) ► Jin dynasty (265–420) painters (2 P) ► Painters from Liaoning (1 C) ► Ming dynasty painters (1 C, 38 P) ► People's Republic of China painters (24 C, 6 P) ► Chinese portrait painters (14 P) ► Qing dynasty painters (1 C, 63 P) ▼ Republic of China painters (1 C, 52 P) ► Republic of China landscape painters (2 P) ► Painters from Shaanxi (1 C, 6 P) ► Painters from Shandong (1 C, 9 P) ► Painters from Shanghai (1 C, 14 P) ► Painters from Shanxi (2 P) ► Painters from Sichuan (1 C, 7 P) ► Song dynasty painters (1 C, 29 P) ► Southern and Northern Dynasties painters (6 C) ► Sui dynasty painters (2 P) ► Tang dynasty painters (1 C, 10 P) ► Three Kingdoms painters (1 C) ► Painters from Tianjin (3 P) ► Yuan dynasty painters (1 C, 23 P) ► Painters from Yunnan (1 C, 1 P) ► Painters from Zhejiang (1 C, 61 P) ► Chinese painter stubs (191 P) ► Chinese paintings (3 C, 14 P) ► Qing dynasty painting (1 P) ► Song dynasty painting (1 C, 1 P) ► Tang dynasty painting (2 C, 2 P) ► Tibetan painting (1 C, 10 P) ► Radio by culture (2 C) ► Television by culture (7 C) ► Theatre by culture (9 C) ► Arts by period (1 C, 1 P) ► Arts by place (5 C, 1 P) ► Aesthetics (19 C, 130 P) ► Artists (39 C, 67 P, 2 F) ► Audiovisual art (2 C, 1 P) ► Arts awards (13 C, 41 P) ► Art competitions (1 C, 3 P) ► Crafts (31 C, 97 P) ► Creative works (21 C, 2 P) ► Culinary arts (2 C, 19 P) ► Arts databases (1 C, 8 P) ► Disability in the arts (5 C, 19 P) ► Arts districts (2 C, 57 P) ► Arts events (9 C, 10 P) ► Funerary art (2 C, 14 P) ► Arts genres by country or nationality (20 C) ► Artistic incompetence (17 P) ► Art and culture law (2 C, 13 P) ► Arts-related lists (20 C, 42 P) ► Literature (52 C, 86 P) ► Arts occupations (9 C, 34 P) ► Arts organizations (24 C, 61 P) ► People associated with the arts (11 C, 5 P) ► Performing arts (44 C, 107 P) ► Perfumery (6 C, 44 P) ► Plastic arts (8 C, 3 P) ► The arts and politics (7 C, 7 P) ► Religion and the arts (9 C, 2 P) ▼ Topics in the arts (8 C, 2 P) ► Topics in popular culture (41 C, 96 P) ► Angels in art (2 C, 116 P) ► Animals in art (22 C, 102 P) ► Anti-fascist works (3 C, 4 P) ▼ Art by subject (22 C, 6 P) ► Statues by subject (11 C, 1 P) ▼ Paintings by subject (7 C, 2 P) ▼ Portraits by subject (6 C, 5 P) ► Portraits of monarchs (3 C, 13 P, 1 F) ► Portraits of popes (8 P, 1 F) ► Portraits of historial figures (2 P) ► Self-portraits (1 C, 54 P, 3 F) ► Portraits of William Shakespeare (15 P) ▼ Portraits of women (1 C, 2 P) ► Mona Lisa (13 P, 3 F) ► Paintings set in cabarets (4 P) ► Landscape paintings (56 P) ► Maritime paintings (1 C, 29 P) ► Paintings depicting myths (1 C, 5 P) ► Paintings of people (5 C, 19 P) ► War paintings (1 C, 68 P) ► Angels in art (2 C, 116 P) ► Animals in art (22 C, 102 P) ► Black people in art (13 P) ► Botanical art (1 C, 17 P) ► Dacia in art (1 C, 11 P) ► Death in art (4 C, 14 P) ► Depictions of kneeling (5 P) ► Environmental art (3 C, 26 P) ► Marine art (5 C, 11 P, 1 F) ► Mathematics and art (3 C, 5 P) ► Military art (7 C, 74 P, 1 F) ► Moon in art (3 P) ► Native Americans in art (19 P, 1 F) ► Depictions of people (22 C) ► Political art (10 C, 61 P) ► Religious art (7 C, 12 P) ► Science in art (1 C, 12 P) ► Sexuality in arts (4 C, 1 P) ► Slavery in art (11 P, 1 F) ► Vodou art (2 P) ► Censorship in the arts (3 C, 52 P) ► Military of the United States in art (5 C, 1 P) ► Virtual reality in fiction (6 C, 82 P) ► Arts venues (4 C, 4 P) ► Visual arts (45 C, 69 P) ► Women and the arts (17 C, 26 P) ► Works about the arts (20 C, 1 P) ► Wikipedia books on arts (7 C, 7 P) ► Art stubs (21 C, 379 P) B ► Behavior (24 C, 50 P) C ► Chronology (20 C, 52 P) ► Creativity (18 C, 63 P) ► Culture (46 C, 62 P) D ► Disciplines (8 C, 1 P) E ► Education (59 C, 197 P) ► Environment (47 C, 75 P) G ► Geography (28 C, 79 P) ► Government (66 C, 113 P) H ► Health (40 C, 4 P) ► History (34 C, 37 P) ► Humanities (33 C, 80 P) ► Humans (25 C, 43 P) I ► Industry (37 C, 101 P) ► Information (25 C, 33 P) K ► Knowledge (31 C, 96 P) L ► Language (26 C, 69 P) ► Law (27 C, 75 P) M ► Mathematics (19 C, 9 P) ► Medicine (23 C, 18 P) ► Mind (37 C, 18 P) N ► Nature (23 C, 9 P) O ► Objects (6 C, 2 P) P ► People (13 C, 4 P) ► Politics (36 C, 50 P) S ► Science (38 C, 32 P) ► Sports (36 C, 9 P) ► Structure (24 C, 13 P) ► Systems (7 C, 23 P) T ► Technology (52 C, 134 P) U ► Universe (10 C, 24 P) W ► World (13 C, 12 P) Wikipedia Category http://en.wikipedia.org/wiki/Category:Main_topic_classifications 35 subcategories13 topics Semi-structured
  • 11. Advantages of Collaborative Knowledge Generation
  • 12. "Wiki" is a Hawaiian word meaning… http://www.wikidata.org/wiki/Q128736 The results of Wikipedia article and Wikidata about John Nash’s car accident after 17 hours of related news release.
  • 13. Wikipedia DBpedeia Wikidata Language(s) Support 288 (as May 2015) 111(extraction of Wikipedia) 119 (see DBpedia dumps) >125 (as Mar, 2015) 27 (DBpedia Ontology) 358 (as Aug.2014)
  • 15.
  • 16. Infobox Categories Structured information is hidden in Article Wikitext / templates such as: infobox and categories. Source: Broughton, J. (2008). Wikipedia: the missing manual. " O'Reilly Media, Inc.".
  • 19. WikiTaxonomy is generated by traversing the network and deciding for each pair of categories whether the sub-category isa a super-category. Hovy et al., Collaboratively built semi-structured content and Artificial Intelligence: The story so far, Artificial Intelligence (2012)
  • 20. Main Reference: Lehmann, Jens, et al. (2015) "DBpedia–A large-scale, multilingual knowledge base extracted from Wikipedia." Semantic Web, Vol 6. No.2
  • 21. Data Extraction and Mapping Data Dumps Extractors turn a specific type of wiki markup into triples. http://dbpedia.org/resource/Academia_Sinica http://en.wikipedia.org/wiki/Academia_Sinica
  • 22. 1. Labels 2. Abstracts 3. Interlanguage links 4. Images 5. Redirects 6. Disambiguation 7. External links 8. Page links 9. Homepages 10. Geo-coordinates 11. Person data 12. PND 13. SKOS categories 14. Page ID 15. Revision ID 16. Category label 17. Article categories 18. Mappings 19. Infobox 19 extractorshttp://wiki.dbpedia.org/online-access/DBpediaLive dcterms:subject dbr:Academia_Sinica • category:Members_of_Academia_Sinica • category:National_academies_of_arts_and_humanities • category:National_academies_of_sciences • category:Organizations_established_in_1928 • category:1928_establishments_in_China • category:Research_institutes_in_the_Republic_of_China Academia Sinica
  • 23. DBPedia Thematic Overview Revised Source from: Valsecchi, F., Abrate, M., Bacciu, C., Tesconi, M., & Marchetti, A. DBpedia Atlas: Mapping the Uncharted Lands of Linked Data. Linked Data on the Web (LDOW2015) DBpedia Atlas, online at http://wafi.iit.cnr.it/lod/dbpedia/atlas.  the largest classes of the ontology: Agent, Place, Work, Species, and TimePeriod  most deepest levels of the ontology are in Place : Diocese class (has 5 super classes) and OverseasDepartment, HistoricalDistrict, FormerMunicipality, HistoricalProvince (6 super classes)  the highest average outdegree: Soccer Manager, Jockey and Horse Trainer (bottom right)  the lowest depth/average outdegree: CareerStation, PersonFunction and TimePeriod
  • 26. Main Reference: Vrandečić, D., & Krötzsch, M. (2014). Wikidata: a free collaborative knowledgebase. Communications of the ACM, 57(10), 78-85.
  • 28. Wikidata: Centralized languages + infoboxes to a unique entry
  • 31. Academia Sinica (Q337266) Statements Academia Sinica /m/0216tkFreebase identifier stated in Freebase Data Dumps as publication of 28 October 2013 Contextual information/ ternary relations/ are represented by the “qualifier”
  • 32. 1.Item 1. Item identifier (number prefixed with Q) 2. Fingerprint, consisting of: 1. Multilingual label* 2. Multilingual description* 3. Multilingual aliases 3. Statements, each consisting of: 1. Claim, consisting of: 1. Property 2. Value 3. Qualifiers (additional property- value pairs) 2. References (each consisting of one or more property-value pairs) 3. Rank 4. Site links 2. Property 1. Property identifier (number prefixed with P) 2. Fingerprint, consisting of: 1. Multilingual label* 2. Multilingual description* 3. Multilingual aliases 3. Statements, each consisting of: 1. Claim, consisting of: 1. Property 2. Value 3. Qualifiers (additional property-value pairs) 2. References (each consisting of one or more property-value pairs) 3. Rank 4. Datatype Wikibase database content can be summarized as follows: Entity is one of the following three types of Wikibase pages, each with database content: 3. Query** *) Unless label and/or description of an entity are not empty, within the scope of an entity type, an entity's combination of label and description in a certain language must be unique. **) Under development. http://www.mediawiki.org/wiki/Wikibase/DataModel/Primer
  • 33. What if ? Digital Archives Taiwan
  • 35. dat.digitalarchives.tw can answer questions like: Q1:銅琺瑯方瓶有哪些語意概念? What concepts are represented in the Artifact A ? Q2: 概念侈口(器口向 外張)描述了哪些 器物? What artifacts have been described by the concept X ? Q3: 器物一和器物二有哪些相似的特質? What relations are between A and B ( or more) ?
  • 36. 1. 25 Artifact : 374 triple 2. 6 classes (details) 3. core properties: 10/11 dat:ceramicCharacteristics ; [陶瓷性狀描述] not been used yet. 4. Concepts: 148 dat concepts + 39 AAT 5. 24/25 Artifacts use AAT; the main properties to relate AAT are dat:ArtifactType /[器物類型], dct:created /[創作時代] and dct:medium 6. 181 instances (details) : 148 concepts + 25 Artifact + 8 meta (4 datasets + 3 reusing + 1 Article ) using 40 properties (details) 7. Total triples : 641 Data Profiling : 25 artifacts in dat.digitalarchives.tw
  • 38. Property Value dat:artifactType <http://dat.digitalarchives.tw/Concept/800000632> dat:artifactType <http://vocab.getty.edu/aat/300010898> dat:componentForm <http://dat.digitalarchives.tw/Concept/800000886> dat:componentForm <http://dat.digitalarchives.tw/Concept/800000913> dat:componentForm <http://dat.digitalarchives.tw/Concept/800000915> dat:componentForm <http://dat.digitalarchives.tw/Concept/800001103> dat:componentForm <http://dat.digitalarchives.tw/Concept/800001205> dct:created unavailable dat:decorationSubject <http://dat.digitalarchives.tw/Concept/800000295> r4r:hasProvenance prv:DataCreation dct:instructionalMethod <http://vocab.getty.edu/aat/300053778> r4r:isPartOf <http://dat.digitalarchives.tw/data/Dataset/10000001> dct:title 銅琺瑯方瓶 rdf:type dat:Artifact schema:url <http://catalog.digitalarchives.tw/item/00/30/e5/f1.html> 銅琺瑯方瓶 http://dat.digitalarchives.tw/resource/Artifact/3204593
  • 39.
  • 42. Q2: 概念侈口描述了哪些器物? ASK What artifacts have been described by the concept X ?
  • 43.
  • 47. p o 器物類型 如意 創作時代 乾隆 r4r:isPartOf <http://dat.digitalarchives.tw/data/Dataset/10000001> rdf:type dat:Artifact r4r:hasProvenance prv:DataCreation (器物)風格 Qianlong (Chinese dynastic style) 器物類型 bottles
  • 48. •數位典藏索引典物件層面 【800001814】 • <古董珍玩與各式收藏> 【800002058】 • 珍玩 【800002059】 • 如意 【800001497】 • 奇石 【800001501】 • 山子 【800001502】 • 湖石 【800001503】 • 插屏 【800001504】 • 清供 【800001505】 • 盆景(擺設) 【800001506】 器物類型 http://catalog.digitalarchives.tw/item/00/0c/c5/bf.html http://catalog.digitalarchives.tw/item/00/33/49/cf.html 創作時代 (器物)風格 dat:artifactType dat:style •數位典藏索引典 【800001809】 •風格與時代層面 【800001811】 • <中國風格與時代> 【800001854】 • <中國朝代> 【800001913】 • 清 【800001971】 • 乾隆 【800001975】 • 順治 【800001972】 • 康熙 【800001973】 • 雍正 【800001974】 dct:created http://dat.digitalarchives.tw/resource/Artifact/837055 http://dat.digitalarchives.tw/resource/Artifact/3361231
  • 49. impact: Centralization languages/infobox impact: Creating and updating list articles
  • 50. What if ? External Resources for Semantic Representation & Enrichment of Languages, Time, Place, Multimedia …
  • 52.
  • 53.
  • 54. What if ? Through the Eyes of Wikipedia + Dbpedia + Wikidata
  • 55. 1. Wikidata URI for disambiguation? 2. Enrichment by embedding Wikidata information to our interfaces? (no extraction & maintenance tasks) 3. Logical reasoning through Wikidata or DBPedia or (Wikidata +DBPedia) to infer new knowledge ?
  • 56. Upload 84 hundred thousand cc DATA to Wikimedia Commons?
  • 57. Dbpedia Wikidata is on the way. http://wikidata.dbpedia.org/
  • 58. What if ? Cultural Heritage meet Wikipedia /Wikidata (Europeana + AAT + Wikidata )
  • 59. Source: Vladimir Alexiev, Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015) Vocabularies linkage & Coreferences
  • 60. The new move towards the possible partnership of Europeana and Wikidata http://pro.europeana.eu/files/Europeana_Professional/Europeana _Network/europeana_wikimedia_taskforce_report_2015.pdf
  • 63.
  • 64. Thank you This document is made available under the Creative Commons Licence CC-BY-SA 4.0 Citation Information: Andrea Wei-Ching Huang (2015) A Preliminary Study on Wikipedia, DBpedia and Wikidata. URL: http://andrea-index.blogspot.tw/2015/06/wikipedia-dbpedia-wikidata.html