Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Horizontal Decomposition of
Freebase
based on data sampled in July
Summary
• 29-41% of accepted triples (links + other) hold
most of the value
• “a” is numerous but doesn’t provide the
evid...
a
5%
description
18%
key
11%
keyNs
13%label
6%
name
6%notability
0%
nfp
0%
text
8%
web
6%
links
20%
other
7%
percentage of...
a
16%
description
1%
key
9%
keyNs
11%
label
6%
name
6%
notability
2%
nfp
2%
text
0%
web
5%
links
32%
other
10%
percentage ...
a
15%
description
7%
key
8%
keyNs
9%
label
4%
name
4%
notability
2%
nfp
1%
text
3%
web
6%
links
30%
other
11%
percentage o...
:BaseKB and Infovore
Data processed with Infovore software
https://github.com/paulhoule/infovore/
Get segmented Freebase d...
Nächste SlideShare
Wird geladen in …5
×

Horizontal decomposition of freebase

866 Aufrufe

Veröffentlicht am

Roughly 1/3 of the facts in Freebase contribute most of the value; by extracting from the Freebase dump, the data can be more rapidly processed by RDF tools such as triple stores as well as Hadoop-based toolkits such as the infovore framework

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Horizontal decomposition of freebase

  1. 1. Horizontal Decomposition of Freebase based on data sampled in July
  2. 2. Summary • 29-41% of accepted triples (links + other) hold most of the value • “a” is numerous but doesn’t provide the evidence other predicates do • “descriptions” are bulky • “names” are not machine readable • “keys” are duplicated, nonstandard, optional
  3. 3. a 5% description 18% key 11% keyNs 13%label 6% name 6%notability 0% nfp 0% text 8% web 6% links 20% other 7% percentage of gz compressed size
  4. 4. a 16% description 1% key 9% keyNs 11% label 6% name 6% notability 2% nfp 2% text 0% web 5% links 32% other 10% percentage of facts
  5. 5. a 15% description 7% key 8% keyNs 9% label 4% name 4% notability 2% nfp 1% text 3% web 6% links 30% other 11% percentage of uncompressed size
  6. 6. :BaseKB and Infovore Data processed with Infovore software https://github.com/paulhoule/infovore/ Get segmented Freebase data at http://basekb.com/

×