SlideShare ist ein Scribd-Unternehmen logo
1 von 40
ParlBench: a SPARQL-benchmark for electronic
publishing applications
Tatiana Tarasova Maarten Marx
University of Amsterdam
Information and Language Processing Systems
May 26, 2013
Workshop on Benchmarking RDF Systems, ESWC 2013
MEDIA	
  
PUBLICATIONS	
  
LIFE-­‐SCIENCES	
  CROSS-­‐DOMAIN	
  
GEOGRAPHIC	
  
GOVERNMENT	
  
MEDIA	
  
PUBLICATIONS	
  
LIFE-­‐SCIENCES	
  CROSS-­‐DOMAIN	
  
GEOGRAPHIC	
  
GOVERNMENT	
  
?	
  
The ParlBench Benchmark
Goal:
→ test performances of RDF store systems in the settings of e-publishing
applications
The ParlBench Benchmark
Goal:
→ test performances of RDF store systems in the settings of e-publishing
applications
Components:
→ real-world data: Dutch parliamentary proceedings, members and
political parties
→ vocabulary: Parliamentary Proceedings [2] (ParliPro) + mix of
existing vocabularies
→ 19 analytical SPARQL queries grouped into 4 micro-benchmarks:
Average, Count, Factual and Top 10
The ParlBench Benchmark
Goal:
→ test performances of RDF store systems in the settings of e-publishing
applications
Components:
→ real-world data: Dutch parliamentary proceedings, members and
political parties
→ vocabulary: Parliamentary Proceedings [2] (ParliPro) + mix of
existing vocabularies
→ 19 analytical SPARQL queries grouped into 4 micro-benchmarks:
Average, Count, Factual and Top 10
Performance metrics:
→ loading time
→ query response time
Outline
1 The ParlBench Benchmark
Data Sets
Queries
2 ParlBench experimental run on Virtuoso
Outline
1 The ParlBench Benchmark
Data Sets
Queries
2 ParlBench experimental run on Virtuoso
Outline
1 The ParlBench Benchmark
Data Sets
Queries
2 ParlBench experimental run on Virtuoso
The ParlBench Data Sets I
PoliticalMashup: characteristics
→ Dutch parliamentary proceedings (1814-2013),
political parties and politicians
→ richly structured XML documents (∼ 54.000)
→ URIs of concepts
→ metadata: who said what and when
→ links to Wikipedia
The ParlBench Data Sets I
PoliticalMashup: characteristics
→ Dutch parliamentary proceedings (1814-2013),
political parties and politicians
→ richly structured XML documents (∼ 54.000)
→ URIs of concepts
→ metadata: who said what and when
→ links to Wikipedia
Linked PoliticalMashup: design choices
→ keep the URIs and linking structure
→ re-use existing vocabularies
→ link to the Linked Open Data cloud
→ separate the structure from the text
The ParlBench Data Sets II
parties: Dutch political parties
members: members of the Dutch parliament
proceedings: structure of the Dutch parliamentary proceedings
paragraphs: content of speeches of the parliamentary meetings
tagged entities: links from the paragraphs to DBpedia
# of triples
parties members proceedings paragraphs tagged entities total
510 33,885 ∼36.5M ∼11.25M ∼34.4M ∼82.2M
RDF Data Model
Parliamentary Proceedings: ParliPro [2], DC and DC Terms [8]
Topic
Stage
Direction
Speech
Paragraph
Scene
Parliament
Member
Political
Party
has part
Parliamentary
Proceedings
has part
has parthas part
references
member
references
party
has part
has part
has part
has part
RDF Data Model
Parliament Member: FOAF [4], Bio [3] and DBpedia Ontology [5]
Parliament
Member
DBpedia
resource
same as
Biography
biography
RDF Data Model
Parties: ParliPro [2]
Political
Party
DBpedia
resource
same as
RDF Data Model
Paragraphs: ParliPro [2]
Paragraph
Content of the
paragraph
has text
RDF Data Model
Tagged Entities: MUTO [6], FOAF [4], Basic WGS84 [7]
Paragraph
Tag
DBpedia
resource
has auto meaning
Person Organization
Spatial
Thing
is a
is a
is a
Outline
1 The ParlBench Benchmark
Data Sets
Queries
2 ParlBench experimental run on Virtuoso
19 ParlBench queries: 4 micro-benchmarks
→ 3 Average, e.g.
A0: Retrieve average number of people spoke per topic.
→ 5 Count, e.g.
C4: Count speeches of a female speaker from the topic where only one
female spoke.
→ 6 Factual, e.g.
F3: What is the percentage of female speakers?
→ 5 Top 10, e.g.
T4: Retrieve top 10 longest topics (i.e., number of paragraphs).
Outline
1 The ParlBench Benchmark
Data Sets
Queries
2 ParlBench experimental run on Virtuoso
ParlBench experimental run
Test Machine
→ MacBook Pro + Mac OS X Lion 10.7.6 x64
→ CPUs: 2.8 GHz Intel Core i7 (2x2 cores)
→ Memory: 8GB
ParlBench experimental run
System Under Test
→ Virtuoso Open Source Edition v.06.01.3, native RDF store
→ default Virtuoso index scheme
→ configuration for large data sets loading
ParlBench experimental run
Experimental set-up
→ 8 test collections: Parties, Members, scaled Proceedings (from 1 to
100%)
→ single user mode
→ 1 run = 10 permutations of 19 queries (190 queries)
→ warm-up period: 5 runs (950 queries)
→ measuring period: 3 runs (570 queries)
→ query response time: mean of all the permutations of all the runs
(10*3 = 30 runs)
Scaling of proceedings
Scaling Factor 1% 2% 4% 8% 16% 32% 64% 100%
# of triples ∼0.5M ∼1M ∼1.9M ∼3.9M ∼7.6M ∼15M ∼23M ∼36.5M
Loading Time, log2
(time, sec)
1 2 4 8 16 32 64 100
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
Size of proceedings, %
Time,sec
Query Response Time by Micro-Benchmarks,
log2
(SUM(time), sec)
1 2 4 8 16 32 64 100
0.25
0.5
1
2
4
8
16
32
64
128
256
Size of proceedings, %
Sumofexecutiontime,sec
average
count
factual
top10
Query Response Time on the Largest Collection (∼36M)
A0 A1 A2 C0 C1 C2 C3 C4 F0 F1 F2 F3 F4 F5 T0 T1 T2 T3 T4
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
Queries
Time,sec
45.9422
39.5885
47.1268
2.4212
10.6883
1.4383 0.8649
30.0118
7.9996
78.1858
22.377822.4192
0.1053
48.8887
0.8357
10.2813
41.6915
0.9241
168.1313
average
count
factual
top10
T4: Retrieve top 10 longest topics (i.e., number of
paragraphs).
SELECT ?topic COUNT(?par) as ?numOfPars
WHERE {
?topic rdf:type parlipro:Topic .
?speech rdf:type parlipro:Speech .
?speech dcterms:hasPart ?par .
?par rdf:type parlipro:Paragraph .
{?topic dcterms:hasPart ?speech .}
UNION{
?topic dcterms:hasPart ?sd .
?sd rdf:type parlipro:StageDirection .
?sd dcterms:hasPart ?speech .}
UNION{
?topic dcterms:hasPart ?scene .
?scene rdf:type parlipro:Scene .
?scene dcterms:hasPart ?speech .}}
GROUP BY ?topic
ORDER BY DESC(?numOfPars)
LIMIT 10
Characteristics of ParlBench queries
micro benchmark
Average Count Factual Top 10
A0 A1 A2 C0 C1 C2 C3 C4 F0 F1 F2 F3 F4 F5 T0 T1 T2 T3 T4
FILTER + + + + + + + +
UNION + + + + + + + + +
LIMIT + + + + + + +
ORDER BY + + + + + + +
GROUP BY + + + + + + + + + + + +
COUNT + + + + + + + + + + + + + + + + +
DISTINCT + + + +
AVG + + +
negation +
OPTIONAL + +
subquery + + + + + + +
blank node scoping + + + + + + + + +
# of triple patterns 10 9 12 5 5 5 6 13 8 16 6 6 2 4 2 4 9 3 11
T2: Retrieve top 10 topics with the most speeches
SELECT ?topic COUNT(?speech) as ?numOfSpeeches
WHERE {
?topic rdf:type parlipro:Topic .
?speech rdf:type parlipro:Speech .
{?topic dcterms:hasPart ?speech .}
UNION{
{?topic dcterms:hasPart ?sd .
?sd rdf:type parlipro:StageDirection .
?sd dcterms:hasPart ?speech .}
UNION{
?topic dcterms:hasPart ?scene .
?scene rdf:type parlipro:Scene .
?scene dcterms:hasPart ?speech .}}
GROUP BY ?topic
ORDER BY DESC(?numOfSpeeches)
LIMIT 10
Conclusion
→ SPARQL-benchmark for e-publishing applications
→ large collections of real data
→ intuitive analytical queries
→ micro-benchmarks for SPARQL features analysis
Future work
→ enlarge the data sets
- votes in proceedings
- interlink proceedings with the Dutch legislation data set [1] (>280M of
triples)
- tagged entities: more tags
→ extend the queries
- SPARQL 1.1: path expressions
- Linked Open Data integration scenario
→ run the benchmark on other RDF stores
Thank you!
ParlBench resources
→ data access:
→ resolvable URIs
→ RDF data dumps at http://data.politicalmashup.nl/RDF/data/
→ experimental run:
website describing an experimental run
http://data.politicalmashup.nl/RDF/
public SPARQL-endpoint to a test collection
http://data.politicalmashup.nl/sparql/
→ scripts are available at
http://data.politicalmashup.nl/RDF/scripts/
→ ParliPro vocabulary:
RDF representation http://purl.org/vocab/parlipro#
HTML representation
http://data.politicalmashup.nl/RDF/vocabularies/parlipro
Thank you!
Questions?
References I
Dutch national regulations in CEN MetaLex
http://doc.metalex.eu/
The Parliamentary Proceedings (ParliPro) Vocabulary
http://purl.org/vocab/parlipro#
BIO: A vocabulary for biographical information
http://vocab.org/bio
The Friend of a Friend Vocabulary (FOAF)
http://xmlns.com/foaf/0.1/
The DBpedia Ontology http://dbpedia.org/ontology/
The Modular Unified Tagging Ontology (MUTO)
http://muto.socialtagging.org/
Basic Geo (WGS84 lat/long) Vocabulary
http://www.w3.org/2003/01/geo/wgs84_pos#
References II
Dublin Core Metadata Element Set
http://purl.org/dc/elements/1.1/ and Dublin Core collection
description Terms http://purl.org/dc/terms/
Statistics of the benchmark data sets
dataset # of triples size # of files
members 33,885 14M 3,583
parties 510 612K 151
proceedings 36,503,688 4.15G 51,233
paragraphs 11,250,295 5.77G 51,233
tagged entities 34,449,033 2.57G 34,755
TOTAL: 82,237,411 ∼13G 140,955
Statistics of the ParlBench data sets
Number of classes: 9
Number of properties: 25
Number of instances per class:
Member: 3,583
Party: 151
Proceedings: 51,233
Topic: 102,289
Stage Direction: 1,776,598
Scene: 189,226
Speech: 2,495,969
Paragraph: 11,211,520
Tagged Entity: 11,383,787
Parliamentary Proceedings: example of encoding
parlipro:Parliamentary
Proceedings
pm:nl.proc.ob.d.h-
tk-19992000-2432-2483
rdf:type
pm:nl.proc.ob.d.h-
tk-19992000-2432-2483.1 parlipro:Topic
dcterms:hasPart
pm:nl.proc.ob.d.h-
tk-19992000-2432-2483.1.7.30
parlipro:Speech
rdf:type
dcterms:hasPart
pm:nl.proc.ob.d.h-
tk-19992000-2432-2483.1.7.30.1
parlipro:Paragraph
rdf:type
dcterms:hasPart
pm:nl.p.gl
pm:nl.m.02547
parlipro:refMember
parlipro:refParty
1999-12-08
rdf:type
dc:date
pm:nl.proc.ob.d.h-
tk-19992000-2432-2483.1.7 parlipro:Scene
rdf:type
dcterms:hasPart
…
Members: example of encoding
nl-dbpedia:Marijke_Vos
owl:sameAs
_:bio
bio:biography
pm:nl.m.02547
foaf:gender
bio:Biography
en-dbpedia:Marijke_Vos
owl:sameAs
dbpedia-
ont:Female
rdf:type
1957-05-04
foaf:birthday
Leidschendam
dbpedia-
ont:birthPlace
Parliament
Member
rdf:type
Paragraphs and Tagged Entities: example of encoding
Paragraph
pm:nl.proc.ob.d.h-
tk-19992000-2432-2483.1.7.30.1
parlipro:Paragraph
rdf:type
Blijkbaar is er nu het een en ander mis in de relatie
tussen de Europese Unie en de Russische Federatie. ...
has text
Paragraphs and Tagged Entities: example of encoding
Paragraph
pm:nl.proc.ob.d.h-
tk-19992000-2432-2483.1.7.30.1
parlipro:Paragraph
rdf:type
Blijkbaar is er nu het een en ander mis in de relatie
tussen de Europese Unie en de Russische Federatie. ...
has text
Tagged Entity
muto:hasTag
pm:nl.proc.ob.d.h-
tk-19992000-2432-2483.1.7.30.1
_:tag
muto:hasAutoMeaning
nl-dbpedia:Rusland geo:SpatialThing
rdf:type
parlipro:Paragraph
rdf:type

Weitere ähnliche Inhalte

Was ist angesagt?

RDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactJean-Paul Calbimonte
 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Mehwish Alam
 
Accessing R from Python using RPy2
Accessing R from Python using RPy2Accessing R from Python using RPy2
Accessing R from Python using RPy2Ryan Rosario
 
RDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rRDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rYanchang Zhao
 
Connecting Stream Reasoners on the Web
Connecting Stream Reasoners on the WebConnecting Stream Reasoners on the Web
Connecting Stream Reasoners on the WebJean-Paul Calbimonte
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationRethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationOlaf Hartig
 
Triplewave: a step towards RDF Stream Processing on the Web
Triplewave: a step towards RDF Stream Processing on the WebTriplewave: a step towards RDF Stream Processing on the Web
Triplewave: a step towards RDF Stream Processing on the WebDaniele Dell'Aglio
 
Mapping Lo Dto Proton Revised [Compatibility Mode]
Mapping Lo Dto Proton Revised [Compatibility Mode]Mapping Lo Dto Proton Revised [Compatibility Mode]
Mapping Lo Dto Proton Revised [Compatibility Mode]Mariana Damova, Ph.D
 
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data CubesSAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data CubesMuhammad Saleem
 
RDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsRDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsJean-Paul Calbimonte
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDFSara EL HASSAD
 
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...Yann Pauly
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisMehwish Alam
 
TripleWave: Spreading RDF Streams on the Web
TripleWave: Spreading RDF Streams on the WebTripleWave: Spreading RDF Streams on the Web
TripleWave: Spreading RDF Streams on the WebAndrea Mauri
 

Was ist angesagt? (20)

inteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access FrameworkinteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access Framework
 
RDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's React
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.
 
Accessing R from Python using RPy2
Accessing R from Python using RPy2Accessing R from Python using RPy2
Accessing R from Python using RPy2
 
RDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rRDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-r
 
Text Mining with R
Text Mining with RText Mining with R
Text Mining with R
 
Connecting Stream Reasoners on the Web
Connecting Stream Reasoners on the WebConnecting Stream Reasoners on the Web
Connecting Stream Reasoners on the Web
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationRethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
 
Triplewave: a step towards RDF Stream Processing on the Web
Triplewave: a step towards RDF Stream Processing on the WebTriplewave: a step towards RDF Stream Processing on the Web
Triplewave: a step towards RDF Stream Processing on the Web
 
Mapping Lo Dto Proton Revised [Compatibility Mode]
Mapping Lo Dto Proton Revised [Compatibility Mode]Mapping Lo Dto Proton Revised [Compatibility Mode]
Mapping Lo Dto Proton Revised [Compatibility Mode]
 
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data CubesSAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
 
RDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsRDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementations
 
Triple Stores
Triple StoresTriple Stores
Triple Stores
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDF
 
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept Analysis
 
LD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and toolsLD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and tools
 
2017 biological databases_part1_vupload
2017 biological databases_part1_vupload2017 biological databases_part1_vupload
2017 biological databases_part1_vupload
 
TripleWave: Spreading RDF Streams on the Web
TripleWave: Spreading RDF Streams on the WebTripleWave: Spreading RDF Streams on the Web
TripleWave: Spreading RDF Streams on the Web
 

Ähnlich wie ParlBench: a SPARQL-benchmark for electronic publishing applications.

2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod GmodJun Zhao
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataRahul Jain
 
Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Databricks
 
A Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF ProcessingA Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF Processinglucianb
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013François Belleau
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...Oscar Corcho
 
2009 Dils Flyweb
2009 Dils Flyweb2009 Dils Flyweb
2009 Dils FlywebJun Zhao
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Juan Sequeda
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata2010 03 Lodoxf Openflydata
2010 03 Lodoxf OpenflydataJun Zhao
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Jim Dowling
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked DataRuben Verborgh
 
Querying data on the Web – client or server?
Querying data on the Web – client or server?Querying data on the Web – client or server?
Querying data on the Web – client or server?Ruben Verborgh
 
Re-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutRe-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutMediaMixerCommunity
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012François Belleau
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic WebIvan Herman
 
SPARQL in the Semantic Web
SPARQL in the Semantic WebSPARQL in the Semantic Web
SPARQL in the Semantic WebJan Beeck
 
SampLD, Structural Properties as Proxy for Semantic Relevance
SampLD, Structural Properties as Proxy for Semantic RelevanceSampLD, Structural Properties as Proxy for Semantic Relevance
SampLD, Structural Properties as Proxy for Semantic Relevancelaurensrietveld
 

Ähnlich wie ParlBench: a SPARQL-benchmark for electronic publishing applications. (20)

2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015
 
A Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF ProcessingA Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF Processing
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
 
2009 Dils Flyweb
2009 Dils Flyweb2009 Dils Flyweb
2009 Dils Flyweb
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
The CIARD RINGValeri
The CIARD RINGValeriThe CIARD RINGValeri
The CIARD RINGValeri
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked Data
 
Querying data on the Web – client or server?
Querying data on the Web – client or server?Querying data on the Web – client or server?
Querying data on the Web – client or server?
 
Re-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutRe-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playout
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
SPARQL in the Semantic Web
SPARQL in the Semantic WebSPARQL in the Semantic Web
SPARQL in the Semantic Web
 
SampLD, Structural Properties as Proxy for Semantic Relevance
SampLD, Structural Properties as Proxy for Semantic RelevanceSampLD, Structural Properties as Proxy for Semantic Relevance
SampLD, Structural Properties as Proxy for Semantic Relevance
 

Kürzlich hochgeladen

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Kürzlich hochgeladen (20)

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

ParlBench: a SPARQL-benchmark for electronic publishing applications.

  • 1. ParlBench: a SPARQL-benchmark for electronic publishing applications Tatiana Tarasova Maarten Marx University of Amsterdam Information and Language Processing Systems May 26, 2013 Workshop on Benchmarking RDF Systems, ESWC 2013
  • 2. MEDIA   PUBLICATIONS   LIFE-­‐SCIENCES  CROSS-­‐DOMAIN   GEOGRAPHIC   GOVERNMENT  
  • 3. MEDIA   PUBLICATIONS   LIFE-­‐SCIENCES  CROSS-­‐DOMAIN   GEOGRAPHIC   GOVERNMENT   ?  
  • 4. The ParlBench Benchmark Goal: → test performances of RDF store systems in the settings of e-publishing applications
  • 5. The ParlBench Benchmark Goal: → test performances of RDF store systems in the settings of e-publishing applications Components: → real-world data: Dutch parliamentary proceedings, members and political parties → vocabulary: Parliamentary Proceedings [2] (ParliPro) + mix of existing vocabularies → 19 analytical SPARQL queries grouped into 4 micro-benchmarks: Average, Count, Factual and Top 10
  • 6. The ParlBench Benchmark Goal: → test performances of RDF store systems in the settings of e-publishing applications Components: → real-world data: Dutch parliamentary proceedings, members and political parties → vocabulary: Parliamentary Proceedings [2] (ParliPro) + mix of existing vocabularies → 19 analytical SPARQL queries grouped into 4 micro-benchmarks: Average, Count, Factual and Top 10 Performance metrics: → loading time → query response time
  • 7. Outline 1 The ParlBench Benchmark Data Sets Queries 2 ParlBench experimental run on Virtuoso
  • 8. Outline 1 The ParlBench Benchmark Data Sets Queries 2 ParlBench experimental run on Virtuoso
  • 9. Outline 1 The ParlBench Benchmark Data Sets Queries 2 ParlBench experimental run on Virtuoso
  • 10. The ParlBench Data Sets I PoliticalMashup: characteristics → Dutch parliamentary proceedings (1814-2013), political parties and politicians → richly structured XML documents (∼ 54.000) → URIs of concepts → metadata: who said what and when → links to Wikipedia
  • 11. The ParlBench Data Sets I PoliticalMashup: characteristics → Dutch parliamentary proceedings (1814-2013), political parties and politicians → richly structured XML documents (∼ 54.000) → URIs of concepts → metadata: who said what and when → links to Wikipedia Linked PoliticalMashup: design choices → keep the URIs and linking structure → re-use existing vocabularies → link to the Linked Open Data cloud → separate the structure from the text
  • 12. The ParlBench Data Sets II parties: Dutch political parties members: members of the Dutch parliament proceedings: structure of the Dutch parliamentary proceedings paragraphs: content of speeches of the parliamentary meetings tagged entities: links from the paragraphs to DBpedia # of triples parties members proceedings paragraphs tagged entities total 510 33,885 ∼36.5M ∼11.25M ∼34.4M ∼82.2M
  • 13. RDF Data Model Parliamentary Proceedings: ParliPro [2], DC and DC Terms [8] Topic Stage Direction Speech Paragraph Scene Parliament Member Political Party has part Parliamentary Proceedings has part has parthas part references member references party has part has part has part has part
  • 14. RDF Data Model Parliament Member: FOAF [4], Bio [3] and DBpedia Ontology [5] Parliament Member DBpedia resource same as Biography biography
  • 15. RDF Data Model Parties: ParliPro [2] Political Party DBpedia resource same as
  • 16. RDF Data Model Paragraphs: ParliPro [2] Paragraph Content of the paragraph has text
  • 17. RDF Data Model Tagged Entities: MUTO [6], FOAF [4], Basic WGS84 [7] Paragraph Tag DBpedia resource has auto meaning Person Organization Spatial Thing is a is a is a
  • 18. Outline 1 The ParlBench Benchmark Data Sets Queries 2 ParlBench experimental run on Virtuoso
  • 19. 19 ParlBench queries: 4 micro-benchmarks → 3 Average, e.g. A0: Retrieve average number of people spoke per topic. → 5 Count, e.g. C4: Count speeches of a female speaker from the topic where only one female spoke. → 6 Factual, e.g. F3: What is the percentage of female speakers? → 5 Top 10, e.g. T4: Retrieve top 10 longest topics (i.e., number of paragraphs).
  • 20. Outline 1 The ParlBench Benchmark Data Sets Queries 2 ParlBench experimental run on Virtuoso
  • 21. ParlBench experimental run Test Machine → MacBook Pro + Mac OS X Lion 10.7.6 x64 → CPUs: 2.8 GHz Intel Core i7 (2x2 cores) → Memory: 8GB
  • 22. ParlBench experimental run System Under Test → Virtuoso Open Source Edition v.06.01.3, native RDF store → default Virtuoso index scheme → configuration for large data sets loading
  • 23. ParlBench experimental run Experimental set-up → 8 test collections: Parties, Members, scaled Proceedings (from 1 to 100%) → single user mode → 1 run = 10 permutations of 19 queries (190 queries) → warm-up period: 5 runs (950 queries) → measuring period: 3 runs (570 queries) → query response time: mean of all the permutations of all the runs (10*3 = 30 runs) Scaling of proceedings Scaling Factor 1% 2% 4% 8% 16% 32% 64% 100% # of triples ∼0.5M ∼1M ∼1.9M ∼3.9M ∼7.6M ∼15M ∼23M ∼36.5M
  • 24. Loading Time, log2 (time, sec) 1 2 4 8 16 32 64 100 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 Size of proceedings, % Time,sec
  • 25. Query Response Time by Micro-Benchmarks, log2 (SUM(time), sec) 1 2 4 8 16 32 64 100 0.25 0.5 1 2 4 8 16 32 64 128 256 Size of proceedings, % Sumofexecutiontime,sec average count factual top10
  • 26. Query Response Time on the Largest Collection (∼36M) A0 A1 A2 C0 C1 C2 C3 C4 F0 F1 F2 F3 F4 F5 T0 T1 T2 T3 T4 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 Queries Time,sec 45.9422 39.5885 47.1268 2.4212 10.6883 1.4383 0.8649 30.0118 7.9996 78.1858 22.377822.4192 0.1053 48.8887 0.8357 10.2813 41.6915 0.9241 168.1313 average count factual top10
  • 27. T4: Retrieve top 10 longest topics (i.e., number of paragraphs). SELECT ?topic COUNT(?par) as ?numOfPars WHERE { ?topic rdf:type parlipro:Topic . ?speech rdf:type parlipro:Speech . ?speech dcterms:hasPart ?par . ?par rdf:type parlipro:Paragraph . {?topic dcterms:hasPart ?speech .} UNION{ ?topic dcterms:hasPart ?sd . ?sd rdf:type parlipro:StageDirection . ?sd dcterms:hasPart ?speech .} UNION{ ?topic dcterms:hasPart ?scene . ?scene rdf:type parlipro:Scene . ?scene dcterms:hasPart ?speech .}} GROUP BY ?topic ORDER BY DESC(?numOfPars) LIMIT 10
  • 28. Characteristics of ParlBench queries micro benchmark Average Count Factual Top 10 A0 A1 A2 C0 C1 C2 C3 C4 F0 F1 F2 F3 F4 F5 T0 T1 T2 T3 T4 FILTER + + + + + + + + UNION + + + + + + + + + LIMIT + + + + + + + ORDER BY + + + + + + + GROUP BY + + + + + + + + + + + + COUNT + + + + + + + + + + + + + + + + + DISTINCT + + + + AVG + + + negation + OPTIONAL + + subquery + + + + + + + blank node scoping + + + + + + + + + # of triple patterns 10 9 12 5 5 5 6 13 8 16 6 6 2 4 2 4 9 3 11
  • 29. T2: Retrieve top 10 topics with the most speeches SELECT ?topic COUNT(?speech) as ?numOfSpeeches WHERE { ?topic rdf:type parlipro:Topic . ?speech rdf:type parlipro:Speech . {?topic dcterms:hasPart ?speech .} UNION{ {?topic dcterms:hasPart ?sd . ?sd rdf:type parlipro:StageDirection . ?sd dcterms:hasPart ?speech .} UNION{ ?topic dcterms:hasPart ?scene . ?scene rdf:type parlipro:Scene . ?scene dcterms:hasPart ?speech .}} GROUP BY ?topic ORDER BY DESC(?numOfSpeeches) LIMIT 10
  • 30. Conclusion → SPARQL-benchmark for e-publishing applications → large collections of real data → intuitive analytical queries → micro-benchmarks for SPARQL features analysis Future work → enlarge the data sets - votes in proceedings - interlink proceedings with the Dutch legislation data set [1] (>280M of triples) - tagged entities: more tags → extend the queries - SPARQL 1.1: path expressions - Linked Open Data integration scenario → run the benchmark on other RDF stores
  • 31. Thank you! ParlBench resources → data access: → resolvable URIs → RDF data dumps at http://data.politicalmashup.nl/RDF/data/ → experimental run: website describing an experimental run http://data.politicalmashup.nl/RDF/ public SPARQL-endpoint to a test collection http://data.politicalmashup.nl/sparql/ → scripts are available at http://data.politicalmashup.nl/RDF/scripts/ → ParliPro vocabulary: RDF representation http://purl.org/vocab/parlipro# HTML representation http://data.politicalmashup.nl/RDF/vocabularies/parlipro
  • 33. References I Dutch national regulations in CEN MetaLex http://doc.metalex.eu/ The Parliamentary Proceedings (ParliPro) Vocabulary http://purl.org/vocab/parlipro# BIO: A vocabulary for biographical information http://vocab.org/bio The Friend of a Friend Vocabulary (FOAF) http://xmlns.com/foaf/0.1/ The DBpedia Ontology http://dbpedia.org/ontology/ The Modular Unified Tagging Ontology (MUTO) http://muto.socialtagging.org/ Basic Geo (WGS84 lat/long) Vocabulary http://www.w3.org/2003/01/geo/wgs84_pos#
  • 34. References II Dublin Core Metadata Element Set http://purl.org/dc/elements/1.1/ and Dublin Core collection description Terms http://purl.org/dc/terms/
  • 35. Statistics of the benchmark data sets dataset # of triples size # of files members 33,885 14M 3,583 parties 510 612K 151 proceedings 36,503,688 4.15G 51,233 paragraphs 11,250,295 5.77G 51,233 tagged entities 34,449,033 2.57G 34,755 TOTAL: 82,237,411 ∼13G 140,955
  • 36. Statistics of the ParlBench data sets Number of classes: 9 Number of properties: 25 Number of instances per class: Member: 3,583 Party: 151 Proceedings: 51,233 Topic: 102,289 Stage Direction: 1,776,598 Scene: 189,226 Speech: 2,495,969 Paragraph: 11,211,520 Tagged Entity: 11,383,787
  • 37. Parliamentary Proceedings: example of encoding parlipro:Parliamentary Proceedings pm:nl.proc.ob.d.h- tk-19992000-2432-2483 rdf:type pm:nl.proc.ob.d.h- tk-19992000-2432-2483.1 parlipro:Topic dcterms:hasPart pm:nl.proc.ob.d.h- tk-19992000-2432-2483.1.7.30 parlipro:Speech rdf:type dcterms:hasPart pm:nl.proc.ob.d.h- tk-19992000-2432-2483.1.7.30.1 parlipro:Paragraph rdf:type dcterms:hasPart pm:nl.p.gl pm:nl.m.02547 parlipro:refMember parlipro:refParty 1999-12-08 rdf:type dc:date pm:nl.proc.ob.d.h- tk-19992000-2432-2483.1.7 parlipro:Scene rdf:type dcterms:hasPart …
  • 38. Members: example of encoding nl-dbpedia:Marijke_Vos owl:sameAs _:bio bio:biography pm:nl.m.02547 foaf:gender bio:Biography en-dbpedia:Marijke_Vos owl:sameAs dbpedia- ont:Female rdf:type 1957-05-04 foaf:birthday Leidschendam dbpedia- ont:birthPlace Parliament Member rdf:type
  • 39. Paragraphs and Tagged Entities: example of encoding Paragraph pm:nl.proc.ob.d.h- tk-19992000-2432-2483.1.7.30.1 parlipro:Paragraph rdf:type Blijkbaar is er nu het een en ander mis in de relatie tussen de Europese Unie en de Russische Federatie. ... has text
  • 40. Paragraphs and Tagged Entities: example of encoding Paragraph pm:nl.proc.ob.d.h- tk-19992000-2432-2483.1.7.30.1 parlipro:Paragraph rdf:type Blijkbaar is er nu het een en ander mis in de relatie tussen de Europese Unie en de Russische Federatie. ... has text Tagged Entity muto:hasTag pm:nl.proc.ob.d.h- tk-19992000-2432-2483.1.7.30.1 _:tag muto:hasAutoMeaning nl-dbpedia:Rusland geo:SpatialThing rdf:type parlipro:Paragraph rdf:type