SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Mining the Web of Linked Data 
with RapidMiner 
Introducing the RapidMiner 
Linked Open Data Extension 
Petar Ristoski, Christian Bizer, Heiko Paulheim
Motivation 
Which factors lead to a high corruption rate? 
How to improve the quality of living? 
How to find good books to read? 
How to publish more scientific articles? 
How to prevent inflation? 
What makes cars to consume less fuel? 
How to decrease the electricity consumption? 
10/27/14 Ristoski, Bizer, Paulheim 2
Motivation 
?? 
10/27/14 Ristoski, Bizer, Paulheim 3
Motivation 
Local LOD 
Data 
link combine cleanse transform analyze 
10/27/14 Ristoski, Bizer, Paulheim 4
RapidMiner Linked Open Data Extension 
Introducing RapidMiner: 
● An open source platform for data mining and predictive analytics 
● Processes are designed by wiring operators in a GUI 
(no programming) 
● Operators for data loading, transformation, modeling, visualization, … 
● Scalable, distributed, parallel processing in a cloud environment 
● 200,000 active users 
● Developers can write their own extensions 
10/27/14 Ristoski, Bizer, Paulheim 5
RapidMiner Linked Open Data Extension 
• The extension adds operators for 
– accessing local and remote semantic web data (RDF, SPARQL, …) 
– linking local to remote data (e.g., DBpedia Lookup) 
– enriching local data (e.g., with data properties from LOD sources) 
– automatically following links to other datasets 
– exploiting semantic schemata for optimizing attribute subset selection 
(DiscoveryScience'14) 
– matching and fusing data from different sources 
• Data analysts can use it without knowing SPARQL etc. 
10/27/14 Ristoski, Bizer, Paulheim 6
Example Use Case 
• Which factors correlate with the increase of published scientific and 
technical journal articles? 
• RapidMiner workflow: 
– Import data from WorldBank RDF data cube 
– Link countries to DBpedia 
– Explore additional datasets 
– Generate attributes 
– Analyze the results 
• now live! 
10/27/14 Ristoski, Bizer, Paulheim 7
Example Use Case 
• Starting from links to DBpedia, we follow links and collect data from 
– DBpedia 
– Linked GeoData 
– Eurostat 
– GeoNames 
– WHO’s Global Health 
Observatory 
– Linked Energy Data 
– OpenCyc 
– World Factbook 
– YAGO 
• Related data is fused 
– e.g., population figures from different sources 
10/27/14 Ristoski, Bizer, Paulheim 8
Example Use Case 
• Factors that correlate with large number of publications 
– The fragile state index – FSI (positive) 
– Human development index – HDI (positive) 
– GDP (positive) 
• wealthier countries being able to invest 
more federal money into science funding? 
– For EU countries, the number of EU seats (positive) 
• an increasing fraction of EU funding for science 
being attributed to those countries? 
– Many climate indicators (precipitation, hours of sun, temperature) 
• unequal distribution of wealth across different climate zones? 
10/27/14 Ristoski, Bizer, Paulheim 9
Other Use Cases 
• Improving performance of predictive models (RMWorld'14) 
– UCI car dataset: predicting fuel consumption 
• Reducing the prediction error of M5' by half 
– on average, we are wrong by 1.6 instead of 2.9 MPG 
10/27/14 Ristoski, Bizer, Paulheim 10
Other Use Cases 
• Building Semantic Recommeder Systems (ESWC'14) 
• Combines two extensions: 
– Linked Open Data extension 
– Recommender system extension 
• Use data about books 
for content-based recommender 
– best system (out of 24) 
on two out of three tasks 
10/27/14 Ristoski, Bizer, Paulheim 12
Other Use Cases 
• Debugging Linked Open Data 
– loading a subset of statements 
– augment with additional features 
– run outlier detection 
• again: a special 
extension 
• Example: identify wrong 
dataset interlinks 
(WoDOOM'14) 
– AUC up to 85% 
10/27/14 Ristoski, Bizer, Paulheim 13
Summary 
• This challenge entry 
– brings data analysis to the web of data 
– can be used by data analysts without learning SPARQL 
• Availability 
– on the RapidMiner 
marketplace 
– installable from 
inside RapidMiner 
– >4,000 installations 
and counting 
10/27/14 Ristoski, Bizer, Paulheim 14
Mining the Web of Linked Data 
with RapidMiner 
Introducing the RapidMiner 
Linked Open Data Extension 
Petar Ristoski, Christian Bizer, Heiko Paulheim

Weitere ähnliche Inhalte

Was ist angesagt?

The Modern Palimpsest
The Modern PalimpsestThe Modern Palimpsest
The Modern PalimpsestLeigh Dodds
 
Extending Tables with Data from over a Million Websites
 Extending Tables with Data from over a Million Websites Extending Tables with Data from over a Million Websites
Extending Tables with Data from over a Million WebsitesChris Bizer
 
How links can make your open data even greater
How links can make your open data even greaterHow links can make your open data even greater
How links can make your open data even greaterCristina Sarasua
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlPrimal Pappachan
 
The Graph Structure of the Web - Aggregated by Pay-Level Domain
The Graph Structure of the Web - Aggregated by Pay-Level DomainThe Graph Structure of the Web - Aggregated by Pay-Level Domain
The Graph Structure of the Web - Aggregated by Pay-Level Domainoli-unima
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsdgarijo
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublinm_ackermann
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
Aallbibframe em-20130714
Aallbibframe em-20130714Aallbibframe em-20130714
Aallbibframe em-20130714zepheiraorg
 
Web Data Extraction: A Crash Course
Web Data Extraction: A Crash CourseWeb Data Extraction: A Crash Course
Web Data Extraction: A Crash CourseGiorgio Orsi
 
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...Robert Meusel
 
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...Data Beers
 
Maintaining scholarly standards in the digital age: Publishing historical gaz...
Maintaining scholarly standards in the digital age: Publishing historical gaz...Maintaining scholarly standards in the digital age: Publishing historical gaz...
Maintaining scholarly standards in the digital age: Publishing historical gaz...Humphrey Southall
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
 
Towards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and WikidataTowards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and WikidataAndreas Thalhammer
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Heiko Paulheim
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 

Was ist angesagt? (20)

The Modern Palimpsest
The Modern PalimpsestThe Modern Palimpsest
The Modern Palimpsest
 
Extending Tables with Data from over a Million Websites
 Extending Tables with Data from over a Million Websites Extending Tables with Data from over a Million Websites
Extending Tables with Data from over a Million Websites
 
How links can make your open data even greater
How links can make your open data even greaterHow links can make your open data even greater
How links can make your open data even greater
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing Webcrawl
 
The Graph Structure of the Web - Aggregated by Pay-Level Domain
The Graph Structure of the Web - Aggregated by Pay-Level DomainThe Graph Structure of the Web - Aggregated by Pay-Level Domain
The Graph Structure of the Web - Aggregated by Pay-Level Domain
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublin
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
WCIT2010
WCIT2010WCIT2010
WCIT2010
 
Aallbibframe em-20130714
Aallbibframe em-20130714Aallbibframe em-20130714
Aallbibframe em-20130714
 
Web Data Extraction: A Crash Course
Web Data Extraction: A Crash CourseWeb Data Extraction: A Crash Course
Web Data Extraction: A Crash Course
 
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
 
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
 
Maintaining scholarly standards in the digital age: Publishing historical gaz...
Maintaining scholarly standards in the digital age: Publishing historical gaz...Maintaining scholarly standards in the digital age: Publishing historical gaz...
Maintaining scholarly standards in the digital age: Publishing historical gaz...
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
Towards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and WikidataTowards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and Wikidata
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 

Andere mochten auch

Slides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMinerSlides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMinerSabrina Kirstein
 
RapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidmining Content
 
RapidMiner: Setting Up A Process
RapidMiner:  Setting Up A ProcessRapidMiner:  Setting Up A Process
RapidMiner: Setting Up A ProcessRapidmining Content
 
Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningHeiko Paulheim
 
RapidMiner: Rapid Miner Products
RapidMiner: Rapid Miner ProductsRapidMiner: Rapid Miner Products
RapidMiner: Rapid Miner ProductsDataminingTools Inc
 
Are you good at doing sports?
Are you good at doing sports?Are you good at doing sports?
Are you good at doing sports?onlychild21
 
Pengumuman sipencatar 2015
Pengumuman sipencatar 2015Pengumuman sipencatar 2015
Pengumuman sipencatar 2015raizin
 
CMG Recommendation Letter PSB JPG
CMG Recommendation Letter PSB JPGCMG Recommendation Letter PSB JPG
CMG Recommendation Letter PSB JPGPia Sanchez
 
Bai tap trac_nghiem_16_units
Bai tap trac_nghiem_16_unitsBai tap trac_nghiem_16_units
Bai tap trac_nghiem_16_unitsGodfrey Tran
 
Best Practice Guideline to Managing On-site Vermiculture Technologies
Best Practice Guideline to Managing On-site Vermiculture TechnologiesBest Practice Guideline to Managing On-site Vermiculture Technologies
Best Practice Guideline to Managing On-site Vermiculture Technologiesx3G9
 
Early College Academy Greeley Colorado
Early College Academy Greeley ColoradoEarly College Academy Greeley Colorado
Early College Academy Greeley ColoradoPaul Francisco
 
Universal Image Loader: Story, Architecture, FAQ
Universal Image Loader: Story, Architecture, FAQUniversal Image Loader: Story, Architecture, FAQ
Universal Image Loader: Story, Architecture, FAQSergey Tarasevich
 
E sky2u.com marketing plan v3.0
E sky2u.com marketing plan v3.0E sky2u.com marketing plan v3.0
E sky2u.com marketing plan v3.0e-sky, Inc
 
Diskripsi diri
Diskripsi diriDiskripsi diri
Diskripsi diridanielpuji
 
Clinical analysis report 14
Clinical analysis report 14Clinical analysis report 14
Clinical analysis report 14Nikee McEyes
 
Lean manufacturin
Lean manufacturinLean manufacturin
Lean manufacturinDana Feo
 
MS Foods Plan A Quarterly Newsletter February 2016
MS Foods Plan A Quarterly Newsletter February 2016MS Foods Plan A Quarterly Newsletter February 2016
MS Foods Plan A Quarterly Newsletter February 2016Ray McCreadie
 

Andere mochten auch (20)

Slides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMinerSlides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMiner
 
RapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid Miner
 
RapidMiner: Setting Up A Process
RapidMiner:  Setting Up A ProcessRapidMiner:  Setting Up A Process
RapidMiner: Setting Up A Process
 
Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data Mining
 
RapidMiner: Rapid Miner Products
RapidMiner: Rapid Miner ProductsRapidMiner: Rapid Miner Products
RapidMiner: Rapid Miner Products
 
Child domestic labor handbook
Child domestic labor handbookChild domestic labor handbook
Child domestic labor handbook
 
Are you good at doing sports?
Are you good at doing sports?Are you good at doing sports?
Are you good at doing sports?
 
Opjo
OpjoOpjo
Opjo
 
Pengumuman sipencatar 2015
Pengumuman sipencatar 2015Pengumuman sipencatar 2015
Pengumuman sipencatar 2015
 
CMG Recommendation Letter PSB JPG
CMG Recommendation Letter PSB JPGCMG Recommendation Letter PSB JPG
CMG Recommendation Letter PSB JPG
 
Bai tap trac_nghiem_16_units
Bai tap trac_nghiem_16_unitsBai tap trac_nghiem_16_units
Bai tap trac_nghiem_16_units
 
Best Practice Guideline to Managing On-site Vermiculture Technologies
Best Practice Guideline to Managing On-site Vermiculture TechnologiesBest Practice Guideline to Managing On-site Vermiculture Technologies
Best Practice Guideline to Managing On-site Vermiculture Technologies
 
Early College Academy Greeley Colorado
Early College Academy Greeley ColoradoEarly College Academy Greeley Colorado
Early College Academy Greeley Colorado
 
Universal Image Loader: Story, Architecture, FAQ
Universal Image Loader: Story, Architecture, FAQUniversal Image Loader: Story, Architecture, FAQ
Universal Image Loader: Story, Architecture, FAQ
 
E sky2u.com marketing plan v3.0
E sky2u.com marketing plan v3.0E sky2u.com marketing plan v3.0
E sky2u.com marketing plan v3.0
 
Diskripsi diri
Diskripsi diriDiskripsi diri
Diskripsi diri
 
Clinical analysis report 14
Clinical analysis report 14Clinical analysis report 14
Clinical analysis report 14
 
Mini-Training: Let's have a rest
Mini-Training: Let's have a restMini-Training: Let's have a rest
Mini-Training: Let's have a rest
 
Lean manufacturin
Lean manufacturinLean manufacturin
Lean manufacturin
 
MS Foods Plan A Quarterly Newsletter February 2016
MS Foods Plan A Quarterly Newsletter February 2016MS Foods Plan A Quarterly Newsletter February 2016
MS Foods Plan A Quarterly Newsletter February 2016
 

Ähnlich wie Mining the Web of Linked Data with RapidMiner

ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)Piet J.H. Daas
 
Inspire hack 2017-linked-data
Inspire hack 2017-linked-dataInspire hack 2017-linked-data
Inspire hack 2017-linked-dataRaul Palma
 
Team 05 linked data generation
Team 05 linked data generationTeam 05 linked data generation
Team 05 linked data generationplan4all
 
Mapping presentation THAG big data from space
Mapping presentation THAG big data from spaceMapping presentation THAG big data from space
Mapping presentation THAG big data from spaceBartosz Szkudlarek
 
RD shared services and research data spring
RD shared services and research data springRD shared services and research data spring
RD shared services and research data springJisc RDM
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW
 
Easy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalEasy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalMartin Kaltenböck
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapubeswcsummerschool
 
COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015Comsode - FP7 project
 
Business cases and costs RDN
Business cases and costs RDNBusiness cases and costs RDN
Business cases and costs RDNJisc RDM
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data GenerationFilip Radulovic
 
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen TechnologienTFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen TechnologienTourismFastForward
 
Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharingJisc RDM
 
Linked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable developmentLinked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable developmentMartin Kaltenböck
 
Big Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsBig Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsGeoffrey Fox
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflowsSSSW
 
Repository and preservation systems
Repository and preservation systemsRepository and preservation systems
Repository and preservation systemsJisc
 
A Recipe for Sustainable Software
A Recipe for Sustainable SoftwareA Recipe for Sustainable Software
A Recipe for Sustainable SoftwarePhilip Bourne
 
Solving the data problem for research beyond
Solving the data problem for research beyondSolving the data problem for research beyond
Solving the data problem for research beyondJisc
 

Ähnlich wie Mining the Web of Linked Data with RapidMiner (20)

ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
 
Inspire hack 2017-linked-data
Inspire hack 2017-linked-dataInspire hack 2017-linked-data
Inspire hack 2017-linked-data
 
Team 05 linked data generation
Team 05 linked data generationTeam 05 linked data generation
Team 05 linked data generation
 
Mapping presentation THAG big data from space
Mapping presentation THAG big data from spaceMapping presentation THAG big data from space
Mapping presentation THAG big data from space
 
RD shared services and research data spring
RD shared services and research data springRD shared services and research data spring
RD shared services and research data spring
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
 
Easy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalEasy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance Professional
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015
 
Business cases and costs RDN
Business cases and costs RDNBusiness cases and costs RDN
Business cases and costs RDN
 
Open Data is not Enough
Open Data is not EnoughOpen Data is not Enough
Open Data is not Enough
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data Generation
 
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen TechnologienTFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
 
Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharing
 
Linked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable developmentLinked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable development
 
Big Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsBig Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other things
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
Repository and preservation systems
Repository and preservation systemsRepository and preservation systems
Repository and preservation systems
 
A Recipe for Sustainable Software
A Recipe for Sustainable SoftwareA Recipe for Sustainable Software
A Recipe for Sustainable Software
 
Solving the data problem for research beyond
Solving the data problem for research beyondSolving the data problem for research beyond
Solving the data problem for research beyond
 

Mehr von Heiko Paulheim

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...Heiko Paulheim
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfHeiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsHeiko Paulheim
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsHeiko Paulheim
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Heiko Paulheim
 
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids  on the Knowledge Graph BlockBeyond DBpedia and YAGO – The New Kids  on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph BlockHeiko Paulheim
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Heiko Paulheim
 
Machine Learning & Embeddings for Large Knowledge Graphs
Machine Learning & Embeddings  for Large Knowledge GraphsMachine Learning & Embeddings  for Large Knowledge Graphs
Machine Learning & Embeddings for Large Knowledge GraphsHeiko Paulheim
 
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphFrom Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphHeiko Paulheim
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Heiko Paulheim
 
Machine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsHeiko Paulheim
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterHeiko Paulheim
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingHeiko Paulheim
 
Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the WebHeiko Paulheim
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyHeiko Paulheim
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine LearningHeiko Paulheim
 
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopHeiko Paulheim
 

Mehr von Heiko Paulheim (20)

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge Graphs
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids  on the Knowledge Graph BlockBeyond DBpedia and YAGO – The New Kids  on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
 
Machine Learning & Embeddings for Large Knowledge Graphs
Machine Learning & Embeddings  for Large Knowledge GraphsMachine Learning & Embeddings  for Large Knowledge Graphs
Machine Learning & Embeddings for Large Knowledge Graphs
 
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphFrom Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
 
How much is a Triple?
How much is a Triple?How much is a Triple?
How much is a Triple?
 
Machine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge Graphs
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on Twitter
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph Profiling
 
Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the Web
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
 
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
 

Kürzlich hochgeladen

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 

Kürzlich hochgeladen (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 

Mining the Web of Linked Data with RapidMiner

  • 1. Mining the Web of Linked Data with RapidMiner Introducing the RapidMiner Linked Open Data Extension Petar Ristoski, Christian Bizer, Heiko Paulheim
  • 2. Motivation Which factors lead to a high corruption rate? How to improve the quality of living? How to find good books to read? How to publish more scientific articles? How to prevent inflation? What makes cars to consume less fuel? How to decrease the electricity consumption? 10/27/14 Ristoski, Bizer, Paulheim 2
  • 3. Motivation ?? 10/27/14 Ristoski, Bizer, Paulheim 3
  • 4. Motivation Local LOD Data link combine cleanse transform analyze 10/27/14 Ristoski, Bizer, Paulheim 4
  • 5. RapidMiner Linked Open Data Extension Introducing RapidMiner: ● An open source platform for data mining and predictive analytics ● Processes are designed by wiring operators in a GUI (no programming) ● Operators for data loading, transformation, modeling, visualization, … ● Scalable, distributed, parallel processing in a cloud environment ● 200,000 active users ● Developers can write their own extensions 10/27/14 Ristoski, Bizer, Paulheim 5
  • 6. RapidMiner Linked Open Data Extension • The extension adds operators for – accessing local and remote semantic web data (RDF, SPARQL, …) – linking local to remote data (e.g., DBpedia Lookup) – enriching local data (e.g., with data properties from LOD sources) – automatically following links to other datasets – exploiting semantic schemata for optimizing attribute subset selection (DiscoveryScience'14) – matching and fusing data from different sources • Data analysts can use it without knowing SPARQL etc. 10/27/14 Ristoski, Bizer, Paulheim 6
  • 7. Example Use Case • Which factors correlate with the increase of published scientific and technical journal articles? • RapidMiner workflow: – Import data from WorldBank RDF data cube – Link countries to DBpedia – Explore additional datasets – Generate attributes – Analyze the results • now live! 10/27/14 Ristoski, Bizer, Paulheim 7
  • 8. Example Use Case • Starting from links to DBpedia, we follow links and collect data from – DBpedia – Linked GeoData – Eurostat – GeoNames – WHO’s Global Health Observatory – Linked Energy Data – OpenCyc – World Factbook – YAGO • Related data is fused – e.g., population figures from different sources 10/27/14 Ristoski, Bizer, Paulheim 8
  • 9. Example Use Case • Factors that correlate with large number of publications – The fragile state index – FSI (positive) – Human development index – HDI (positive) – GDP (positive) • wealthier countries being able to invest more federal money into science funding? – For EU countries, the number of EU seats (positive) • an increasing fraction of EU funding for science being attributed to those countries? – Many climate indicators (precipitation, hours of sun, temperature) • unequal distribution of wealth across different climate zones? 10/27/14 Ristoski, Bizer, Paulheim 9
  • 10. Other Use Cases • Improving performance of predictive models (RMWorld'14) – UCI car dataset: predicting fuel consumption • Reducing the prediction error of M5' by half – on average, we are wrong by 1.6 instead of 2.9 MPG 10/27/14 Ristoski, Bizer, Paulheim 10
  • 11. Other Use Cases • Building Semantic Recommeder Systems (ESWC'14) • Combines two extensions: – Linked Open Data extension – Recommender system extension • Use data about books for content-based recommender – best system (out of 24) on two out of three tasks 10/27/14 Ristoski, Bizer, Paulheim 12
  • 12. Other Use Cases • Debugging Linked Open Data – loading a subset of statements – augment with additional features – run outlier detection • again: a special extension • Example: identify wrong dataset interlinks (WoDOOM'14) – AUC up to 85% 10/27/14 Ristoski, Bizer, Paulheim 13
  • 13. Summary • This challenge entry – brings data analysis to the web of data – can be used by data analysts without learning SPARQL • Availability – on the RapidMiner marketplace – installable from inside RapidMiner – >4,000 installations and counting 10/27/14 Ristoski, Bizer, Paulheim 14
  • 14. Mining the Web of Linked Data with RapidMiner Introducing the RapidMiner Linked Open Data Extension Petar Ristoski, Christian Bizer, Heiko Paulheim