Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspective

Heiko Paulheim
Heiko PaulheimProfessor for Data Science um University of Mannheim
9/19/2019 Heiko Paulheim 1
Big Data, Smart Algorithms, and Market Power
A Computer Scientist’s Perspective
Heiko Paulheim
Chair for Data Science
University of Mannheim
Heiko Paulheim
9/19/2019 Heiko Paulheim 2
Introductory Example: GPS vs. Smart Phones
• Tests show: smart phones do the job better
– with smart phones on the rise, GPS sales decline
0
5.000
10.000
15.000
20.000
25.000
30.000
GPSsales
Smart phonesales
Source: Statista
Data for Germany;
US looks similar
9/19/2019 Heiko Paulheim 3
Computer Science Interlude: Navigation
• Problem: find the shortest path through a network
• Solution: known since the 1950s
– can be written down in less than 20 lines
End
Start
2km
2km
1km
1km
1km
3km
2km
1km
9/19/2019 Heiko Paulheim 4
Computer Science Interlude: Navigation
• Usually, we do not want the shortest way
– but the fastest
• We need to estimate times
End
Start
0:05 0:15
0:10
0:10
0:15
0:15
0:05
0:10
9/19/2019 Heiko Paulheim 5
Estimating Times for Edges
• Static: path length and speed limit
• Dynamic: live car movements
• Google Maps: owned by Google
– So is Android (market share US: 48%, Germany: 73%, China: 79%)
– i.e., about one android phone in every other car
Source: https://gs.statcounter.com/os-market-share/mobile/
9/19/2019 Heiko Paulheim 6
Visual Depiction
• One Android phone in every other car
Image: Bing Maps
9/19/2019 Heiko Paulheim 7
Improving Navigation
• Ingredients:
– A simple standard textbook algorithm from the 1950s
– A lot of data
• Better navigation
– Usually: not by smarter algorithms
– But by better (=bigger) data!
End
Start
0:05
0:10
0:15
0:10 0:25
0:10
0:15
0:15
0:05
Image: https://neo4j.com/blog/top-13-resources-graph-theory-algorithms/
9/19/2019 Heiko Paulheim 8
A.I. Winters and A Paradigm Shift
• AI has a massive uptake since the 2010s
– But using very different paradigms
1st
AI Winter
2nd
AI Winter
Fast & Horvitz (2016): Long-Term Trends in the Public Perception of Artificial Intelligence
9/19/2019 Heiko Paulheim 9
An Example for AI: Go
• 1990s
– Using handcrafted rules
• i.e., smart algorithms
– Often defeated by children
2010s
Using data from millions of
games
i.e., big data
AlphaGo: Beat some of world’s
best players in 2016
9/19/2019 Heiko Paulheim 10
AI in the Big Data Age (1)
• Algorithms are fairly simple and well known
• Data matters
Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation
smarter
algorithm
more
data
9/19/2019 Heiko Paulheim 11
AI in the Big Data Age (2)
• Algorithms are fairly simple and well known
• Data matters
Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation
more data:
trivial baseline
beats smart
algorithms
9/19/2019 Heiko Paulheim 12
Big Data: Long vs. Wide Data
• Long data = more records of the same kind
– e.g., GPS data from more users
• Wide data = more information about the same records
– e.g., additional information about users
Lehmberg & Hassanzadeh (2018): Ontology Augmentation Through Matching with Web Tables
9/19/2019 Heiko Paulheim 13
It’s All about Patterns in Data
• Examples
– Traffic movements
– Online user behavior
– Cliques in social networks
– …
• Methods:
– Data Mining
– Machine Learning
– …
→ Intensively researched since the 1980s
Image: https://factordaily.com/balaraman-ravindran-reinforcement-learning/
9/19/2019 Heiko Paulheim 14
Patterns in Long Data
9/19/2019 Heiko Paulheim 15
Patterns in Long Data
9/19/2019 Heiko Paulheim 16
Patterns in Wide Data
9/19/2019 Heiko Paulheim 18
Big Data: Long vs. Wide Data
• Example: YouTube (owned by Google)
– Display videos to the user that are as interesting as possible
• Long data: users’ interaction histories
• Wide data:
users’ interaction histories + Google Web searches + visited places
+ Google Play music preferences + ...
9/19/2019 Heiko Paulheim 19
Big Data: Long vs. Wide Data
• Example: Facebook
– Display as much content of interest as possible
• Long data: user profile and interactions
• Wide data:
user profile and interactions + WhatsApp chats
In Germany,
OVG Hamburg
prohibits this
combination!
Image: https://www.instagram.com/p/Bt3OG4DFOsK/
9/19/2019 Heiko Paulheim 20
Big Data: Long vs. Wide Data
• Example: WeChat
• Started as chat application
– showing advertisement based on chats
– later added: apps-in-app (shopping, payment, …)
– CS perspective: rather an OS than an app
• Long data
– Many people’s chats
• Wide data
– Chats
– Shopping history (also includes: products viewed)
– Payment history
Image: Wikipedia
9/19/2019 Heiko Paulheim 21
Take Aways
• Modern AI Systems
– Rely on massive amounts of data
– Processed with fairly simple algorithms
• Algorithms are often well known
– e.g., textbooks, research papers
– It is hard to own an algorithm
• Data is crucial
– Longer data (e.g., acquiring more customers)
– Wider data (e.g., merging businesses)
– It is easy to own data
9/19/2019 Heiko Paulheim 22
Big Data, Smart Algorithms, and Market Power
A Computer Scientist’s Perspective
Heiko Paulheim
Chair for Data Science
University of Mannheim
Heiko Paulheim
1 von 21

Recomendados

Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec... von
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Heiko Paulheim
848 views20 Folien
Machine Learning & Embeddings for Large Knowledge Graphs von
Machine Learning & Embeddings  for Large Knowledge GraphsMachine Learning & Embeddings  for Large Knowledge Graphs
Machine Learning & Embeddings for Large Knowledge GraphsHeiko Paulheim
1.8K views43 Folien
Make Embeddings Semantic Again! von
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Heiko Paulheim
916 views13 Folien
Ld4 dh tutorial von
Ld4 dh tutorialLd4 dh tutorial
Ld4 dh tutorialEnrico Daga
1.9K views163 Folien
Machine Learning with and for Semantic Web Knowledge Graphs von
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsHeiko Paulheim
3.1K views166 Folien
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph von
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphFrom Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphHeiko Paulheim
713 views37 Folien

Más contenido relacionado

Was ist angesagt?

Knowledge Graphs on the Web von
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the WebHeiko Paulheim
1.8K views83 Folien
Type Inference on Noisy RDF Data von
Type Inference on Noisy RDF DataType Inference on Noisy RDF Data
Type Inference on Noisy RDF DataHeiko Paulheim
1.4K views19 Folien
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems von
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsHeiko Paulheim
274 views45 Folien
From Wikis to Knowledge Graphs von
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsHeiko Paulheim
121 views58 Folien
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati... von
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Heiko Paulheim
492 views52 Folien
Towards Knowledge Graph Profiling von
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingHeiko Paulheim
1.9K views59 Folien

Was ist angesagt?(20)

Knowledge Graphs on the Web von Heiko Paulheim
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the Web
Heiko Paulheim1.8K views
Type Inference on Noisy RDF Data von Heiko Paulheim
Type Inference on Noisy RDF DataType Inference on Noisy RDF Data
Type Inference on Noisy RDF Data
Heiko Paulheim1.4K views
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems von Heiko Paulheim
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Heiko Paulheim274 views
From Wikis to Knowledge Graphs von Heiko Paulheim
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge Graphs
Heiko Paulheim121 views
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati... von Heiko Paulheim
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Heiko Paulheim492 views
Towards Knowledge Graph Profiling von Heiko Paulheim
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph Profiling
Heiko Paulheim1.9K views
What the Adoption of schema.org Tells about Linked Open Data von Heiko Paulheim
What the Adoption of schema.org Tells about Linked Open DataWhat the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open Data
Heiko Paulheim1.5K views
How news organizations are using data to tell von peterverweij
How news organizations are using data to tellHow news organizations are using data to tell
How news organizations are using data to tell
peterverweij633 views
Searching for reliable business information: free versus fee von voginip
Searching for reliable business information: free versus feeSearching for reliable business information: free versus fee
Searching for reliable business information: free versus fee
voginip1.2K views
data science: past, present, and future von chris wiggins
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and future
chris wiggins2.8K views
The changing landscape of search for business information von voginip
The changing landscape of search for business informationThe changing landscape of search for business information
The changing landscape of search for business information
voginip1.5K views
Linked Open Data enhanced Knowledge Discovery von Heiko Paulheim
Linked Open Data enhanced  Knowledge DiscoveryLinked Open Data enhanced  Knowledge Discovery
Linked Open Data enhanced Knowledge Discovery
Heiko Paulheim754 views
Linked data in the German National Library at the OCLC IFLA round table 2013 von Lars G. Svensson
Linked data in the German National Library at the OCLC IFLA round table 2013Linked data in the German National Library at the OCLC IFLA round table 2013
Linked data in the German National Library at the OCLC IFLA round table 2013
Lars G. Svensson1.6K views
Fast Approximate A-box Consistency Checking using Machine Learning von Heiko Paulheim
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
Heiko Paulheim1.1K views
EDF2012 Rufus Pollock - Open Data. Where we are where we are going von European Data Forum
EDF2012  Rufus Pollock - Open Data. Where we are where we are goingEDF2012  Rufus Pollock - Open Data. Where we are where we are going
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
data science history / data science @ NYT von chris wiggins
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYT
chris wiggins3K views
data history / data science @ NYT von chris wiggins
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYT
chris wiggins1.3K views

Similar a Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspective

Eduserv Symposium 2013 - New technologies & paradigms, old laws von
Eduserv Symposium 2013 - New technologies & paradigms, old lawsEduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old lawsEduserv
682 views26 Folien
Procurement as a key player in the digital enterprise WKO VIENNA 13092016 von
Procurement as a key player in the digital enterprise WKO VIENNA 13092016Procurement as a key player in the digital enterprise WKO VIENNA 13092016
Procurement as a key player in the digital enterprise WKO VIENNA 13092016Michael Klemen
101 views11 Folien
Google Trends Analysis von
Google Trends AnalysisGoogle Trends Analysis
Google Trends AnalysisAwara Direct Search
1.3K views10 Folien
Exploiting Linked Open Data as Background Knowledge in Data Mining von
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningHeiko Paulheim
4.8K views63 Folien
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo... von
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...panagenda
452 views14 Folien
Social business Fireside Chat with Frank Nestler von
Social business Fireside Chat with Frank NestlerSocial business Fireside Chat with Frank Nestler
Social business Fireside Chat with Frank NestlerLetsConnect
440 views14 Folien

Similar a Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspective(20)

Eduserv Symposium 2013 - New technologies & paradigms, old laws von Eduserv
Eduserv Symposium 2013 - New technologies & paradigms, old lawsEduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv682 views
Procurement as a key player in the digital enterprise WKO VIENNA 13092016 von Michael Klemen
Procurement as a key player in the digital enterprise WKO VIENNA 13092016Procurement as a key player in the digital enterprise WKO VIENNA 13092016
Procurement as a key player in the digital enterprise WKO VIENNA 13092016
Michael Klemen101 views
Exploiting Linked Open Data as Background Knowledge in Data Mining von Heiko Paulheim
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data Mining
Heiko Paulheim4.8K views
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo... von panagenda
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...
panagenda452 views
Social business Fireside Chat with Frank Nestler von LetsConnect
Social business Fireside Chat with Frank NestlerSocial business Fireside Chat with Frank Nestler
Social business Fireside Chat with Frank Nestler
LetsConnect440 views
Big Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics von Broadridge
Big Data in the Fund Industry: From Descriptive to Prescriptive Data AnalyticsBig Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Big Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Broadridge1.4K views
Opportunities for IT and SLA Professionals to Collaborate von Anand Deshpande
Opportunities for IT and SLA Professionals to CollaborateOpportunities for IT and SLA Professionals to Collaborate
Opportunities for IT and SLA Professionals to Collaborate
Anand Deshpande386 views
Illegal Downloads And The Affect On The Film Industry von Ashley Smith
Illegal Downloads And The Affect On The Film IndustryIllegal Downloads And The Affect On The Film Industry
Illegal Downloads And The Affect On The Film Industry
Ashley Smith3 views
Building an 'Internet of Things' ( IoT ) technology cluster in Brighton von Bill Harpley
Building an 'Internet of Things' ( IoT ) technology cluster in BrightonBuilding an 'Internet of Things' ( IoT ) technology cluster in Brighton
Building an 'Internet of Things' ( IoT ) technology cluster in Brighton
Bill Harpley2K views
What Is That DMP Good For, Anyway? von MediaPost
What Is That DMP Good For, Anyway?What Is That DMP Good For, Anyway?
What Is That DMP Good For, Anyway?
MediaPost1K views
Citizen Participation - Case Study on Participatory Apps in Germany von Tobias Siebenlist
Citizen Participation - Case Study on Participatory Apps in GermanyCitizen Participation - Case Study on Participatory Apps in Germany
Citizen Participation - Case Study on Participatory Apps in Germany
Tobias Siebenlist321 views
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp... von Geoffrey Fox
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Geoffrey Fox5.7K views
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center... von Geoffrey Fox
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Geoffrey Fox1.3K views
The Biggest Lies That Digital Marketers Tell Themselves von Samuel Scott
The Biggest Lies That Digital Marketers Tell ThemselvesThe Biggest Lies That Digital Marketers Tell Themselves
The Biggest Lies That Digital Marketers Tell Themselves
Samuel Scott830 views
Big Data and Social Media von Amy Shuen
Big Data and Social MediaBig Data and Social Media
Big Data and Social Media
Amy Shuen1.1K views
Managing Environmental Data in the Google Age von Thierry Gregorius
Managing Environmental Data in the Google AgeManaging Environmental Data in the Google Age
Managing Environmental Data in the Google Age
Thierry Gregorius383 views
Birnbach Communications Predictions For 2012 von NormanB
Birnbach Communications Predictions For 2012Birnbach Communications Predictions For 2012
Birnbach Communications Predictions For 2012
NormanB311 views

Más de Heiko Paulheim

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ... von
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...Heiko Paulheim
741 views72 Folien
What_do_Knowledge_Graph_Embeddings_Learn.pdf von
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfHeiko Paulheim
170 views42 Folien
New Adventures in RDF2vec von
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
97 views50 Folien
Weakly Supervised Learning for Fake News Detection on Twitter von
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterHeiko Paulheim
1.8K views18 Folien
Data-driven Joint Debugging of the DBpedia Mappings and Ontology von
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyHeiko Paulheim
882 views24 Folien
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top von
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopHeiko Paulheim
762 views18 Folien

Más de Heiko Paulheim(13)

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ... von Heiko Paulheim
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Heiko Paulheim741 views
What_do_Knowledge_Graph_Embeddings_Learn.pdf von Heiko Paulheim
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
Heiko Paulheim170 views
Weakly Supervised Learning for Fake News Detection on Twitter von Heiko Paulheim
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on Twitter
Heiko Paulheim1.8K views
Data-driven Joint Debugging of the DBpedia Mappings and Ontology von Heiko Paulheim
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Heiko Paulheim882 views
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top von Heiko Paulheim
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Heiko Paulheim762 views
Combining Ontology Matchers via Anomaly Detection von Heiko Paulheim
Combining Ontology Matchers via Anomaly DetectionCombining Ontology Matchers via Anomaly Detection
Combining Ontology Matchers via Anomaly Detection
Heiko Paulheim634 views
Gathering Alternative Surface Forms for DBpedia Entities von Heiko Paulheim
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia Entities
Heiko Paulheim2.5K views
Mining the Web of Linked Data with RapidMiner von Heiko Paulheim
Mining the Web of Linked Data with RapidMinerMining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMiner
Heiko Paulheim4K views
Data Mining with Background Knowledge from the Web - Introducing the RapidMin... von Heiko Paulheim
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Heiko Paulheim1.7K views
Detecting Incorrect Numerical Data in DBpedia von Heiko Paulheim
Detecting Incorrect Numerical Data in DBpediaDetecting Incorrect Numerical Data in DBpedia
Detecting Incorrect Numerical Data in DBpedia
Heiko Paulheim790 views
Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection von Heiko Paulheim
Identifying Wrong Links between Datasets by Multi-dimensional Outlier DetectionIdentifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Heiko Paulheim641 views
Extending DBpedia with Wikipedia List Pages von Heiko Paulheim
Extending DBpedia with Wikipedia List PagesExtending DBpedia with Wikipedia List Pages
Extending DBpedia with Wikipedia List Pages
Heiko Paulheim1.9K views

Último

[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init... von
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...DataScienceConferenc1
5 views18 Folien
VoxelNet von
VoxelNetVoxelNet
VoxelNettaeseon ryu
17 views21 Folien
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf von
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf10urkyr34
7 views259 Folien
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ... von
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...DataScienceConferenc1
5 views19 Folien
Ukraine Infographic_22NOV2023_v2.pdf von
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdfAnastosiyaGurin
1.4K views3 Folien
Inawisdom Quick Sight von
Inawisdom Quick SightInawisdom Quick Sight
Inawisdom Quick SightPhilipBasford
7 views27 Folien

Último(20)

[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init... von DataScienceConferenc1
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf von 10urkyr34
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf
10urkyr347 views
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ... von DataScienceConferenc1
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...
Ukraine Infographic_22NOV2023_v2.pdf von AnastosiyaGurin
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdf
AnastosiyaGurin1.4K views
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx von DataScienceConferenc1
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx
4_4_WP_4_06_ND_Model.pptx von d6fmc6kwd4
4_4_WP_4_06_ND_Model.pptx4_4_WP_4_06_ND_Model.pptx
4_4_WP_4_06_ND_Model.pptx
d6fmc6kwd47 views
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion von Bertram Ludäscher
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an... von StatsCommunications
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
[DSC Europe 23] Luca Morena - From Psychohistory to Curious Machines von DataScienceConferenc1
[DSC Europe 23] Luca Morena - From Psychohistory to Curious Machines[DSC Europe 23] Luca Morena - From Psychohistory to Curious Machines
[DSC Europe 23] Luca Morena - From Psychohistory to Curious Machines
PRIVACY AWRE PERSONAL DATA STORAGE von antony420421
PRIVACY AWRE PERSONAL DATA STORAGEPRIVACY AWRE PERSONAL DATA STORAGE
PRIVACY AWRE PERSONAL DATA STORAGE
antony4204217 views
CRM stick or twist workshop von info828217
CRM stick or twist workshopCRM stick or twist workshop
CRM stick or twist workshop
info82821714 views
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo... von DataScienceConferenc1
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
Product Research sample.pdf von AllenSingson
Product Research sample.pdfProduct Research sample.pdf
Product Research sample.pdf
AllenSingson33 views

Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspective

  • 1. 9/19/2019 Heiko Paulheim 1 Big Data, Smart Algorithms, and Market Power A Computer Scientist’s Perspective Heiko Paulheim Chair for Data Science University of Mannheim Heiko Paulheim
  • 2. 9/19/2019 Heiko Paulheim 2 Introductory Example: GPS vs. Smart Phones • Tests show: smart phones do the job better – with smart phones on the rise, GPS sales decline 0 5.000 10.000 15.000 20.000 25.000 30.000 GPSsales Smart phonesales Source: Statista Data for Germany; US looks similar
  • 3. 9/19/2019 Heiko Paulheim 3 Computer Science Interlude: Navigation • Problem: find the shortest path through a network • Solution: known since the 1950s – can be written down in less than 20 lines End Start 2km 2km 1km 1km 1km 3km 2km 1km
  • 4. 9/19/2019 Heiko Paulheim 4 Computer Science Interlude: Navigation • Usually, we do not want the shortest way – but the fastest • We need to estimate times End Start 0:05 0:15 0:10 0:10 0:15 0:15 0:05 0:10
  • 5. 9/19/2019 Heiko Paulheim 5 Estimating Times for Edges • Static: path length and speed limit • Dynamic: live car movements • Google Maps: owned by Google – So is Android (market share US: 48%, Germany: 73%, China: 79%) – i.e., about one android phone in every other car Source: https://gs.statcounter.com/os-market-share/mobile/
  • 6. 9/19/2019 Heiko Paulheim 6 Visual Depiction • One Android phone in every other car Image: Bing Maps
  • 7. 9/19/2019 Heiko Paulheim 7 Improving Navigation • Ingredients: – A simple standard textbook algorithm from the 1950s – A lot of data • Better navigation – Usually: not by smarter algorithms – But by better (=bigger) data! End Start 0:05 0:10 0:15 0:10 0:25 0:10 0:15 0:15 0:05 Image: https://neo4j.com/blog/top-13-resources-graph-theory-algorithms/
  • 8. 9/19/2019 Heiko Paulheim 8 A.I. Winters and A Paradigm Shift • AI has a massive uptake since the 2010s – But using very different paradigms 1st AI Winter 2nd AI Winter Fast & Horvitz (2016): Long-Term Trends in the Public Perception of Artificial Intelligence
  • 9. 9/19/2019 Heiko Paulheim 9 An Example for AI: Go • 1990s – Using handcrafted rules • i.e., smart algorithms – Often defeated by children 2010s Using data from millions of games i.e., big data AlphaGo: Beat some of world’s best players in 2016
  • 10. 9/19/2019 Heiko Paulheim 10 AI in the Big Data Age (1) • Algorithms are fairly simple and well known • Data matters Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation smarter algorithm more data
  • 11. 9/19/2019 Heiko Paulheim 11 AI in the Big Data Age (2) • Algorithms are fairly simple and well known • Data matters Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation more data: trivial baseline beats smart algorithms
  • 12. 9/19/2019 Heiko Paulheim 12 Big Data: Long vs. Wide Data • Long data = more records of the same kind – e.g., GPS data from more users • Wide data = more information about the same records – e.g., additional information about users Lehmberg & Hassanzadeh (2018): Ontology Augmentation Through Matching with Web Tables
  • 13. 9/19/2019 Heiko Paulheim 13 It’s All about Patterns in Data • Examples – Traffic movements – Online user behavior – Cliques in social networks – … • Methods: – Data Mining – Machine Learning – … → Intensively researched since the 1980s Image: https://factordaily.com/balaraman-ravindran-reinforcement-learning/
  • 14. 9/19/2019 Heiko Paulheim 14 Patterns in Long Data
  • 15. 9/19/2019 Heiko Paulheim 15 Patterns in Long Data
  • 16. 9/19/2019 Heiko Paulheim 16 Patterns in Wide Data
  • 17. 9/19/2019 Heiko Paulheim 18 Big Data: Long vs. Wide Data • Example: YouTube (owned by Google) – Display videos to the user that are as interesting as possible • Long data: users’ interaction histories • Wide data: users’ interaction histories + Google Web searches + visited places + Google Play music preferences + ...
  • 18. 9/19/2019 Heiko Paulheim 19 Big Data: Long vs. Wide Data • Example: Facebook – Display as much content of interest as possible • Long data: user profile and interactions • Wide data: user profile and interactions + WhatsApp chats In Germany, OVG Hamburg prohibits this combination! Image: https://www.instagram.com/p/Bt3OG4DFOsK/
  • 19. 9/19/2019 Heiko Paulheim 20 Big Data: Long vs. Wide Data • Example: WeChat • Started as chat application – showing advertisement based on chats – later added: apps-in-app (shopping, payment, …) – CS perspective: rather an OS than an app • Long data – Many people’s chats • Wide data – Chats – Shopping history (also includes: products viewed) – Payment history Image: Wikipedia
  • 20. 9/19/2019 Heiko Paulheim 21 Take Aways • Modern AI Systems – Rely on massive amounts of data – Processed with fairly simple algorithms • Algorithms are often well known – e.g., textbooks, research papers – It is hard to own an algorithm • Data is crucial – Longer data (e.g., acquiring more customers) – Wider data (e.g., merging businesses) – It is easy to own data
  • 21. 9/19/2019 Heiko Paulheim 22 Big Data, Smart Algorithms, and Market Power A Computer Scientist’s Perspective Heiko Paulheim Chair for Data Science University of Mannheim Heiko Paulheim