SlideShare ist ein Scribd-Unternehmen logo
1 von 20
3/28/19 Heiko Paulheim 1
Big Data, Smart Algorithms, and Market Power
A Computer Scientist’s Perspective
Heiko Paulheim
Chair for Data Science
University of Mannheim
Heiko Paulheim
3/28/19 Heiko Paulheim 2
Introductory Example: GPS vs. Smart Phones
• Tests show: smart phones do the job better
– with smart phones on the rise, GPS sales decline
0
5 .0 0 0
1 0 .0 0 0
1 5 .0 0 0
2 0 .0 0 0
2 5 .0 0 0
3 0 .0 0 0
G P S s a le s
S m a rt p h o n e s a le s
Source: Statista
3/28/19 Heiko Paulheim 3
Computer Science Interlude: Navigation
• Problem: find the shortest path through a network
• Solution: known since the 1950s
– can be written down in less than 20 lines
End
Start
2km
2km
1km
1km
1km
3km
2km
1km
3/28/19 Heiko Paulheim 4
Computer Science Interlude: Navigation
• Usually, we do not want the shortest way
– but the fastest
• We need to estimate times
End
Start
0:05 0:15
0:10
0:10
0:15
0:15
0:05
0:10
3/28/19 Heiko Paulheim 5
Estimating Times for Edges
• Static: path length and speed limit
• Dynamic: live car movements
• Google Maps: owned by Google
– So is Android
– 57M smart phones in Germany, market share of Android: 80%
●
i.e., one android phone in every other car
3/28/19 Heiko Paulheim 6
Visual Depiction
• One Android phone in every other car
Image: Bing Maps
3/28/19 Heiko Paulheim 7
Improving Navigation
• Ingredients:
– A simple standard textbook algorithm from the 1950s
– A lot of data
• Better navigation
– Usually: not by smarter algorithms
– But by better (=bigger) data!
End
Start
0:05
0:10
0:15
0:10 0:25
0:10
0:15
0:15
0:05
Image: https://neo4j.com/blog/top-13-resources-graph-theory-algorithms/
3/28/19 Heiko Paulheim 8
A.I. Winters and A Paradigm Shift
• AI has a massive uptake since the 2010s
– But using very different paradigms
1st
AI Winter
2nd
AI Winter
Fast & Horvitz (2016): Long-Term Trends in the Public Perception of Artificial Intelligence
3/28/19 Heiko Paulheim 9
An Example for AI: Go
• 1990s
– Using handcrafted rules
●
i.e., smart algorithms
– Often defeated by children
• 2010s
– Using data from millions of
games
●
i.e., big data
– AlphaGo: Beat some of
world’s best players in 2016
3/28/19 Heiko Paulheim 10
AI in the Big Data Age (1)
• Algorithms are fairly simple and well known
• Data matters
Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation
smarter
algorithm
more
data
3/28/19 Heiko Paulheim 11
AI in the Big Data Age (2)
• Algorithms are fairly simple and well known
• Data matters
Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation
more data:
trivial baseline
beats smart
algorithms
3/28/19 Heiko Paulheim 12
Big Data: Long vs. Wide Data
• Long data = more records of the same kind
– e.g., GPS data from more users
• Wide data = more information about the same records
– e.g., additional information about users
Lehmberg & Hassanzadeh (2018): Ontology Augmentation Through Matching with Web Tables
3/28/19 Heiko Paulheim 13
Big Data: Long vs. Wide Data
• Example: YouTube (owned by Google)
– Display videos to the user that are as interesting as possible
• Long data: users’ interaction histories
• Wide data:
users’ interaction histories + Google Web searches + visited places
+ Google Play music preferences + ...
3/28/19 Heiko Paulheim 14
Big Data: Long vs. Wide Data
• Example: Facebook
– Display as much content of interest as possible
• Long data: user profile and interactions
• Wide data:
user profile and interactions + WhatsApp chats
In Germany,
OVG Hamburg
prohibits this
combination!
Image: https://www.instagram.com/p/Bt3OG4DFOsK/
3/28/19 Heiko Paulheim 15
It’s All about Patterns in Data
• Examples
– Traffic movements
– Online user behavior
– Cliques in social networks
– …
• Methods:
– Data Mining
– Machine Learning
– …
→ Intensively researched since the 1980s
Image: https://factordaily.com/balaraman-ravindran-reinforcement-learning/
3/28/19 Heiko Paulheim 16
Patterns in Long Data
3/28/19 Heiko Paulheim 17
Patterns in Long Data
3/28/19 Heiko Paulheim 18
Patterns in Wide Data
3/28/19 Heiko Paulheim 19
Take Aways
• Modern AI Systems
– Rely on massive amounts of data
– Processed with fairly simple algorithms
• Algorithms are often well known
– e.g., textbooks, research papers
– It is hard to own an algorithm
• Data is crucial
– Longer data (e.g., acquiring more customers)
– Wider data (e.g., merging businesses)
– It is easy to own data
3/28/19 Heiko Paulheim 20
Big Data, Smart Algorithms, and Market Power
A Computer Scientist’s Perspective
Heiko Paulheim
Chair for Data Science
University of Mannheim
Heiko Paulheim

Weitere ähnliche Inhalte

Was ist angesagt?

Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the WebHeiko Paulheim
 
Type Inference on Noisy RDF Data
Type Inference on Noisy RDF DataType Inference on Noisy RDF Data
Type Inference on Noisy RDF DataHeiko Paulheim
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingHeiko Paulheim
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsHeiko Paulheim
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsHeiko Paulheim
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Heiko Paulheim
 
data science: past, present, and future
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and futurechris wiggins
 
Searching for reliable business information: free versus fee
Searching for reliable business information: free versus feeSearching for reliable business information: free versus fee
Searching for reliable business information: free versus feevoginip
 
Linked data in the German National Library at the OCLC IFLA round table 2013
Linked data in the German National Library at the OCLC IFLA round table 2013Linked data in the German National Library at the OCLC IFLA round table 2013
Linked data in the German National Library at the OCLC IFLA round table 2013Lars G. Svensson
 
Linked Open Data enhanced Knowledge Discovery
Linked Open Data enhanced  Knowledge DiscoveryLinked Open Data enhanced  Knowledge Discovery
Linked Open Data enhanced Knowledge DiscoveryHeiko Paulheim
 
The changing landscape of search for business information
The changing landscape of search for business informationThe changing landscape of search for business information
The changing landscape of search for business informationvoginip
 
data science history / data science @ NYT
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYTchris wiggins
 
LOAD–Linked Open Antipodal Data
LOAD–Linked Open Antipodal DataLOAD–Linked Open Antipodal Data
LOAD–Linked Open Antipodal DataLars G. Svensson
 
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
EDF2012  Rufus Pollock - Open Data. Where we are where we are goingEDF2012  Rufus Pollock - Open Data. Where we are where we are going
EDF2012 Rufus Pollock - Open Data. Where we are where we are goingEuropean Data Forum
 
Data Journalism at HSE conference
Data Journalism at HSE conferenceData Journalism at HSE conference
Data Journalism at HSE conferenceIrina Radchenko
 
data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...chris wiggins
 
Ejrcicio Presentación mapas conceptuales L Liberal
Ejrcicio Presentación mapas conceptuales   L LiberalEjrcicio Presentación mapas conceptuales   L Liberal
Ejrcicio Presentación mapas conceptuales L Liberalliberall
 

Was ist angesagt? (20)

Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the Web
 
Type Inference on Noisy RDF Data
Type Inference on Noisy RDF DataType Inference on Noisy RDF Data
Type Inference on Noisy RDF Data
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph Profiling
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge Graphs
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
data science: past, present, and future
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and future
 
Searching for reliable business information: free versus fee
Searching for reliable business information: free versus feeSearching for reliable business information: free versus fee
Searching for reliable business information: free versus fee
 
Linked data in the German National Library at the OCLC IFLA round table 2013
Linked data in the German National Library at the OCLC IFLA round table 2013Linked data in the German National Library at the OCLC IFLA round table 2013
Linked data in the German National Library at the OCLC IFLA round table 2013
 
Linked Open Data enhanced Knowledge Discovery
Linked Open Data enhanced  Knowledge DiscoveryLinked Open Data enhanced  Knowledge Discovery
Linked Open Data enhanced Knowledge Discovery
 
The changing landscape of search for business information
The changing landscape of search for business informationThe changing landscape of search for business information
The changing landscape of search for business information
 
[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...
[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...
[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...
 
Data on the web
Data on the webData on the web
Data on the web
 
data science history / data science @ NYT
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYT
 
LOAD–Linked Open Antipodal Data
LOAD–Linked Open Antipodal DataLOAD–Linked Open Antipodal Data
LOAD–Linked Open Antipodal Data
 
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
EDF2012  Rufus Pollock - Open Data. Where we are where we are goingEDF2012  Rufus Pollock - Open Data. Where we are where we are going
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
 
Data Journalism at HSE conference
Data Journalism at HSE conferenceData Journalism at HSE conference
Data Journalism at HSE conference
 
data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...
 
Ejrcicio Presentación mapas conceptuales L Liberal
Ejrcicio Presentación mapas conceptuales   L LiberalEjrcicio Presentación mapas conceptuales   L Liberal
Ejrcicio Presentación mapas conceptuales L Liberal
 
Digital Economy, Digital Tourism based on Open Data and Open Access Approach
Digital Economy, Digital Tourism based on Open Data and Open Access ApproachDigital Economy, Digital Tourism based on Open Data and Open Access Approach
Digital Economy, Digital Tourism based on Open Data and Open Access Approach
 

Ähnlich wie Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspective

Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningHeiko Paulheim
 
Eduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old lawsEduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old lawsEduserv
 
Alice Andreuzzi, Catchy Srl - "Catchy: Social Data Intelligence"
Alice Andreuzzi, Catchy Srl - "Catchy: Social Data Intelligence"Alice Andreuzzi, Catchy Srl - "Catchy: Social Data Intelligence"
Alice Andreuzzi, Catchy Srl - "Catchy: Social Data Intelligence"Data Driven Innovation
 
Strata Big data presentation
Strata Big data presentationStrata Big data presentation
Strata Big data presentationPiet J.H. Daas
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Geoffrey Fox
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Geoffrey Fox
 
Big data as a source for official statistics
Big data as a source for official statisticsBig data as a source for official statistics
Big data as a source for official statisticsEdwin de Jonge
 
EU Open Data Strategy
EU Open Data StrategyEU Open Data Strategy
EU Open Data StrategyePSI Platform
 
An overview of Twitter analytics
An overview of Twitter analyticsAn overview of Twitter analytics
An overview of Twitter analyticsDr Wasim Ahmed
 
Wwsss intro2016-final
Wwsss intro2016-finalWwsss intro2016-final
Wwsss intro2016-finalSteffen Staab
 
Big Data and Research Ethics
Big Data and Research EthicsBig Data and Research Ethics
Big Data and Research EthicsJan Schmidt
 
A Year in Open Data - OGD 2012 Key-note
A Year in Open Data - OGD 2012 Key-noteA Year in Open Data - OGD 2012 Key-note
A Year in Open Data - OGD 2012 Key-noteePSI Platform
 
AI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science ConceptsAI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science ConceptsDan O'Leary
 
Big Data — Your new best friend
Big Data — Your new best friendBig Data — Your new best friend
Big Data — Your new best friendReuven Lerner
 
Big Data, the Future of Statistics: Experiences at Statistics Netherlands
Big Data, the Future of Statistics: Experiences at Statistics NetherlandsBig Data, the Future of Statistics: Experiences at Statistics Netherlands
Big Data, the Future of Statistics: Experiences at Statistics NetherlandsPiet J.H. Daas
 
#ICO: Best Practices, done by Maksim Balashevich
#ICO: Best Practices, done by Maksim Balashevich#ICO: Best Practices, done by Maksim Balashevich
#ICO: Best Practices, done by Maksim BalashevichElfriede Sixt
 

Ähnlich wie Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspective (20)

Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data Mining
 
Eduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old lawsEduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old laws
 
A Year in Open Data
A Year in Open DataA Year in Open Data
A Year in Open Data
 
Alice Andreuzzi, Catchy Srl - "Catchy: Social Data Intelligence"
Alice Andreuzzi, Catchy Srl - "Catchy: Social Data Intelligence"Alice Andreuzzi, Catchy Srl - "Catchy: Social Data Intelligence"
Alice Andreuzzi, Catchy Srl - "Catchy: Social Data Intelligence"
 
Strata Big data presentation
Strata Big data presentationStrata Big data presentation
Strata Big data presentation
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
 
Big data as a source for official statistics
Big data as a source for official statisticsBig data as a source for official statistics
Big data as a source for official statistics
 
EU Open Data Strategy
EU Open Data StrategyEU Open Data Strategy
EU Open Data Strategy
 
An overview of Twitter analytics
An overview of Twitter analyticsAn overview of Twitter analytics
An overview of Twitter analytics
 
Analyzing social media with Python and other tools (1/4)
Analyzing social media with Python and other tools (1/4)Analyzing social media with Python and other tools (1/4)
Analyzing social media with Python and other tools (1/4)
 
Wwsss intro2016-final
Wwsss intro2016-finalWwsss intro2016-final
Wwsss intro2016-final
 
PSI Reuse: Policy and Opportunities
PSI Reuse: Policy and OpportunitiesPSI Reuse: Policy and Opportunities
PSI Reuse: Policy and Opportunities
 
Big Data and Research Ethics
Big Data and Research EthicsBig Data and Research Ethics
Big Data and Research Ethics
 
A Year in Open Data - OGD 2012 Key-note
A Year in Open Data - OGD 2012 Key-noteA Year in Open Data - OGD 2012 Key-note
A Year in Open Data - OGD 2012 Key-note
 
AI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science ConceptsAI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science Concepts
 
Semantic Puzzle
Semantic PuzzleSemantic Puzzle
Semantic Puzzle
 
Big Data — Your new best friend
Big Data — Your new best friendBig Data — Your new best friend
Big Data — Your new best friend
 
Big Data, the Future of Statistics: Experiences at Statistics Netherlands
Big Data, the Future of Statistics: Experiences at Statistics NetherlandsBig Data, the Future of Statistics: Experiences at Statistics Netherlands
Big Data, the Future of Statistics: Experiences at Statistics Netherlands
 
#ICO: Best Practices, done by Maksim Balashevich
#ICO: Best Practices, done by Maksim Balashevich#ICO: Best Practices, done by Maksim Balashevich
#ICO: Best Practices, done by Maksim Balashevich
 

Mehr von Heiko Paulheim

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...Heiko Paulheim
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfHeiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterHeiko Paulheim
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyHeiko Paulheim
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine LearningHeiko Paulheim
 
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopHeiko Paulheim
 
Combining Ontology Matchers via Anomaly Detection
Combining Ontology Matchers via Anomaly DetectionCombining Ontology Matchers via Anomaly Detection
Combining Ontology Matchers via Anomaly DetectionHeiko Paulheim
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesHeiko Paulheim
 
What the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open DataWhat the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open DataHeiko Paulheim
 
Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMinerMining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMinerHeiko Paulheim
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Heiko Paulheim
 
Detecting Incorrect Numerical Data in DBpedia
Detecting Incorrect Numerical Data in DBpediaDetecting Incorrect Numerical Data in DBpedia
Detecting Incorrect Numerical Data in DBpediaHeiko Paulheim
 
Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Identifying Wrong Links between Datasets by Multi-dimensional Outlier DetectionIdentifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Identifying Wrong Links between Datasets by Multi-dimensional Outlier DetectionHeiko Paulheim
 
Extending DBpedia with Wikipedia List Pages
Extending DBpedia with Wikipedia List PagesExtending DBpedia with Wikipedia List Pages
Extending DBpedia with Wikipedia List PagesHeiko Paulheim
 

Mehr von Heiko Paulheim (15)

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on Twitter
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
 
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
 
Combining Ontology Matchers via Anomaly Detection
Combining Ontology Matchers via Anomaly DetectionCombining Ontology Matchers via Anomaly Detection
Combining Ontology Matchers via Anomaly Detection
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia Entities
 
What the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open DataWhat the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open Data
 
Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMinerMining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMiner
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
 
Detecting Incorrect Numerical Data in DBpedia
Detecting Incorrect Numerical Data in DBpediaDetecting Incorrect Numerical Data in DBpedia
Detecting Incorrect Numerical Data in DBpedia
 
Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Identifying Wrong Links between Datasets by Multi-dimensional Outlier DetectionIdentifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
 
Extending DBpedia with Wikipedia List Pages
Extending DBpedia with Wikipedia List PagesExtending DBpedia with Wikipedia List Pages
Extending DBpedia with Wikipedia List Pages
 

Kürzlich hochgeladen

Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaoncallgirls2057
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...ssuserf63bd7
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Anamaria Contreras
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
WSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdfWSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdfJamesConcepcion7
 
EUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersEUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersPeter Horsten
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environmentelijahj01012
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Peter Ward
 
Onemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring CapabilitiesOnemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring CapabilitiesOne Monitar
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Americas Got Grants
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdfShaun Heinrichs
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMVoces Mineras
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
Darshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdfDarshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdfShashank Mehta
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?Olivia Kresic
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdfChris Skinner
 
Guide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFGuide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFChandresh Chudasama
 

Kürzlich hochgeladen (20)

Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
WSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdfWSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdf
 
EUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersEUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exporters
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environment
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...
 
Onemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring CapabilitiesOnemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
 
WAM Corporate Presentation April 12 2024.pdf
WAM Corporate Presentation April 12 2024.pdfWAM Corporate Presentation April 12 2024.pdf
WAM Corporate Presentation April 12 2024.pdf
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...
 
Call Us ➥9319373153▻Call Girls In North Goa
Call Us ➥9319373153▻Call Girls In North GoaCall Us ➥9319373153▻Call Girls In North Goa
Call Us ➥9319373153▻Call Girls In North Goa
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQM
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Darshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdfDarshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdf
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
 
Guide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFGuide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDF
 

Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspective

  • 1. 3/28/19 Heiko Paulheim 1 Big Data, Smart Algorithms, and Market Power A Computer Scientist’s Perspective Heiko Paulheim Chair for Data Science University of Mannheim Heiko Paulheim
  • 2. 3/28/19 Heiko Paulheim 2 Introductory Example: GPS vs. Smart Phones • Tests show: smart phones do the job better – with smart phones on the rise, GPS sales decline 0 5 .0 0 0 1 0 .0 0 0 1 5 .0 0 0 2 0 .0 0 0 2 5 .0 0 0 3 0 .0 0 0 G P S s a le s S m a rt p h o n e s a le s Source: Statista
  • 3. 3/28/19 Heiko Paulheim 3 Computer Science Interlude: Navigation • Problem: find the shortest path through a network • Solution: known since the 1950s – can be written down in less than 20 lines End Start 2km 2km 1km 1km 1km 3km 2km 1km
  • 4. 3/28/19 Heiko Paulheim 4 Computer Science Interlude: Navigation • Usually, we do not want the shortest way – but the fastest • We need to estimate times End Start 0:05 0:15 0:10 0:10 0:15 0:15 0:05 0:10
  • 5. 3/28/19 Heiko Paulheim 5 Estimating Times for Edges • Static: path length and speed limit • Dynamic: live car movements • Google Maps: owned by Google – So is Android – 57M smart phones in Germany, market share of Android: 80% ● i.e., one android phone in every other car
  • 6. 3/28/19 Heiko Paulheim 6 Visual Depiction • One Android phone in every other car Image: Bing Maps
  • 7. 3/28/19 Heiko Paulheim 7 Improving Navigation • Ingredients: – A simple standard textbook algorithm from the 1950s – A lot of data • Better navigation – Usually: not by smarter algorithms – But by better (=bigger) data! End Start 0:05 0:10 0:15 0:10 0:25 0:10 0:15 0:15 0:05 Image: https://neo4j.com/blog/top-13-resources-graph-theory-algorithms/
  • 8. 3/28/19 Heiko Paulheim 8 A.I. Winters and A Paradigm Shift • AI has a massive uptake since the 2010s – But using very different paradigms 1st AI Winter 2nd AI Winter Fast & Horvitz (2016): Long-Term Trends in the Public Perception of Artificial Intelligence
  • 9. 3/28/19 Heiko Paulheim 9 An Example for AI: Go • 1990s – Using handcrafted rules ● i.e., smart algorithms – Often defeated by children • 2010s – Using data from millions of games ● i.e., big data – AlphaGo: Beat some of world’s best players in 2016
  • 10. 3/28/19 Heiko Paulheim 10 AI in the Big Data Age (1) • Algorithms are fairly simple and well known • Data matters Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation smarter algorithm more data
  • 11. 3/28/19 Heiko Paulheim 11 AI in the Big Data Age (2) • Algorithms are fairly simple and well known • Data matters Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation more data: trivial baseline beats smart algorithms
  • 12. 3/28/19 Heiko Paulheim 12 Big Data: Long vs. Wide Data • Long data = more records of the same kind – e.g., GPS data from more users • Wide data = more information about the same records – e.g., additional information about users Lehmberg & Hassanzadeh (2018): Ontology Augmentation Through Matching with Web Tables
  • 13. 3/28/19 Heiko Paulheim 13 Big Data: Long vs. Wide Data • Example: YouTube (owned by Google) – Display videos to the user that are as interesting as possible • Long data: users’ interaction histories • Wide data: users’ interaction histories + Google Web searches + visited places + Google Play music preferences + ...
  • 14. 3/28/19 Heiko Paulheim 14 Big Data: Long vs. Wide Data • Example: Facebook – Display as much content of interest as possible • Long data: user profile and interactions • Wide data: user profile and interactions + WhatsApp chats In Germany, OVG Hamburg prohibits this combination! Image: https://www.instagram.com/p/Bt3OG4DFOsK/
  • 15. 3/28/19 Heiko Paulheim 15 It’s All about Patterns in Data • Examples – Traffic movements – Online user behavior – Cliques in social networks – … • Methods: – Data Mining – Machine Learning – … → Intensively researched since the 1980s Image: https://factordaily.com/balaraman-ravindran-reinforcement-learning/
  • 16. 3/28/19 Heiko Paulheim 16 Patterns in Long Data
  • 17. 3/28/19 Heiko Paulheim 17 Patterns in Long Data
  • 18. 3/28/19 Heiko Paulheim 18 Patterns in Wide Data
  • 19. 3/28/19 Heiko Paulheim 19 Take Aways • Modern AI Systems – Rely on massive amounts of data – Processed with fairly simple algorithms • Algorithms are often well known – e.g., textbooks, research papers – It is hard to own an algorithm • Data is crucial – Longer data (e.g., acquiring more customers) – Wider data (e.g., merging businesses) – It is easy to own data
  • 20. 3/28/19 Heiko Paulheim 20 Big Data, Smart Algorithms, and Market Power A Computer Scientist’s Perspective Heiko Paulheim Chair for Data Science University of Mannheim Heiko Paulheim