SlideShare ist ein Scribd-Unternehmen logo
1 von 30
MapReduce and Parallel Database Systems: Complementary or Competitive Technology? Daniel Abadi Yale University October 12 th , 2010
Brief History of MapReduce ,[object Object],[object Object],[object Object],[object Object]
 
 
Controversy ,[object Object],[object Object],[object Object],[object Object]
 
SIGMOD 2009 Paper ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Hardware Setup ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Join Task
UDF Task DBMS clearly doesn’t scale ,[object Object],[object Object]
Benchmark Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Overall Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
 
 
 
 
 
We’re all in agreement, right?
But Wait! ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
People are using Hadoop as a DW ,[object Object],[object Object],[object Object],[object Object]
Fault Tolerance and Cluster Heterogeneity Results Database systems restart entire query upon a single node failure, and do not adapt if a node is running slowly
So … ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
HadoopDB ,[object Object],[object Object],[object Object]
HadoopDB Architecture
TPC-H Benchmark Results
Fault Tolerance and Cluster Heterogeneity Results
HadoopDB: Current Status ,[object Object],[object Object],[object Object],[object Object],[object Object]
Invisible Loading ,[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusions ,[object Object],[object Object],[object Object]

Weitere ähnliche Inhalte

Was ist angesagt?

Nov 2010 HUG: Business Intelligence for Big Data
Nov 2010 HUG: Business Intelligence for Big DataNov 2010 HUG: Business Intelligence for Big Data
Nov 2010 HUG: Business Intelligence for Big Data
Yahoo Developer Network
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduce
J Singh
 

Was ist angesagt? (20)

Why Talend for Big Data?
Why Talend for Big Data?Why Talend for Big Data?
Why Talend for Big Data?
 
Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP
 
Hadoop bigdata overview
Hadoop bigdata overviewHadoop bigdata overview
Hadoop bigdata overview
 
Comparison - RDBMS vs Hadoop vs Apache
Comparison - RDBMS vs Hadoop vs ApacheComparison - RDBMS vs Hadoop vs Apache
Comparison - RDBMS vs Hadoop vs Apache
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop Implementations
 
Intro to Hybrid Data Warehouse
Intro to Hybrid Data WarehouseIntro to Hybrid Data Warehouse
Intro to Hybrid Data Warehouse
 
Schema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-WriteSchema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-Write
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Jethro data meetup index base sql on hadoop - oct-2014
Jethro data meetup    index base sql on hadoop - oct-2014Jethro data meetup    index base sql on hadoop - oct-2014
Jethro data meetup index base sql on hadoop - oct-2014
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
Data Engineering Basics
Data Engineering BasicsData Engineering Basics
Data Engineering Basics
 
Nov 2010 HUG: Business Intelligence for Big Data
Nov 2010 HUG: Business Intelligence for Big DataNov 2010 HUG: Business Intelligence for Big Data
Nov 2010 HUG: Business Intelligence for Big Data
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduce
 
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionHow One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
 
Design of Hadoop Distributed File System
Design of Hadoop Distributed File SystemDesign of Hadoop Distributed File System
Design of Hadoop Distributed File System
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 
Spark with Azure HDInsight - Tampa Bay Data Science - Adnan Masood, PhD
Spark with Azure HDInsight  - Tampa Bay Data Science - Adnan Masood, PhDSpark with Azure HDInsight  - Tampa Bay Data Science - Adnan Masood, PhD
Spark with Azure HDInsight - Tampa Bay Data Science - Adnan Masood, PhD
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiry
 

Andere mochten auch

Predicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopPredicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via Hadoop
Skillspeed
 
CAP, PACELC, and Determinism
CAP, PACELC, and DeterminismCAP, PACELC, and Determinism
CAP, PACELC, and Determinism
Daniel Abadi
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
Revolution Analytics
 

Andere mochten auch (20)

SQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialSQL-on-Hadoop Tutorial
SQL-on-Hadoop Tutorial
 
Invisible loading
Invisible loadingInvisible loading
Invisible loading
 
Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012
 
Leopard: Lightweight Partitioning and Replication for Dynamic Graphs
Leopard: Lightweight Partitioning and Replication  for Dynamic Graphs Leopard: Lightweight Partitioning and Replication  for Dynamic Graphs
Leopard: Lightweight Partitioning and Replication for Dynamic Graphs
 
Shared slides-edbt-keynote-03-19-13
Shared slides-edbt-keynote-03-19-13Shared slides-edbt-keynote-03-19-13
Shared slides-edbt-keynote-03-19-13
 
Consistency Tradeoffs in Modern Distributed Database System Design
Consistency Tradeoffs in Modern Distributed Database System DesignConsistency Tradeoffs in Modern Distributed Database System Design
Consistency Tradeoffs in Modern Distributed Database System Design
 
VLDB 2009 Tutorial on Column-Stores
VLDB 2009 Tutorial on Column-StoresVLDB 2009 Tutorial on Column-Stores
VLDB 2009 Tutorial on Column-Stores
 
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the HerdHadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
 
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
 
The Power of Determinism in Database Systems
The Power of Determinism in Database SystemsThe Power of Determinism in Database Systems
The Power of Determinism in Database Systems
 
Predicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopPredicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via Hadoop
 
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspectiveBig Data & Analytics MapReduce/Hadoop – A programmer’s perspective
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
 
CAP, PACELC, and Determinism
CAP, PACELC, and DeterminismCAP, PACELC, and Determinism
CAP, PACELC, and Determinism
 
Cloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-HadoopCloudera Showcase: SQL-on-Hadoop
Cloudera Showcase: SQL-on-Hadoop
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?
 
mapReduce for machine learning
mapReduce for machine learning mapReduce for machine learning
mapReduce for machine learning
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
 
Design patterns in MapReduce
Design patterns in MapReduceDesign patterns in MapReduce
Design patterns in MapReduce
 
Predictive Analytics on Big Data. DIY or BUY?
Predictive Analytics on Big Data. DIY or BUY?Predictive Analytics on Big Data. DIY or BUY?
Predictive Analytics on Big Data. DIY or BUY?
 

Ähnlich wie Daniel Abadi HadoopWorld 2010

Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
Christopher Pezza
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
BOSC 2010
 
Hadoop technology
Hadoop technologyHadoop technology
Hadoop technology
tipanagiriharika
 
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dipayan Dev
 

Ähnlich wie Daniel Abadi HadoopWorld 2010 (20)

How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and FacebookHow Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
 
Managing Big data with Hadoop
Managing Big data with HadoopManaging Big data with Hadoop
Managing Big data with Hadoop
 
Hadoop by kamran khan
Hadoop by kamran khanHadoop by kamran khan
Hadoop by kamran khan
 
Hadoop
HadoopHadoop
Hadoop
 
HDFS
HDFSHDFS
HDFS
 
OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...
OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...
OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...
 
Hadoop An Introduction
Hadoop An IntroductionHadoop An Introduction
Hadoop An Introduction
 
Unit-3_BDA.ppt
Unit-3_BDA.pptUnit-3_BDA.ppt
Unit-3_BDA.ppt
 
Hadoop technology
Hadoop technologyHadoop technology
Hadoop technology
 
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
 
Distributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptxDistributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptx
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
 
Cppt
CpptCppt
Cppt
 
Cppt
CpptCppt
Cppt
 
Hadoop presentation
Hadoop presentationHadoop presentation
Hadoop presentation
 
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxM. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Daniel Abadi HadoopWorld 2010