Suche senden
Hochladen
HadoopDB
•
Als ODP, PDF herunterladen
•
3 gefällt mir
•
1,394 views
Miguel Pastor
Folgen
Brief introduction to a new approach on handling big amount of data
Weniger lesen
Mehr lesen
Technologie
Melden
Teilen
Melden
Teilen
1 von 39
Jetzt herunterladen
Empfohlen
Data processing with spark in r & python
Data processing with spark in r & python
Maloy Manna, PMP®
HadoopDB in Action
HadoopDB in Action
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
Schema Agnostic Indexing with Azure DocumentDB
Schema Agnostic Indexing with Azure DocumentDB
Dharma Shukla
ETL Practices for Better or Worse
ETL Practices for Better or Worse
Eric Sun
Processing large-scale graphs with Google Pregel
Processing large-scale graphs with Google Pregel
Max Neunhöffer
EDW and Hadoop
EDW and Hadoop
Tapio Vaattanen
Row or Columnar Database
Row or Columnar Database
Biju Nair
data stage-material
data stage-material
Rajesh Kv
Empfohlen
Data processing with spark in r & python
Data processing with spark in r & python
Maloy Manna, PMP®
HadoopDB in Action
HadoopDB in Action
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
Schema Agnostic Indexing with Azure DocumentDB
Schema Agnostic Indexing with Azure DocumentDB
Dharma Shukla
ETL Practices for Better or Worse
ETL Practices for Better or Worse
Eric Sun
Processing large-scale graphs with Google Pregel
Processing large-scale graphs with Google Pregel
Max Neunhöffer
EDW and Hadoop
EDW and Hadoop
Tapio Vaattanen
Row or Columnar Database
Row or Columnar Database
Biju Nair
data stage-material
data stage-material
Rajesh Kv
NoSQL databases
NoSQL databases
Meshal Albeedhani
Spark core
Spark core
Prashant Gupta
Introduction to NOSQL databases
Introduction to NOSQL databases
Ashwani Kumar
Sql server 2012 dba online training
Sql server 2012 dba online training
sqlmasters
Apache Hive
Apache Hive
tusharsinghal58
Quantopix analytics system (qas)
Quantopix analytics system (qas)
Al Sabawi
Session 14 - Hive
Session 14 - Hive
AnandMHadoop
Introduction To HBase
Introduction To HBase
Anil Gupta
From Raw Data to Analytics with No ETL
From Raw Data to Analytics with No ETL
Cloudera, Inc.
Microsoft R - Data Science at Scale
Microsoft R - Data Science at Scale
Sascha Dittmann
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work- unit5
RojaT4
Handling the growth of data
Handling the growth of data
Piyush Katariya
Comparison - RDBMS vs Hadoop vs Apache
Comparison - RDBMS vs Hadoop vs Apache
SandeepTaksande
Digital Transformation with Microsoft Azure
Digital Transformation with Microsoft Azure
Luan Moreno Medeiros Maciel
Introduction to ArangoDB (nosql matters Barcelona 2012)
Introduction to ArangoDB (nosql matters Barcelona 2012)
ArangoDB Database
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Trieu Nguyen
Appache Cassandra
Appache Cassandra
nehabsairam
Multi model-databases
Multi model-databases
ArangoDB Database
SQL Server Workshop for Developers - Visual Studio Live! NY 2012
SQL Server Workshop for Developers - Visual Studio Live! NY 2012
Andrew Brust
Hive
Hive
Manas Nayak
Emerging database technology multimedia database
Emerging database technology multimedia database
Salama Al Busaidi
Google app engine python
Google app engine python
Eueung Mulyana
Weitere ähnliche Inhalte
Was ist angesagt?
NoSQL databases
NoSQL databases
Meshal Albeedhani
Spark core
Spark core
Prashant Gupta
Introduction to NOSQL databases
Introduction to NOSQL databases
Ashwani Kumar
Sql server 2012 dba online training
Sql server 2012 dba online training
sqlmasters
Apache Hive
Apache Hive
tusharsinghal58
Quantopix analytics system (qas)
Quantopix analytics system (qas)
Al Sabawi
Session 14 - Hive
Session 14 - Hive
AnandMHadoop
Introduction To HBase
Introduction To HBase
Anil Gupta
From Raw Data to Analytics with No ETL
From Raw Data to Analytics with No ETL
Cloudera, Inc.
Microsoft R - Data Science at Scale
Microsoft R - Data Science at Scale
Sascha Dittmann
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work- unit5
RojaT4
Handling the growth of data
Handling the growth of data
Piyush Katariya
Comparison - RDBMS vs Hadoop vs Apache
Comparison - RDBMS vs Hadoop vs Apache
SandeepTaksande
Digital Transformation with Microsoft Azure
Digital Transformation with Microsoft Azure
Luan Moreno Medeiros Maciel
Introduction to ArangoDB (nosql matters Barcelona 2012)
Introduction to ArangoDB (nosql matters Barcelona 2012)
ArangoDB Database
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Trieu Nguyen
Appache Cassandra
Appache Cassandra
nehabsairam
Multi model-databases
Multi model-databases
ArangoDB Database
SQL Server Workshop for Developers - Visual Studio Live! NY 2012
SQL Server Workshop for Developers - Visual Studio Live! NY 2012
Andrew Brust
Hive
Hive
Manas Nayak
Was ist angesagt?
(20)
NoSQL databases
NoSQL databases
Spark core
Spark core
Introduction to NOSQL databases
Introduction to NOSQL databases
Sql server 2012 dba online training
Sql server 2012 dba online training
Apache Hive
Apache Hive
Quantopix analytics system (qas)
Quantopix analytics system (qas)
Session 14 - Hive
Session 14 - Hive
Introduction To HBase
Introduction To HBase
From Raw Data to Analytics with No ETL
From Raw Data to Analytics with No ETL
Microsoft R - Data Science at Scale
Microsoft R - Data Science at Scale
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work- unit5
Handling the growth of data
Handling the growth of data
Comparison - RDBMS vs Hadoop vs Apache
Comparison - RDBMS vs Hadoop vs Apache
Digital Transformation with Microsoft Azure
Digital Transformation with Microsoft Azure
Introduction to ArangoDB (nosql matters Barcelona 2012)
Introduction to ArangoDB (nosql matters Barcelona 2012)
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Appache Cassandra
Appache Cassandra
Multi model-databases
Multi model-databases
SQL Server Workshop for Developers - Visual Studio Live! NY 2012
SQL Server Workshop for Developers - Visual Studio Live! NY 2012
Hive
Hive
Andere mochten auch
Emerging database technology multimedia database
Emerging database technology multimedia database
Salama Al Busaidi
Google app engine python
Google app engine python
Eueung Mulyana
Learn SQL Quickly
Learn SQL Quickly
tutorialbooks
Escalabilidad - Apache y MySQL
Escalabilidad - Apache y MySQL
Lorena Fernández
Planning For High Performance Web Application
Planning For High Performance Web Application
Yue Tian
Comparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented Database
Editor IJMTER
7 Databases in 70 minutes
7 Databases in 70 minutes
Karen Lopez
Multimedia Database
Multimedia Database
Avnish Patel
Andere mochten auch
(8)
Emerging database technology multimedia database
Emerging database technology multimedia database
Google app engine python
Google app engine python
Learn SQL Quickly
Learn SQL Quickly
Escalabilidad - Apache y MySQL
Escalabilidad - Apache y MySQL
Planning For High Performance Web Application
Planning For High Performance Web Application
Comparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented Database
7 Databases in 70 minutes
7 Databases in 70 minutes
Multimedia Database
Multimedia Database
Ähnlich wie HadoopDB
Big data hadoop rdbms
Big data hadoop rdbms
Arjen de Vries
Hadoop_arunam_ppt
Hadoop_arunam_ppt
jerrin joseph
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
appaji intelhunt
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
nzhang
Percona Lucid Db
Percona Lucid Db
guestd3896369
Big data concepts
Big data concepts
Serkan Özal
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Chicago Hadoop Users Group
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...
Ram Murat Sharma
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
Douglas Bernardini
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
Yahoo Developer Network
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Cloudera, Inc.
Hadoop in sigmod 2011
Hadoop in sigmod 2011
Bin Cai
HADOOP
HADOOP
Harinder Kaur
Nextag talk
Nextag talk
Joydeep Sen Sarma
Hoodie - DataEngConf 2017
Hoodie - DataEngConf 2017
Vinoth Chandar
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
datastack
Hadoop Technologies
Hadoop Technologies
zahid-mian
Hadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
Cloudera, Inc.
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
Hadoop User Group
Ähnlich wie HadoopDB
(20)
Big data hadoop rdbms
Big data hadoop rdbms
Hadoop_arunam_ppt
Hadoop_arunam_ppt
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
Percona Lucid Db
Percona Lucid Db
Big data concepts
Big data concepts
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop in sigmod 2011
Hadoop in sigmod 2011
HADOOP
HADOOP
Nextag talk
Nextag talk
Hoodie - DataEngConf 2017
Hoodie - DataEngConf 2017
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
Hadoop Technologies
Hadoop Technologies
Hadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
Mehr von Miguel Pastor
Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014
Miguel Pastor
Microservices: The OSGi way A different vision on microservices
Microservices: The OSGi way A different vision on microservices
Miguel Pastor
Liferay and Big Data
Liferay and Big Data
Miguel Pastor
Reactive applications and Akka intro used in the Madrid Scala Meetup
Reactive applications and Akka intro used in the Madrid Scala Meetup
Miguel Pastor
Reactive applications using Akka
Reactive applications using Akka
Miguel Pastor
Liferay Devcon 2013: Our way towards modularity
Liferay Devcon 2013: Our way towards modularity
Miguel Pastor
Liferay Module Framework
Liferay Module Framework
Miguel Pastor
Liferay and Cloud
Liferay and Cloud
Miguel Pastor
Jvm fundamentals
Jvm fundamentals
Miguel Pastor
Scala Overview
Scala Overview
Miguel Pastor
Hadoop, Cloud y Spring
Hadoop, Cloud y Spring
Miguel Pastor
Scala: un vistazo general
Scala: un vistazo general
Miguel Pastor
Platform as a Service overview
Platform as a Service overview
Miguel Pastor
Aspect Oriented Programming introduction
Aspect Oriented Programming introduction
Miguel Pastor
Software measure-slides
Software measure-slides
Miguel Pastor
Arquitecturas MMOG
Arquitecturas MMOG
Miguel Pastor
Software Failures
Software Failures
Miguel Pastor
Groovy and Grails intro
Groovy and Grails intro
Miguel Pastor
Mehr von Miguel Pastor
(18)
Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014
Microservices: The OSGi way A different vision on microservices
Microservices: The OSGi way A different vision on microservices
Liferay and Big Data
Liferay and Big Data
Reactive applications and Akka intro used in the Madrid Scala Meetup
Reactive applications and Akka intro used in the Madrid Scala Meetup
Reactive applications using Akka
Reactive applications using Akka
Liferay Devcon 2013: Our way towards modularity
Liferay Devcon 2013: Our way towards modularity
Liferay Module Framework
Liferay Module Framework
Liferay and Cloud
Liferay and Cloud
Jvm fundamentals
Jvm fundamentals
Scala Overview
Scala Overview
Hadoop, Cloud y Spring
Hadoop, Cloud y Spring
Scala: un vistazo general
Scala: un vistazo general
Platform as a Service overview
Platform as a Service overview
Aspect Oriented Programming introduction
Aspect Oriented Programming introduction
Software measure-slides
Software measure-slides
Arquitecturas MMOG
Arquitecturas MMOG
Software Failures
Software Failures
Groovy and Grails intro
Groovy and Grails intro
Kürzlich hochgeladen
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Maria Levchenko
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Enterprise Knowledge
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
Evaluating the top large language models.pdf
Evaluating the top large language models.pdf
ChristopherTHyatt
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Delhi Call girls
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
The Digital Insurer
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
UK Journal
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Antenna Manufacturer Coco
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
hans926745
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
sudhanshuwaghmare1
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
hans926745
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
Kürzlich hochgeladen
(20)
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Evaluating the top large language models.pdf
Evaluating the top large language models.pdf
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
HadoopDB
1.
HadoopDB Miguel Angel
Pastor Olivar miguelinlas3 at gmail dot com http://miguelinlas3.blogspot.com http://twitter.com/miguelinlas3
2.
3.
HadoopDB Architecture
4.
Results
5.
Conclusions
6.
Introduction
7.
8.
Data amount is
exploding
9.
Previous problem ->
Shared nothing architectures
10.
11.
Map/Reduce systems
12.
13.
14.
Analytics environments: not
restart querys
15.
Problem at scaling
16.
17.
18.
UDF mechanism
19.
Desirable SQL
and no SQL interfaces
20.
21.
22.
23.
Assumption: failures are
rare
24.
Assumption: dozens of
nodes in clusters
25.
Engineering decisions
26.
Background: Map/Reduce
27.
28.
Works on heterogeneus
environment
29.
30.
31.
SQL not supported
directly ( Hive )
32.
HadoopDB
33.
34.
35.
36.
37.
38.
39.
Job and Task
trackers
40.
Architecture
41.
42.
43.
Execute the SQL
query
44.
45.
46.
47.
Plan to deploy
as separated service
48.
49.
Breaking single data
node in ckunks
50.
51.
52.
53.
Semantic analyzer connects
to catalog
54.
DAG of relational
operators
55.
Optimizer reestructuration
56.
Convert plan to
M/R jobs
57.
DAG in M/R
serialized in xml plan
58.
59.
60.
Traverse DAG (bottom
up). Rule based SQL generator
61.
Benckmarking
62.
63.
64.
2 virtual cores
65.
850 GB storage
66.
64 bits Linux
Fedora 8
67.
68.
1024 MB heap
size
69.
70.
PostgreSQL 8.2.5
71.
No compress data
72.
73.
Used a cloud
edition
74.
75.
Run on EC2
(not cloud edition available)
76.
77.
78.
18 millions ranking
(~1Gigabyte)
79.
Stored as plain
text in HDFS
80.
Loading data
81.
Grep Task
82.
83.
84.
85.
UDF Aggregation Task
86.
87.
DBMS-X 15% overly
optimistic
88.
89.
Fault tolerance and
heterogeneus environments
90.
Benchmarks
91.
92.
Reduce the number
of nodes to achieve the same order of magnitude
93.
Fault tolerance is
important
94.
Conclusions
95.
96.
PostgreSQL is not
a column store
97.
Hadoop and hive
relatively new open source projects
98.
HadoopDB is flexible
and extensible
99.
References
100.
101.
HadoopDB article
102.
HadoopDB project
103.
Vertica
104.
Apache Hive
105.
That´s all!
Jetzt herunterladen