Suche senden
Hochladen
Hadoop - Simple. Scalable.
•
1 gefällt mir
•
1,023 views
elliando dias
Folgen
Technologie
Melden
Teilen
Melden
Teilen
1 von 44
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Nov HUG 2009: Hadoop Record Reader In Python
Nov HUG 2009: Hadoop Record Reader In Python
Yahoo Developer Network
JOSA TechTalks - Big Data on Hadoop
JOSA TechTalks - Big Data on Hadoop
Jordan Open Source Association
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
Cassandra + Hadoop @ApacheCon
Cassandra + Hadoop @ApacheCon
Jeremy Hanna
Practical Hadoop using Pig
Practical Hadoop using Pig
David Wellman
Another Intro To Hadoop
Another Intro To Hadoop
Adeel Ahmad
Hadoop Technology
Hadoop Technology
Atul Kushwaha
Hive and data analysis using pandas
Hive and data analysis using pandas
Purna Chander K
Empfohlen
Nov HUG 2009: Hadoop Record Reader In Python
Nov HUG 2009: Hadoop Record Reader In Python
Yahoo Developer Network
JOSA TechTalks - Big Data on Hadoop
JOSA TechTalks - Big Data on Hadoop
Jordan Open Source Association
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
Cassandra + Hadoop @ApacheCon
Cassandra + Hadoop @ApacheCon
Jeremy Hanna
Practical Hadoop using Pig
Practical Hadoop using Pig
David Wellman
Another Intro To Hadoop
Another Intro To Hadoop
Adeel Ahmad
Hadoop Technology
Hadoop Technology
Atul Kushwaha
Hive and data analysis using pandas
Hive and data analysis using pandas
Purna Chander K
Geek camp
Geek camp
jdhok
Getting Started on Hadoop
Getting Started on Hadoop
Paco Nathan
Making Big Data, small
Making Big Data, small
MarcinJedyk
Scalable Hadoop with succinct Python: the best of both worlds
Scalable Hadoop with succinct Python: the best of both worlds
DataWorks Summit
Hadoop: The elephant in the room
Hadoop: The elephant in the room
cacois
Hadoop training by keylabs
Hadoop training by keylabs
Siva Sankar
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
Hadoop
Hadoop
siva shankari
Hadoop
Hadoop
Jaydeep Patel
Intro to Hadoop
Intro to Hadoop
jeffturner
Hadoop
Hadoop
Kartik Kalpande Patil
Bw tech hadoop
Bw tech hadoop
Mindgrub Technologies
How To Run Mapreduce Jobs In Python
How To Run Mapreduce Jobs In Python
Yi Wang
BioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing data
Zhong Wang
Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009
yhadoop
Introduction to Hadoop - FinistJug
Introduction to Hadoop - FinistJug
David Morin
How to measure your dataflow using fio, pktgen and bandwidthTest
How to measure your dataflow using fio, pktgen and bandwidthTest
Naoto MATSUMOTO
9/2017 STL HUG - Back to School
9/2017 STL HUG - Back to School
Adam Doyle
Hadoop and big data
Hadoop and big data
Sharad Pandey
Implementing S-Expressions Based Extented Languages in LISP
Implementing S-Expressions Based Extented Languages in LISP
elliando dias
JCR Content Management
JCR Content Management
elliando dias
Bibliografía primaria y topológica de las ediciones de los libros de Santiago...
Bibliografía primaria y topológica de las ediciones de los libros de Santiago...
Marta Domínguez-Senra
Weitere ähnliche Inhalte
Was ist angesagt?
Geek camp
Geek camp
jdhok
Getting Started on Hadoop
Getting Started on Hadoop
Paco Nathan
Making Big Data, small
Making Big Data, small
MarcinJedyk
Scalable Hadoop with succinct Python: the best of both worlds
Scalable Hadoop with succinct Python: the best of both worlds
DataWorks Summit
Hadoop: The elephant in the room
Hadoop: The elephant in the room
cacois
Hadoop training by keylabs
Hadoop training by keylabs
Siva Sankar
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
Hadoop
Hadoop
siva shankari
Hadoop
Hadoop
Jaydeep Patel
Intro to Hadoop
Intro to Hadoop
jeffturner
Hadoop
Hadoop
Kartik Kalpande Patil
Bw tech hadoop
Bw tech hadoop
Mindgrub Technologies
How To Run Mapreduce Jobs In Python
How To Run Mapreduce Jobs In Python
Yi Wang
BioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing data
Zhong Wang
Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009
yhadoop
Introduction to Hadoop - FinistJug
Introduction to Hadoop - FinistJug
David Morin
How to measure your dataflow using fio, pktgen and bandwidthTest
How to measure your dataflow using fio, pktgen and bandwidthTest
Naoto MATSUMOTO
9/2017 STL HUG - Back to School
9/2017 STL HUG - Back to School
Adam Doyle
Hadoop and big data
Hadoop and big data
Sharad Pandey
Was ist angesagt?
(19)
Geek camp
Geek camp
Getting Started on Hadoop
Getting Started on Hadoop
Making Big Data, small
Making Big Data, small
Scalable Hadoop with succinct Python: the best of both worlds
Scalable Hadoop with succinct Python: the best of both worlds
Hadoop: The elephant in the room
Hadoop: The elephant in the room
Hadoop training by keylabs
Hadoop training by keylabs
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Hadoop
Hadoop
Hadoop
Hadoop
Intro to Hadoop
Intro to Hadoop
Hadoop
Hadoop
Bw tech hadoop
Bw tech hadoop
How To Run Mapreduce Jobs In Python
How To Run Mapreduce Jobs In Python
BioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing data
Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009
Introduction to Hadoop - FinistJug
Introduction to Hadoop - FinistJug
How to measure your dataflow using fio, pktgen and bandwidthTest
How to measure your dataflow using fio, pktgen and bandwidthTest
9/2017 STL HUG - Back to School
9/2017 STL HUG - Back to School
Hadoop and big data
Hadoop and big data
Andere mochten auch
Implementing S-Expressions Based Extented Languages in LISP
Implementing S-Expressions Based Extented Languages in LISP
elliando dias
JCR Content Management
JCR Content Management
elliando dias
Bibliografía primaria y topológica de las ediciones de los libros de Santiago...
Bibliografía primaria y topológica de las ediciones de los libros de Santiago...
Marta Domínguez-Senra
Writing Your Own JSR-Compliant, Domain-Specific Scripting Language
Writing Your Own JSR-Compliant, Domain-Specific Scripting Language
elliando dias
SharePoint Governance and Lifecycle Management with Project Server 2010
SharePoint Governance and Lifecycle Management with Project Server 2010
Alexander Burton
Why you should be excited about ClojureScript
Why you should be excited about ClojureScript
elliando dias
Nomenclatura e peças de container
Nomenclatura e peças de container
elliando dias
Functional Programming with Immutable Data Structures
Functional Programming with Immutable Data Structures
elliando dias
Clojurescript slides
Clojurescript slides
elliando dias
Andere mochten auch
(9)
Implementing S-Expressions Based Extented Languages in LISP
Implementing S-Expressions Based Extented Languages in LISP
JCR Content Management
JCR Content Management
Bibliografía primaria y topológica de las ediciones de los libros de Santiago...
Bibliografía primaria y topológica de las ediciones de los libros de Santiago...
Writing Your Own JSR-Compliant, Domain-Specific Scripting Language
Writing Your Own JSR-Compliant, Domain-Specific Scripting Language
SharePoint Governance and Lifecycle Management with Project Server 2010
SharePoint Governance and Lifecycle Management with Project Server 2010
Why you should be excited about ClojureScript
Why you should be excited about ClojureScript
Nomenclatura e peças de container
Nomenclatura e peças de container
Functional Programming with Immutable Data Structures
Functional Programming with Immutable Data Structures
Clojurescript slides
Clojurescript slides
Ähnlich wie Hadoop - Simple. Scalable.
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
NPN Training
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
Reynold Xin
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
Stefano Paluello
BW Tech Meetup: Hadoop and The rise of Big Data
BW Tech Meetup: Hadoop and The rise of Big Data
Mindgrub Technologies
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environment
Delhi/NCR HUG
Presentation sreenu dwh-services
Presentation sreenu dwh-services
Sreenu Musham
Hadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
Cloudera, Inc.
Hadoop Architecture in Depth
Hadoop Architecture in Depth
Syed Hadoop
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
sudhakara st
Big Data Architecture and Deployment
Big Data Architecture and Deployment
Cisco Canada
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data sean mc keown
Cisco Canada
Big data with HDFS and Mapreduce
Big data with HDFS and Mapreduce
senthil0809
Apache Hadoop Big Data Technology
Apache Hadoop Big Data Technology
Jay Nagar
hadoop
hadoop
swatic018
hadoop
hadoop
swatic018
Hadoop and big data training
Hadoop and big data training
agiamas
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
datastack
Lecture 2 part 1
Lecture 2 part 1
Jazan University
Apache Hadoop & Friends at Utah Java User's Group
Apache Hadoop & Friends at Utah Java User's Group
Cloudera, Inc.
Introduction to Hadoop
Introduction to Hadoop
joelcrabb
Ähnlich wie Hadoop - Simple. Scalable.
(20)
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
BW Tech Meetup: Hadoop and The rise of Big Data
BW Tech Meetup: Hadoop and The rise of Big Data
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environment
Presentation sreenu dwh-services
Presentation sreenu dwh-services
Hadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
Hadoop Architecture in Depth
Hadoop Architecture in Depth
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
Big Data Architecture and Deployment
Big Data Architecture and Deployment
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data sean mc keown
Big data with HDFS and Mapreduce
Big data with HDFS and Mapreduce
Apache Hadoop Big Data Technology
Apache Hadoop Big Data Technology
hadoop
hadoop
hadoop
hadoop
Hadoop and big data training
Hadoop and big data training
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
Lecture 2 part 1
Lecture 2 part 1
Apache Hadoop & Friends at Utah Java User's Group
Apache Hadoop & Friends at Utah Java User's Group
Introduction to Hadoop
Introduction to Hadoop
Mehr von elliando dias
Geometria Projetiva
Geometria Projetiva
elliando dias
Polyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better Agility
elliando dias
Javascript Libraries
Javascript Libraries
elliando dias
How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!
elliando dias
Ragel talk
Ragel talk
elliando dias
A Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the Web
elliando dias
Introdução ao Arduino
Introdução ao Arduino
elliando dias
Minicurso arduino
Minicurso arduino
elliando dias
Incanter Data Sorcery
Incanter Data Sorcery
elliando dias
Rango
Rango
elliando dias
Fab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine Design
elliando dias
The Digital Revolution: Machines that makes
The Digital Revolution: Machines that makes
elliando dias
Hadoop + Clojure
Hadoop + Clojure
elliando dias
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
elliando dias
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Study
elliando dias
From Lisp to Clojure/Incanter and RAn Introduction
From Lisp to Clojure/Incanter and RAn Introduction
elliando dias
FleetDB A Schema-Free Database in Clojure
FleetDB A Schema-Free Database in Clojure
elliando dias
Clojure and The Robot Apocalypse
Clojure and The Robot Apocalypse
elliando dias
Clojure - A new Lisp
Clojure - A new Lisp
elliando dias
Clojure - An Introduction for Lisp Programmers
Clojure - An Introduction for Lisp Programmers
elliando dias
Mehr von elliando dias
(20)
Geometria Projetiva
Geometria Projetiva
Polyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better Agility
Javascript Libraries
Javascript Libraries
How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!
Ragel talk
Ragel talk
A Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the Web
Introdução ao Arduino
Introdução ao Arduino
Minicurso arduino
Minicurso arduino
Incanter Data Sorcery
Incanter Data Sorcery
Rango
Rango
Fab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine Design
The Digital Revolution: Machines that makes
The Digital Revolution: Machines that makes
Hadoop + Clojure
Hadoop + Clojure
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Study
From Lisp to Clojure/Incanter and RAn Introduction
From Lisp to Clojure/Incanter and RAn Introduction
FleetDB A Schema-Free Database in Clojure
FleetDB A Schema-Free Database in Clojure
Clojure and The Robot Apocalypse
Clojure and The Robot Apocalypse
Clojure - A new Lisp
Clojure - A new Lisp
Clojure - An Introduction for Lisp Programmers
Clojure - An Introduction for Lisp Programmers
Kürzlich hochgeladen
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Principled Technologies
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
The Digital Insurer
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
Boston Institute of Analytics
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Neo4j
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Khem
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
debabhi2
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Juan lago vázquez
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
wesley chun
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
lior mazor
Kürzlich hochgeladen
(20)
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
Hadoop - Simple. Scalable.
1.
Hadoop Simple. Scalable.
2.
@markgunnels mark@catamorphiclabs.com
3.
Java. Clojure. Ruby.
Cloudera Certified
4.
posscon.org April 15, 16,
and 17
5.
Agenda Overview Massively
Large Data Sets and the problems therein Distributed File System MapReduce Pig
6.
Overview
7.
Doug Cutting
Genius
8.
Favorite Hadoop Story
New York Times
9.
4 Terabytes of
Source Articles.
10.
24 Hours.
11.
5.5 Terabytes of
PDFs.
12.
Did it again.
13.
$240.
14.
Infoporn from Yahoo
73 hours 490 TB Shuffling 280 TB Output 4000 Nodes 16 PB Disk Space 32K Cores 64 TB RAM
15.
Hadoop solves...
16.
Analyzing Massively Large
Datasets
17.
Two Problems You have
to distribute.
18.
Data Storage Capacity
has increased rapidly beyond read speeds. Datasets won't fit on one disk. Tolerate node failure.
19.
Data Analysis
Combine data from many machines. Tolerate node failure.
20.
How Hadoop solves
these problems.
21.
Send Code to
Data. Not Data to Code.
22.
Data Storage
HDFS
23.
Name Node. Data
Nodes. Master - Slave Relationship
24.
Shard massive files
across multiple machines. MB, GB, and TB
25.
Tolerant of Node
Failure Files replicated across at least 3 nodes.
26.
HDFS behaves like
a normal file system. No true appends yet.
27.
Demonstration.
28.
Data Analysis
MapReduce
29.
Job Tracker. Task
Nodes. Master - Slave Relationship.
30.
map
31.
Demonstration
32.
pmap
33.
Demonstration
34.
reduce
35.
Demonstration
36.
(reduce (pmap))
37.
Demonstration.
38.
MapReduce
Java
39.
Nobody likes it.
:-)
40.
MapReduce Ruby. Python. Unix
Utilities.
41.
MapReduce Clojure
42.
Hadoop Ecosystem Pigkeeper. Hive.
Cascading.
43.
Pig
44.
HBase
Jetzt herunterladen