SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Downloaden Sie, um offline zu lesen
SARA Hadoop Hackathon
   Evert.Lammerts@sara.nl
   December 7, 2010
DJOERD HIEMSTRA
                                             (UTwente)




EDGAR MEIJ
     (UvA)



             SARA Hadoop Hackathon, December 7, 2010
2002             2004                   2006

Nutch*           MR/GFS**               Hadoop


*  http://nutch.apache.org/
** http://labs.google.com/papers/mapreduce.html
   http://labs.google.com/papers/gfs.html

                     SARA Hadoop Hackathon, December 7, 2010
2010: A Hype in Production




http://wiki.apache.org/hadoop/PoweredBy
                SARA Hadoop Hackathon, December 7, 2010
Super computing




Cloud computing                               Grid computing




     Cluster computing              GPU computing


                    http://www.sara.nl/

           SARA Hadoop Hackathon, December 7, 2010
:-(
                       Data         Expensive!
                                                         Computation




                                         :-)
                       Data          Cheaper!
                                                         Computation




Ref: Luiz André Barroso and Urs Hölzle, Google Inc.
   The Datacenter as a Computer: An Introduction to the Design of Warehouse­Scale Machines



                             SARA Hadoop Hackathon, December 7, 2010
NameNode              JobTracker




DN   TT   DN      TT             DN        TT        DN     TT


DN   TT   DN      TT             DN        TT        DN     TT



                                                    DN    DataNode

                                                    TT   TaskTracker

          SARA Hadoop Hackathon, December 7, 2010
File   Map                              Shuffle         Reduce           Output
       $ echo “${email#*@}, ${name}”     $ sort          $ wc ­l




                                                                      ewi.utwente.nl, 1
                                                                      gmail.com,      2
                                                                      nbic.nl,        1
                                                                      nikhef.nl,      3
                                                                      sara.nl,        1




                            SARA Hadoop Hackathon, December 7, 2010
From: Hadoop, The Definitive Guide (2nd Edition), Tom White




           SARA Hadoop Hackathon, December 7, 2010
Today

09.30 - 09.50   Welcome & Introduction
09.50 - 10.15   Map/Reduce @ University of Twente
10.15 - 10.30   Kick-off hackathon
14.00 - 15.00   Optional: SARA tour
10.30 - 17.00   Hackathon
17.00 - 17.30   Results and closing




                SARA Hadoop Hackathon, December 7, 2010

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Another Intro To Hadoop
Another Intro To HadoopAnother Intro To Hadoop
Another Intro To Hadoop
 
Hadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingHadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
 
Sf NoSQL MeetUp: Apache Hadoop and HBase
Sf NoSQL MeetUp: Apache Hadoop and HBaseSf NoSQL MeetUp: Apache Hadoop and HBase
Sf NoSQL MeetUp: Apache Hadoop and HBase
 
Big Data Analytics for Non-Programmers
Big Data Analytics for Non-ProgrammersBig Data Analytics for Non-Programmers
Big Data Analytics for Non-Programmers
 
Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPT
 
EclipseCon Keynote: Apache Hadoop - An Introduction
EclipseCon Keynote: Apache Hadoop - An IntroductionEclipseCon Keynote: Apache Hadoop - An Introduction
EclipseCon Keynote: Apache Hadoop - An Introduction
 
Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop Technology
 
Hadoop seminar
Hadoop seminarHadoop seminar
Hadoop seminar
 
Hadoop/Spark Non-Technical Basics
Hadoop/Spark Non-Technical BasicsHadoop/Spark Non-Technical Basics
Hadoop/Spark Non-Technical Basics
 
Geek camp
Geek campGeek camp
Geek camp
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Apache hadoop introduction and architecture
Apache hadoop  introduction and architectureApache hadoop  introduction and architecture
Apache hadoop introduction and architecture
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tables
 
Hadoop Technologies
Hadoop TechnologiesHadoop Technologies
Hadoop Technologies
 
Intro to hadoop ecosystem
Intro to hadoop ecosystemIntro to hadoop ecosystem
Intro to hadoop ecosystem
 
Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
 
Intro to Big Data Hadoop
Intro to Big Data HadoopIntro to Big Data Hadoop
Intro to Big Data Hadoop
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
 
Introduction to Hadoop part1
Introduction to Hadoop part1Introduction to Hadoop part1
Introduction to Hadoop part1
 

Ähnlich wie Introduction to SARA's Hadoop Hackathon - dec 7th 2010

Seattle hug 2010
Seattle hug 2010Seattle hug 2010
Seattle hug 2010
Abe Taha
 
Large Scale Data With Hadoop
Large Scale Data With HadoopLarge Scale Data With Hadoop
Large Scale Data With Hadoop
guest27e6764
 
Distributed Social Networking
Distributed Social NetworkingDistributed Social Networking
Distributed Social Networking
Bastian Hofmann
 

Ähnlich wie Introduction to SARA's Hadoop Hackathon - dec 7th 2010 (20)

20100128ebay
20100128ebay20100128ebay
20100128ebay
 
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
 
Seattle hug 2010
Seattle hug 2010Seattle hug 2010
Seattle hug 2010
 
Big Data Step-by-Step: Using R & Hadoop (with RHadoop's rmr package)
Big Data Step-by-Step: Using R & Hadoop (with RHadoop's rmr package)Big Data Step-by-Step: Using R & Hadoop (with RHadoop's rmr package)
Big Data Step-by-Step: Using R & Hadoop (with RHadoop's rmr package)
 
Riak Intro
Riak IntroRiak Intro
Riak Intro
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
 
Large Scale Data With Hadoop
Large Scale Data With HadoopLarge Scale Data With Hadoop
Large Scale Data With Hadoop
 
20091203gemini
20091203gemini20091203gemini
20091203gemini
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
 
TheEdge10 : Big Data is Here - Hadoop to the Rescue
TheEdge10 : Big Data is Here - Hadoop to the RescueTheEdge10 : Big Data is Here - Hadoop to the Rescue
TheEdge10 : Big Data is Here - Hadoop to the Rescue
 
May 2013 HUG: HCatalog/Hive Data Out
May 2013 HUG: HCatalog/Hive Data OutMay 2013 HUG: HCatalog/Hive Data Out
May 2013 HUG: HCatalog/Hive Data Out
 
HUG Meetup 2013: HCatalog / Hive Data Out
HUG Meetup 2013: HCatalog / Hive Data Out HUG Meetup 2013: HCatalog / Hive Data Out
HUG Meetup 2013: HCatalog / Hive Data Out
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
 
Distributed Social Networking
Distributed Social NetworkingDistributed Social Networking
Distributed Social Networking
 
20100201hplabs
20100201hplabs20100201hplabs
20100201hplabs
 

Kürzlich hochgeladen

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Kürzlich hochgeladen (20)

What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 

Introduction to SARA's Hadoop Hackathon - dec 7th 2010

  • 1. SARA Hadoop Hackathon Evert.Lammerts@sara.nl December 7, 2010
  • 2. DJOERD HIEMSTRA (UTwente) EDGAR MEIJ (UvA) SARA Hadoop Hackathon, December 7, 2010
  • 3. 2002 2004 2006 Nutch* MR/GFS** Hadoop *  http://nutch.apache.org/ ** http://labs.google.com/papers/mapreduce.html    http://labs.google.com/papers/gfs.html SARA Hadoop Hackathon, December 7, 2010
  • 4. 2010: A Hype in Production http://wiki.apache.org/hadoop/PoweredBy SARA Hadoop Hackathon, December 7, 2010
  • 5. Super computing Cloud computing Grid computing Cluster computing GPU computing http://www.sara.nl/ SARA Hadoop Hackathon, December 7, 2010
  • 6. :-( Data Expensive! Computation :-) Data Cheaper! Computation Ref: Luiz André Barroso and Urs Hölzle, Google Inc.    The Datacenter as a Computer: An Introduction to the Design of Warehouse­Scale Machines SARA Hadoop Hackathon, December 7, 2010
  • 7. NameNode JobTracker DN TT DN TT DN TT DN TT DN TT DN TT DN TT DN TT DN DataNode TT TaskTracker SARA Hadoop Hackathon, December 7, 2010
  • 8. File Map Shuffle Reduce Output $ echo “${email#*@}, ${name}” $ sort $ wc ­l ewi.utwente.nl, 1 gmail.com,      2 nbic.nl,        1 nikhef.nl,      3 sara.nl,        1 SARA Hadoop Hackathon, December 7, 2010
  • 9. From: Hadoop, The Definitive Guide (2nd Edition), Tom White SARA Hadoop Hackathon, December 7, 2010
  • 10. Today 09.30 - 09.50 Welcome & Introduction 09.50 - 10.15 Map/Reduce @ University of Twente 10.15 - 10.30 Kick-off hackathon 14.00 - 15.00 Optional: SARA tour 10.30 - 17.00 Hackathon 17.00 - 17.30 Results and closing SARA Hadoop Hackathon, December 7, 2010