SlideShare ist ein Scribd-Unternehmen logo
1 von 47
The Blind Men and the Elephant
Matthew Aslett, Senior Analyst, The 451 Group
     Hadoop World, 8 November, 2011


             © 2011 by The 451 Group. All rights reserved
Agenda
   Introduction and family history
   The Blind Men and the Elephant
   What is the point of Hadoop?
   Adoption trends
   Big data, total data
   Exploratory analytics
   Hadoop-related business strategies
   Contributors and their contributions
   A cautionary tale




                          © 2011 by The 451 Group. All rights reserved
The 451 Group
                        451 Research is focused on the business of enterprise IT
                        innovation. The company’s analysts provide critical and timely
                        insight into the competitive dynamics of innovation in
                        emerging technology segments.


                        Tier1 Research is a single-source research and advisory firm covering
                        the multi-tenant datacenter, hosting, IT and cloud-computing
                        sectors, blending the best of industry and financial research.


                        The Uptime Institute is ‘The Global Data Center Authority’ and a
                        pioneer in the creation and facilitation of end-user knowledge
                        communities to improve reliability and uninterruptible availability
                        in datacenter facilities.

                        TheInfoPro is a leading IT advisory and research firm that provides
                        real-world perspectives on the customer and market dynamics of the
                        enterprise information technology landscape, harnessing the
                        collective knowledge and insight of leading IT organizations
                        worldwide.

                        ChangeWave Research is a research firm that identifies and quantifies
                        ‘change’ in consumer spending behavior, corporate purchasing, and
                        industry, company and technology trends.




                © 2011 by The 451 Group. All rights reserved
451 Research
 Matthew Aslett
  • Senior analyst, enterprise software
  • With The 451 Group since 2007
  • www.twitter.com/maslett


Information Management                                    Commercial Adoption of Open Source
   Operational databases                                 (CAOS)
   Data warehousing                                       Open source projects
   Data caching                                           Adoption of open source software
   Event processing                                       Vendor strategies

 Hadoop first properly covered in March                   Hadoop first covered February 2008 as
2009 report covering the formation of                     part of coverage of emerging open source
Apache Hadoop distributor Cloudera                        data management projects




                            © 2011 by The 451 Group. All rights reserved
A family history?




                    © 2011 by The 451 Group. All rights reserved
The Blind Men and the Elephant

“It was six men of Indostan
To learning much inclined,
Who went to see the Elephant
(Though all of them were blind),
That each by observation
Might satisfy his mind.”

John Godfrey Saxe (1872)




                        © 2011 by The 451 Group. All rights reserved
The Blind Men and the Elephant

“After Hadoop finishes
filtering the data, the place
you want to put that data
is in Oracle Database.”

Larry Ellison (2011)




                          © 2011 by The 451 Group. All rights reserved
Oracle Big Data Appliance
Apache Hadoop


NoSQL Database


Oracle Tools


                        Oracle Database
Data Integrator for Oracle Database


Data Loader
                                      Big data
R distribution                        integration




                            © 2011 by The 451 Group. All rights reserved
What is the point of Hadoop?




 Big data                       Big data                             Big data
 storage                        integration                          analytics

 Yes, depending on who you ask (and when)



                      © 2011 by The 451 Group. All rights reserved
Example deployment

                 Processes millions of searches and transactions a
     Orbitz      day, resulting in hundreds of GBs of log data


                  Early Hadoop adopter for long-term storage and
 Big data          processing of un/semi-structured data
 storage          Too much data to store and process in data warehouse
                   due to cost and space considerations

                  Adopted Hive for SQL-like query capabilities
  Big data        Also, machine learning to automate hotel ranking based
  analytics        on user behavior
                  Hadoop provided repository to store and query search
                   logs and MapReduce a more efficient data extraction
                   process
 Big data
 integration      Creating data exports to R, and aggregating data to data
                   warehouse


                  © 2011 by The 451 Group. All rights reserved
Vendor timeline – 451 Research coverage
OCT 11
SEP 11
AUG 11
 JUL 11
JUN 11
           MicroStrategy Quest Software      Opera
MAY 11
APR 11              EMC     NetApp Dell Pervasive
MAR 11
FEB 11
JAN 11
          Platfora          Jaspersoft Revolution
DEC 10             Hadapt
NOV 10
OCT 10
          SAS Appistry Informatica             MapR
SEP 10
AUG 10                                   Amazon
 JUL 10
JUN 10     SnapLogic      Cloudera     Tableau
MAY 10
APR 10
MAR 10    Oracle Pentaho       IBM        Hortonworks
                      Karmasphere Kitenga
FEB 10
JAN 10
DEC 09      Talend                      Microsoft
NOV 09
OCT 09
SEP 09     Datameer DataStax RainStor
AUG 09
 JUL 09
                                           Platform
JUN 09
MAY 09
          ZettaSet Gluster Teradata        Composite
APR 09
MAR 09



                     © 2011 by The 451 Group. All rights reserved
The Apache Hadoop ecosystem
Big data analytics    Microsoft  IBM        Revolution Platfora                           Karmasphere
                      ZettaSet MicroStrategy Tableau    Pentaho
                      Kitenga  Datameer      Jaspersoft Opera                              SAS

Big data integration RainStor Platform Pervasive                            Informatica   Composite
                       Talend IBM                    Quest                     Hadapt
                                                                                          SnapLogic
                       Oracle Teradata              Microsoft               Cloudera

Hadoop distributors
                       Cloudera Hortonworks                      Microsoft         DataStax
                       IBM                MapR                    EMC            Amazon


Big data storage                   Appistry
                       EMC                                 Dell
                       IBM        Gluster                NetApp



                             © 2011 by The 451 Group. All rights reserved
Current data management trends

The amount of    Preliminary survey results Data processing 2013 The value of
                                                     % Change:
data to be       – for illustration purposes capabilitiesvs. 2011
                                                          have    data has never
stored, manage   Enterprise Data Warehouse never been 198%                        been better
d and analyzed                              better
                 Regional/Departmental Data Marts     169%                        understood
is growing       Exploratory Analytics Platform                            183%
rapidly          Hadoop Cluster                                            115%
                 Data Archive                                              394%
                 Operational Databases                                     703%
                 Searchable Data Platform                                  259%
                 Total Data Growth 2011-2013                               180%

RISKOPPORTUNITY

 The data deluge problem is also a big data opportunity




                            © 2011 by The 451 Group. All rights reserved
What is Big Data?
 More than just rising data volumes

                         Big Data ≠ Volume




                       © 2011 by The 451 Group. All rights reserved
What is Big Data?
 Also variety of data types/sources and velocity of data updates

                      Big Data = Volume                          Variety             Velocity




 Preliminary survey results – for illustrative purposes:


My organization’s existing data management
 architecture is suitable to meet its future             29%                        34%                 37%
    demands for business intelligence


            Strongly Agree/Agree                     Neutral                        Disagree/Strongly Disagree




                                     © 2011 by The 451 Group. All rights reserved
Current data management trends

The                ‘Big Data’                     Data processing       The value of
volume, variety   covers a diverse                capabilities have     data has never
and velocity of   set of products                 never been            been better
data is growing   that can be                     better                understood
rapidly           applied to
                  different
                  problems




RISKOPPORTUNITY

 ‘Big Data’ highlights the problem – volume/variety/velocity,
 and promises a solution – value,
 but doesn’t provide a path in between



                         © 2011 by The 451 Group. All rights reserved
What is Total Data?
 Not just another name
  for Big Data

 Inspired by ‘Total Football’ –
  a new approach to soccer that
  emerged in the late 1960s

 If your data is big, the way
  you manage it should be total

 Total Data is making the most
  efficient use of existing and new data management resources to
  deliver value from data



                         © 2011 by The 451 Group. All rights reserved
What is Total Data?
 Also the desire of the user to store and process all their data

             Value = (Volume           Variety                Velocity) x Totality




 Big data
 storage




                         © 2011 by The 451 Group. All rights reserved
What is Total Data?
 Within tolerable time frames

            Value = (Volume           Variety Velocity) x Totality
                                            Time




 Stream processing
 S4                                                                   Hadoop
 Storm
 Percolator




                        © 2011 by The 451 Group. All rights reserved
What is Total Data?
 And the desire to explore data for new value

   Value = (Volume    Variety               Velocity) x (Totality + Exploration)
                                            Time




                                                                       Big data
                                                                       analytics




                        © 2011 by The 451 Group. All rights reserved
Data exploration
 Schema on write                                                   Schema on read

   Application                                                        Application




    Schema
                                                                        Hadoop




     RDBMS
                                                                        Schema




       SQL                                                            MapReduce



                    © 2011 by The 451 Group. All rights reserved
Data exploration

                                     Exploratory Analytics Platform
 RDBMS + UDFs
 SQL-MapReduce                           Application             Application


 Splunk
 HPCC Systems                         Loose schema
                                                                   Hadoop

   Dryad
   Tenzing
                                             RDBMS
   Dremel                                                         Schema
   Piccolo


                                             Analytics            MapReduce



                   © 2011 by The 451 Group. All rights reserved
Data platforms for different data types
 Preliminary survey results – for illustrative purposes:
                                     Customer Data                                59%                             5%     11%

                              Transactional Data                               51%                         8%      11%

              Domain-specific Application Data                              46%                        14%          14%

                        Online Transaction Data                             46%                       11%         11%

                            Application Log Data                         41%                         16%           14%

                     Other Documents/Content                          35%                      16%           16%

                          Audio/Video/Graphics                     30%                  14%                24%

                              Network Log Data                     30%                   16%                22%

                                         Search Log               27%                   19%                 22%

                                     Other Log Files              27%                   16%                24%

                                      Web Log Data                27%                   19%                 22%

                      Social Media/Online Data                  24%                     22%                  24%



         Enterprise Data Warehouse                     Exploratory Analytics Platform                                   Hadoop




                                 © 2011 by The 451 Group. All rights reserved
Data platforms for different application workloads
 Preliminary survey results – for illustrative purposes:
           Data Consolidation                        49%                               11%         14%

  Data Storage for Compliance                        49%                               11%          16%

         Financial Forecasting                       49%                                16%              8%

             Decision Support                        49%                                     22%               8%

             Data Sandboxing                      43%                             16%              11%

                Trend Analysis                    43%                              19%                    19%

        Data Indexing/Search                     41%                             16%               19%

     Ad Hoc, Iterative Analysis                  41%                              22%                    16%

            Customer Analysis                  38%                               22%               14%

              IT Data Analysis               35%                            22%                    16%

          Clickstream Analysis             30%                        22%                    19%



      Enterprise Data Warehouse                   Exploratory Analytics Platform                                Hadoop



                                  © 2011 by The 451 Group. All rights reserved
eBay’s Singularity platform
               Analyze & Report
                                                                Discover & Explore




    Data warehouse                          Singularity                              Hadoop



  6+PB Teradata EDW           40+PB Teradata appliance                       20+PB Hadoop cluster
Structured SQL analysis         Semi-structured SQL                          Unstructured analysis
500+ concurrent users          150+ concurrent users                         5-10 concurrent users

 ‘soft data projection’ – apply structural patterns as the data is analyzed

 support for user-defined functions go beyond standard SQL

 a SQL interface familiar to existing analysts



                             © 2011 by The 451 Group. All rights reserved
What is Total Data?
 While maximizing the investment in existing skills and resources

   Value = (Volume     Variety Velocity) x (Totality + Exploration)
                        (Time x Skills and Resources)




                                  Big data
                                  integration




                        © 2011 by The 451 Group. All rights reserved
What is Total Data?
 While maximizing the investment in existing skills and resources

   Value = (Volume      Variety Velocity) x (Totality + Exploration)
                         (Time x Skills and Resources)

 Total Data is making the most efficient use of existing and new data
  management resources to deliver value from data

 Inspired by ‘Total Football’




                         © 2011 by The 451 Group. All rights reserved
The old way


                                                                           Data   Reporting/BI

                                                                           mart

                                                                                  Reporting/BI
    App       Relationa
                                                                           Data
                  l
                                                                           mart
              database
    App


                                                                                  Reporting/BI
    App                            Data
              Relationa                                         EDW
                  l           cleansing/MDM
    App       database                                                            Reporting/BI




                                                                                  Reporting/BI
    App
              Relationa
                                                            Data archive
                  l                                                               Reporting/BI
              database
    App




                          © 2011 by The 451 Group. All rights reserved
The old way
Data          Operational                        Analytic           Business
archive       database                           database           intelligence




                                                                                   29




                     © 2011 by The 451 Group. All rights reserved
The new way


                                            App                   Stream processing                   Reporting/BI      Reporting/BI
      Reporting/BI
                                Cache                                                       Data
                                                                                            mart
App                                                                                                                       Big data
                               Relationa                                                                 Reporting/BI     Hadoop
                                   l                                                                                    integration
                                                                Datastructure
               Relationa       database
App                l
               database
                                NoSQL
App                            database                                                            “Data Hub”                EDW
App
                     NewSQL database
App
                                                                                                                        Exploratory
                                 Non-                           Datastructure                       Big data             Big data
                                                                                                   Queryable             analytics
App            Relationa       relational                                                           storage              analytics
                                                                                                    archive              platform
                   l           database
App            database




                                             © 2011 by The 451 Group. All rights reserved
The new way

             Data archive       Exploratory
                                analytics

                                                                                 Data
                                                                                 cache/grid
‘Data Hub’     Non-relational    Hadoop
               database


                                Data                                             Datastructure
                 NoSQL          warehouse
                 database




             Event stream                          Relational                                    31
             processing                            database




                                  © 2011 by The 451 Group. All rights reserved
Relevant reports
 Total Data
  • Explaining the the total data management approach to
    dealing with the impact of big data on the data
    management landscape
  • Coming late 2011
  • sales@the451group.com
                                                                    COMING
 Free copy for completing our
  Total Data survey:                                                 LATE
 www.bit.ly/451data
                                                                     2011


                         © 2011 by The 451 Group. All rights reserved
The Blind Men and the Elephant




                   © 2011 by The 451 Group. All rights reserved
The Apache Hadoop ecosystem
Big data analytics    Microsoft  IBM        Revolution Platfora                            Karmasphere
                      ZettaSet MicroStrategy Tableau    Pentaho
                      Kitenga  Datameer      Jaspersoft Opera                               SAS

Big data integration RainStor Platform Pervasive                             Informatica   Composite
                       Talend IBM                     Quest                     Hadapt
                                                                                           SnapLogic
                       Oracle Teradata               Microsoft               Cloudera

Hadoop distributors
                       Cloudera Hortonworks                       Microsoft         DataStax
                        IBM                MapR                    EMC            Amazon


Big data storage                   Appistry
                       EMC                                  Dell
                       IBM        Gluster                 NetApp



                              © 2011 by The 451 Group. All rights reserved
Hadoop-related business strategies




                       Chukwa      Sqoop           ZooKeeper                  Pig         Hortonworks

                       HBase       Avro             Mahout                    Flume       Cloudera CDH
Supportsubscription




                       MapReduce                                              Whirr
                                                                                          IBM BigInsights
                                                                                          Community
                                                                              Hama
                      HDFS
                                                                              Hive

                       Hadoop Common



                                           © 2011 by The 451 Group. All rights reserved
Hadoop-related business strategies



                                    Management                                            Cloudera Enterprise

                       Chukwa      Sqoop           ZooKeeper                  Pig         IBM BigInsights
                                                                                          Enterprise
                       HBase       Avro             Mahout                    Flume
Supportsubscription




                                                                                          Hortonworks
                       MapReduce                                              Whirr       Data Platform
                                                                              Hama
                      HDFS
                                                                              Hive

                       Hadoop Common



                                           © 2011 by The 451 Group. All rights reserved
Apache Hadoop contributors




             Source: Datameer blog. http://datameer.com/blog/uncategorized/whose-hadoop-is-bigger-really-2.html



                        © 2011 by The 451 Group. All rights reserved
Key contributors




              Source: Hortonworks blog. http://www.hortonworks.com/reality-check-contributions-to-apache-hadoop/



                          © 2011 by The 451 Group. All rights reserved
Key contributors




                       Source: Cloudera blog. http://www.cloudera.com/blog/2011/10/the-community-effect/



                   © 2011 by The 451 Group. All rights reserved
Key contributors




                       Source: Cloudera blog. http://www.cloudera.com/blog/2011/10/the-community-effect/



                   © 2011 by The 451 Group. All rights reserved
Hadoop-related business strategies



                                    Management                                            Default alternatives:
                                                                                          MapR/EMC
                       Chukwa      Sqoop           ZooKeeper                  Pig         – Direct Access NFS

                       HBase       Avro             Mahout                    Flume       DataStax
Supportsubscription




                                                                                          –CassandraFS
                       MapReduce                                              Whirr
                                                                                          Optional alternatives:
                                                                              Hama        IBM – GPFS
                      HDFS
                                                                              Hive        Appistry – CloudIQ
                       Hadoop Common
                                                                                          Gluster – GlusterFS


                                           © 2011 by The 451 Group. All rights reserved
Hadoop-related business strategies



                                    Management                                            Default alternatives:
                                                                                          MapR/EMC
                       Chukwa      Sqoop           ZooKeeper                  Pig         – JobTracker HA

                       HBase       Avro             Mahout                    Flume       Platform
Supportsubscription




                                                                                          – Platform MapReduce
                       MapReduce                                              Whirr

                                                                              Hama
                      HDFS
                                                                              Hive

                       Hadoop Common



                                           © 2011 by The 451 Group. All rights reserved
Hadoop component alternatives

 Concerns about JobTracker and NameNode as SPOF

MapReduce
 JobTracker                                                                 TaskTracker   TaskTracker
               TaskTracker   TaskTracker              TaskTracker
 TaskTracker




HDFS
 NameNode
               DataNode       DataNode                 DataNode             DataNode      DataNode
 DataNode




                             © 2011 by The 451 Group. All rights reserved
Apache Hadoop 0.23 and beyond
 NextGen MapReduce splits JobTracker into resource management
 and application lifecycle management
NextGen MapReduce
 Resource   Node Manager   Node Manager             Node Manager          Node Manager   Node Manager
 Manager                     App Master                                    App Master     App Master
             App Master                               App Master




                           © 2011 by The 451 Group. All rights reserved
Apache Hadoop 0.23 and beyond
 NextGen MapReduce splits JobTracker into resource management
  and application lifecycle management
NextGen MapReduce
  Resource   Node Manager   Node Manager             Node Manager          Node Manager   Node Manager
  Manager                     App Master                                    App Master     App Master
              App Master                               App Master



 NameNode HA adds a standby NameNode to enable warm and hot
  standby for both planned and unplanned downtime
 NameNode HA
   Active     Standby
             NameNode         DataNode                DataNode              DataNode      DataNode
 NameNode



 Does not preclude the use of alternatives, but does raise the bar for
  ‘enterprise-level’ capabilities in Apache Hadoop

                            © 2011 by The 451 Group. All rights reserved
A cautionary tale?




                     © 2011 by The 451 Group. All rights reserved
Survey details:
   http://bit.ly/451data




matthew.aslett@the451group.com                                       www.twitter.com/maslett



                      © 2011 by The 451 Group. All rights reserved

Weitere ähnliche Inhalte

Was ist angesagt?

Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelDataWorks Summit
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?Hortonworks
 
Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12Tanguy MOAL
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionHortonworks
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...DataWorks Summit
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle TechnologiesOleksii Movchaniuk
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightHortonworks
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and HadoopFebiyan Rachman
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIntel IT Center
 
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...DataWorks Summit/Hadoop Summit
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHortonworks
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark Summit
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHortonworks
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012Gigaom
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 

Was ist angesagt? (20)

Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
 
Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12
 
Big Data
Big DataBig Data
Big Data
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle Technologies
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
 
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare Transformation
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 

Ähnlich wie Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 Group

The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...Cloudera, Inc.
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Cloudera, Inc.
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopCloudera, Inc.
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...Amr Awadallah
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big DecisionsInnoTech
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Jeffrey T. Pollock
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendCaserta
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016StampedeCon
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Jonathan Seidman
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Calpont Corporation
 
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...Cloudera, Inc.
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsInside Analysis
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarHortonworks
 
Putting Business Intelligence to Work on Hadoop Data Stores
Putting Business Intelligence to Work on Hadoop Data StoresPutting Business Intelligence to Work on Hadoop Data Stores
Putting Business Intelligence to Work on Hadoop Data StoresDATAVERSITY
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 

Ähnlich wie Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 Group (20)

The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
 
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise Analytics
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Putting Business Intelligence to Work on Hadoop Data Stores
Putting Business Intelligence to Work on Hadoop Data StoresPutting Business Intelligence to Work on Hadoop Data Stores
Putting Business Intelligence to Work on Hadoop Data Stores
 
Future of-hadoop-analytics
Future of-hadoop-analyticsFuture of-hadoop-analytics
Future of-hadoop-analytics
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 

Mehr von Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mehr von Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Kürzlich hochgeladen

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Kürzlich hochgeladen (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 Group

  • 1. The Blind Men and the Elephant Matthew Aslett, Senior Analyst, The 451 Group Hadoop World, 8 November, 2011 © 2011 by The 451 Group. All rights reserved
  • 2. Agenda  Introduction and family history  The Blind Men and the Elephant  What is the point of Hadoop?  Adoption trends  Big data, total data  Exploratory analytics  Hadoop-related business strategies  Contributors and their contributions  A cautionary tale © 2011 by The 451 Group. All rights reserved
  • 3. The 451 Group 451 Research is focused on the business of enterprise IT innovation. The company’s analysts provide critical and timely insight into the competitive dynamics of innovation in emerging technology segments. Tier1 Research is a single-source research and advisory firm covering the multi-tenant datacenter, hosting, IT and cloud-computing sectors, blending the best of industry and financial research. The Uptime Institute is ‘The Global Data Center Authority’ and a pioneer in the creation and facilitation of end-user knowledge communities to improve reliability and uninterruptible availability in datacenter facilities. TheInfoPro is a leading IT advisory and research firm that provides real-world perspectives on the customer and market dynamics of the enterprise information technology landscape, harnessing the collective knowledge and insight of leading IT organizations worldwide. ChangeWave Research is a research firm that identifies and quantifies ‘change’ in consumer spending behavior, corporate purchasing, and industry, company and technology trends. © 2011 by The 451 Group. All rights reserved
  • 4. 451 Research  Matthew Aslett • Senior analyst, enterprise software • With The 451 Group since 2007 • www.twitter.com/maslett Information Management Commercial Adoption of Open Source  Operational databases (CAOS)  Data warehousing  Open source projects  Data caching  Adoption of open source software  Event processing  Vendor strategies  Hadoop first properly covered in March  Hadoop first covered February 2008 as 2009 report covering the formation of part of coverage of emerging open source Apache Hadoop distributor Cloudera data management projects © 2011 by The 451 Group. All rights reserved
  • 5. A family history? © 2011 by The 451 Group. All rights reserved
  • 6. The Blind Men and the Elephant “It was six men of Indostan To learning much inclined, Who went to see the Elephant (Though all of them were blind), That each by observation Might satisfy his mind.” John Godfrey Saxe (1872) © 2011 by The 451 Group. All rights reserved
  • 7. The Blind Men and the Elephant “After Hadoop finishes filtering the data, the place you want to put that data is in Oracle Database.” Larry Ellison (2011) © 2011 by The 451 Group. All rights reserved
  • 8. Oracle Big Data Appliance Apache Hadoop NoSQL Database Oracle Tools Oracle Database Data Integrator for Oracle Database Data Loader Big data R distribution integration © 2011 by The 451 Group. All rights reserved
  • 9. What is the point of Hadoop? Big data Big data Big data storage integration analytics  Yes, depending on who you ask (and when) © 2011 by The 451 Group. All rights reserved
  • 10. Example deployment Processes millions of searches and transactions a Orbitz day, resulting in hundreds of GBs of log data  Early Hadoop adopter for long-term storage and Big data processing of un/semi-structured data storage  Too much data to store and process in data warehouse due to cost and space considerations  Adopted Hive for SQL-like query capabilities Big data  Also, machine learning to automate hotel ranking based analytics on user behavior  Hadoop provided repository to store and query search logs and MapReduce a more efficient data extraction process Big data integration  Creating data exports to R, and aggregating data to data warehouse © 2011 by The 451 Group. All rights reserved
  • 11. Vendor timeline – 451 Research coverage OCT 11 SEP 11 AUG 11 JUL 11 JUN 11 MicroStrategy Quest Software Opera MAY 11 APR 11 EMC NetApp Dell Pervasive MAR 11 FEB 11 JAN 11 Platfora Jaspersoft Revolution DEC 10 Hadapt NOV 10 OCT 10 SAS Appistry Informatica MapR SEP 10 AUG 10 Amazon JUL 10 JUN 10 SnapLogic Cloudera Tableau MAY 10 APR 10 MAR 10 Oracle Pentaho IBM Hortonworks Karmasphere Kitenga FEB 10 JAN 10 DEC 09 Talend Microsoft NOV 09 OCT 09 SEP 09 Datameer DataStax RainStor AUG 09 JUL 09 Platform JUN 09 MAY 09 ZettaSet Gluster Teradata Composite APR 09 MAR 09 © 2011 by The 451 Group. All rights reserved
  • 12. The Apache Hadoop ecosystem Big data analytics Microsoft IBM Revolution Platfora Karmasphere ZettaSet MicroStrategy Tableau Pentaho Kitenga Datameer Jaspersoft Opera SAS Big data integration RainStor Platform Pervasive Informatica Composite Talend IBM Quest Hadapt SnapLogic Oracle Teradata Microsoft Cloudera Hadoop distributors Cloudera Hortonworks Microsoft DataStax IBM MapR EMC Amazon Big data storage Appistry EMC Dell IBM Gluster NetApp © 2011 by The 451 Group. All rights reserved
  • 13. Current data management trends The amount of Preliminary survey results Data processing 2013 The value of % Change: data to be – for illustration purposes capabilitiesvs. 2011 have data has never stored, manage Enterprise Data Warehouse never been 198% been better d and analyzed better Regional/Departmental Data Marts 169% understood is growing Exploratory Analytics Platform 183% rapidly Hadoop Cluster 115% Data Archive 394% Operational Databases 703% Searchable Data Platform 259% Total Data Growth 2011-2013 180% RISKOPPORTUNITY  The data deluge problem is also a big data opportunity © 2011 by The 451 Group. All rights reserved
  • 14. What is Big Data?  More than just rising data volumes  Big Data ≠ Volume © 2011 by The 451 Group. All rights reserved
  • 15. What is Big Data?  Also variety of data types/sources and velocity of data updates  Big Data = Volume Variety Velocity  Preliminary survey results – for illustrative purposes: My organization’s existing data management architecture is suitable to meet its future 29% 34% 37% demands for business intelligence Strongly Agree/Agree Neutral Disagree/Strongly Disagree © 2011 by The 451 Group. All rights reserved
  • 16. Current data management trends The ‘Big Data’ Data processing The value of volume, variety covers a diverse capabilities have data has never and velocity of set of products never been been better data is growing that can be better understood rapidly applied to different problems RISKOPPORTUNITY  ‘Big Data’ highlights the problem – volume/variety/velocity,  and promises a solution – value,  but doesn’t provide a path in between © 2011 by The 451 Group. All rights reserved
  • 17. What is Total Data?  Not just another name for Big Data  Inspired by ‘Total Football’ – a new approach to soccer that emerged in the late 1960s  If your data is big, the way you manage it should be total  Total Data is making the most efficient use of existing and new data management resources to deliver value from data © 2011 by The 451 Group. All rights reserved
  • 18. What is Total Data?  Also the desire of the user to store and process all their data  Value = (Volume Variety Velocity) x Totality Big data storage © 2011 by The 451 Group. All rights reserved
  • 19. What is Total Data?  Within tolerable time frames  Value = (Volume Variety Velocity) x Totality Time  Stream processing  S4 Hadoop  Storm  Percolator © 2011 by The 451 Group. All rights reserved
  • 20. What is Total Data?  And the desire to explore data for new value  Value = (Volume Variety Velocity) x (Totality + Exploration) Time Big data analytics © 2011 by The 451 Group. All rights reserved
  • 21. Data exploration  Schema on write  Schema on read Application Application Schema Hadoop RDBMS Schema SQL MapReduce © 2011 by The 451 Group. All rights reserved
  • 22. Data exploration  Exploratory Analytics Platform  RDBMS + UDFs  SQL-MapReduce Application Application  Splunk  HPCC Systems Loose schema Hadoop  Dryad  Tenzing RDBMS  Dremel Schema  Piccolo Analytics MapReduce © 2011 by The 451 Group. All rights reserved
  • 23. Data platforms for different data types  Preliminary survey results – for illustrative purposes: Customer Data 59% 5% 11% Transactional Data 51% 8% 11% Domain-specific Application Data 46% 14% 14% Online Transaction Data 46% 11% 11% Application Log Data 41% 16% 14% Other Documents/Content 35% 16% 16% Audio/Video/Graphics 30% 14% 24% Network Log Data 30% 16% 22% Search Log 27% 19% 22% Other Log Files 27% 16% 24% Web Log Data 27% 19% 22% Social Media/Online Data 24% 22% 24% Enterprise Data Warehouse Exploratory Analytics Platform Hadoop © 2011 by The 451 Group. All rights reserved
  • 24. Data platforms for different application workloads  Preliminary survey results – for illustrative purposes: Data Consolidation 49% 11% 14% Data Storage for Compliance 49% 11% 16% Financial Forecasting 49% 16% 8% Decision Support 49% 22% 8% Data Sandboxing 43% 16% 11% Trend Analysis 43% 19% 19% Data Indexing/Search 41% 16% 19% Ad Hoc, Iterative Analysis 41% 22% 16% Customer Analysis 38% 22% 14% IT Data Analysis 35% 22% 16% Clickstream Analysis 30% 22% 19% Enterprise Data Warehouse Exploratory Analytics Platform Hadoop © 2011 by The 451 Group. All rights reserved
  • 25. eBay’s Singularity platform Analyze & Report Discover & Explore Data warehouse Singularity Hadoop 6+PB Teradata EDW 40+PB Teradata appliance 20+PB Hadoop cluster Structured SQL analysis Semi-structured SQL Unstructured analysis 500+ concurrent users 150+ concurrent users 5-10 concurrent users  ‘soft data projection’ – apply structural patterns as the data is analyzed  support for user-defined functions go beyond standard SQL  a SQL interface familiar to existing analysts © 2011 by The 451 Group. All rights reserved
  • 26. What is Total Data?  While maximizing the investment in existing skills and resources  Value = (Volume Variety Velocity) x (Totality + Exploration) (Time x Skills and Resources) Big data integration © 2011 by The 451 Group. All rights reserved
  • 27. What is Total Data?  While maximizing the investment in existing skills and resources  Value = (Volume Variety Velocity) x (Totality + Exploration) (Time x Skills and Resources)  Total Data is making the most efficient use of existing and new data management resources to deliver value from data  Inspired by ‘Total Football’ © 2011 by The 451 Group. All rights reserved
  • 28. The old way Data Reporting/BI mart Reporting/BI App Relationa Data l mart database App Reporting/BI App Data Relationa EDW l cleansing/MDM App database Reporting/BI Reporting/BI App Relationa Data archive l Reporting/BI database App © 2011 by The 451 Group. All rights reserved
  • 29. The old way Data Operational Analytic Business archive database database intelligence 29 © 2011 by The 451 Group. All rights reserved
  • 30. The new way App Stream processing Reporting/BI Reporting/BI Reporting/BI Cache Data mart App Big data Relationa Reporting/BI Hadoop l integration Datastructure Relationa database App l database NoSQL App database “Data Hub” EDW App NewSQL database App Exploratory Non- Datastructure Big data Big data Queryable analytics App Relationa relational storage analytics archive platform l database App database © 2011 by The 451 Group. All rights reserved
  • 31. The new way Data archive Exploratory analytics Data cache/grid ‘Data Hub’ Non-relational Hadoop database Data Datastructure NoSQL warehouse database Event stream Relational 31 processing database © 2011 by The 451 Group. All rights reserved
  • 32. Relevant reports  Total Data • Explaining the the total data management approach to dealing with the impact of big data on the data management landscape • Coming late 2011 • sales@the451group.com COMING  Free copy for completing our Total Data survey: LATE  www.bit.ly/451data 2011 © 2011 by The 451 Group. All rights reserved
  • 33. The Blind Men and the Elephant © 2011 by The 451 Group. All rights reserved
  • 34. The Apache Hadoop ecosystem Big data analytics Microsoft IBM Revolution Platfora Karmasphere ZettaSet MicroStrategy Tableau Pentaho Kitenga Datameer Jaspersoft Opera SAS Big data integration RainStor Platform Pervasive Informatica Composite Talend IBM Quest Hadapt SnapLogic Oracle Teradata Microsoft Cloudera Hadoop distributors Cloudera Hortonworks Microsoft DataStax IBM MapR EMC Amazon Big data storage Appistry EMC Dell IBM Gluster NetApp © 2011 by The 451 Group. All rights reserved
  • 35. Hadoop-related business strategies Chukwa Sqoop ZooKeeper Pig Hortonworks HBase Avro Mahout Flume Cloudera CDH Supportsubscription MapReduce Whirr IBM BigInsights Community Hama HDFS Hive Hadoop Common © 2011 by The 451 Group. All rights reserved
  • 36. Hadoop-related business strategies Management Cloudera Enterprise Chukwa Sqoop ZooKeeper Pig IBM BigInsights Enterprise HBase Avro Mahout Flume Supportsubscription Hortonworks MapReduce Whirr Data Platform Hama HDFS Hive Hadoop Common © 2011 by The 451 Group. All rights reserved
  • 37. Apache Hadoop contributors Source: Datameer blog. http://datameer.com/blog/uncategorized/whose-hadoop-is-bigger-really-2.html © 2011 by The 451 Group. All rights reserved
  • 38. Key contributors Source: Hortonworks blog. http://www.hortonworks.com/reality-check-contributions-to-apache-hadoop/ © 2011 by The 451 Group. All rights reserved
  • 39. Key contributors Source: Cloudera blog. http://www.cloudera.com/blog/2011/10/the-community-effect/ © 2011 by The 451 Group. All rights reserved
  • 40. Key contributors Source: Cloudera blog. http://www.cloudera.com/blog/2011/10/the-community-effect/ © 2011 by The 451 Group. All rights reserved
  • 41. Hadoop-related business strategies Management Default alternatives: MapR/EMC Chukwa Sqoop ZooKeeper Pig – Direct Access NFS HBase Avro Mahout Flume DataStax Supportsubscription –CassandraFS MapReduce Whirr Optional alternatives: Hama IBM – GPFS HDFS Hive Appistry – CloudIQ Hadoop Common Gluster – GlusterFS © 2011 by The 451 Group. All rights reserved
  • 42. Hadoop-related business strategies Management Default alternatives: MapR/EMC Chukwa Sqoop ZooKeeper Pig – JobTracker HA HBase Avro Mahout Flume Platform Supportsubscription – Platform MapReduce MapReduce Whirr Hama HDFS Hive Hadoop Common © 2011 by The 451 Group. All rights reserved
  • 43. Hadoop component alternatives  Concerns about JobTracker and NameNode as SPOF MapReduce JobTracker TaskTracker TaskTracker TaskTracker TaskTracker TaskTracker TaskTracker HDFS NameNode DataNode DataNode DataNode DataNode DataNode DataNode © 2011 by The 451 Group. All rights reserved
  • 44. Apache Hadoop 0.23 and beyond  NextGen MapReduce splits JobTracker into resource management and application lifecycle management NextGen MapReduce Resource Node Manager Node Manager Node Manager Node Manager Node Manager Manager App Master App Master App Master App Master App Master © 2011 by The 451 Group. All rights reserved
  • 45. Apache Hadoop 0.23 and beyond  NextGen MapReduce splits JobTracker into resource management and application lifecycle management NextGen MapReduce Resource Node Manager Node Manager Node Manager Node Manager Node Manager Manager App Master App Master App Master App Master App Master  NameNode HA adds a standby NameNode to enable warm and hot standby for both planned and unplanned downtime NameNode HA Active Standby NameNode DataNode DataNode DataNode DataNode NameNode  Does not preclude the use of alternatives, but does raise the bar for ‘enterprise-level’ capabilities in Apache Hadoop © 2011 by The 451 Group. All rights reserved
  • 46. A cautionary tale? © 2011 by The 451 Group. All rights reserved
  • 47. Survey details: http://bit.ly/451data matthew.aslett@the451group.com www.twitter.com/maslett © 2011 by The 451 Group. All rights reserved