SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Using Hadoop Analytics to
Gain a Big Data Advantage

Jonathan Seidman, Solution Architect, Cloudera
Ian Fyfe, VP Product Marketing, Pentaho
Jeff Stacey, Director of GTM Strategy, Channel & Sales Development, Dell
Why big data
    matters to your
    business
    Jonathan
    Seidman, Cloudera




2   Confidential        Big Data Solutions

                                             2
Explosive Data Growth

                                           10,000
 GIGABYTES OF DATA CREATED (IN BILLIONS)




                                                                 1.8 trillion gigabytes of data was
                                                                 created in 2011…

                                                                    •   More than 90% is unstructured data
                                                                    •   Approx. 500 quadrillion files
                                               5,000                •   Quantity doubles every 2 years




                                                  0

                                                         2005                                           2010                                      2015

                                                                                                               STRUCTURED DATA         UNSTRUCTURED DATA
Source: IDC 2011




                                           3      Confidential                                                               Big Data Solutions
The ‗Big Data‘ Phenomenon


       Big Data Drivers                       More Content                                                    More Devices

       • The proliferation of data capture
         and creation technologies
       • Increased ―interconnectedness‖
         drives consumption (creating
         more data)                              More                                                         New & Better
                                              Consumption                                                      Information
       • Inexpensive storage makes it
         possible to keep more, longer
       • Innovative software and analysis
         tools turn data into information




                                                                   • Every gigabyte of stored content can generate
                                    Big Data encompasses not         a petabyte or more of transient data*
                                    only the content itself, but
                                    how it’s consumed              • The information about you is much greater
                                                                     than the information you create
*Source: IDC 2011




      4     Confidential                                                                 Big Data Solutions
The Opportunity: Quickly gain a competitive
advantage
                                                           Use Cases
                   • Big opportunity to drive              • Ecommerce – Predict
                     revenue, e.g.                           customer behavior across
                      – Predict customer behavior            all channels to drive
                        across all channels (Web             revenue
                        site, social media, email, etc.)   • E-gaming – understand
                      – Understand and monetize              and better monetize
                        customer behavior                    customer behavior
                      – Predict customer churn             • Networks – predict failure,
                                                             neutralize attacks to reduce
                   • Big opportunity to reduce               costs
                     costs, e.g.                           • Customers – predict churn,
                      – Networks – predict                   optimize revenue
                        failure, neutralize attacks        • Machines/sensors –
                      – Machines/sensors – predict           predict failures, reduce
                        failures                             costs
                      – Financial risk management –        • Financial risk
                        reduce fraud, increase security      management – reduce
                                                             fraud, increase security


5   Confidential                                                      Big Data Solutions
Big data
    challenges
    Ian Fyfe, Pentaho




6   Confidential        Big Data Solutions

                                             6
Big Data Challenges


              Cost-effectively managing the volume, velocity and variety of
                                                                      data


                   Deriving value across structured and unstructured data


          Adapting to context changes and integrating new data sources
                                                             and types




7   Confidential                                               Big Data Solutions
The Current Solutions
                                          10,000
GIGABYTES OF DATA CREATED (IN BILLIONS)




                                                           Current Database Solutions are designed
                                                           for structured data.

                                                                •   Optimized to answer known questions
                                                                    quickly
                                              5,000             •   Schemas dictate form/context
                                                                •   Difficult to adapt to new data types and new
                                                                    questions
                                                                •   Expensive at petabyte scale




                                                 0                                                                                                        10%
                                                        2005                                       2010                                          2015

                                                                                                                   STRUCTURED DATA    UNSTRUCTURED DATA




                                          8      Confidential                                                                        Big Data Solutions
Common Data Analytics Architecture
          Offline data can‘t be
            analyzed easily



     TAPE
    ARCHIVE                           Can‘t explore original
                                                                                          BI REPORTS
                                                                                               &
                                        high fidelity data
                                                                                       INTERACTIVE APPS




 STORAGE ONLY                                                                                 RDBMS
     GRID                                              ETL COMPUTE
                                                           GRID                            (AGGREGATED
(ORIGINAL RAW DATA)                                                                            DATA)




                                                       Moving data to compute
                                                            doesn‘t scale
                    DATA COLLECTION




                     DATA SOURCES

9    Confidential                                                               Big Data Solutions
Leveraging big
     data for
     competitive
     advantage
     All




10   Confidential     Big Data Solutions
Success With Hadoop




11   Confidential     Big Data Solutions
Big Data Analytics at TravelTainment
    Multi-channel distribution platform for the travel industry

                                                                             Pentaho Business Analytics fits perfectly
                                                                                 into our open source Big Data
                                                                                          environment.‖

                                                                               -- Ibrahim Husseini, Director of Data
                                                                                    Warehouse, TravelTainment


•        Business challenge: Inefficient and time consuming reporting capabilities on big data
         sets with legacy system.

    Benefits
                                                                                  Why Pentaho
    • Ability to visualize its very large data volumes for reporting and
         analysis in such a way that non-technical users can also easily        • Capability to analyze data from Hadoop
         understand them                                                          and Hive
                                                                                • Professional support for in-depth
    • Can now run complex reports three times faster and with more
                                                                                  analysis
         flexibility than before
                                                                                • Self-service analysis and reporting for
    • For the first time can offer clients user-friendly, self-service and        business customers
         ad-hoc reporting services helping IT focus on their main business
         and not serve as support desk for reporting                            • Cost effective solution




    12     Confidential                                                                          Big Data Solutions
                                                                                                                            1
Dell uncovers new insights and reduces IT costs by US$35 million with a
business intelligence solution designed for big data




                                       Accelerated customer shipment time by 33 percent
Dell                                    Saved US$2 million by improving product quality
Business
                                        Integrated data silos
Intelligence
Practice                                Reduced IT costs by US$35 million

                                        Increased agility




 13   Confidential                                                Big Data Solutions
SecureWorks slashes
                               the cost of storage                              Organization
                               with Dell | Cloudera                             SecureWorks is a true security partner
                                                                                to help protect your IT assets, comply
                               Solution                                         with regulations and reduce costs —
                                                                                without having to build your internal
                                                                                security expertise from scratch.


Challenge
SecureWorks needed a highly scalable solution for
collecting, processing, and analyzing massive amounts of data collected from
customer environments.                                                           “Our storage cost per
                                                                                  gigabyte is 23 cents.
Solution                                                                          We thought we had
The organization deployed the Dell™ | Cloudera® Solution with Cloudera‘s          great economics
distribution of Apache® Hadoop® software, Dell-developed Crowbar software         previously when we
framework, PowerEdge™ C2100 servers, Force10 switches, Dell and                   were spending about
Cloudera services in a solution based on a Dell reference architecture.
                                                                                  seventeen dollars per
                                                                                  gigabyte.”
Benefits
• Reduced the cost of data storage to 23 cents per/gigabyte
• Gained easy scalability for future growth                                        Robert Scudiere, Director of
                                                                                   Engineering, Dell SecureWorks
• Leveraged open source software and commodity hardware to reduce time
  to market
• Maintain high availability for critical services and flexibility to analyze
  structured and unstructured data

 Read the case study
 Watch the case study video
14   Confidential                                                                            Big Data Solutions
Must-haves of an
     effective
     big data solution
     Jeff Stacey, Dell
     16, then 24 to close

     & Jonathan Seidman,
     Cloudera




15   Confidential           Big Data Solutions

                                                 1
Big Data Solution Requirements


                                 Cost-effectively manage
                    the volume, variety and velocity of data


                                    Process and analyze
                         large, complex data sets…quickly


                                           Flexibly adapt
                    to context changes and new data types




16   Confidential                            Big Data Solutions
Why was Hadoop created?
                                                                    Dramatic changes in
Exploding data volumes & types                           LEADS TO
                                                                    enterprise data management

                                                                                       With Hadoop, you can…
                                                                                       •   Extract more value
                 DIGITAL
                CONTENT                                                                •   From more data
                                                                                       •   More cost effectively
                                                                            NEW        •   With greater flexibility
                                                OPERATIONAL
                                                                         OPPORTUNITY
                                      WEB          DATA
                                      LOGS
                SOCIAL
                MEDIA                                                                       •   Deep analysis
     FILES                                   SMART
                                             GRIDS
                                                                                            •   Exhaustive and detailed
                                                                          HARD              •   Sophisticated algorithms
                                                                        PROBLEMS            •   Quick results
                           TRANSACTIONAL
                               DATA

         AD
     IMPRESSIONS                                                                                     •   Any kind
                                               R&D                                                   •   From any source
                                               DATA
                                                                                                     •   Structured and unstructured
                                                                         BIG DATA                    •   At scale


 It’s difficult to handle data this diverse at this scale.
          Traditional platforms can’t keep pace.




17      Confidential                                                                            Big Data Solutions
What is Apache Hadoop?
                                                                 CORE HADOOP COMPONENTS
Hadoop is a platform for data
storage and processing that is…                                Hadoop
                                                           Distributed File
                                                                                                   MapReduce
  Scalable                                                System (HDFS)
  Fault tolerant                                           File sharing and data                 Distributed computing
  Open source                                            protection across physical             across physical servers
                                                                   servers




         Consolidates                       Excels at                                         Scales
          everything                     complex analysis                                  economically
                                     • Scale-out architecture divides                  • Can be deployed on
 • A single repository for storing
                                       workloads across multiple                         commodity hardware
   and mining any type of data
                                       nodes
 • Not bound by a single schema                                                        • Open source platform
                                     • Flexible file system eliminates                   guards against vendor
                                       ETL bottlenecks                                   lock




18   Confidential                                                                               Big Data Solutions
Core Hadoop: HDFS
Self-healing, high bandwidth CLUSTERED STORAGE



       1

       2
                    HDFS
       3                           2                                                   1
                                               1            1          2

       4                           4                                                   3
                                               5            3          3

       5                           5                                                   4
                                               2            4          5

     HDFS breaks incoming files into blocks and stores them redundantly across the cluster



19   Confidential                                                            Big Data Solutions
Core Hadoop: MapReduce
Framework for DISTRIBUTED COMPUTING



     1

      2
                    MR
      3                            2                                                  1
                                               1           1           2

      4                            4                                                  3
                                               5           3           3

      5                            5                                                  4
                                               2           4           5

          Processes many jobs in parallel across many nodes and combines the results



20   Confidential                                                            Big Data Solutions
Major Hadoop Utilities

                                                                     Apache Pig
                                                                   High-level language
                                                                   for expressing data
                                    Apache Hive                     analysis programs
                                                                                                       Apache HBase
                                 SQL-like language and
                                  metadata repository                                                 The Hadoop database.
                                                                                                       Random, real -time
                                                                                                        read/write access


                           Hue
                                                                                                              Apache Zookeeper
                      Browser-based
                    desktop interface for                                                                        Highly reliable
                      interacting with                                                                             distributed
                          Hadoop                                                                               coordination service


                               Oozie
                                                                                                                Flume
                          Server-based
                        workflow engine for                                                              Distributed service for
                         Hadoop activities                                                                   collecting and
                                                                                                          aggregating log and
                                                                                                               event data

                                                   Sqoop
                                                                                 Apache Whirr
                                              Integrating Hadoop
                                                  with RDBMS                    Library for running
                                                                                Hadoop in the cloud




21   Confidential                                                                                                            Big Data Solutions
Hadoop in Production




22   Confidential      Big Data Solutions
The unrivaled leader in Hadoop
• Worldwide #1 distribution of Apache
  Hadoop
• 100% Open-Source Hadoop
  Distribution
• Largest contributor to the open source
  Hadoop ecosystem
     – Project founders from 8 of the 13
       leading Apache Projects
• Cloudera has more Apache committers
  on staff than any other company
• More than 100 enterprise & public
  sector customers across a wide variety
  of industries



23   Confidential                          Big Data Solutions
Dell | Cloudera Solution with Pentaho


                         Dell Value
                          Business intelligence practice
                          Open & scalable infrastructure
                          Certified and tested platforms
                          Active community participation
                          Crowbar deployment tool
                          Reference Architecture
                          Deployment Guide & Services
                          Joint support with Cloudera
                          Actual customers




24   Confidential                          Big Data Solutions
Industry first: PowerEdge C8000
         Mix and match for the ultimate performance in a dense 4U package

 •     Speed up your most resource-intensive
       workloads by mixing and matching
       compute, storage and/or GPU nodes in the
       same 4U shared infrastructure chassis
 •     Get the cores, memory and I/O expansion you
       need for peak workload performance

 Great for: Big Data, Web 2.0/Hosting, HPC

                                       Get faster results with
            Mix & Match                                                Do more with less
                                       more compute power
     • Mix compute, storage and
       GPUs in the same 4U             • Intel Xeon ES-2600          • Shared infrastructure
       chassis                           processors boost              reduces power & cooling
                                         performance by 80%            costs by ~20%
     • More workload flexibility, HD
       & I/O options than the HP       • Up to 135W support          • Refresh with the latest
                                                                       components without having
       SL6500 or Super Micro           • 2x the I/O bandwidth with
       6047R                                                           to replace the entire chassis
                                         PCI Express Gen3



25    Confidential                                                             Big Data Solutions
• Visual design for Hadoop
• Reduces skills requirements
• Deep integration with Hadoop
      – HDFS, MapReduce, Sqoop, Oo
        zie
      – Runs as MapReduce in-Hadoop
                                      Reporting &   Data Discovery            Predictive
• Easily connects Hadoop to           Dashboards     Visualization            Analytics

  other enterprise data sources
• Broadens Hadoop use to data
  analysts, business users and IT




                      Data
                 Ingestion, Man
                 ipulation, Integ
                 ration, Workflo
                        w
 26    Confidential                                            Big Data Solutions
Fast Visual Development for Hadoop
                     Ingestion / Manipulation / Integration




                     Scheduling


                                                        Modeling




27   Confidential                                  Big Data Solutions
                                                                        2
Discovery > Proof of Value > Deployment




28   Confidential                 Big Data Solutions
Summary
             Dell | Cloudera Solution with Pentaho



                      Cost-effectively managing the volume, velocity and
                                                           variety of data



                   Derive value across structured and unstructured data




                   Rapidly adapt to context changes and integrating new
                                                 data sources and types




29   Confidential                                              Big Data Solutions
Q&A


     Ian Fyfe, Pentaho




30   Confidential        Big Data Solutions
Start getting big insights

     Jonathan Seidman, Cloudera
     jseidman@cloudera.com
     www.cloudera.com

     Ian Fyfe, Pentaho
     ifyfe@pentaho.com
     www.pentaho.com

     Jeff Stacey, Dell
     Hadoop@dell.com
     www.dell.com/hadoop



31   Confidential                 Big Data Solutions

Weitere ähnliche Inhalte

Was ist angesagt?

Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Cloudera, Inc.
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelDataWorks Summit
 
Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introductionmattcasters
 
Designing Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDesigning Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDataWorks Summit
 
Why Your Product Needs an Analytic Strategy
Why Your Product Needs an Analytic Strategy Why Your Product Needs an Analytic Strategy
Why Your Product Needs an Analytic Strategy Pentaho
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseJeffrey T. Pollock
 
4. Big data & analytics HP
4. Big data & analytics HP4. Big data & analytics HP
4. Big data & analytics HPMITEF México
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIntel IT Center
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...DataWorks Summit
 
Optimizing your Hadoop Infastructure: An Industry Panel Presentation
Optimizing your Hadoop Infastructure: An Industry Panel PresentationOptimizing your Hadoop Infastructure: An Industry Panel Presentation
Optimizing your Hadoop Infastructure: An Industry Panel PresentationDataWorks Summit
 
Expand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big DataExpand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big Datajdijcks
 
Innovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle RInnovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle RCapgemini
 
What is the Point of Hadoop
What is the Point of HadoopWhat is the Point of Hadoop
What is the Point of HadoopDataWorks Summit
 
The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...DataWorks Summit
 

Was ist angesagt? (20)

Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
 
Plug 20110217
Plug   20110217Plug   20110217
Plug 20110217
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
 
Big Data for BI - Beyond the Hype - Pentaho
Big Data for BI - Beyond the Hype - PentahoBig Data for BI - Beyond the Hype - Pentaho
Big Data for BI - Beyond the Hype - Pentaho
 
Oracle's BigData solutions
Oracle's BigData solutionsOracle's BigData solutions
Oracle's BigData solutions
 
Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introduction
 
Designing Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDesigning Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted Analytics
 
Why Your Product Needs an Analytic Strategy
Why Your Product Needs an Analytic Strategy Why Your Product Needs an Analytic Strategy
Why Your Product Needs an Analytic Strategy
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
Big Data
Big DataBig Data
Big Data
 
4. Big data & analytics HP
4. Big data & analytics HP4. Big data & analytics HP
4. Big data & analytics HP
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...
 
Optimizing your Hadoop Infastructure: An Industry Panel Presentation
Optimizing your Hadoop Infastructure: An Industry Panel PresentationOptimizing your Hadoop Infastructure: An Industry Panel Presentation
Optimizing your Hadoop Infastructure: An Industry Panel Presentation
 
Expand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big DataExpand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big Data
 
Innovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle RInnovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle R
 
A Mayo Clinic Big Data Implementation
A Mayo Clinic Big Data ImplementationA Mayo Clinic Big Data Implementation
A Mayo Clinic Big Data Implementation
 
What is the Point of Hadoop
What is the Point of HadoopWhat is the Point of Hadoop
What is the Point of Hadoop
 
The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...
 

Andere mochten auch

Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopCloudera, Inc.
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightCloudera, Inc.
 
Hipi: Computer Vision at Large Scale
Hipi: Computer Vision at Large ScaleHipi: Computer Vision at Large Scale
Hipi: Computer Vision at Large ScaleLiu Liu
 
Hadoop World 2011: Indexing the Earth - Large Scale Satellite Image Processin...
Hadoop World 2011: Indexing the Earth - Large Scale Satellite Image Processin...Hadoop World 2011: Indexing the Earth - Large Scale Satellite Image Processin...
Hadoop World 2011: Indexing the Earth - Large Scale Satellite Image Processin...Cloudera, Inc.
 
15 minute presentation about Thesis
15 minute presentation about Thesis15 minute presentation about Thesis
15 minute presentation about ThesisSven Meys
 
Big Data_Analytics - Stick Man Presentation
Big Data_Analytics - Stick Man PresentationBig Data_Analytics - Stick Man Presentation
Big Data_Analytics - Stick Man PresentationAlan Taylor
 
Enabling Key Business Advantage from Big Data through Advanced Ingest Process...
Enabling Key Business Advantage from Big Data through Advanced Ingest Process...Enabling Key Business Advantage from Big Data through Advanced Ingest Process...
Enabling Key Business Advantage from Big Data through Advanced Ingest Process...StampedeCon
 
How Do Red Bull, The Hershey Company, and Intel Turn Big Data Into Competitiv...
How Do Red Bull, The Hershey Company, and Intel Turn Big Data Into Competitiv...How Do Red Bull, The Hershey Company, and Intel Turn Big Data Into Competitiv...
How Do Red Bull, The Hershey Company, and Intel Turn Big Data Into Competitiv...Dr. William J. Ward
 
Hadoop on OpenStack - Sahara @DevNation 2014
Hadoop on OpenStack - Sahara @DevNation 2014Hadoop on OpenStack - Sahara @DevNation 2014
Hadoop on OpenStack - Sahara @DevNation 2014spinningmatt
 
Resource Management in Impala - StampedeCon 2016
Resource Management in Impala - StampedeCon 2016Resource Management in Impala - StampedeCon 2016
Resource Management in Impala - StampedeCon 2016StampedeCon
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and DeploymentCisco Canada
 
Parallel Distributed Image Stacking and Mosaicing with Hadoop__HadoopSummit2010
Parallel Distributed Image Stacking and Mosaicing with Hadoop__HadoopSummit2010Parallel Distributed Image Stacking and Mosaicing with Hadoop__HadoopSummit2010
Parallel Distributed Image Stacking and Mosaicing with Hadoop__HadoopSummit2010Yahoo Developer Network
 
Big Data to your advantage with High-Performance Analytics
Big Data to your advantage with High-Performance AnalyticsBig Data to your advantage with High-Performance Analytics
Big Data to your advantage with High-Performance AnalyticsSAS Institute India Pvt. Ltd
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsMichael Kopp
 
ANTS and BIG DATA - The it outsourcing trend - ICTCom 2016
ANTS and BIG DATA - The it outsourcing trend - ICTCom 2016ANTS and BIG DATA - The it outsourcing trend - ICTCom 2016
ANTS and BIG DATA - The it outsourcing trend - ICTCom 2016Dinh Le Dat (Kevin D.)
 

Andere mochten auch (20)

Using MapReduce for Large–scale Medical Image Analysis
Using MapReduce for Large–scale Medical Image AnalysisUsing MapReduce for Large–scale Medical Image Analysis
Using MapReduce for Large–scale Medical Image Analysis
 
Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in Hadoop
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
 
Hipi: Computer Vision at Large Scale
Hipi: Computer Vision at Large ScaleHipi: Computer Vision at Large Scale
Hipi: Computer Vision at Large Scale
 
Hadoop World 2011: Indexing the Earth - Large Scale Satellite Image Processin...
Hadoop World 2011: Indexing the Earth - Large Scale Satellite Image Processin...Hadoop World 2011: Indexing the Earth - Large Scale Satellite Image Processin...
Hadoop World 2011: Indexing the Earth - Large Scale Satellite Image Processin...
 
15 minute presentation about Thesis
15 minute presentation about Thesis15 minute presentation about Thesis
15 minute presentation about Thesis
 
Big Data_Analytics - Stick Man Presentation
Big Data_Analytics - Stick Man PresentationBig Data_Analytics - Stick Man Presentation
Big Data_Analytics - Stick Man Presentation
 
Big Data, Too Big To Ignore
Big Data, Too Big To IgnoreBig Data, Too Big To Ignore
Big Data, Too Big To Ignore
 
Big dat anaren
Big dat anarenBig dat anaren
Big dat anaren
 
Enabling Key Business Advantage from Big Data through Advanced Ingest Process...
Enabling Key Business Advantage from Big Data through Advanced Ingest Process...Enabling Key Business Advantage from Big Data through Advanced Ingest Process...
Enabling Key Business Advantage from Big Data through Advanced Ingest Process...
 
How Do Red Bull, The Hershey Company, and Intel Turn Big Data Into Competitiv...
How Do Red Bull, The Hershey Company, and Intel Turn Big Data Into Competitiv...How Do Red Bull, The Hershey Company, and Intel Turn Big Data Into Competitiv...
How Do Red Bull, The Hershey Company, and Intel Turn Big Data Into Competitiv...
 
big dat ppt
big dat pptbig dat ppt
big dat ppt
 
Virtualizing Hadoop
Virtualizing HadoopVirtualizing Hadoop
Virtualizing Hadoop
 
Hadoop on OpenStack - Sahara @DevNation 2014
Hadoop on OpenStack - Sahara @DevNation 2014Hadoop on OpenStack - Sahara @DevNation 2014
Hadoop on OpenStack - Sahara @DevNation 2014
 
Resource Management in Impala - StampedeCon 2016
Resource Management in Impala - StampedeCon 2016Resource Management in Impala - StampedeCon 2016
Resource Management in Impala - StampedeCon 2016
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 
Parallel Distributed Image Stacking and Mosaicing with Hadoop__HadoopSummit2010
Parallel Distributed Image Stacking and Mosaicing with Hadoop__HadoopSummit2010Parallel Distributed Image Stacking and Mosaicing with Hadoop__HadoopSummit2010
Parallel Distributed Image Stacking and Mosaicing with Hadoop__HadoopSummit2010
 
Big Data to your advantage with High-Performance Analytics
Big Data to your advantage with High-Performance AnalyticsBig Data to your advantage with High-Performance Analytics
Big Data to your advantage with High-Performance Analytics
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ Applications
 
ANTS and BIG DATA - The it outsourcing trend - ICTCom 2016
ANTS and BIG DATA - The it outsourcing trend - ICTCom 2016ANTS and BIG DATA - The it outsourcing trend - ICTCom 2016
ANTS and BIG DATA - The it outsourcing trend - ICTCom 2016
 

Ähnlich wie Webinar | Using Hadoop Analytics to Gain a Big Data Advantage

BSC 3362 - Big Data and Social Analytics - IOD Conference (IBM)
BSC 3362 - Big Data and Social Analytics - IOD Conference (IBM)BSC 3362 - Big Data and Social Analytics - IOD Conference (IBM)
BSC 3362 - Big Data and Social Analytics - IOD Conference (IBM)Mark Heid
 
OSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalOSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalAccenture the Netherlands
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Mark Heid
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntelAPAC
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantStuart Miniman
 
Big data movement webcast
Big data movement webcastBig data movement webcast
Big data movement webcasttervela
 
01 im overview high level
01 im overview high level01 im overview high level
01 im overview high levelJames Findlay
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forumbigdatawf
 
Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntelAPAC
 
Zakipoint Introduction
Zakipoint IntroductionZakipoint Introduction
Zakipoint Introductionrameshkbudhani
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing DataWorks Summit
 
Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013 Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013 IBM Sverige
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureOdinot Stanislas
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best PracticesYellowfin
 
Crunching “Big Data” to Drive 2012 Revenue Growth: The 5 Myths of Sales & Mar...
Crunching “Big Data” to Drive 2012 Revenue Growth: The 5 Myths of Sales & Mar...Crunching “Big Data” to Drive 2012 Revenue Growth: The 5 Myths of Sales & Mar...
Crunching “Big Data” to Drive 2012 Revenue Growth: The 5 Myths of Sales & Mar...MarketBridge
 
Big data and bi best practices slidedeck
Big data and bi best practices slidedeckBig data and bi best practices slidedeck
Big data and bi best practices slidedeckActian Corporation
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...BigMine
 

Ähnlich wie Webinar | Using Hadoop Analytics to Gain a Big Data Advantage (20)

BSC 3362 - Big Data and Social Analytics - IOD Conference (IBM)
BSC 3362 - Big Data and Social Analytics - IOD Conference (IBM)BSC 3362 - Big Data and Social Analytics - IOD Conference (IBM)
BSC 3362 - Big Data and Social Analytics - IOD Conference (IBM)
 
OSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalOSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - Technical
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick Knupffer
 
Secure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & IntelSecure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & Intel
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Big data movement webcast
Big data movement webcastBig data movement webcast
Big data movement webcast
 
01 im overview high level
01 im overview high level01 im overview high level
01 im overview high level
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big Data
 
Zakipoint Introduction
Zakipoint IntroductionZakipoint Introduction
Zakipoint Introduction
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013 Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best Practices
 
Crunching “Big Data” to Drive 2012 Revenue Growth: The 5 Myths of Sales & Mar...
Crunching “Big Data” to Drive 2012 Revenue Growth: The 5 Myths of Sales & Mar...Crunching “Big Data” to Drive 2012 Revenue Growth: The 5 Myths of Sales & Mar...
Crunching “Big Data” to Drive 2012 Revenue Growth: The 5 Myths of Sales & Mar...
 
Big data and bi best practices slidedeck
Big data and bi best practices slidedeckBig data and bi best practices slidedeck
Big data and bi best practices slidedeck
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
Kurukshetra - Big Data
Kurukshetra - Big DataKurukshetra - Big Data
Kurukshetra - Big Data
 

Mehr von Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mehr von Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Kürzlich hochgeladen

Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Webinar | Using Hadoop Analytics to Gain a Big Data Advantage

  • 1. Using Hadoop Analytics to Gain a Big Data Advantage Jonathan Seidman, Solution Architect, Cloudera Ian Fyfe, VP Product Marketing, Pentaho Jeff Stacey, Director of GTM Strategy, Channel & Sales Development, Dell
  • 2. Why big data matters to your business Jonathan Seidman, Cloudera 2 Confidential Big Data Solutions 2
  • 3. Explosive Data Growth 10,000 GIGABYTES OF DATA CREATED (IN BILLIONS) 1.8 trillion gigabytes of data was created in 2011… • More than 90% is unstructured data • Approx. 500 quadrillion files 5,000 • Quantity doubles every 2 years 0 2005 2010 2015 STRUCTURED DATA UNSTRUCTURED DATA Source: IDC 2011 3 Confidential Big Data Solutions
  • 4. The ‗Big Data‘ Phenomenon Big Data Drivers More Content More Devices • The proliferation of data capture and creation technologies • Increased ―interconnectedness‖ drives consumption (creating more data) More New & Better Consumption Information • Inexpensive storage makes it possible to keep more, longer • Innovative software and analysis tools turn data into information • Every gigabyte of stored content can generate Big Data encompasses not a petabyte or more of transient data* only the content itself, but how it’s consumed • The information about you is much greater than the information you create *Source: IDC 2011 4 Confidential Big Data Solutions
  • 5. The Opportunity: Quickly gain a competitive advantage Use Cases • Big opportunity to drive • Ecommerce – Predict revenue, e.g. customer behavior across – Predict customer behavior all channels to drive across all channels (Web revenue site, social media, email, etc.) • E-gaming – understand – Understand and monetize and better monetize customer behavior customer behavior – Predict customer churn • Networks – predict failure, neutralize attacks to reduce • Big opportunity to reduce costs costs, e.g. • Customers – predict churn, – Networks – predict optimize revenue failure, neutralize attacks • Machines/sensors – – Machines/sensors – predict predict failures, reduce failures costs – Financial risk management – • Financial risk reduce fraud, increase security management – reduce fraud, increase security 5 Confidential Big Data Solutions
  • 6. Big data challenges Ian Fyfe, Pentaho 6 Confidential Big Data Solutions 6
  • 7. Big Data Challenges Cost-effectively managing the volume, velocity and variety of data Deriving value across structured and unstructured data Adapting to context changes and integrating new data sources and types 7 Confidential Big Data Solutions
  • 8. The Current Solutions 10,000 GIGABYTES OF DATA CREATED (IN BILLIONS) Current Database Solutions are designed for structured data. • Optimized to answer known questions quickly 5,000 • Schemas dictate form/context • Difficult to adapt to new data types and new questions • Expensive at petabyte scale 0 10% 2005 2010 2015 STRUCTURED DATA UNSTRUCTURED DATA 8 Confidential Big Data Solutions
  • 9. Common Data Analytics Architecture Offline data can‘t be analyzed easily TAPE ARCHIVE Can‘t explore original BI REPORTS & high fidelity data INTERACTIVE APPS STORAGE ONLY RDBMS GRID ETL COMPUTE GRID (AGGREGATED (ORIGINAL RAW DATA) DATA) Moving data to compute doesn‘t scale DATA COLLECTION DATA SOURCES 9 Confidential Big Data Solutions
  • 10. Leveraging big data for competitive advantage All 10 Confidential Big Data Solutions
  • 11. Success With Hadoop 11 Confidential Big Data Solutions
  • 12. Big Data Analytics at TravelTainment Multi-channel distribution platform for the travel industry Pentaho Business Analytics fits perfectly into our open source Big Data environment.‖ -- Ibrahim Husseini, Director of Data Warehouse, TravelTainment • Business challenge: Inefficient and time consuming reporting capabilities on big data sets with legacy system. Benefits Why Pentaho • Ability to visualize its very large data volumes for reporting and analysis in such a way that non-technical users can also easily • Capability to analyze data from Hadoop understand them and Hive • Professional support for in-depth • Can now run complex reports three times faster and with more analysis flexibility than before • Self-service analysis and reporting for • For the first time can offer clients user-friendly, self-service and business customers ad-hoc reporting services helping IT focus on their main business and not serve as support desk for reporting • Cost effective solution 12 Confidential Big Data Solutions 1
  • 13. Dell uncovers new insights and reduces IT costs by US$35 million with a business intelligence solution designed for big data Accelerated customer shipment time by 33 percent Dell Saved US$2 million by improving product quality Business Integrated data silos Intelligence Practice Reduced IT costs by US$35 million Increased agility 13 Confidential Big Data Solutions
  • 14. SecureWorks slashes the cost of storage Organization with Dell | Cloudera SecureWorks is a true security partner to help protect your IT assets, comply Solution with regulations and reduce costs — without having to build your internal security expertise from scratch. Challenge SecureWorks needed a highly scalable solution for collecting, processing, and analyzing massive amounts of data collected from customer environments. “Our storage cost per gigabyte is 23 cents. Solution We thought we had The organization deployed the Dell™ | Cloudera® Solution with Cloudera‘s great economics distribution of Apache® Hadoop® software, Dell-developed Crowbar software previously when we framework, PowerEdge™ C2100 servers, Force10 switches, Dell and were spending about Cloudera services in a solution based on a Dell reference architecture. seventeen dollars per gigabyte.” Benefits • Reduced the cost of data storage to 23 cents per/gigabyte • Gained easy scalability for future growth Robert Scudiere, Director of Engineering, Dell SecureWorks • Leveraged open source software and commodity hardware to reduce time to market • Maintain high availability for critical services and flexibility to analyze structured and unstructured data Read the case study Watch the case study video 14 Confidential Big Data Solutions
  • 15. Must-haves of an effective big data solution Jeff Stacey, Dell 16, then 24 to close & Jonathan Seidman, Cloudera 15 Confidential Big Data Solutions 1
  • 16. Big Data Solution Requirements Cost-effectively manage the volume, variety and velocity of data Process and analyze large, complex data sets…quickly Flexibly adapt to context changes and new data types 16 Confidential Big Data Solutions
  • 17. Why was Hadoop created? Dramatic changes in Exploding data volumes & types LEADS TO enterprise data management With Hadoop, you can… • Extract more value DIGITAL CONTENT • From more data • More cost effectively NEW • With greater flexibility OPERATIONAL OPPORTUNITY WEB DATA LOGS SOCIAL MEDIA • Deep analysis FILES SMART GRIDS • Exhaustive and detailed HARD • Sophisticated algorithms PROBLEMS • Quick results TRANSACTIONAL DATA AD IMPRESSIONS • Any kind R&D • From any source DATA • Structured and unstructured BIG DATA • At scale It’s difficult to handle data this diverse at this scale. Traditional platforms can’t keep pace. 17 Confidential Big Data Solutions
  • 18. What is Apache Hadoop? CORE HADOOP COMPONENTS Hadoop is a platform for data storage and processing that is… Hadoop Distributed File MapReduce  Scalable System (HDFS)  Fault tolerant File sharing and data Distributed computing  Open source protection across physical across physical servers servers Consolidates Excels at Scales everything complex analysis economically • Scale-out architecture divides • Can be deployed on • A single repository for storing workloads across multiple commodity hardware and mining any type of data nodes • Not bound by a single schema • Open source platform • Flexible file system eliminates guards against vendor ETL bottlenecks lock 18 Confidential Big Data Solutions
  • 19. Core Hadoop: HDFS Self-healing, high bandwidth CLUSTERED STORAGE 1 2 HDFS 3 2 1 1 1 2 4 4 3 5 3 3 5 5 4 2 4 5 HDFS breaks incoming files into blocks and stores them redundantly across the cluster 19 Confidential Big Data Solutions
  • 20. Core Hadoop: MapReduce Framework for DISTRIBUTED COMPUTING 1 2 MR 3 2 1 1 1 2 4 4 3 5 3 3 5 5 4 2 4 5 Processes many jobs in parallel across many nodes and combines the results 20 Confidential Big Data Solutions
  • 21. Major Hadoop Utilities Apache Pig High-level language for expressing data Apache Hive analysis programs Apache HBase SQL-like language and metadata repository The Hadoop database. Random, real -time read/write access Hue Apache Zookeeper Browser-based desktop interface for Highly reliable interacting with distributed Hadoop coordination service Oozie Flume Server-based workflow engine for Distributed service for Hadoop activities collecting and aggregating log and event data Sqoop Apache Whirr Integrating Hadoop with RDBMS Library for running Hadoop in the cloud 21 Confidential Big Data Solutions
  • 22. Hadoop in Production 22 Confidential Big Data Solutions
  • 23. The unrivaled leader in Hadoop • Worldwide #1 distribution of Apache Hadoop • 100% Open-Source Hadoop Distribution • Largest contributor to the open source Hadoop ecosystem – Project founders from 8 of the 13 leading Apache Projects • Cloudera has more Apache committers on staff than any other company • More than 100 enterprise & public sector customers across a wide variety of industries 23 Confidential Big Data Solutions
  • 24. Dell | Cloudera Solution with Pentaho Dell Value  Business intelligence practice  Open & scalable infrastructure  Certified and tested platforms  Active community participation  Crowbar deployment tool  Reference Architecture  Deployment Guide & Services  Joint support with Cloudera  Actual customers 24 Confidential Big Data Solutions
  • 25. Industry first: PowerEdge C8000 Mix and match for the ultimate performance in a dense 4U package • Speed up your most resource-intensive workloads by mixing and matching compute, storage and/or GPU nodes in the same 4U shared infrastructure chassis • Get the cores, memory and I/O expansion you need for peak workload performance Great for: Big Data, Web 2.0/Hosting, HPC Get faster results with Mix & Match Do more with less more compute power • Mix compute, storage and GPUs in the same 4U • Intel Xeon ES-2600 • Shared infrastructure chassis processors boost reduces power & cooling performance by 80% costs by ~20% • More workload flexibility, HD & I/O options than the HP • Up to 135W support • Refresh with the latest components without having SL6500 or Super Micro • 2x the I/O bandwidth with 6047R to replace the entire chassis PCI Express Gen3 25 Confidential Big Data Solutions
  • 26. • Visual design for Hadoop • Reduces skills requirements • Deep integration with Hadoop – HDFS, MapReduce, Sqoop, Oo zie – Runs as MapReduce in-Hadoop Reporting & Data Discovery Predictive • Easily connects Hadoop to Dashboards Visualization Analytics other enterprise data sources • Broadens Hadoop use to data analysts, business users and IT Data Ingestion, Man ipulation, Integ ration, Workflo w 26 Confidential Big Data Solutions
  • 27. Fast Visual Development for Hadoop Ingestion / Manipulation / Integration Scheduling Modeling 27 Confidential Big Data Solutions 2
  • 28. Discovery > Proof of Value > Deployment 28 Confidential Big Data Solutions
  • 29. Summary Dell | Cloudera Solution with Pentaho  Cost-effectively managing the volume, velocity and variety of data  Derive value across structured and unstructured data  Rapidly adapt to context changes and integrating new data sources and types 29 Confidential Big Data Solutions
  • 30. Q&A Ian Fyfe, Pentaho 30 Confidential Big Data Solutions
  • 31. Start getting big insights Jonathan Seidman, Cloudera jseidman@cloudera.com www.cloudera.com Ian Fyfe, Pentaho ifyfe@pentaho.com www.pentaho.com Jeff Stacey, Dell Hadoop@dell.com www.dell.com/hadoop 31 Confidential Big Data Solutions

Hinweis der Redaktion

  1. Come up with something new – to the point what they are looking for.Start with some stories. How real firms used our products, had a problem, solved it.Shows how they can’t solve problem with the tools they have.
  2. Business users asking more sophisticated questionsExplore data in more detailCombine a variety of dataExtract actionable information and insight from it quicklyTraditional “big data” solutionsExtremely expensiveand/orNot enough detail
  3. TravelTainment, a provider of multi-channel distribution platforms for the travel industry, is using Pentaho Business Analytics for self-service analytics and reporting in a Big Data environment. With the continually booming online travel market, TravelTainment’s different clients required more insight into its data to help them plan promotions and other services. Before Pentaho, the company had acquired a set of legacy systems that had grown around individual products with limited reporting capabilities. As a result, reporting was inefficient and time consuming for IT. When TravelTainment decided to standardize on a single customer-focused reporting application, it chose Pentaho Business Analytics for the solution’s self-service reporting and ability to manage Big Data sets. Pentaho Reporting enables TravelTainment to run reports three times faster and with more flexibility than before. TravelTainment can now, for the first time, offer its clients user-friendly, self-service and ad-hoc reporting services. This also means that TravelTainment’s developer team can now fully concentrate on its main business, rather than having to serve as a support desk for reporting. With the success of this implementation, TravelTainment now plans to evaluate using Pentaho Data Integration (PDI) to move its data in and out of Hadoop.
  4. http://content.dell.com/us/en/enterprise/d/corporate~case-studies~en/Documents~2011-dell-bi-11003262.pdf.aspxBusiness needWith explosive data growth and the proliferation of data silos, Dell spent millions on data management without monetizing information. It needed to integrate enterprise data to improve information accuracy, cut costs, and uncover actionable insights. SolutionDell Enterprise Business Intelligence (EBI) consultants helped design and deploy an integrated, global enterprise data warehouse solution, combining Teradata, Informatica, and other BI software with new and existing Dell infrastructure components.Benefits• Accelerated customer shipment time by 33 percent and decreased the shipment backlog • Saved US$2 million by improving product quality and avoiding component replacements• Integrated data silos, offering an enterprise-wide view of information while reducing IT costs by US$35 million• Increased agility by providing information workers with self-service capabilities for accessing certified global data
  5. Introducing the four products that make up the PowerEdge C8000 series:The PowerEdge C8000 4U shared infrastructure chassisThe PowerEdge C8220 single-wide compute sledThe PowerEdge C8220x double-wide GPU sledThe PowerEdge C8000x double-wide storage sled The PowerEdge C8000 chassis holds up to 8 single-wide compute sleds or 4double-wide compute sleds. Each compute sled is equivalent to a standard server built with a processor(s), memory, network interface, baseboard management controller, and local hard drive storage. The C8000 will only be the only 4U Shared Infrastructure on the market that gives customers compute, GPU, and storage options in one chassis with the ability for internal or external power. Zeus delivers the greatest amount configuration flexibility and front-side serviceability. Zeus’ flexibility allows customers to standardize on a single architecture. By using the same common chassis design for a variety of configurations, the PowerEdge C 8000 series can be scaled out, just like a versatile Lego block.  The advantages of Zeus:By using the same basic building block over and over again, our customers can get the performance they need, with less deployment and maintenance time needed. This efficient use of IT resources plus the shared infrastructure savings help lower the total cost of ownership. Technology refresh cycles can be staggered to further reduce the total cost of ownership over several years.
  6. Emphasize results they can achieve! Go back to customer case studies.