SlideShare a Scribd company logo
1 of 47
Hortonworks
“State of the Union”
Shaun Connolly, VP Strategy
@shaunconnolly, @hortonworks

January 22, 2013




© Hortonworks Inc. 2013        Page 1
Quick House Keeping Rule

• Q&A panel is available if you have any questions during
 the webinar
• There will be time for Q&A at the end
• We will record the webinar for future viewing
• All attendees will receive a cope of the slides an recording




                                                            Page 2
      © Hortonworks Inc. 2013
Hortonworks
• History of Apache Hadoop & Hortonworks’ Role
  – Genesis of Apache Hadoop
  – Role of Apache Software Foundation
  – Hortonworks Process for “Enterprise Hadoop”


• Key Areas of Focus in 2012

• The Road Ahead for Enterprise Hadoop




                                                  Page 3
     © Hortonworks Inc. 2013
A Brief History of Apache Hadoop

                 Apache Project          Yahoo!                Hortonworks
                  Established        Operate at scale          Data Platform

                                                                                    2013
   2004                   2006        2008              2010         2012        Enterprise
                                                                                  Hadoop
2005: Yahoo! creates
 team under E14 to                                         Focus on INNOVATION
  work on Hadoop




                                                                                      Page 4
           © Hortonworks Inc. 2013
A Brief History of Apache Hadoop

                 Apache Project            Yahoo!                Hortonworks
                  Established          Operate at scale          Data Platform

                                                                                       2013
   2004                   2006           2008             2010         2012         Enterprise
                                                                                     Hadoop
2005: Yahoo! creates
 team under E14 to                                            Focus on INNOVATION
  work on Hadoop

                          2007: Yahoo team extends focus to
                            operations to support multiple    Focus on OPERATIONS
                             projects & growing clusters




                                                                                         Page 5
           © Hortonworks Inc. 2013
A Brief History of Apache Hadoop

                 Apache Project            Yahoo!                  Hortonworks
                  Established          Operate at scale            Data Platform

                                                                                             2013
   2004                   2006           2008             2010            2012            Enterprise
                                                                                           Hadoop
2005: Yahoo! creates
 team under E14 to                                             Focus on INNOVATION
  work on Hadoop

                          2007: Yahoo team extends focus to
                            operations to support multiple    Focus on OPERATIONS
                             projects & growing clusters


                                      2011: Hortonworks created to focus on
                                       “Enterprise Hadoop“. Starts with 24    STABILITY
                                        key Hadoop engineers from Yahoo



                                                                                               Page 6
           © Hortonworks Inc. 2013
Hortonworks Snapshot

                                         We develop, distribute and support
                                         the ONLY 100% open source
 Headquarters: Palo Alto, CA
 Employees: 180+ and growing
                                         Enterprise Hadoop distribution
 Investors: Benchmark, Index, Yahoo



Develop                                     Distribute                      Support
• We employ the core                   • We distribute the only 100%   • We are uniquely positioned
  architects, builders and               Open Source Enterprise          to deliver the highest quality
  operators of Apache Hadoop             Hadoop Distribution:            of Hadoop support
                                         Hortonworks Data
• We drive innovation within             Platform                      • We enable the ecosystem to
  Apache Software                                                        work better with Hadoop
  Foundation projects                  • We engineer, test & certify
                                         HDP for enterprise usage

Endorsed by Strategic Partners




                                                                                                    Page 7
             © Hortonworks Inc. 2013
Apache Community Leadership
                                                                  Apache Software Foundation
                  Test &                                          Guiding Principles
                  Patch                             Release
                                       Apache                     • Release early & often
                                       Hadoop
                                                                  • Transparency, respect, meritocracy
                              Design & Develop




“We have noticed more activity over the last year
 from Hortonworks’ engineers on building out
 Apache Hadoop’s more innovative features. These
 include YARN, Ambari and HCatalog..”
                                          - Jeff Kelly: Wikibon


                                                                                                  Page 8
             © Hortonworks Inc. 2013
Apache Community Leadership
     Apache
                                                                     Apache Software Foundation
       Pig          Test &                                           Guiding Principles
                    Patch                              Release
                                         Apache                      • Release early & often
                                         Hadoop
              Apache                                                 • Transparency, respect, meritocracy
               Hive
                                Design & Develop

                                   Apache
   Apache                          HCatalo
   HBase                             g

                                                      Apache
                   Other                              Ambari
                  Apache
                  Projects




“We have noticed more activity over the last year
 from Hortonworks’ engineers on building out
 Apache Hadoop’s more innovative features. These
 include YARN, Ambari and HCatalog..”
                                             - Jeff Kelly: Wikibon


                                                                                                     Page 9
               © Hortonworks Inc. 2013
Apache Community Leadership
     Apache
                                                                     Apache Software Foundation
       Pig          Test &                                           Guiding Principles
                    Patch                              Release
                                         Apache                      • Release early & often
                                         Hadoop
              Apache                                                 • Transparency, respect, meritocracy
               Hive
                                Design & Develop

                                   Apache
                                                                     Key Roles held by Hortonworkers
   Apache
   HBase
                                   HCatalo
                                     g
                                                                     • PMC Members
                                                                        – Managing community projects
                                                      Apache
                                                      Ambari
                                                                        – Mentoring new incubator projects
                   Other
                  Apache                                                – About 20 Hortonworkers managing community
                  Projects

                                                                     • Committers
                                                                        – Authoring, reviewing & editing code
                                                                        – About 50 Hortonworkers across projects
“We have noticed more activity over the last year
 from Hortonworks’ engineers on building out
 Apache Hadoop’s more innovative features. These                     • Release Managers
 include YARN, Ambari and HCatalog..”                                   – Testing & releasing projects
                                                                        – Hortonworkers across key projects like
                                             - Jeff Kelly: Wikibon        Hadoop, Hive, Pig, HCatalog, Ambari, HBase

                                                                                                                   Page 10
               © Hortonworks Inc. 2013
Hortonworks Process for Enterprise Hadoop
Upstream Community Projects                               Downstream Enterprise Product




  Apache
    Pig          Test &
                 Patch                          Release
                                      Apache
                                      Hadoop
           Apache                                                 Hortonworks
            Hive
                              Design & Develop                    Data Platform

                                Apache
 Apache                         HCatalo
 HBase                            g

                                               Apache
               Other                           Ambari
              Apache
              Projects




                                                                                  Page 11
            © Hortonworks Inc. 2013
Hortonworks Process for Enterprise Hadoop
Upstream Community Projects                                  Downstream Enterprise Product

                                                                              Integrate
                                                                                & Test



  Apache                                                  Design &
    Pig          Test &
                 Patch                                    Develop
                                      Apache    Release                                   Package
                                      Hadoop                                              & Certify
           Apache                                                      Hortonworks
            Hive
                              Design & Develop                         Data Platform

                                Apache
 Apache                         HCatalo
 HBase                            g
                                                                     Distribute
                                               Apache
               Other                           Ambari
              Apache
              Projects




                                                                                          Page 12
            © Hortonworks Inc. 2013
Hortonworks Process for Enterprise Hadoop
Upstream Community Projects                                      Downstream Enterprise Product

    Virtuous Cycle: development & fixed issues done upstream & stable project releases flow downstream
                                                                                        Integrate
                                                                                          & Test

                                                        Fixed Issues


  Apache                                                    Design &
    Pig          Test &
                 Patch                                      Develop
                                      Apache    Release                                                          Package
                                      Hadoop                                                                     & Certify
           Apache                                  Stable Project               Hortonworks
            Hive                                   Releases
                              Design & Develop                                  Data Platform

                                Apache
 Apache                         HCatalo
 HBase                            g
                                                                             Distribute
                                               Apache
               Other                           Ambari
              Apache
              Projects
                                                                 No Lock-in: Integrated, tested & certified distribution
                                                               lowers risk by ensuring close alignment with Apache projects


                                                                                                                 Page 13
            © Hortonworks Inc. 2013
HDP Certifies Latest Stable Components

  Apache                       HDP                           CDH                           CDH
  Project                       1.2                          3u5                           4.1.2
  Hadoop                       1.1.2                020.2 +923.418               2.0.0alpha +541
     Pig                       0.10.1                   0.8.1 +51.39                   0.10.0 +48
    Hive                       0.10.0                   0.7.1 +42.56                   0.9.0 +148
  HCatalog                     0.5.0                          n/a                            n/a
   HBase                       0.94.2                  0.90.6 +84.73                  0.92.1 +154
   Sqoop                       1.4.2                     1.3.0 +5.88                    1.4.1 +51
    Oozie                      3.2.0                         3.2.0                         3.2.0
  Zookeeper                    3.4.5                     3.3.5 +19.5                    3.4.3 +25
   Ambari                      1.2.0                          n/a                            n/a
   Flume                       1.3.0                    0.9.4 +25.46                   1.2.0 +119
   Mahout                      0.7.0                       0.5 +9.7                       0.7 +4


                                        Source: http://files.cloudera.com/pdf/datasheet/cdh4.1_spec_sheet.pdf

                                                                                                     Page 14
     © Hortonworks Inc. 2013
True Enterprise Class Open Source
• 100% Open Source. No Holdbacks.
  – Only true implementation of OSS Apache Hadoop
  – Preferred by the software vendors that you rely on


• Flexible Deployment
  – No License Fee for usage


• Community Open Source Mitigates Lock-In
  – Proprietary Open Source = Lock-In
  – Open communities always trump “open source”




                                                         Page 15
      © Hortonworks Inc. 2013
Hortonworks
• History of Apache Hadoop & Hortonworks’ Role

• Key Areas of Focus in 2012
  – Addressing “Enterprise Hadoop” Requirements
  – Enabling Interoperability of the Ecosystem


• The Road Ahead for Enterprise Hadoop




                                                  Page 16
     © Hortonworks Inc. 2013
HDP: Enterprise Hadoop Distribution
                                                                   Hortonworks
                                                                   Data Platform (HDP)
                                                                   Enterprise Hadoop
                                                                   • The ONLY 100% open source
                                WEBHDFS
                                Distributed    MAP REDUCE            and complete distribution
  HADOOP CORE                   Storage & Processing (in 2.0)
                                 HDFS          YARN
                                     Enterprise Readiness:
  PLATFORM SERVICES                  HA, DR, Snapshots, Security   • Enterprise grade, proven and
                                     ,…
                                                                     tested at scale

                                                                   • Ecosystem endorsed to
                                                                     ensure interoperability




                                                                                               Page 17
      © Hortonworks Inc. 2013
HDP: Enterprise Hadoop Distribution

                                           DATA                    Hortonworks
                                         SERVICES
                                                                   Data Platform (HDP)
                                FLUME Store, Proces
                                         PIG     HIVE
                                      s and Access        HBASE    Enterprise Hadoop
                                SQOOP     Data
                                          HCATALOG
                                                                   • The ONLY 100% open source
                                WEBHDFS
                                Distributed    MAP REDUCE            and complete distribution
  HADOOP CORE                   Storage & Processing (in 2.0)
                                 HDFS          YARN
                                     Enterprise Readiness:
  PLATFORM SERVICES                  HA, DR, Snapshots, Security   • Enterprise grade, proven and
                                     ,…
                                                                     tested at scale

                                                                   • Ecosystem endorsed to
                                                                     ensure interoperability




                                                                                               Page 18
      © Hortonworks Inc. 2013
HDP: Enterprise Hadoop Distribution

 OPERATIONAL                               DATA                   Hortonworks
   SERVICES                              SERVICES
                                                                  Data Platform (HDP)
   Manage &
    AMBARI                      FLUME Store, Proces
                                         PIG     HIVE
   Operate at                         s and Access        HBASE   Enterprise Hadoop
     Scale                      SQOOP     Data
     OOZIE                                HCATALOG
                                                                  • The ONLY 100% open source
                                WEBHDFS
                                Distributed    MAP REDUCE           and complete distribution
  HADOOP CORE                   Storage & Processing (in 2.0)
                                 HDFS          YARN


  PLATFORM SERVICES                  Enterprise Readiness: HA,
                                     DR, Snapshots, Security, …
                                                                  • Enterprise grade, proven and
                                                                    tested at scale

                                                                  • Ecosystem endorsed to
                                                                    ensure interoperability




                                                                                              Page 19
      © Hortonworks Inc. 2013
HDP: Enterprise Hadoop Distribution

 OPERATIONAL                               DATA                    Hortonworks
   SERVICES                              SERVICES
                                                                   Data Platform (HDP)
   Manage &
    AMBARI                      FLUME Store, Proces
                                         PIG     HIVE
   Operate at                         s and Access        HBASE    Enterprise Hadoop
     Scale                      SQOOP     Data
     OOZIE                                HCATALOG
                                                                   • The ONLY 100% open source
                                WEBHDFS
                                Distributed    MAP REDUCE            and complete distribution
  HADOOP CORE                   Storage & Processing (in 2.0)
                                 HDFS          YARN
                                     Enterprise Readiness:
  PLATFORM SERVICES                  HA, DR, Snapshots, Security   • Enterprise grade, proven and
                                     ,…
                                                                     tested at scale
                                HORTONWORKS
                                DATA PLATFORM (HDP)                • Ecosystem endorsed to
                                                                     ensure interoperability




                                                                                               Page 20
      © Hortonworks Inc. 2013
HDP: Enterprise Hadoop Distribution

 OPERATIONAL                                 DATA                    Hortonworks
   SERVICES                                SERVICES
                                                                     Data Platform (HDP)
   Manage &
    AMBARI                        FLUME Store, Proces
                                           PIG     HIVE
   Operate at                           s and Access        HBASE    Enterprise Hadoop
     Scale                        SQOOP     Data
     OOZIE                                  HCATALOG
                                                                     • The ONLY 100% open source
                                  WEBHDFS
                                  Distributed    MAP REDUCE            and complete distribution
  HADOOP CORE                     Storage & Processing (in 2.0)
                                   HDFS          YARN
                                       Enterprise Readiness:
  PLATFORM SERVICES                    HA, DR, Snapshots, Security   • Enterprise grade, proven and
                                       ,…
                                                                       tested at scale
                                  HORTONWORKS
                                  DATA PLATFORM (HDP)                • Ecosystem endorsed to
                                                                       ensure interoperability
   OS                 Cloud                 VM          Appliance




                                                                                                 Page 21
        © Hortonworks Inc. 2013
Latest Hortonworks Announcements
Two releases in January 2013


  JANUARY                 Hortonworks Data Platform 1.2
                          Hortonworks Brings Enterprise Manageability to 100%
                          Open Source Apache Hadoop Distribution
    15


  JANUARY                 Hortonworks Sandbox
                          Hortonworks accelerates Hadoop skills development
                          with an easy-to-use, flexible and extensible platform to
    22                    learn, evaluate and use Apache Hadoop


                                                                               Page 22
     © Hortonworks Inc. 2013
HDP 1.2 Summary
Hortonworks Data Platform 1.2
HDP outpaces the competition to extend leadership through 100%
open source Enterprise Apache Hadoop

Focus areas:
 • Ambari: continued innovation with a complete, free and open
   cluster management tool
      •        Provision, Manage and Monitor your Hadoop infrastructure
      •        Job diagnostics, usage heat maps
      •        Ecosystem integration
 • Enhanced security model for Hive and HCatalog
 • Performance and operational enhancements for HBase
 • Extended Full Stack HA to Hive & HCatalog Metastore


                                                                          Page 23
          © Hortonworks Inc. 2013
HDP 1.2: Ambari Key Features
                                     • Job Diagnostics
                                       Visualize and troubleshoot Hadoop
                                       job execution and performance

                                     • Cluster History
                                       View historical job execution &
                                       performance

                                     • Instant Insight
                                       View health of Core Hadoop
                                       (HDFS, MapReduce) and related
                                       projects

                                     • Cluster Navigation
                                       “Quick link” buttons jump into
                                       namenode web UI for a server

                                     • REST interface
                                       provides external access to Ambari
                                       for existing tools. Facilitates
Apache Ambari Dashboard                integration with Microsoft System
                                       Center and Teradata Viewpoint




                                                                         Page 24
           © Hortonworks Inc. 2013
0 to Big Data in 15 Minutes




Hands on tutorials
 integrated into                   HDP environment for
    Sandbox                            evaluation


                                                         Page 25
         © Hortonworks Inc. 2013
Hortonworks
• History of Apache Hadoop & Hortonworks’ Role

• Key Areas of Focus in 2012
  – Addressing “Enterprise Hadoop” Requirements
  – Enabling Interoperability of the Ecosystem


• The Road Ahead for Enterprise Hadoop




                                                  Page 26
     © Hortonworks Inc. 2013
Traditional Data Architecture
APPLICATIONS




                    Business                       Custom        Enterprise
                    Analytics                    Applications   Applications
                                                                               DEV & DATA
                                                                                 TOOLS
                                                                                 BUILD &
                                                                                  TEST
DATA SYSTEMS




                                                                               OPERATIONAL
                                                                                  TOOLS
                                                                                MANAGE &
                                                                                MONITOR
                 RDBMS       EDW           MPP
                       TRADITIONAL REPOS
DATA SOURCES




                   Traditional Sources
               OLTP, PO(RDBMS, OLTP, OLAP)
                  S
               SYSTEMS




                                                                                             Page 27
                     © Hortonworks Inc. 2013
Next-Generation Data Architecture
APPLICATIONS




                    Business                       Custom                   Enterprise
                    Analytics                    Applications              Applications
                                                                                                      DEV & DATA
                                                                                                        TOOLS
                                                                                                        BUILD &
                                                                                                         TEST
DATA SYSTEMS




                                                                                                      OPERATIONAL
                                                                                                         TOOLS
                                                                          HORTONWORKS                  MANAGE &
                                                                          DATA PLATFORM                MONITOR
                 RDBMS       EDW           MPP
                       TRADITIONAL REPOS
DATA SOURCES




                   Traditional Sources                             New Sources
               OLTP, PO(RDBMS, OLTP, OLAP)            (web logs, email, sensor data, social media)
                                                                                             MOBILE
                  S                                                                          DATA
               SYSTEMS




                                                                                                                    Page 28
                     © Hortonworks Inc. 2013
Interoperating With Your Tools
APPLICATIONS




                       Microsoft Applications
                                                                                                DEV & DATA
                                                                                                  TOOLS
DATA SYSTEMS




                                                                                                OPERATIONAL
                                                                                                   TOOLS
                                                                    HORTONWORKS
                                                                    DATA PLATFORM
                      TRADITIONAL REPOS                                                              Viewpoint
DATA SOURCES




                   Traditional Sources                       New Sources
               OLTP, PO(RDBMS, OLTP, OLAP)      (web logs, email, sensor data, social media)
                                                                                       MOBILE
                  S                                                                    DATA
               SYSTEMS




                                                                                                                 Page 29
                     © Hortonworks Inc. 2013
Hortonworks & Teradata
• Unified Data Architecture
   – The right technology on the right analytical problems using best of breed technologies

                                                           • Viewpoint Integration
                                                               – Common management console
                                                                 for Aster, Teradata and Apache
                                                                 Hadoop
                                                           • TVI: Teradata Vital
                                                             Infrastructure
                                                               – Proactive
                                                                 reliability, availability, and
                                                                 manageability support service
                                                           • Aster Connector for Hadoop
                                                               – SQL-H integration
                                                           • Teradata Connector for Hadoop
                                                               – Sqoop integration
                                                           • Pre-tuned HDFS and
                                       HORTONWORKS           MapReduce parameters for Big
                                   DISTRIBUTION PLATFORM     Data workloads
                                  Big Data Management




                                                                                         Page 30
       © Hortonworks Inc. 2013
Hortonworks & Microsoft
                                Microsoft Brings Big Data to the Masses


HDInsight                      • Simplifies Hadoop, Enterprise Ready
                               • Hortonworks Data Platform used for
                                 Hadoop on Windows Server and Azure


+                              • An engineered, open source solution
                                  – Hadoop engineered for Windows
• Excel                           – Hadoop powered Microsoft business tools
• PowerPivot (BI)                 – Ops integration with MS System Center
• PowerView                       – Bidirectional connectors for SQL Server
  (visualization)                 – Support for Hyper-V, deploy Hadoop on VMs
                                  – Opens the .NET developer community to Hadoop
                                  – Deploy on Azure in 10 minutes


                                                                            Page 31
     © Hortonworks Inc. 2013
Hortonworks
• History of Apache Hadoop & Hortonworks’ Role

• Key Areas of Focus in 2012

• The Road Ahead for Enterprise Hadoop
  – Patterns of Use
  – Key Areas of Investment




                                                 Page 32
     © Hortonworks Inc. 2013
Market Transitioning into Early Majority
                                                                           Enterprise adoption accelerates via:
                                                                           • Repeatable horizontal patterns of use
 relative %
customers




                                                                           • Ecosystem-driven pull market
                                                                           • Vertical applications (aka bowling pins)



                                              The CHASM
        Innovators, t              Early                    Early               Late
                                                                                                   Laggards, Ske
         echnology               adopters,                majority, pr     majority, conse
                                                                                                       ptics
         enthusiasts            visionaries               agmatists           rvatives




                                                                                                                           time
                  Customers want                                             Customers want
              technology & performance                                   solutions & convenience

                                                                                              Source: Geoffrey Moore - Crossing the Chasm



                                                                                                                                  Page 33
                 © Hortonworks Inc. 2013
Patterns of Use: “Right-time Access”
                                   Business Case


                               Batch          Interactive       Online



                              Refine        Explore           Enrich


                                         HORTONWORKS
                                         DATA PLATFORM



                                            Big Data
                               Transactions, Interactions, Observations




                                                                          Page 34
    © Hortonworks Inc. 2013
Being Big Data Driven at Neustar
                                        Create new business opportunities and save money
                                        with information analytics
• Provides real-time
  information and
  analysis to
                                        • Traditional business heavy in data capture and data
  Internet, telecommunic                  movement.
  ations, entertainment
  and marketing                            – Aggregate data for industries as information exchange
  industries throughout                    – For instance they used to store 1% of DNS data for 60 days
  the world.
                                             to bill customers and identify DDOS attacks – With Hadoop
• Started off focused on                     they now store 100% over a year
  # porting for carriers
                                           – Not economically feasible to use existing DW for new data


• 2500+ Employees                       • Eliminated politics with creation of “catch basin”
                                           – Year 1: Use Hadoop to capture everything they used to throw
                                             away while leaving existing systems in tact
                                           – Year 2: Make this data available for new business
                                             opportunities, but require the business to justify




                                                                                                   Page 35
              © Hortonworks Inc. 2013
Customers Don’t Want More Data Silos

 AVOID: Systems separated by                  GOAL: Platform that natively
workload type due to contention                supports mixed workloads




                                              Batch         Interactive     Online
Refine            Explore          Enrich
                                            Refine         Explore          Enrich




Big                 Big            Big                   Big Data
Data                Data           Data     Transactions, Interactions, Observations




                                                                               Page 36
         © Hortonworks Inc. 2013
2013 “Enterprise Hadoop” Initiatives

                                                   Invest In:


                  OPERATIONA              DATA
                   L SERVICES           SERVICES



                                HADOOP CORE

                              PLATFORM SERVICES

                            HORTONWORKS
                         DATA PLATFORM (HDP)




                                                            Page 37
    © Hortonworks Inc. 2013
2013 “Enterprise Hadoop” Initiatives

                                                     Invest In:
                                                   –Platform Services

                  OPERATIONA              DATA
                   L SERVICES           SERVICES



                                HADOOP CORE

                              PLATFORM SERVICES

                            HORTONWORKS
                         DATA PLATFORM (HDP)




                      “Continuum”
                         Biz Continuity




                                                                  Page 38
    © Hortonworks Inc. 2013
2013 “Enterprise Hadoop” Initiatives

                                                                          Invest In:
                   Hive / “Stinger”
                      Interactive Query
                                                                      –Platform Services
                                                      HBase
                                                     Online Data
                                                                      –Data Services
                  OPERATIONA              DATA
                   L SERVICES           SERVICES



                                HADOOP CORE

                              PLATFORM SERVICES

                                                      “Herd”          –
                            HORTONWORKS
                         DATA PLATFORM (HDP)       Data Integration



                      “Continuum”
                         Biz Continuity




                                                                                       Page 39
    © Hortonworks Inc. 2013
2013 “Enterprise Hadoop” Initiatives

                                                                                Invest In:
                           Hive / “Stinger”
                              Interactive Query
                                                                              –Platform Services
   Ambari                                                     HBase
Manage & Operate                                             Online Data
                                                                              –Data Services
                          OPERATIONA              DATA
                           L SERVICES           SERVICES



                                        HADOOP CORE

                                      PLATFORM SERVICES

                                                              “Herd”          –Operational Services
   “Knox”                           HORTONWORKS

 Secure Access
                                 DATA PLATFORM (HDP)       Data Integration



                              “Continuum”
                                 Biz Continuity




                                                                                               Page 40
            © Hortonworks Inc. 2013
Top BI Vendors Support Hive Today




                                    Page 41
   © Hortonworks Inc. 2013
Stinger: Enhance Hive for BI Use Cases
 Enterprise Reports                                                       Parameterized Reports



                                          Dashboard / Scorecard




    Data Mining                                                               Visualization




                                              More SQL
                                                  &
                                         Better Performance

                                 Batch                      Interactive

                                                                                              Page 42
       © Hortonworks Inc. 2013
Our Focus Remains Unchanged
• Innovate Core Hadoop
  – Lead innovation within the Apache Hadoop community


• Enhance Hadoop for Enterprise Class Usage
  – Add platform, data, and operational services that enterprises need
  – Apply enterprise software rigor to test & release process


• Enable the Data Ecosystem
  – Leverage Hadoop to enable Partners to be successful


• All Open Source, All the Time
  – Avoid proprietary open source which locks you in


                                                                   Page 43
     © Hortonworks Inc. 2013
Next Steps
                             Download Hortonworks Sandbox
                             www.hortonworks.com/sandbox



                             Download Hortonworks Data Platform
                             www.hortonworks.com/download



                             Register for Enterprise Hadoop Series
                             www.hortonworks.com/webinars



                             Follow…
                             @shaunconnolly, @hortonworks


                                                                     Page 44
   © Hortonworks Inc. 2013
Power of Community is Key




    Amsterdam                                        San Jose, CA
 March 20 - 21, 2013                               June 26 - 27, 2013

    REGISTER NOW                                 CALL FOR PAPERS
http://hadoopsummit.org/amsterdam/register/   http://hadoopsummit.org/san-jose/call-for-papers/




         © Hortonworks Inc. 2013
Next Steps
                             Download Hortonworks Sandbox
                             www.hortonworks.com/sandbox



                             Download Hortonworks Data Platform
                             www.hortonworks.com/download



                             Register for Enterprise Hadoop Series
                             www.hortonworks.com/webinars



                             Follow…
                             @shaunconnolly, @hortonworks


                                                                     Page 46
   © Hortonworks Inc. 2013
Questions?
@shaunconnolly, @hortonworks




                               Page 47
    © Hortonworks Inc. 2013

More Related Content

What's hot

Hortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramHortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Hortonworks
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNHortonworks
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveHortonworks
 
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...Hortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopHortonworks
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalHortonworks
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseHortonworks
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopHortonworks
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Hortonworks
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHortonworks
 

What's hot (20)

Hortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data London
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
 
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 

Viewers also liked

Ambari Meetup: Ambari Futures
Ambari Meetup: Ambari FuturesAmbari Meetup: Ambari Futures
Ambari Meetup: Ambari FuturesHortonworks
 
VCCP Kin Production Director Chris Chaundler - Sustaining brand conversation
VCCP Kin Production Director Chris Chaundler - Sustaining brand conversationVCCP Kin Production Director Chris Chaundler - Sustaining brand conversation
VCCP Kin Production Director Chris Chaundler - Sustaining brand conversationThe_IPA
 
Demystify Big Data Breakfast Briefing - Juergen Urbanski, T-Systems
Demystify Big Data Breakfast Briefing - Juergen Urbanski, T-SystemsDemystify Big Data Breakfast Briefing - Juergen Urbanski, T-Systems
Demystify Big Data Breakfast Briefing - Juergen Urbanski, T-SystemsHortonworks
 
Apache Ambari - What's New in 1.4.2
Apache Ambari - What's New in 1.4.2Apache Ambari - What's New in 1.4.2
Apache Ambari - What's New in 1.4.2Hortonworks
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)mark madsen
 
Go Zero to Big Data in 15 Minutes with the Hortonworks Sandbox
Go Zero to Big Data in 15 Minutes with the Hortonworks SandboxGo Zero to Big Data in 15 Minutes with the Hortonworks Sandbox
Go Zero to Big Data in 15 Minutes with the Hortonworks SandboxHortonworks
 

Viewers also liked (6)

Ambari Meetup: Ambari Futures
Ambari Meetup: Ambari FuturesAmbari Meetup: Ambari Futures
Ambari Meetup: Ambari Futures
 
VCCP Kin Production Director Chris Chaundler - Sustaining brand conversation
VCCP Kin Production Director Chris Chaundler - Sustaining brand conversationVCCP Kin Production Director Chris Chaundler - Sustaining brand conversation
VCCP Kin Production Director Chris Chaundler - Sustaining brand conversation
 
Demystify Big Data Breakfast Briefing - Juergen Urbanski, T-Systems
Demystify Big Data Breakfast Briefing - Juergen Urbanski, T-SystemsDemystify Big Data Breakfast Briefing - Juergen Urbanski, T-Systems
Demystify Big Data Breakfast Briefing - Juergen Urbanski, T-Systems
 
Apache Ambari - What's New in 1.4.2
Apache Ambari - What's New in 1.4.2Apache Ambari - What's New in 1.4.2
Apache Ambari - What's New in 1.4.2
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)
 
Go Zero to Big Data in 15 Minutes with the Hortonworks Sandbox
Go Zero to Big Data in 15 Minutes with the Hortonworks SandboxGo Zero to Big Data in 15 Minutes with the Hortonworks Sandbox
Go Zero to Big Data in 15 Minutes with the Hortonworks Sandbox
 

Similar to State of the Union with Shaun Connolly

Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Hortonworks
 
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataMrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataPatrickCrompton
 
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks
 
OSDC 2013 | Introduction into Hadoop by Olivier Renault
OSDC 2013 | Introduction into Hadoop by Olivier RenaultOSDC 2013 | Introduction into Hadoop by Olivier Renault
OSDC 2013 | Introduction into Hadoop by Olivier RenaultNETWAYS
 
Apache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondApache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondDataWorks Summit
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?Hortonworks
 
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters Hortonworks
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Hortonworks
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Mac Moore
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemShivaji Dutta
 
Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformHortonworks
 
UK - Agile Data Applications on Hadoop
UK - Agile Data Applications on HadoopUK - Agile Data Applications on Hadoop
UK - Agile Data Applications on HadoopHortonworks
 
Introduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsIntroduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsDataWorks Summit
 

Similar to State of the Union with Shaun Connolly (20)

Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?
 
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataMrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big Data
 
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14
 
OSDC 2013 | Introduction into Hadoop by Olivier Renault
OSDC 2013 | Introduction into Hadoop by Olivier RenaultOSDC 2013 | Introduction into Hadoop by Olivier Renault
OSDC 2013 | Introduction into Hadoop by Olivier Renault
 
Apache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondApache Hadoop Now Next and Beyond
Apache Hadoop Now Next and Beyond
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
 
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
 
Inside hadoop-dev
Inside hadoop-devInside hadoop-dev
Inside hadoop-dev
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
 
Hadoop In Action
Hadoop In ActionHadoop In Action
Hadoop In Action
 
Munich HUG 21.11.2013
Munich HUG 21.11.2013Munich HUG 21.11.2013
Munich HUG 21.11.2013
 
Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data Platform
 
UK - Agile Data Applications on Hadoop
UK - Agile Data Applications on HadoopUK - Agile Data Applications on Hadoop
UK - Agile Data Applications on Hadoop
 
201305 hadoop jpl-v3
201305 hadoop jpl-v3201305 hadoop jpl-v3
201305 hadoop jpl-v3
 
Introduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsIntroduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI Tools
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 

Recently uploaded (20)

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 

State of the Union with Shaun Connolly

  • 1. Hortonworks “State of the Union” Shaun Connolly, VP Strategy @shaunconnolly, @hortonworks January 22, 2013 © Hortonworks Inc. 2013 Page 1
  • 2. Quick House Keeping Rule • Q&A panel is available if you have any questions during the webinar • There will be time for Q&A at the end • We will record the webinar for future viewing • All attendees will receive a cope of the slides an recording Page 2 © Hortonworks Inc. 2013
  • 3. Hortonworks • History of Apache Hadoop & Hortonworks’ Role – Genesis of Apache Hadoop – Role of Apache Software Foundation – Hortonworks Process for “Enterprise Hadoop” • Key Areas of Focus in 2012 • The Road Ahead for Enterprise Hadoop Page 3 © Hortonworks Inc. 2013
  • 4. A Brief History of Apache Hadoop Apache Project Yahoo! Hortonworks Established Operate at scale Data Platform 2013 2004 2006 2008 2010 2012 Enterprise Hadoop 2005: Yahoo! creates team under E14 to Focus on INNOVATION work on Hadoop Page 4 © Hortonworks Inc. 2013
  • 5. A Brief History of Apache Hadoop Apache Project Yahoo! Hortonworks Established Operate at scale Data Platform 2013 2004 2006 2008 2010 2012 Enterprise Hadoop 2005: Yahoo! creates team under E14 to Focus on INNOVATION work on Hadoop 2007: Yahoo team extends focus to operations to support multiple Focus on OPERATIONS projects & growing clusters Page 5 © Hortonworks Inc. 2013
  • 6. A Brief History of Apache Hadoop Apache Project Yahoo! Hortonworks Established Operate at scale Data Platform 2013 2004 2006 2008 2010 2012 Enterprise Hadoop 2005: Yahoo! creates team under E14 to Focus on INNOVATION work on Hadoop 2007: Yahoo team extends focus to operations to support multiple Focus on OPERATIONS projects & growing clusters 2011: Hortonworks created to focus on “Enterprise Hadoop“. Starts with 24 STABILITY key Hadoop engineers from Yahoo Page 6 © Hortonworks Inc. 2013
  • 7. Hortonworks Snapshot We develop, distribute and support the ONLY 100% open source Headquarters: Palo Alto, CA Employees: 180+ and growing Enterprise Hadoop distribution Investors: Benchmark, Index, Yahoo Develop Distribute Support • We employ the core • We distribute the only 100% • We are uniquely positioned architects, builders and Open Source Enterprise to deliver the highest quality operators of Apache Hadoop Hadoop Distribution: of Hadoop support Hortonworks Data • We drive innovation within Platform • We enable the ecosystem to Apache Software work better with Hadoop Foundation projects • We engineer, test & certify HDP for enterprise usage Endorsed by Strategic Partners Page 7 © Hortonworks Inc. 2013
  • 8. Apache Community Leadership Apache Software Foundation Test & Guiding Principles Patch Release Apache • Release early & often Hadoop • Transparency, respect, meritocracy Design & Develop “We have noticed more activity over the last year from Hortonworks’ engineers on building out Apache Hadoop’s more innovative features. These include YARN, Ambari and HCatalog..” - Jeff Kelly: Wikibon Page 8 © Hortonworks Inc. 2013
  • 9. Apache Community Leadership Apache Apache Software Foundation Pig Test & Guiding Principles Patch Release Apache • Release early & often Hadoop Apache • Transparency, respect, meritocracy Hive Design & Develop Apache Apache HCatalo HBase g Apache Other Ambari Apache Projects “We have noticed more activity over the last year from Hortonworks’ engineers on building out Apache Hadoop’s more innovative features. These include YARN, Ambari and HCatalog..” - Jeff Kelly: Wikibon Page 9 © Hortonworks Inc. 2013
  • 10. Apache Community Leadership Apache Apache Software Foundation Pig Test & Guiding Principles Patch Release Apache • Release early & often Hadoop Apache • Transparency, respect, meritocracy Hive Design & Develop Apache Key Roles held by Hortonworkers Apache HBase HCatalo g • PMC Members – Managing community projects Apache Ambari – Mentoring new incubator projects Other Apache – About 20 Hortonworkers managing community Projects • Committers – Authoring, reviewing & editing code – About 50 Hortonworkers across projects “We have noticed more activity over the last year from Hortonworks’ engineers on building out Apache Hadoop’s more innovative features. These • Release Managers include YARN, Ambari and HCatalog..” – Testing & releasing projects – Hortonworkers across key projects like - Jeff Kelly: Wikibon Hadoop, Hive, Pig, HCatalog, Ambari, HBase Page 10 © Hortonworks Inc. 2013
  • 11. Hortonworks Process for Enterprise Hadoop Upstream Community Projects Downstream Enterprise Product Apache Pig Test & Patch Release Apache Hadoop Apache Hortonworks Hive Design & Develop Data Platform Apache Apache HCatalo HBase g Apache Other Ambari Apache Projects Page 11 © Hortonworks Inc. 2013
  • 12. Hortonworks Process for Enterprise Hadoop Upstream Community Projects Downstream Enterprise Product Integrate & Test Apache Design & Pig Test & Patch Develop Apache Release Package Hadoop & Certify Apache Hortonworks Hive Design & Develop Data Platform Apache Apache HCatalo HBase g Distribute Apache Other Ambari Apache Projects Page 12 © Hortonworks Inc. 2013
  • 13. Hortonworks Process for Enterprise Hadoop Upstream Community Projects Downstream Enterprise Product Virtuous Cycle: development & fixed issues done upstream & stable project releases flow downstream Integrate & Test Fixed Issues Apache Design & Pig Test & Patch Develop Apache Release Package Hadoop & Certify Apache Stable Project Hortonworks Hive Releases Design & Develop Data Platform Apache Apache HCatalo HBase g Distribute Apache Other Ambari Apache Projects No Lock-in: Integrated, tested & certified distribution lowers risk by ensuring close alignment with Apache projects Page 13 © Hortonworks Inc. 2013
  • 14. HDP Certifies Latest Stable Components Apache HDP CDH CDH Project 1.2 3u5 4.1.2 Hadoop 1.1.2 020.2 +923.418 2.0.0alpha +541 Pig 0.10.1 0.8.1 +51.39 0.10.0 +48 Hive 0.10.0 0.7.1 +42.56 0.9.0 +148 HCatalog 0.5.0 n/a n/a HBase 0.94.2 0.90.6 +84.73 0.92.1 +154 Sqoop 1.4.2 1.3.0 +5.88 1.4.1 +51 Oozie 3.2.0 3.2.0 3.2.0 Zookeeper 3.4.5 3.3.5 +19.5 3.4.3 +25 Ambari 1.2.0 n/a n/a Flume 1.3.0 0.9.4 +25.46 1.2.0 +119 Mahout 0.7.0 0.5 +9.7 0.7 +4 Source: http://files.cloudera.com/pdf/datasheet/cdh4.1_spec_sheet.pdf Page 14 © Hortonworks Inc. 2013
  • 15. True Enterprise Class Open Source • 100% Open Source. No Holdbacks. – Only true implementation of OSS Apache Hadoop – Preferred by the software vendors that you rely on • Flexible Deployment – No License Fee for usage • Community Open Source Mitigates Lock-In – Proprietary Open Source = Lock-In – Open communities always trump “open source” Page 15 © Hortonworks Inc. 2013
  • 16. Hortonworks • History of Apache Hadoop & Hortonworks’ Role • Key Areas of Focus in 2012 – Addressing “Enterprise Hadoop” Requirements – Enabling Interoperability of the Ecosystem • The Road Ahead for Enterprise Hadoop Page 16 © Hortonworks Inc. 2013
  • 17. HDP: Enterprise Hadoop Distribution Hortonworks Data Platform (HDP) Enterprise Hadoop • The ONLY 100% open source WEBHDFS Distributed MAP REDUCE and complete distribution HADOOP CORE Storage & Processing (in 2.0) HDFS YARN Enterprise Readiness: PLATFORM SERVICES HA, DR, Snapshots, Security • Enterprise grade, proven and ,… tested at scale • Ecosystem endorsed to ensure interoperability Page 17 © Hortonworks Inc. 2013
  • 18. HDP: Enterprise Hadoop Distribution DATA Hortonworks SERVICES Data Platform (HDP) FLUME Store, Proces PIG HIVE s and Access HBASE Enterprise Hadoop SQOOP Data HCATALOG • The ONLY 100% open source WEBHDFS Distributed MAP REDUCE and complete distribution HADOOP CORE Storage & Processing (in 2.0) HDFS YARN Enterprise Readiness: PLATFORM SERVICES HA, DR, Snapshots, Security • Enterprise grade, proven and ,… tested at scale • Ecosystem endorsed to ensure interoperability Page 18 © Hortonworks Inc. 2013
  • 19. HDP: Enterprise Hadoop Distribution OPERATIONAL DATA Hortonworks SERVICES SERVICES Data Platform (HDP) Manage & AMBARI FLUME Store, Proces PIG HIVE Operate at s and Access HBASE Enterprise Hadoop Scale SQOOP Data OOZIE HCATALOG • The ONLY 100% open source WEBHDFS Distributed MAP REDUCE and complete distribution HADOOP CORE Storage & Processing (in 2.0) HDFS YARN PLATFORM SERVICES Enterprise Readiness: HA, DR, Snapshots, Security, … • Enterprise grade, proven and tested at scale • Ecosystem endorsed to ensure interoperability Page 19 © Hortonworks Inc. 2013
  • 20. HDP: Enterprise Hadoop Distribution OPERATIONAL DATA Hortonworks SERVICES SERVICES Data Platform (HDP) Manage & AMBARI FLUME Store, Proces PIG HIVE Operate at s and Access HBASE Enterprise Hadoop Scale SQOOP Data OOZIE HCATALOG • The ONLY 100% open source WEBHDFS Distributed MAP REDUCE and complete distribution HADOOP CORE Storage & Processing (in 2.0) HDFS YARN Enterprise Readiness: PLATFORM SERVICES HA, DR, Snapshots, Security • Enterprise grade, proven and ,… tested at scale HORTONWORKS DATA PLATFORM (HDP) • Ecosystem endorsed to ensure interoperability Page 20 © Hortonworks Inc. 2013
  • 21. HDP: Enterprise Hadoop Distribution OPERATIONAL DATA Hortonworks SERVICES SERVICES Data Platform (HDP) Manage & AMBARI FLUME Store, Proces PIG HIVE Operate at s and Access HBASE Enterprise Hadoop Scale SQOOP Data OOZIE HCATALOG • The ONLY 100% open source WEBHDFS Distributed MAP REDUCE and complete distribution HADOOP CORE Storage & Processing (in 2.0) HDFS YARN Enterprise Readiness: PLATFORM SERVICES HA, DR, Snapshots, Security • Enterprise grade, proven and ,… tested at scale HORTONWORKS DATA PLATFORM (HDP) • Ecosystem endorsed to ensure interoperability OS Cloud VM Appliance Page 21 © Hortonworks Inc. 2013
  • 22. Latest Hortonworks Announcements Two releases in January 2013 JANUARY Hortonworks Data Platform 1.2 Hortonworks Brings Enterprise Manageability to 100% Open Source Apache Hadoop Distribution 15 JANUARY Hortonworks Sandbox Hortonworks accelerates Hadoop skills development with an easy-to-use, flexible and extensible platform to 22 learn, evaluate and use Apache Hadoop Page 22 © Hortonworks Inc. 2013
  • 23. HDP 1.2 Summary Hortonworks Data Platform 1.2 HDP outpaces the competition to extend leadership through 100% open source Enterprise Apache Hadoop Focus areas: • Ambari: continued innovation with a complete, free and open cluster management tool • Provision, Manage and Monitor your Hadoop infrastructure • Job diagnostics, usage heat maps • Ecosystem integration • Enhanced security model for Hive and HCatalog • Performance and operational enhancements for HBase • Extended Full Stack HA to Hive & HCatalog Metastore Page 23 © Hortonworks Inc. 2013
  • 24. HDP 1.2: Ambari Key Features • Job Diagnostics Visualize and troubleshoot Hadoop job execution and performance • Cluster History View historical job execution & performance • Instant Insight View health of Core Hadoop (HDFS, MapReduce) and related projects • Cluster Navigation “Quick link” buttons jump into namenode web UI for a server • REST interface provides external access to Ambari for existing tools. Facilitates Apache Ambari Dashboard integration with Microsoft System Center and Teradata Viewpoint Page 24 © Hortonworks Inc. 2013
  • 25. 0 to Big Data in 15 Minutes Hands on tutorials integrated into HDP environment for Sandbox evaluation Page 25 © Hortonworks Inc. 2013
  • 26. Hortonworks • History of Apache Hadoop & Hortonworks’ Role • Key Areas of Focus in 2012 – Addressing “Enterprise Hadoop” Requirements – Enabling Interoperability of the Ecosystem • The Road Ahead for Enterprise Hadoop Page 26 © Hortonworks Inc. 2013
  • 27. Traditional Data Architecture APPLICATIONS Business Custom Enterprise Analytics Applications Applications DEV & DATA TOOLS BUILD & TEST DATA SYSTEMS OPERATIONAL TOOLS MANAGE & MONITOR RDBMS EDW MPP TRADITIONAL REPOS DATA SOURCES Traditional Sources OLTP, PO(RDBMS, OLTP, OLAP) S SYSTEMS Page 27 © Hortonworks Inc. 2013
  • 28. Next-Generation Data Architecture APPLICATIONS Business Custom Enterprise Analytics Applications Applications DEV & DATA TOOLS BUILD & TEST DATA SYSTEMS OPERATIONAL TOOLS HORTONWORKS MANAGE & DATA PLATFORM MONITOR RDBMS EDW MPP TRADITIONAL REPOS DATA SOURCES Traditional Sources New Sources OLTP, PO(RDBMS, OLTP, OLAP) (web logs, email, sensor data, social media) MOBILE S DATA SYSTEMS Page 28 © Hortonworks Inc. 2013
  • 29. Interoperating With Your Tools APPLICATIONS Microsoft Applications DEV & DATA TOOLS DATA SYSTEMS OPERATIONAL TOOLS HORTONWORKS DATA PLATFORM TRADITIONAL REPOS Viewpoint DATA SOURCES Traditional Sources New Sources OLTP, PO(RDBMS, OLTP, OLAP) (web logs, email, sensor data, social media) MOBILE S DATA SYSTEMS Page 29 © Hortonworks Inc. 2013
  • 30. Hortonworks & Teradata • Unified Data Architecture – The right technology on the right analytical problems using best of breed technologies • Viewpoint Integration – Common management console for Aster, Teradata and Apache Hadoop • TVI: Teradata Vital Infrastructure – Proactive reliability, availability, and manageability support service • Aster Connector for Hadoop – SQL-H integration • Teradata Connector for Hadoop – Sqoop integration • Pre-tuned HDFS and HORTONWORKS MapReduce parameters for Big DISTRIBUTION PLATFORM Data workloads Big Data Management Page 30 © Hortonworks Inc. 2013
  • 31. Hortonworks & Microsoft Microsoft Brings Big Data to the Masses HDInsight • Simplifies Hadoop, Enterprise Ready • Hortonworks Data Platform used for Hadoop on Windows Server and Azure + • An engineered, open source solution – Hadoop engineered for Windows • Excel – Hadoop powered Microsoft business tools • PowerPivot (BI) – Ops integration with MS System Center • PowerView – Bidirectional connectors for SQL Server (visualization) – Support for Hyper-V, deploy Hadoop on VMs – Opens the .NET developer community to Hadoop – Deploy on Azure in 10 minutes Page 31 © Hortonworks Inc. 2013
  • 32. Hortonworks • History of Apache Hadoop & Hortonworks’ Role • Key Areas of Focus in 2012 • The Road Ahead for Enterprise Hadoop – Patterns of Use – Key Areas of Investment Page 32 © Hortonworks Inc. 2013
  • 33. Market Transitioning into Early Majority Enterprise adoption accelerates via: • Repeatable horizontal patterns of use relative % customers • Ecosystem-driven pull market • Vertical applications (aka bowling pins) The CHASM Innovators, t Early Early Late Laggards, Ske echnology adopters, majority, pr majority, conse ptics enthusiasts visionaries agmatists rvatives time Customers want Customers want technology & performance solutions & convenience Source: Geoffrey Moore - Crossing the Chasm Page 33 © Hortonworks Inc. 2013
  • 34. Patterns of Use: “Right-time Access” Business Case Batch Interactive Online Refine Explore Enrich HORTONWORKS DATA PLATFORM Big Data Transactions, Interactions, Observations Page 34 © Hortonworks Inc. 2013
  • 35. Being Big Data Driven at Neustar Create new business opportunities and save money with information analytics • Provides real-time information and analysis to • Traditional business heavy in data capture and data Internet, telecommunic movement. ations, entertainment and marketing – Aggregate data for industries as information exchange industries throughout – For instance they used to store 1% of DNS data for 60 days the world. to bill customers and identify DDOS attacks – With Hadoop • Started off focused on they now store 100% over a year # porting for carriers – Not economically feasible to use existing DW for new data • 2500+ Employees • Eliminated politics with creation of “catch basin” – Year 1: Use Hadoop to capture everything they used to throw away while leaving existing systems in tact – Year 2: Make this data available for new business opportunities, but require the business to justify Page 35 © Hortonworks Inc. 2013
  • 36. Customers Don’t Want More Data Silos AVOID: Systems separated by GOAL: Platform that natively workload type due to contention supports mixed workloads Batch Interactive Online Refine Explore Enrich Refine Explore Enrich Big Big Big Big Data Data Data Data Transactions, Interactions, Observations Page 36 © Hortonworks Inc. 2013
  • 37. 2013 “Enterprise Hadoop” Initiatives Invest In: OPERATIONA DATA L SERVICES SERVICES HADOOP CORE PLATFORM SERVICES HORTONWORKS DATA PLATFORM (HDP) Page 37 © Hortonworks Inc. 2013
  • 38. 2013 “Enterprise Hadoop” Initiatives Invest In: –Platform Services OPERATIONA DATA L SERVICES SERVICES HADOOP CORE PLATFORM SERVICES HORTONWORKS DATA PLATFORM (HDP) “Continuum” Biz Continuity Page 38 © Hortonworks Inc. 2013
  • 39. 2013 “Enterprise Hadoop” Initiatives Invest In: Hive / “Stinger” Interactive Query –Platform Services HBase Online Data –Data Services OPERATIONA DATA L SERVICES SERVICES HADOOP CORE PLATFORM SERVICES “Herd” – HORTONWORKS DATA PLATFORM (HDP) Data Integration “Continuum” Biz Continuity Page 39 © Hortonworks Inc. 2013
  • 40. 2013 “Enterprise Hadoop” Initiatives Invest In: Hive / “Stinger” Interactive Query –Platform Services Ambari HBase Manage & Operate Online Data –Data Services OPERATIONA DATA L SERVICES SERVICES HADOOP CORE PLATFORM SERVICES “Herd” –Operational Services “Knox” HORTONWORKS Secure Access DATA PLATFORM (HDP) Data Integration “Continuum” Biz Continuity Page 40 © Hortonworks Inc. 2013
  • 41. Top BI Vendors Support Hive Today Page 41 © Hortonworks Inc. 2013
  • 42. Stinger: Enhance Hive for BI Use Cases Enterprise Reports Parameterized Reports Dashboard / Scorecard Data Mining Visualization More SQL & Better Performance Batch Interactive Page 42 © Hortonworks Inc. 2013
  • 43. Our Focus Remains Unchanged • Innovate Core Hadoop – Lead innovation within the Apache Hadoop community • Enhance Hadoop for Enterprise Class Usage – Add platform, data, and operational services that enterprises need – Apply enterprise software rigor to test & release process • Enable the Data Ecosystem – Leverage Hadoop to enable Partners to be successful • All Open Source, All the Time – Avoid proprietary open source which locks you in Page 43 © Hortonworks Inc. 2013
  • 44. Next Steps Download Hortonworks Sandbox www.hortonworks.com/sandbox Download Hortonworks Data Platform www.hortonworks.com/download Register for Enterprise Hadoop Series www.hortonworks.com/webinars Follow… @shaunconnolly, @hortonworks Page 44 © Hortonworks Inc. 2013
  • 45. Power of Community is Key Amsterdam San Jose, CA March 20 - 21, 2013 June 26 - 27, 2013 REGISTER NOW CALL FOR PAPERS http://hadoopsummit.org/amsterdam/register/ http://hadoopsummit.org/san-jose/call-for-papers/ © Hortonworks Inc. 2013
  • 46. Next Steps Download Hortonworks Sandbox www.hortonworks.com/sandbox Download Hortonworks Data Platform www.hortonworks.com/download Register for Enterprise Hadoop Series www.hortonworks.com/webinars Follow… @shaunconnolly, @hortonworks Page 46 © Hortonworks Inc. 2013
  • 47. Questions? @shaunconnolly, @hortonworks Page 47 © Hortonworks Inc. 2013

Editor's Notes

  1. State of the Union” Webinar Features Hortonworks Executive Delivering 2012 Year-in-Review, MappingOut Strategic Direction for 2013 and Highlighting Key Product Offerings PALO ALTO, Calif.—January 16, 2013—Hortonworks, a leading commercial vendor promoting the innovation, development and support of Apache Hadoop, today announced its “State of the Union and Vision for Apache Hadoop in 2013” webinar, taking place on Tuesday, January 22, 2013 at 1:00 p.m. ET. During the webinar, Vice President of Corporate Strategy Shaun Connolly will provide an overview of company highlights from 2012 as well as a strategic roadmap for Apache Hadoop in 2013.What: “Hortonworks State of the Union and Vision for Apache Hadoop in 2013” webinarWho: Shaun Connolly, vice president of corporate strategy, HortonworksWhen: Tuesday, January 22, 2013 at 1:00 p.m. ET/10:00 a.m. PT
  2. I can’t really talk about Hortonworks without first taking a moment to talk about the history of Hadoop.What we now know of as Hadoop really started back in 2005, when Eric Baldeschwieler – known as “E14” – started to work on a project that to build a large scale data storage and processing technology that would allow them to store and process massive amounts of data to underpin Yahoo’s most critical application, Search. The initial focus was on building out the technology – the key components being HDFS and MapReduce – that would become the Core of what we think of as Hadoop today, and continuing to innovate it to meet the needs of this specific application.By 2008, Hadoop usage had greatly expanded inside of Yahoo, to the point that many applications were now using this data management platform, and as a result the team’s focus extended to include a focus on Operations: now that applications were beginning to propagate around the organization, sophisticated capabilities for operating it at scale were necessary. It was also at this time that usage began to expand well beyond Yahoo, with many notable organizations (including Facebook and others) adopting Hadoop as the basis of their large scale data processing and storage applications and necessitating a focus on operations to support what as by now a large variety of critical business applications.In 2011, recognizing that more mainstream adoption of Hadoop was beginning to take off and with an objective of facilitating it, the core team left – with the blessing of Yahoo – to form Hortonworks. The goal of the group was to facilitate broader adoption by addressing the Enterprise capabilities that would would enable a larger number of organizations to adopt and expand their usage of Hadoop.[note: if useful as a talk track, Cloudera was formed in 2008 well BEFORE the operational expertise of running Hadoop at scale was established inside of Yahoo]
  3. I can’t really talk about Hortonworks without first taking a moment to talk about the history of Hadoop.What we now know of as Hadoop really started back in 2005, when Eric Baldeschwieler – known as “E14” – started to work on a project that to build a large scale data storage and processing technology that would allow them to store and process massive amounts of data to underpin Yahoo’s most critical application, Search. The initial focus was on building out the technology – the key components being HDFS and MapReduce – that would become the Core of what we think of as Hadoop today, and continuing to innovate it to meet the needs of this specific application.By 2008, Hadoop usage had greatly expanded inside of Yahoo, to the point that many applications were now using this data management platform, and as a result the team’s focus extended to include a focus on Operations: now that applications were beginning to propagate around the organization, sophisticated capabilities for operating it at scale were necessary. It was also at this time that usage began to expand well beyond Yahoo, with many notable organizations (including Facebook and others) adopting Hadoop as the basis of their large scale data processing and storage applications and necessitating a focus on operations to support what as by now a large variety of critical business applications.In 2011, recognizing that more mainstream adoption of Hadoop was beginning to take off and with an objective of facilitating it, the core team left – with the blessing of Yahoo – to form Hortonworks. The goal of the group was to facilitate broader adoption by addressing the Enterprise capabilities that would would enable a larger number of organizations to adopt and expand their usage of Hadoop.[note: if useful as a talk track, Cloudera was formed in 2008 well BEFORE the operational expertise of running Hadoop at scale was established inside of Yahoo]
  4. I can’t really talk about Hortonworks without first taking a moment to talk about the history of Hadoop.What we now know of as Hadoop really started back in 2005, when Eric Baldeschwieler – known as “E14” – started to work on a project that to build a large scale data storage and processing technology that would allow them to store and process massive amounts of data to underpin Yahoo’s most critical application, Search. The initial focus was on building out the technology – the key components being HDFS and MapReduce – that would become the Core of what we think of as Hadoop today, and continuing to innovate it to meet the needs of this specific application.By 2008, Hadoop usage had greatly expanded inside of Yahoo, to the point that many applications were now using this data management platform, and as a result the team’s focus extended to include a focus on Operations: now that applications were beginning to propagate around the organization, sophisticated capabilities for operating it at scale were necessary. It was also at this time that usage began to expand well beyond Yahoo, with many notable organizations (including Facebook and others) adopting Hadoop as the basis of their large scale data processing and storage applications and necessitating a focus on operations to support what as by now a large variety of critical business applications.In 2011, recognizing that more mainstream adoption of Hadoop was beginning to take off and with an objective of facilitating it, the core team left – with the blessing of Yahoo – to form Hortonworks. The goal of the group was to facilitate broader adoption by addressing the Enterprise capabilities that would would enable a larger number of organizations to adopt and expand their usage of Hadoop.[note: if useful as a talk track, Cloudera was formed in 2008 well BEFORE the operational expertise of running Hadoop at scale was established inside of Yahoo]
  5. At Hortonworks today, our focus is very clear: we Develop, Distribute and Support a 100% open source distribution of Enterprise Apache Hadoop.We employ the core architects, builders and operators of Apache Hadoop and drive the innovation in the open source community.We distribute the only 100% open source Enterprise Hadoop distribution: the Hortonworks Data PlatformGiven our operational expertise of running some of the largest Hadoop infrastructure in the world at Yahoo, our team is uniquely positioned to support youOur approach is also uniquely endorsed by some of the biggest vendors in the IT marketYahoo is both and investor and a customer, and most importantly, a development partner. We partner to develop Hadoop, and no distribution of HDP is released without first being tested on Yahoo’s infrastructure and using the same regression suite that they have used for years as they grew to have the largest production cluster in the worldMicrosoft has partnered with Hortonworks to include HDP in both their off-premise offering on Azure but also their on-premise offering under the product name HDInsight. This also includes integration with both Visual Studio for application development but also with System Center for operational management of the infrastructureTeradata includes HDP in their products in order to provide the broadest possible range of options for their customers
  6. Eric and team created the Hadoop project as open source, and that is and always will be central to our approach. We believe strongly that the technology needs to be community driven and open source.In terms of open source mechanics, Apache Hadoop is governed by the Apache Software Foundation which provides structure to what inside a commercial software company would be a tightly governed process around the development, test and release process. When we think of Core Hadoop, the ASF has helped to manage this process for several years now.However as Hadoop has become more widely used, it has spawned a set of ancillary open source projects that introduce capabilities required for more mainstream use. These projects are generally classified as either being related to:“Data Services” – those that enable the Storage, Processing, and Accessing of data“Operational Services” – those that enable the management and operations of the infrastructureThe projects within these categories are run as independent projects with their own teams, and include some of the technologies you likely know of: Data Services include projects such as Hive, Pig, Hbase and Hcatalog, while Operational Services include Apache Ambari and more.Hortonworkers have always played a critical role in the development, test and release process for Core Apache Hadoop but also play leading roles in these ancillary projects that are required for enterprise usage. This includes every role from committer, release manager, and in many cases, the project leads. For example Arun Murthy is the project lead for Core Hadoop.Current Hortonworks PMC members by project:Hadoop:  Arun Murthy, Deveraj Das, EnisSoztutar, GiridharanKesavan, JitendraNathPandy, MahadevKonar, Matt Foley, Owen O'Malley, Sanjay Radia, Suresh Srinivas, Nicholas Sze, Vinod Kumar VavilapalliPig:  Daniel Dai, Alan Gates, GiridharanKesavan, AshutoshChauhan, Thejas NairHive:  AshutoshChauhanHBase:  NoneOozie:  Deveraj Das, Alan GatesSqoop:  NoneFlume:  NoneBigtop:  Alan Gates, Steve Loughran, Owen O'MalleyIncubator (not a Hadoop project but shows who's helping grow new projects in Apache):  Arun Murthy, Deveraj Das, Alan Gates, MahadevKonar, Steve Loughran, Owen O'Malley, EnisSoztutar
  7. Eric and team created the Hadoop project as open source, and that is and always will be central to our approach. We believe strongly that the technology needs to be community driven and open source.In terms of open source mechanics, Apache Hadoop is governed by the Apache Software Foundation which provides structure to what inside a commercial software company would be a tightly governed process around the development, test and release process. When we think of Core Hadoop, the ASF has helped to manage this process for several years now.However as Hadoop has become more widely used, it has spawned a set of ancillary open source projects that introduce capabilities required for more mainstream use. These projects are generally classified as either being related to:“Data Services” – those that enable the Storage, Processing, and Accessing of data“Operational Services” – those that enable the management and operations of the infrastructureThe projects within these categories are run as independent projects with their own teams, and include some of the technologies you likely know of: Data Services include projects such as Hive, Pig, Hbase and Hcatalog, while Operational Services include Apache Ambari and more.Hortonworkers have always played a critical role in the development, test and release process for Core Apache Hadoop but also play leading roles in these ancillary projects that are required for enterprise usage. This includes every role from committer, release manager, and in many cases, the project leads. For example Arun Murthy is the project lead for Core Hadoop.Current Hortonworks PMC members by project:Hadoop:  Arun Murthy, Deveraj Das, EnisSoztutar, GiridharanKesavan, JitendraNathPandy, MahadevKonar, Matt Foley, Owen O'Malley, Sanjay Radia, Suresh Srinivas, Nicholas Sze, Vinod Kumar VavilapalliPig:  Daniel Dai, Alan Gates, GiridharanKesavan, AshutoshChauhan, Thejas NairHive:  AshutoshChauhanHBase:  NoneOozie:  Deveraj Das, Alan GatesSqoop:  NoneFlume:  NoneBigtop:  Alan Gates, Steve Loughran, Owen O'MalleyIncubator (not a Hadoop project but shows who's helping grow new projects in Apache):  Arun Murthy, Deveraj Das, Alan Gates, MahadevKonar, Steve Loughran, Owen O'Malley, EnisSoztutar
  8. Eric and team created the Hadoop project as open source, and that is and always will be central to our approach. We believe strongly that the technology needs to be community driven and open source.In terms of open source mechanics, Apache Hadoop is governed by the Apache Software Foundation which provides structure to what inside a commercial software company would be a tightly governed process around the development, test and release process. When we think of Core Hadoop, the ASF has helped to manage this process for several years now.However as Hadoop has become more widely used, it has spawned a set of ancillary open source projects that introduce capabilities required for more mainstream use. These projects are generally classified as either being related to:“Data Services” – those that enable the Storage, Processing, and Accessing of data“Operational Services” – those that enable the management and operations of the infrastructureThe projects within these categories are run as independent projects with their own teams, and include some of the technologies you likely know of: Data Services include projects such as Hive, Pig, Hbase and Hcatalog, while Operational Services include Apache Ambari and more.Hortonworkers have always played a critical role in the development, test and release process for Core Apache Hadoop but also play leading roles in these ancillary projects that are required for enterprise usage. This includes every role from committer, release manager, and in many cases, the project leads. For example Arun Murthy is the project lead for Core Hadoop.Current Hortonworks PMC members by project:Hadoop:  Arun Murthy, Deveraj Das, EnisSoztutar, GiridharanKesavan, JitendraNathPandy, MahadevKonar, Matt Foley, Owen O'Malley, Sanjay Radia, Suresh Srinivas, Nicholas Sze, Vinod Kumar VavilapalliPig:  Daniel Dai, Alan Gates, GiridharanKesavan, AshutoshChauhan, Thejas NairHive:  AshutoshChauhanHBase:  NoneOozie:  Deveraj Das, Alan GatesSqoop:  NoneFlume:  NoneBigtop:  Alan Gates, Steve Loughran, Owen O'MalleyIncubator (not a Hadoop project but shows who's helping grow new projects in Apache):  Arun Murthy, Deveraj Das, Alan Gates, MahadevKonar, Steve Loughran, Owen O'Malley, EnisSoztutar
  9. So how does this get brought together into our distribution? It is really pretty straightforward, but also very unique:We start with this group of open source projects that I described and that we are continually driving in the OSS community. [CLICK] We then package the appropriate versions of those open source projects, integrate and test them using a full suite, including all the IP for regression testing contributed by Yahoo, and [CLICK] contribute back all of the bug fixes to the open source tree. From there, we package and certify a distribution in the from of the Hortonworks Data Platform (HDP) that includes both Hadoop Core as well as the related projects required by the Enterprise user, and provide to our customers.Through this application of Enterprise Software development process to the open source projects, the result is a 100% open source distribution that has been packaged, tested and certified by Hortonworks. It is also 100% in sync with the open source trees.
  10. So how does this get brought together into our distribution? It is really pretty straightforward, but also very unique:We start with this group of open source projects that I described and that we are continually driving in the OSS community. [CLICK] We then package the appropriate versions of those open source projects, integrate and test them using a full suite, including all the IP for regression testing contributed by Yahoo, and [CLICK] contribute back all of the bug fixes to the open source tree. From there, we package and certify a distribution in the from of the Hortonworks Data Platform (HDP) that includes both Hadoop Core as well as the related projects required by the Enterprise user, and provide to our customers.Through this application of Enterprise Software development process to the open source projects, the result is a 100% open source distribution that has been packaged, tested and certified by Hortonworks. It is also 100% in sync with the open source trees.
  11. So how does this get brought together into our distribution? It is really pretty straightforward, but also very unique:We start with this group of open source projects that I described and that we are continually driving in the OSS community. [CLICK] We then package the appropriate versions of those open source projects, integrate and test them using a full suite, including all the IP for regression testing contributed by Yahoo, and [CLICK] contribute back all of the bug fixes to the open source tree. From there, we package and certify a distribution in the from of the Hortonworks Data Platform (HDP) that includes both Hadoop Core as well as the related projects required by the Enterprise user, and provide to our customers.Through this application of Enterprise Software development process to the open source projects, the result is a 100% open source distribution that has been packaged, tested and certified by Hortonworks. It is also 100% in sync with the open source trees.
  12. HDP tracks closely to Apache project releasesCDH forks early and patches CDH distributions off to the side of the Apache community projects resulting in unnecessary drift and risk of lock-inThe “+923.423” and the “+541” parts of the version numbers represent how many patches these components have drifted away from corresponding Apache projects.While some drift can be expected, patches and changes that are in the order of hundreds results in lock-in and actually eliminates the virtuous cycle that upstream community should help drive.
  13. We are believers in open source: for us, we believe it is the most efficient way to develop enterprise softwareBut more importantly, we believe that 100% open source is the best approach for our customers. And in particular in the data management market, our customers are acutely aware of the implication of growing their database usage with a proprietary vendor who then can exert pricing pressure (Oracle).Particularly when it comes to data storage, which we can all anticipate will continue to grow exponentially, you don’t want to be penalized for scale. By choosing an open source approach organizations can build their operational processes on open technologies, without concern that they will be locked in to a particular vendor. And they can be confident that as their usage grows, they can choose from flexible pricing alternatives – by node or by storage – that aligns best to their needs.It is ultimately about mitigating risk, and in this regard open source has been proven as the safest approach. I would also caution you to look beyond the open source label used by some vendors: are they harvesting open source work, forking the code and then working independently (“fork early / patch often”)? Or like Hortonworks, have they embraced and committed to the community open source approach which will allow them to stay in sync with the innovation of the community? In the Hadoop community, Hortonworks is unquestioned in taking the community-driven approach.
  14. In summary, by addressing these elements, we can provide an Enterprise Hadoop distribution which includes the:Core ServicesPlatform ServicesData ServicesOperational ServicesRequired by the Enterprise user.And all of this is done in 100% open source, and tested at scale by our team (together with our partner Yahoo) to bring Enterprise process to an open source approach. And finally this is the distribution that is endorsed by the ecosystem to ensure interoperability in your environment.
  15. In summary, by addressing these elements, we can provide an Enterprise Hadoop distribution which includes the:Core ServicesPlatform ServicesData ServicesOperational ServicesRequired by the Enterprise user.And all of this is done in 100% open source, and tested at scale by our team (together with our partner Yahoo) to bring Enterprise process to an open source approach. And finally this is the distribution that is endorsed by the ecosystem to ensure interoperability in your environment.
  16. In summary, by addressing these elements, we can provide an Enterprise Hadoop distribution which includes the:Core ServicesPlatform ServicesData ServicesOperational ServicesRequired by the Enterprise user.And all of this is done in 100% open source, and tested at scale by our team (together with our partner Yahoo) to bring Enterprise process to an open source approach. And finally this is the distribution that is endorsed by the ecosystem to ensure interoperability in your environment.
  17. In summary, by addressing these elements, we can provide an Enterprise Hadoop distribution which includes the:Core ServicesPlatform ServicesData ServicesOperational ServicesRequired by the Enterprise user.And all of this is done in 100% open source, and tested at scale by our team (together with our partner Yahoo) to bring Enterprise process to an open source approach. And finally this is the distribution that is endorsed by the ecosystem to ensure interoperability in your environment.
  18. In summary, by addressing these elements, we can provide an Enterprise Hadoop distribution which includes the:Core ServicesPlatform ServicesData ServicesOperational ServicesRequired by the Enterprise user.And all of this is done in 100% open source, and tested at scale by our team (together with our partner Yahoo) to bring Enterprise process to an open source approach. And finally this is the distribution that is endorsed by the ecosystem to ensure interoperability in your environment.
  19. While overly simplistic, this graphic represents what we commonly see as a general data architecture:A set of data sources producing dataA set of data systems to capture and store that data: most typically a mix of RDBMS and data warehousesA set of applications that leverage the data stored in those data systems. These could be package BI applications (Business Objects, Tableau, etc), Enterprise Applications (e.g. SAP) or Custom Applications (e.g. custom web applications), ranging from ad-hoc reporting tools to mission-critical enterprise operations applications.Your environment is undoubtedly more complicated, but conceptually it is likely similar.
  20. As the volume of data has exploded, we increasingly see organizations acknowledge that not all data belongs in a traditional database. The drivers are both cost (as volumes grow, database licensing costs can become prohibitive) and technology (databases are not optimized for very large datasets).Instead, we increasingly see Hadoop – and HDP in particular – being introduced as a complement to the traditional approaches. It is not replacing the database but rather is a complement: and as such, must integrate easily with existing tools and approaches. This means it must interoperate with:Existing applications – such as Tableau, SAS, Business Objects, etc,Existing databases and data warehouses for loading data to / from the data warehouseDevelopment tools used for building custom applicationsOperational tools for managing and monitoring
  21. It is for that reason that we focus on HDP interoperability across all of these categories:Data systemsHDP is endorsed and embedded with SQL Server, Teradata and moreBI tools: HDP is certified for use with the packaged applications you already use: from Microsoft, to Tableau, Microstrategy, Business Objects and moreWith Development tools: For .Net developers: Visual studio, used to build more than half the custom applications in the world, certifies with HDP to enable microsoft app developers to build custom apps with HadoopFor Java developers: Spring for Apache Hadoop enables Java developers to quickly and easily build Hadoop based applications with HDPOperational toolsIntegration with System Center, and with Teradata viewpoint
  22. SQL-H integration between Aster and Hadoop nodes:Analytics and discovery capabilities provided by Aster nodes while retaining archival data on HadoopHigh PerformanceHigh performance due to parallel BYNETconnections amongst the Aster nodes and Hadoop nodes Management InterfaceSeamless integration with Horton Management ConsoleTeradata Viewpoint integration by March 2013Troubleshooting and SupportabilitySeamless integration with Teradata Vital Infrastructure (TVI) for log collection and support case resolution Out-of-the-box ExperiencePre-tuned parameters for HDFS and MapReduce infrastructure
  23. Enterprise Reports – Your cell phone bill is an exampleDashboard – KPI trackingParameterized Reports – What are the hot prospects in my region?Visualization – Visual exploration of dataData Mining – Large scale data processing and extraction usually fed to other toolsHow?Improve Latency & ThroughputQuery engine improvementsNew “Optimized RCFile” column storeNext-gen runtime (elim’s M/R latency)Extend Deep Analytical AbilityAnalytics functionsImproved SQL coverageContinued focus on core Hive use cases
  24. We are believers in open source: for us, we believe it is the most efficient way to develop enterprise softwareBut more importantly, we believe that 100% open source is the best approach for our customers. And in particular in the data management market, our customers are acutely aware of the implication of growing their database usage with a proprietary vendor who then can exert pricing pressure (Oracle).Particularly when it comes to data storage, which we can all anticipate will continue to grow exponentially, you don’t want to be penalized for scale. By choosing an open source approach organizations can build their operational processes on open technologies, without concern that they will be locked in to a particular vendor. And they can be confident that as their usage grows, they won’t get penalized for success.It is ultimately about mitigating risk, and in this regard open source has been proven as the safest approach. I would also caution you to look beyond the open source label used by some vendors: are they harvesting open source work, forking the code and then working independently (“fork early / patch often”)? Or like Hortonworks, have they embraced and committed to the community open source approach which will allow them to stay in sync with the innovation of the community? In the Hadoop community, Hortonworks is unquestioned in taking the community-driven approach.