SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Big Data Integration
Talend Open Studio & Hortonworks Data Platform
Ciaran Dynes: Senior Director, Product Marketing - Talend
Jim Walker: Director, Product Marketing - Hortonworks

August 8, 2012




© Hortonworks Inc. 2012                                     Page 1
Your Presenters


                               Ciaran Dynes
                               Senior Director, Product Marketing




                                         Jim Walker
                             Director, Product Marketing




                                                                    Page 2
   © Hortonworks Inc. 2012
Talend – The Market Leading Unified Integration Platform

                                     Talend Enterprise


                 Data            Data
                                              MDM     ESB         BPM
                Quality       Integration

                                                                          ¾  Commercial license
                                                                          ¾  Subscription model

         Studio            Repository Deployment Execution   Monitoring



                                                                          ¾  Open source license

                           Talend Open Studio          for
                                                                          ¾  Free of charge
                                                                          ¾  Optional support

                  Data             Data
                 Quality        Integration   MDM     ESB




Recognized as the open source leader in each of its market
            category by all industry analysts
© Talend 2011                                                                                       3
Hortonworks Snapshot

                                     The industry leading and only 100% open
                                     source Apache Hadoop distribution

•  Headquarters
   Sunnyvale, CA                     Most experienced open source leadership team
                                      –    Rob Bearden – CEO (JBoss, SpringSource, i2, Oracle)
•  90+ Employees
                                      –    Shaun Connolly – VP Strategy (VMW, SpringSource, Red Hat, JBoss)
•  Formed with core                   –    John Kreisa – VP Marketing (Red Hat, Cloudera, MarkLogic, Bus Obj)
   Apache Hadoop                      –    Ari Zilka – CPO (Teracotta, Accenture, Walmart.com)
   engineering team
   from Yahoo!                        –    Greg Pavlik – VP Eng. (Oracle SOA & Integration platform)

•  35 engineers and
   architects including              Business model focused on customer success:
   25+ Hadoop                        Hadoop support, services & training
   committers
                                      – Subscription support for Hortonworks Data Platform
                                      – Training business: Private and public classes available
                                        for developers & administrators


           © Hortonworks Inc. 2012
Next-gen data architecture drivers


Business            •     Enable new business models & drive faster growth (20%+)
 Drivers            •     Find insights for competitive advantage & optimal returns



                    •     Data continues to grow exponentially
Technical           •     Data is increasingly everywhere and in many formats
  Drivers           •     Legacy solutions unfit for new requirements growth



Financial           •     Cost of data systems, as % of IT spend, continues to grow
  Drivers           •     Cost advantages of commodity hardware & open source




     © Hortonworks Inc. 2012
Big data changes the game

                                                                     Transactions + Interactions
Petabytes
                  BIG DATA                       Mobile Web                  + Observations
                                                 Sentiment

                                                  User Click Stream
                                                                    SMS/MMS
                                                                                   = BIG DATA
                                                                         Speech to Text

                                                                Social Interactions & Feeds
  Terabytes       WEB                Web logs
                                                                         Spatial & GPS Coordinates
                                         A/B testing
                                                                                Sensors / RFID / Devices
                                                  Behavioral Targeting
   Gigabytes      CRM                                                                   Business Data Feeds
                                                             Dynamic Pricing
                                     Segmentation                                             External Demographics
                                                                    Search Marketing
                                         Customer Touches                                      User Generated Content
                  ERP
   Megabytes                                                           Affiliate Networks
                   Purchase detail              Support Contacts                                  HD Video, Audio, Images
                                                                         Dynamic Funnels
                   Purchase record
                                                    Offer details          Offer history            Product/Service Logs
                   Payment record



                                                  Increasing Data Variety and Complexity


               © Hortonworks Inc. 2012
Use cases: optimize outcomes at scale
                      Media     optimize                 Content
        Intelligence            optimize                 Detection
         Investment             optimize                 Algorithms
        Advertising             optimize                 Performance
                      Fraud     optimize                 Prevention
          Regulation            optimize                 Compliance
 Retail / Wholesale             optimize                 Inventory turns
    Manufacturing               optimize                 Supply chains
          Healthcare            optimize                 Patient outcomes
            Education           optimize                 Learning outcomes
      Government                optimize                 Citizen services
                                    Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation.

      © Hortonworks Inc. 2012
Hortonworks Data Platform

                                                    •  Simplify deployment to get
                                                       started quickly and easily

                                                    •  Monitor, manage any size cluster
                                                       with familiar console and tools

                                                    •  Only platform to include data
                                                       integration services to interact
                               1                       with any data source

                                                    •  Metadata services opens the
                                                       platform for integration with
           Hortonworks Data Platform                   existing applications
     Delivers enterprise grade functionality on a
     proven Apache Hadoop distribution to ease      •  Dependable high availability
    management, simplify use and ease integration      architecture
                 into the enterprise



The only 100% open source data platform for Apache Hadoop

     © Hortonworks Inc. 2012
Data Integration Services

•  Intuitive graphical data
   integration tools for HDFS,
   Hive, HBase, HCatalog and Pig

•  Oozie scheduling allows you to
   manage and stage jobs

•  Connectors for any database,
   business application or system

•  Integrated HCatalog storage

 Bridge the gap between
 legacy data & Hadoop

 Simplify and speed development

                                    Page 9
      © Hortonworks Inc. 2012
What is Big Data integration?
Trying to get from this…




 © Talend 2011 – Stri2y Private & Confidential
 © Talend 2011                                   11
to this…




 Why Talend…

 ONLY Talend generates code that is executed within map reduce. This
 open approach removes the limitation of a proprietary “engine” to
 provide a truly unique and powerful set of tools for big data.
Key Takeaway #2

                 Forces us to think
© Talend 2011
                 differently
© Talend 2011 – Stri2y Private & Confidential   13
But for Talend…. Big data is…




                …everything that is old, is new again!

© Talend 2011 – Stri2y Private & Confidential
© Talend 2011                                            14
Data driven business


                                   enables
             data               governance




                                                         supports
                                           information                             decisions


                                                                                         drives
  Information provides
  value to the business
  If you can't rely on your information then                                          Your
  the result can be missed opportunities, or                                        business
  higher costs.
     Matthew West and Julian Fowler (1999). Developing High Quality Data Models.
      The European Process Industries
                                   STEP Technical Liaison Executive (EPISTLE).
© Talend 2011 – Stri2y Private & Confidential
© Talend 2011                                                                                     15
BIG data driven business

                                   enables
      BIG data                  governance




                                                         supports
                                                BIG                                 BIG
                                           information                             decisions

                                                                                         drives
  Information provides
  value to the business
  If you can't rely on your information then
  the result can be missed opportunities, or                                        BIG
  higher costs.                                                                     business

     Matthew West and Julian Fowler (1999). Developing High Quality Data Models.
      The European Process Industries
                                   STEP Technical Liaison Executive (EPISTLE).
© Talend 2011 – Stri2y Private & Confidential
© Talend 2011                                                                                     16
Let us show you…




© Talend 2012
Putting Web Logs to use

    Scenario:
     ¾  ACME Web Inc. have thousands of customers and millions of daily page hits on their
         ecommerce website
     ¾  ACME believe they could sell more things, if they could simply figure our buying trends
     ¾  ACME turns to Big Data to help get a handle on the volume of data they need to manage




© © Talend 2011 2012
  Talend                                                                        18                 18
Poor Data Quality + Big Data = Big Problems
Poor Data Quality * Big Data = Big Problems^2




           Key Takeaway #3
           In big data…
           poor data quality can be magnified at huge scale

© Talend 2011                                                 19
Metadata Services
Apache HCatalog provides flexible metadata
services across tools and external access
 •  Consistency of metadata and data models across tools
    (MapReduce, Pig, HBase and Hive)
 •  Accessibility: share data as tables in and out of HDFS
 •  Availability: enables flexible, thin-client access via REST API




                                  HCatalog                        Shared table
                                                                  and schema
                                                                  management
   •  Raw Hadoop data                        Table access         opens the
   •  Inconsistent, unknown                  Aligned metadata     platform
   •  Tool specific access                   REST API



        © Hortonworks Inc. 2012
Talend Open Studio for Big Data


                          Democratize Big Data

                          Talend Open Studio for Big Data
                          •    Improves efficiency of big data job
                               design with graphic interface
                          •    Generates Hadoop code
                          •    Run transforms inside Hadoop
                  Pig
    •    Native support for HDFS, Pig, Hbase,
                               Sqoop and Hive


                          •    Apache License
                          •    Available at talend.com
        …an open source
                          •    Distribution with hadoop vendors coming
          ecosystem
© Talend 2011                                                            21
Talend Platform for Big Data


                           Make Faster and More Informed
                                     Decisions

                          Talend Platform for Big Data
                          •    Builds on Talend Open Studio for Big Data
                          •    Adds data quality, advanced scalability and
                               management functions
                                •    MapReduce massively parallel data processing
                  Pig
          •    Shared Repository and remote deployment
                                •    Data quality and profiling
                                •    Data cleansing
                                •    Reporting and dashboards
                          •    Commercial support, warranty/IP indemnity
                               under a subscription license
        …an open source
          ecosystem
© Talend 2011                                                                       22
Why HDP?

Only Hortonworks Data Platform provides…
•  Tightly aligned to core Apache Hadoop development line
   - Reduces risk for customers who may add custom coding or projects

•  Enterprise Integration
  - HCatalog provides scalable, extensible integration point to Hadoop data

•  Most reliable Hadoop distribution
  - Full stack high availability on v1 delivers the strongest SLA guarantees

•  Multi-tenant scheduling and resource management
  - Capacity and fair scheduling optimizes cluster resources

•  Integration with operations, eases cluster management
  - Ambari is the most open/complete operations platform for Hadoop clusters




        © Hortonworks Inc. 2012
What next?
                                  Download Hortonworks Data Platform
1                                 & Talend Open Studio
                                  hortonworks.com/download or talend.com/downlod




2   Use the getting started guide
    hortonworks.com/get-started



3   Learn more… get support

                                                           Hortonworks Support
       •  Expert role based training                       •  Full lifecycle technical support
       •  Course for admins, developers                       across four service levels
          and operators                                    •  Delivered by Apache Hadoop
       •  Certification program                               Experts/Committers
       •  Custom onsite options                            •  Forward-compatible
        hortonworks.com/training                           hortonworks.com/support


                                                                                                 Page 24
        © Hortonworks Inc. 2012
Questions & Answers

                                           TRY
                                           download at hortonworks.com
                                           download at talend.com

                                           LEARN
                                           Hortonworks University

                                           FOLLOW
                                           twitter: @hortonworks
                                           Facebook: facebook.com/hortonworks

                                           MORE EVENTS
                                           hortonworks.com/events



                             Further questions & comments: events@hortonworks.com

                                                                             Page 25
   © Hortonworks Inc. 2012

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?
 
Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Data Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache HadoopData Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache Hadoop
 
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 

Andere mochten auch

Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
DataWorks Summit
 

Andere mochten auch (11)

Talend Big Data Capabilities Overview
Talend Big Data Capabilities OverviewTalend Big Data Capabilities Overview
Talend Big Data Capabilities Overview
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srik...
Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srik...Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srik...
Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srik...
 
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and TalendAdoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
HDP2.5 Updates
HDP2.5 UpdatesHDP2.5 Updates
HDP2.5 Updates
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 

Ähnlich wie Talend Open Studio and Hortonworks Data Platform

Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integration
DataWorks Summit
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
Accenture
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
OW2
 
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOutScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase
 
Annik research analytics deck pvd
Annik research analytics deck   pvdAnnik research analytics deck   pvd
Annik research analytics deck pvd
Atul Sharma
 
Metadata Use Cases
Metadata Use CasesMetadata Use Cases
Metadata Use Cases
dmurph4
 
Metadata Use Cases You Can Use
Metadata Use Cases You Can UseMetadata Use Cases You Can Use
Metadata Use Cases You Can Use
dmurph4
 

Ähnlich wie Talend Open Studio and Hortonworks Data Platform (20)

2012 06 hortonworks paris hug
2012 06 hortonworks paris hug2012 06 hortonworks paris hug
2012 06 hortonworks paris hug
 
Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's Not
 
Powering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache HadoopPowering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache Hadoop
 
Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integration
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
vBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and BeyondvBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and Beyond
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
 
Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation Architectures
 
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureThe Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information Architecture
 
Unified big data architecture
Unified big data architectureUnified big data architecture
Unified big data architecture
 
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
 
Scaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write SplittingScaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write Splitting
 
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOutScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
 
Enterprise Services Solutions
Enterprise Services SolutionsEnterprise Services Solutions
Enterprise Services Solutions
 
Annik research analytics deck pvd
Annik research analytics deck   pvdAnnik research analytics deck   pvd
Annik research analytics deck pvd
 
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
 
Metadata Use Cases
Metadata Use CasesMetadata Use Cases
Metadata Use Cases
 
Metadata Use Cases You Can Use
Metadata Use Cases You Can UseMetadata Use Cases You Can Use
Metadata Use Cases You Can Use
 
Scaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data Distribution
 
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
 

Mehr von Hortonworks

Mehr von Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Kürzlich hochgeladen

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 

Kürzlich hochgeladen (20)

How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 

Talend Open Studio and Hortonworks Data Platform

  • 1. Big Data Integration Talend Open Studio & Hortonworks Data Platform Ciaran Dynes: Senior Director, Product Marketing - Talend Jim Walker: Director, Product Marketing - Hortonworks August 8, 2012 © Hortonworks Inc. 2012 Page 1
  • 2. Your Presenters Ciaran Dynes Senior Director, Product Marketing Jim Walker Director, Product Marketing Page 2 © Hortonworks Inc. 2012
  • 3. Talend – The Market Leading Unified Integration Platform Talend Enterprise Data Data MDM ESB BPM Quality Integration ¾  Commercial license ¾  Subscription model Studio Repository Deployment Execution Monitoring ¾  Open source license Talend Open Studio for ¾  Free of charge ¾  Optional support Data Data Quality Integration MDM ESB Recognized as the open source leader in each of its market category by all industry analysts © Talend 2011 3
  • 4. Hortonworks Snapshot The industry leading and only 100% open source Apache Hadoop distribution •  Headquarters Sunnyvale, CA Most experienced open source leadership team –  Rob Bearden – CEO (JBoss, SpringSource, i2, Oracle) •  90+ Employees –  Shaun Connolly – VP Strategy (VMW, SpringSource, Red Hat, JBoss) •  Formed with core –  John Kreisa – VP Marketing (Red Hat, Cloudera, MarkLogic, Bus Obj) Apache Hadoop –  Ari Zilka – CPO (Teracotta, Accenture, Walmart.com) engineering team from Yahoo! –  Greg Pavlik – VP Eng. (Oracle SOA & Integration platform) •  35 engineers and architects including Business model focused on customer success: 25+ Hadoop Hadoop support, services & training committers – Subscription support for Hortonworks Data Platform – Training business: Private and public classes available for developers & administrators © Hortonworks Inc. 2012
  • 5. Next-gen data architecture drivers Business •  Enable new business models & drive faster growth (20%+) Drivers •  Find insights for competitive advantage & optimal returns •  Data continues to grow exponentially Technical •  Data is increasingly everywhere and in many formats Drivers •  Legacy solutions unfit for new requirements growth Financial •  Cost of data systems, as % of IT spend, continues to grow Drivers •  Cost advantages of commodity hardware & open source © Hortonworks Inc. 2012
  • 6. Big data changes the game Transactions + Interactions Petabytes BIG DATA Mobile Web + Observations Sentiment User Click Stream SMS/MMS = BIG DATA Speech to Text Social Interactions & Feeds Terabytes WEB Web logs Spatial & GPS Coordinates A/B testing Sensors / RFID / Devices Behavioral Targeting Gigabytes CRM Business Data Feeds Dynamic Pricing Segmentation External Demographics Search Marketing Customer Touches User Generated Content ERP Megabytes Affiliate Networks Purchase detail Support Contacts HD Video, Audio, Images Dynamic Funnels Purchase record Offer details Offer history Product/Service Logs Payment record Increasing Data Variety and Complexity © Hortonworks Inc. 2012
  • 7. Use cases: optimize outcomes at scale Media optimize Content Intelligence optimize Detection Investment optimize Algorithms Advertising optimize Performance Fraud optimize Prevention Regulation optimize Compliance Retail / Wholesale optimize Inventory turns Manufacturing optimize Supply chains Healthcare optimize Patient outcomes Education optimize Learning outcomes Government optimize Citizen services Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation. © Hortonworks Inc. 2012
  • 8. Hortonworks Data Platform •  Simplify deployment to get started quickly and easily •  Monitor, manage any size cluster with familiar console and tools •  Only platform to include data integration services to interact 1 with any data source •  Metadata services opens the platform for integration with Hortonworks Data Platform existing applications Delivers enterprise grade functionality on a proven Apache Hadoop distribution to ease •  Dependable high availability management, simplify use and ease integration architecture into the enterprise The only 100% open source data platform for Apache Hadoop © Hortonworks Inc. 2012
  • 9. Data Integration Services •  Intuitive graphical data integration tools for HDFS, Hive, HBase, HCatalog and Pig •  Oozie scheduling allows you to manage and stage jobs •  Connectors for any database, business application or system •  Integrated HCatalog storage Bridge the gap between legacy data & Hadoop Simplify and speed development Page 9 © Hortonworks Inc. 2012
  • 10. What is Big Data integration?
  • 11. Trying to get from this… © Talend 2011 – Stri2y Private & Confidential © Talend 2011 11
  • 12. to this… Why Talend… ONLY Talend generates code that is executed within map reduce. This open approach removes the limitation of a proprietary “engine” to provide a truly unique and powerful set of tools for big data.
  • 13. Key Takeaway #2 Forces us to think © Talend 2011 differently © Talend 2011 – Stri2y Private & Confidential 13
  • 14. But for Talend…. Big data is… …everything that is old, is new again! © Talend 2011 – Stri2y Private & Confidential © Talend 2011 14
  • 15. Data driven business enables data governance supports information decisions drives Information provides value to the business If you can't rely on your information then Your the result can be missed opportunities, or business higher costs. Matthew West and Julian Fowler (1999). Developing High Quality Data Models. The European Process Industries STEP Technical Liaison Executive (EPISTLE). © Talend 2011 – Stri2y Private & Confidential © Talend 2011 15
  • 16. BIG data driven business enables BIG data governance supports BIG BIG information decisions drives Information provides value to the business If you can't rely on your information then the result can be missed opportunities, or BIG higher costs. business Matthew West and Julian Fowler (1999). Developing High Quality Data Models. The European Process Industries STEP Technical Liaison Executive (EPISTLE). © Talend 2011 – Stri2y Private & Confidential © Talend 2011 16
  • 17. Let us show you… © Talend 2012
  • 18. Putting Web Logs to use Scenario: ¾  ACME Web Inc. have thousands of customers and millions of daily page hits on their ecommerce website ¾  ACME believe they could sell more things, if they could simply figure our buying trends ¾  ACME turns to Big Data to help get a handle on the volume of data they need to manage © © Talend 2011 2012 Talend 18 18
  • 19. Poor Data Quality + Big Data = Big Problems Poor Data Quality * Big Data = Big Problems^2 Key Takeaway #3 In big data… poor data quality can be magnified at huge scale © Talend 2011 19
  • 20. Metadata Services Apache HCatalog provides flexible metadata services across tools and external access •  Consistency of metadata and data models across tools (MapReduce, Pig, HBase and Hive) •  Accessibility: share data as tables in and out of HDFS •  Availability: enables flexible, thin-client access via REST API HCatalog Shared table and schema management •  Raw Hadoop data Table access opens the •  Inconsistent, unknown Aligned metadata platform •  Tool specific access REST API © Hortonworks Inc. 2012
  • 21. Talend Open Studio for Big Data Democratize Big Data Talend Open Studio for Big Data •  Improves efficiency of big data job design with graphic interface •  Generates Hadoop code •  Run transforms inside Hadoop Pig •  Native support for HDFS, Pig, Hbase, Sqoop and Hive •  Apache License •  Available at talend.com …an open source •  Distribution with hadoop vendors coming ecosystem © Talend 2011 21
  • 22. Talend Platform for Big Data Make Faster and More Informed Decisions Talend Platform for Big Data •  Builds on Talend Open Studio for Big Data •  Adds data quality, advanced scalability and management functions •  MapReduce massively parallel data processing Pig •  Shared Repository and remote deployment •  Data quality and profiling •  Data cleansing •  Reporting and dashboards •  Commercial support, warranty/IP indemnity under a subscription license …an open source ecosystem © Talend 2011 22
  • 23. Why HDP? Only Hortonworks Data Platform provides… •  Tightly aligned to core Apache Hadoop development line - Reduces risk for customers who may add custom coding or projects •  Enterprise Integration - HCatalog provides scalable, extensible integration point to Hadoop data •  Most reliable Hadoop distribution - Full stack high availability on v1 delivers the strongest SLA guarantees •  Multi-tenant scheduling and resource management - Capacity and fair scheduling optimizes cluster resources •  Integration with operations, eases cluster management - Ambari is the most open/complete operations platform for Hadoop clusters © Hortonworks Inc. 2012
  • 24. What next? Download Hortonworks Data Platform 1 & Talend Open Studio hortonworks.com/download or talend.com/downlod 2 Use the getting started guide hortonworks.com/get-started 3 Learn more… get support Hortonworks Support •  Expert role based training •  Full lifecycle technical support •  Course for admins, developers across four service levels and operators •  Delivered by Apache Hadoop •  Certification program Experts/Committers •  Custom onsite options •  Forward-compatible hortonworks.com/training hortonworks.com/support Page 24 © Hortonworks Inc. 2012
  • 25. Questions & Answers TRY download at hortonworks.com download at talend.com LEARN Hortonworks University FOLLOW twitter: @hortonworks Facebook: facebook.com/hortonworks MORE EVENTS hortonworks.com/events Further questions & comments: events@hortonworks.com Page 25 © Hortonworks Inc. 2012