Weitere ähnliche Inhalte Ähnlich wie Talend Open Studio and Hortonworks Data Platform (20) Mehr von Hortonworks (20) Kürzlich hochgeladen (20) Talend Open Studio and Hortonworks Data Platform1. Big Data Integration
Talend Open Studio & Hortonworks Data Platform
Ciaran Dynes: Senior Director, Product Marketing - Talend
Jim Walker: Director, Product Marketing - Hortonworks
August 8, 2012
© Hortonworks Inc. 2012 Page 1
2. Your Presenters
Ciaran Dynes
Senior Director, Product Marketing
Jim Walker
Director, Product Marketing
Page 2
© Hortonworks Inc. 2012
3. Talend – The Market Leading Unified Integration Platform
Talend Enterprise
Data Data
MDM ESB BPM
Quality Integration
¾ Commercial license
¾ Subscription model
Studio Repository Deployment Execution Monitoring
¾ Open source license
Talend Open Studio for
¾ Free of charge
¾ Optional support
Data Data
Quality Integration MDM ESB
Recognized as the open source leader in each of its market
category by all industry analysts
© Talend 2011 3
4. Hortonworks Snapshot
The industry leading and only 100% open
source Apache Hadoop distribution
• Headquarters
Sunnyvale, CA Most experienced open source leadership team
– Rob Bearden – CEO (JBoss, SpringSource, i2, Oracle)
• 90+ Employees
– Shaun Connolly – VP Strategy (VMW, SpringSource, Red Hat, JBoss)
• Formed with core – John Kreisa – VP Marketing (Red Hat, Cloudera, MarkLogic, Bus Obj)
Apache Hadoop – Ari Zilka – CPO (Teracotta, Accenture, Walmart.com)
engineering team
from Yahoo! – Greg Pavlik – VP Eng. (Oracle SOA & Integration platform)
• 35 engineers and
architects including Business model focused on customer success:
25+ Hadoop Hadoop support, services & training
committers
– Subscription support for Hortonworks Data Platform
– Training business: Private and public classes available
for developers & administrators
© Hortonworks Inc. 2012
5. Next-gen data architecture drivers
Business • Enable new business models & drive faster growth (20%+)
Drivers • Find insights for competitive advantage & optimal returns
• Data continues to grow exponentially
Technical • Data is increasingly everywhere and in many formats
Drivers • Legacy solutions unfit for new requirements growth
Financial • Cost of data systems, as % of IT spend, continues to grow
Drivers • Cost advantages of commodity hardware & open source
© Hortonworks Inc. 2012
6. Big data changes the game
Transactions + Interactions
Petabytes
BIG DATA Mobile Web + Observations
Sentiment
User Click Stream
SMS/MMS
= BIG DATA
Speech to Text
Social Interactions & Feeds
Terabytes WEB Web logs
Spatial & GPS Coordinates
A/B testing
Sensors / RFID / Devices
Behavioral Targeting
Gigabytes CRM Business Data Feeds
Dynamic Pricing
Segmentation External Demographics
Search Marketing
Customer Touches User Generated Content
ERP
Megabytes Affiliate Networks
Purchase detail Support Contacts HD Video, Audio, Images
Dynamic Funnels
Purchase record
Offer details Offer history Product/Service Logs
Payment record
Increasing Data Variety and Complexity
© Hortonworks Inc. 2012
7. Use cases: optimize outcomes at scale
Media optimize Content
Intelligence optimize Detection
Investment optimize Algorithms
Advertising optimize Performance
Fraud optimize Prevention
Regulation optimize Compliance
Retail / Wholesale optimize Inventory turns
Manufacturing optimize Supply chains
Healthcare optimize Patient outcomes
Education optimize Learning outcomes
Government optimize Citizen services
Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation.
© Hortonworks Inc. 2012
8. Hortonworks Data Platform
• Simplify deployment to get
started quickly and easily
• Monitor, manage any size cluster
with familiar console and tools
• Only platform to include data
integration services to interact
1 with any data source
• Metadata services opens the
platform for integration with
Hortonworks Data Platform existing applications
Delivers enterprise grade functionality on a
proven Apache Hadoop distribution to ease • Dependable high availability
management, simplify use and ease integration architecture
into the enterprise
The only 100% open source data platform for Apache Hadoop
© Hortonworks Inc. 2012
9. Data Integration Services
• Intuitive graphical data
integration tools for HDFS,
Hive, HBase, HCatalog and Pig
• Oozie scheduling allows you to
manage and stage jobs
• Connectors for any database,
business application or system
• Integrated HCatalog storage
Bridge the gap between
legacy data & Hadoop
Simplify and speed development
Page 9
© Hortonworks Inc. 2012
11. Trying to get from this…
© Talend 2011 – Stri2y Private & Confidential
© Talend 2011 11
12. to this…
Why Talend…
ONLY Talend generates code that is executed within map reduce. This
open approach removes the limitation of a proprietary “engine” to
provide a truly unique and powerful set of tools for big data.
13. Key Takeaway #2
Forces us to think
© Talend 2011
differently
© Talend 2011 – Stri2y Private & Confidential 13
14. But for Talend…. Big data is…
…everything that is old, is new again!
© Talend 2011 – Stri2y Private & Confidential
© Talend 2011 14
15. Data driven business
enables
data governance
supports
information decisions
drives
Information provides
value to the business
If you can't rely on your information then Your
the result can be missed opportunities, or business
higher costs.
Matthew West and Julian Fowler (1999). Developing High Quality Data Models.
The European Process Industries
STEP Technical Liaison Executive (EPISTLE).
© Talend 2011 – Stri2y Private & Confidential
© Talend 2011 15
16. BIG data driven business
enables
BIG data governance
supports
BIG BIG
information decisions
drives
Information provides
value to the business
If you can't rely on your information then
the result can be missed opportunities, or BIG
higher costs. business
Matthew West and Julian Fowler (1999). Developing High Quality Data Models.
The European Process Industries
STEP Technical Liaison Executive (EPISTLE).
© Talend 2011 – Stri2y Private & Confidential
© Talend 2011 16
18. Putting Web Logs to use
Scenario:
¾ ACME Web Inc. have thousands of customers and millions of daily page hits on their
ecommerce website
¾ ACME believe they could sell more things, if they could simply figure our buying trends
¾ ACME turns to Big Data to help get a handle on the volume of data they need to manage
© © Talend 2011 2012
Talend 18 18
19. Poor Data Quality + Big Data = Big Problems
Poor Data Quality * Big Data = Big Problems^2
Key Takeaway #3
In big data…
poor data quality can be magnified at huge scale
© Talend 2011 19
20. Metadata Services
Apache HCatalog provides flexible metadata
services across tools and external access
• Consistency of metadata and data models across tools
(MapReduce, Pig, HBase and Hive)
• Accessibility: share data as tables in and out of HDFS
• Availability: enables flexible, thin-client access via REST API
HCatalog Shared table
and schema
management
• Raw Hadoop data Table access opens the
• Inconsistent, unknown Aligned metadata platform
• Tool specific access REST API
© Hortonworks Inc. 2012
21. Talend Open Studio for Big Data
Democratize Big Data
Talend Open Studio for Big Data
• Improves efficiency of big data job
design with graphic interface
• Generates Hadoop code
• Run transforms inside Hadoop
Pig
• Native support for HDFS, Pig, Hbase,
Sqoop and Hive
• Apache License
• Available at talend.com
…an open source
• Distribution with hadoop vendors coming
ecosystem
© Talend 2011 21
22. Talend Platform for Big Data
Make Faster and More Informed
Decisions
Talend Platform for Big Data
• Builds on Talend Open Studio for Big Data
• Adds data quality, advanced scalability and
management functions
• MapReduce massively parallel data processing
Pig
• Shared Repository and remote deployment
• Data quality and profiling
• Data cleansing
• Reporting and dashboards
• Commercial support, warranty/IP indemnity
under a subscription license
…an open source
ecosystem
© Talend 2011 22
23. Why HDP?
Only Hortonworks Data Platform provides…
• Tightly aligned to core Apache Hadoop development line
- Reduces risk for customers who may add custom coding or projects
• Enterprise Integration
- HCatalog provides scalable, extensible integration point to Hadoop data
• Most reliable Hadoop distribution
- Full stack high availability on v1 delivers the strongest SLA guarantees
• Multi-tenant scheduling and resource management
- Capacity and fair scheduling optimizes cluster resources
• Integration with operations, eases cluster management
- Ambari is the most open/complete operations platform for Hadoop clusters
© Hortonworks Inc. 2012
24. What next?
Download Hortonworks Data Platform
1 & Talend Open Studio
hortonworks.com/download or talend.com/downlod
2 Use the getting started guide
hortonworks.com/get-started
3 Learn more… get support
Hortonworks Support
• Expert role based training • Full lifecycle technical support
• Course for admins, developers across four service levels
and operators • Delivered by Apache Hadoop
• Certification program Experts/Committers
• Custom onsite options • Forward-compatible
hortonworks.com/training hortonworks.com/support
Page 24
© Hortonworks Inc. 2012
25. Questions & Answers
TRY
download at hortonworks.com
download at talend.com
LEARN
Hortonworks University
FOLLOW
twitter: @hortonworks
Facebook: facebook.com/hortonworks
MORE EVENTS
hortonworks.com/events
Further questions & comments: events@hortonworks.com
Page 25
© Hortonworks Inc. 2012