SlideShare ist ein Scribd-Unternehmen logo
1 von 25
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Q&A box is available for your questions
Webinar will be recorded for future viewing
Thank you for joining!
We’ll get started soon…
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Create a Smarter Data Lake with HP Haven
and Apache Hadoop
We do Hadoop.
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Your speakers…
Ajay Singh, Director of Technical Channels
Hortonworks
Will Gardella, Director of Product
Management, Big Data
HP
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Traditional systems under pressure
Challenges
• Constrains data to app
• Can’t manage new data
• Costly to Scale
Business Value
Clickstream
Geolocation
Web Data
Internet of Things
Docs, emails
Server logs
2012
2.8 Zettabytes
2020
40 Zettabytes
LAGGARDS
INDUSTRY
LEADERS
1
2 New Data
ERP CRM SCM
New
Traditional
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop emerged as foundation of new data architecture
Apache Hadoop is an open source data platform for
managing large volumes of high velocity and variety of data
• Built by Yahoo! to be the heartbeat of its ad & search business
• Donated to Apache Software Foundation in 2005 with rapid adoption by
large web properties & early adopter enterprises
Hadoop Advantages
 Manages new data paradigm
 Handles data at scale
 Cost effective
 Open source
Application
Storage
HDFS
Batch Processing
MapReduce
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop for the Enterprise:
Implement a Modern Data Architecture with HDP
Customer Momentum
• 230+ customers (as of Q3 2014)
Hortonworks Data Platform
• Completely open multi-tenant platform for any app & any
data.
• A centralized architecture of consistent enterprise services
for resource management, security, operations, and
governance.
Partner for Customer Success
• Open source community leadership focus on enterprise
needs
• Unrivaled world class support
• Founded in 2011
• Original 24 architects, developers,
operators of Hadoop from Yahoo!
• 600+ Employees
• 800+ Ecosystem Partners
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDP delivers a completely open data platform
Hortonworks Data Platform 2.2
Hortonworks Data Platform provides Hadoop for the Enterprise: a centralized architecture
of core enterprise services, for any application and any data.
Completely Open
• HDP incorporates every element
required of an enterprise data
platform: data storage, data
access, governance, security,
operations
• All components are developed in
open source and then rigorously
tested, certified, and delivered as
an integrated open source platform
that’s easy to consume and use by
the enterprise and ecosystem.
YARN: Data Operating System
(Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
ApachePig
° °
° °
° ° °
° ° °
HDFS
(Hadoop Distributed File System)
GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
Apache Falcon
ApacheHive
Cascading
ApacheHBase
ApacheAccumulo
ApacheSolr
ApacheSpark
ApacheStorm
Apache Sqoop
Apache Flume
Apache Kafka
SECURITY
Apache Ranger
Apache Knox
Apache Falcon
OPERATIONS
Apache Ambari
Apache
Zookeeper
Apache Oozie
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
HP & Hortonworks
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
HP & Hortonworks: An Integrated Part of a Modern Data Architecture
Smart Content Hub
Solution Architecture
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
The Opportunity
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.11
Accelerating business outcomes
Data lakes – the new enterprise data hub
Social media IT/OT ImagesAudioVideo
Transactional
dataMobile Search engineEmail Texts Documents
Hadoop Distributed File System (HDFS)
Self-healing, high bandwidth, cheap clustered storage
Map/Reduce
Distributed Computing Framework
Business
outcomes
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12
Analyzing data is a multi step process
Data type
• Structured tables
• Semi-structured
• Unstructured
• Documents
• Images
• Audio
• Video
Speed
• Batch
• Interactive
• Real-time
Process
• Acquisition
• Preparation
• Visualization
• Analysis
• Presentation
• Collaboration
Skill set
• Business users
• Programmer
• Database
expert
• Statistician
• Mathematician
• Subject Matter
Expert
Types
• Descriptive
• Diagnostic
• Predictive
• Prescriptive
Requires “Easy To Use” tools to meet wide range of skills
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.13
Challenge: Barriers between business users and
actionable information
Business users Data Scientists
Programmers Batch
Data Cleansing
Programming
Statistics
Reports
Information Requests
Hadoop
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14
Contextual
search
Data
exploration
Image/video
analytics Geospatial
analytics
SQL on
HadoopAccelerated
analytics
Sentiment
analysis
Predictive
analytics
HP Haven Big Data platform
Access Explore Enrich Analyze Predict Serve Act
And
more...
Core big data business capabilities
On-premise In the Cloud
Industry-leading breath & depth of capabilities
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.15
A Smarter Data Lake
• Any Source − Build, enrich, and clean up your data lake
• Data Clarity & Mapped Security – Data dictionary and information security within your data
lake
• Advanced Analytics - Provide contextual search and text, image, video, speech machine
learning
Fast Analytics with HP Vertica on Hadoop
• The fastest and most advanced SQL analytics on Hadoop
• Operationalize, democratize and monetize all your data
• Data tiering – pick the best location and format for your data
HP Haven for Hadoop
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16
2D/3D clustering, Acoustic signature, Active matching, Agents, Alerting, Auto language detection, Auto query guidance, Boolean & legacy, Operations, Breaking news clustering, Categorization, Collaboration, Community, Concept highlighting,
Concept-query, Summarization, Conceptual retrieval, Context summarization, Cross-modal suggest, Dynamic n-dimensional, Taxonomy generation, Dynamic XML, Consumption, Exact phrase matching, Expertise location, Explicit profiling,
Face recognition, Field modulation, Frame analysis, Fuzzy matching, Hot clustering, Hyperlinking, Image analysis, Image association, Implicit profiling, Keyword search, Mail object identification, Melody classification, Melody identification,
Metadata recognition, Natural language retrieval, Object identification, Object recognition, Ontology generation, Parametric refinement, Phrase spotting, Proper name identification, Query by example, Real-time aggregation, Routing,
Scene detection, Script alignment, Sentiment analysis, Sound matching, Speaker identification, Speaker recognition, Spectographic analysis, Spell checking, Tag reconciliation, Transcription, Video analysis, Voice printing, Word spotting,
Work groups, XML tagging….
AnalyzeEnrichFind Act
HP IDOL: Act on 100% of your information
Transactional
data
IOT Search
engine
ImagesSocial
media
Video Audio Mobile Documents Texts EmailCustomer
communications
Language independent Language independent
News
Forums
Blogs
…and more
Enterprise External and Cloud
HP Archiving
HP Data Protection
HP Marketing
Optimization
…and more
Act on 100% of your
information
HP IDOL
+500 powerful HP IDOL functions
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Using HP Haven with Hadoop
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.18
A Smarter Data Lake Needs…
Automatically analyse rich media
Connectors & Policies
HP IDOL Features
Integration points with Hadoop
Understand myriad file formats and types
Breakdown information silos across enterprise
Improved, intuitive visibility to contents
KeyView
IDOL Server (incl HDFS Sync)
Image Server & Video Server
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.19
Hadoop HDFS Synchronizer
Deep Hadoop integration with MPP M/R architecture and enterprise-class security
• Automate the complete picture
- Extracts the entire content of a given file
residing on HDFS
- Processing on HDFS
• Configuration -> Map Reduce
- Synchronized crawlers that translate
configurations into Map/Reduce processes
- No advanced programming necessary
• Leverage M/R
- Distributed MPP processing, data locality,
minimized network traffic
• Advanced analytics built in
- OCR, entity extraction, logo detection,
IDOL HDFS Sync: prepare data for analysis
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.20
Demo: Smart Content Hub
Hadoop Cluster
HDFS
HDFS
Connector
IDOL
Enterprise
Connectors
IDOL
Apps
Enterprise
Repositorie
s
Cloud &
Web
Business
Users
HDFS
Sync
Hadoop Services
Edge
Node
Resource Slots
Compute Nodes
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Hadoop and IDOL in practice
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.22
Case study: Hadoop and Big Data in healthcare
2
New use-cases enabled
• Population and Community Health
• Prediction capabilities (symptoms, ailments, outbreaks,
etc.)
• Clear picture of Community Health (attitudinal trends,
demographics, geospatial)
• International impact
• Benefit/Reference-based plan design
• Care Management/Care Coordination
• Combine with Claims to fill in gaps (symptomatic,
attitudes, education)
• Outcome Success
• Surveillance, Analysis, Product Development
Innovation
• Competitive intelligence
• Trends (attitudinal/behavioral, caregiving, device
usage, etc.)
• Monetized data insight opportunities
• Consumer Activation/Engagement/Education
• Consumer conversations, trends, blogs
• Interactive/participative approach
• Expand “Circles of Influence”
• Sets Quality Standards for Care/Providers
• Reputation Management/Outreach
• Sentiment management (competitor & brand)
• Outreach to support members, clients, providers
• Voice of the Customer
Claims Data
Treatment/
Service Data
Call Center Data
Innovation
Payment
integrity
Product
dev.
Care
delivery
BrandConsumer
activation
Providers
FWA
Recovery
Data
Provider
Information
Lines of business
Social Media
Challenges:
• Started with Payment Integrity Use-Case
• Dealing with evolving patterns of FWA
• Multiple payment systems , no single view
• No-Self Service
• Long turn around time for BI analysis
reports
HP solution:
• IDOL based solution
• Self-service analysis for business analyst
• Single point of access - Multiple systems
• Dynamic rule-engine tests against new
and historical claims to identify potential
recoveries
• Scale out on Hadoop Architecture
• New data, use-cases being added
continually
ROI:
• 24 x improvement in analysis turnaround
• Millions $$ saved in first few weeks
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.23
Using HP IDOL with Hadoop
• Reduce cost, time, and expertise
required to gain actionable insight
• Empower business users to
interact with Hadoop data
• Real-time and interactive
• IDOL’s advanced analysis of data
• Connects all data-types
• Standardized data model
RETURN ON
INFORMATION
Securely perform enterprise-class analysis of Hadoop data
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.24
www.hp.com/go/haven
hortonworks.com/partner/hp/
Solution brochure
Technical white paper
HP Vertica SQL on Hadoop
FAQ
Customer analytics use case
Learn more about HP Haven
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Next Steps…
Download the Hortonworks Sandbox
Learn Hadoop
Build Your Analytic App
Try Hadoop 2
More about HP & Hortonworks
http://hortonworks.com/partner/HP/
Contact us: events@hortonworks.com

Weitere ähnliche Inhalte

Was ist angesagt?

Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...Hortonworks
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudHortonworks
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Hortonworks
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionHortonworks
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarHortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Hortonworks
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Hortonworks
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark Summit
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 

Was ist angesagt? (20)

Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica Webinar
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 

Andere mochten auch

Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataHortonworks
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceHortonworks
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Three Steps to Modern Media Asset Management with Active Archive
Three Steps to Modern Media Asset Management with Active ArchiveThree Steps to Modern Media Asset Management with Active Archive
Three Steps to Modern Media Asset Management with Active ArchiveAvere Systems
 
Digital Media Ingest and Storage Options on AWS
Digital Media Ingest and Storage Options on AWSDigital Media Ingest and Storage Options on AWS
Digital Media Ingest and Storage Options on AWSAmazon Web Services
 
SUN TV NETWORK LIMITED
SUN TV NETWORK LIMITEDSUN TV NETWORK LIMITED
SUN TV NETWORK LIMITEDARVIND D
 
ximena araneda - The Next Generation MAM Systems
ximena araneda - The Next Generation MAM Systemsximena araneda - The Next Generation MAM Systems
ximena araneda - The Next Generation MAM SystemsFIAT/IFTA
 
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks
 
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012Amazon Web Services
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderHortonworks
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3Hortonworks
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with sparkHortonworks
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHortonworks
 
Data-Ed: Best Practices with the Data Management Maturity Model
Data-Ed: Best Practices with the Data Management Maturity ModelData-Ed: Best Practices with the Data Management Maturity Model
Data-Ed: Best Practices with the Data Management Maturity ModelData Blueprint
 
Технологии blockchain в здравоохранении
Технологии blockchain в здравоохраненииТехнологии blockchain в здравоохранении
Технологии blockchain в здравоохраненииSerge Dobridnjuk
 
Hortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Zaloni
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 

Andere mochten auch (20)

Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Three Steps to Modern Media Asset Management with Active Archive
Three Steps to Modern Media Asset Management with Active ArchiveThree Steps to Modern Media Asset Management with Active Archive
Three Steps to Modern Media Asset Management with Active Archive
 
Digital Media Ingest and Storage Options on AWS
Digital Media Ingest and Storage Options on AWSDigital Media Ingest and Storage Options on AWS
Digital Media Ingest and Storage Options on AWS
 
SUN TV NETWORK LIMITED
SUN TV NETWORK LIMITEDSUN TV NETWORK LIMITED
SUN TV NETWORK LIMITED
 
ximena araneda - The Next Generation MAM Systems
ximena araneda - The Next Generation MAM Systemsximena araneda - The Next Generation MAM Systems
ximena araneda - The Next Generation MAM Systems
 
Sun Tv
Sun TvSun Tv
Sun Tv
 
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search
 
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via Slider
 
Vidispine
VidispineVidispine
Vidispine
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with spark
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare Transformation
 
Data-Ed: Best Practices with the Data Management Maturity Model
Data-Ed: Best Practices with the Data Management Maturity ModelData-Ed: Best Practices with the Data Management Maturity Model
Data-Ed: Best Practices with the Data Management Maturity Model
 
Технологии blockchain в здравоохранении
Технологии blockchain в здравоохраненииТехнологии blockchain в здравоохранении
Технологии blockchain в здравоохранении
 
Hortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical Applications
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 

Ähnlich wie Create a Smarter Data Lake with HP Haven and Apache Hadoop

A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...DataWorks Summit
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Hortonworks
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinerySteve Loughran
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranJAX London
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...Platfora
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldCA Technologies
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championAmeet Paranjape
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - finalHortonworks
 

Ähnlich wie Create a Smarter Data Lake with HP Haven and Apache Hadoop (20)

A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
201305 hadoop jpl-v3
201305 hadoop jpl-v3201305 hadoop jpl-v3
201305 hadoop jpl-v3
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
OOP 2014
OOP 2014OOP 2014
OOP 2014
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
Big Data
Big DataBig Data
Big Data
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - final
 

Mehr von Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Mehr von Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Kürzlich hochgeladen

Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 

Kürzlich hochgeladen (20)

Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 

Create a Smarter Data Lake with HP Haven and Apache Hadoop

  • 1. © Hortonworks Inc. 2011 – 2014. All Rights Reserved Q&A box is available for your questions Webinar will be recorded for future viewing Thank you for joining! We’ll get started soon…
  • 2. © Hortonworks Inc. 2011 – 2014. All Rights Reserved Create a Smarter Data Lake with HP Haven and Apache Hadoop We do Hadoop.
  • 3. © Hortonworks Inc. 2011 – 2014. All Rights Reserved Your speakers… Ajay Singh, Director of Technical Channels Hortonworks Will Gardella, Director of Product Management, Big Data HP
  • 4. © Hortonworks Inc. 2011 – 2014. All Rights Reserved Traditional systems under pressure Challenges • Constrains data to app • Can’t manage new data • Costly to Scale Business Value Clickstream Geolocation Web Data Internet of Things Docs, emails Server logs 2012 2.8 Zettabytes 2020 40 Zettabytes LAGGARDS INDUSTRY LEADERS 1 2 New Data ERP CRM SCM New Traditional
  • 5. © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop emerged as foundation of new data architecture Apache Hadoop is an open source data platform for managing large volumes of high velocity and variety of data • Built by Yahoo! to be the heartbeat of its ad & search business • Donated to Apache Software Foundation in 2005 with rapid adoption by large web properties & early adopter enterprises Hadoop Advantages  Manages new data paradigm  Handles data at scale  Cost effective  Open source Application Storage HDFS Batch Processing MapReduce
  • 6. © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop for the Enterprise: Implement a Modern Data Architecture with HDP Customer Momentum • 230+ customers (as of Q3 2014) Hortonworks Data Platform • Completely open multi-tenant platform for any app & any data. • A centralized architecture of consistent enterprise services for resource management, security, operations, and governance. Partner for Customer Success • Open source community leadership focus on enterprise needs • Unrivaled world class support • Founded in 2011 • Original 24 architects, developers, operators of Hadoop from Yahoo! • 600+ Employees • 800+ Ecosystem Partners
  • 7. © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDP delivers a completely open data platform Hortonworks Data Platform 2.2 Hortonworks Data Platform provides Hadoop for the Enterprise: a centralized architecture of core enterprise services, for any application and any data. Completely Open • HDP incorporates every element required of an enterprise data platform: data storage, data access, governance, security, operations • All components are developed in open source and then rigorously tested, certified, and delivered as an integrated open source platform that’s easy to consume and use by the enterprise and ecosystem. YARN: Data Operating System (Cluster Resource Management) 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ApachePig ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS Apache Falcon ApacheHive Cascading ApacheHBase ApacheAccumulo ApacheSolr ApacheSpark ApacheStorm Apache Sqoop Apache Flume Apache Kafka SECURITY Apache Ranger Apache Knox Apache Falcon OPERATIONS Apache Ambari Apache Zookeeper Apache Oozie
  • 8. © Hortonworks Inc. 2011 – 2014. All Rights Reserved HP & Hortonworks
  • 9. © Hortonworks Inc. 2011 – 2014. All Rights Reserved HP & Hortonworks: An Integrated Part of a Modern Data Architecture Smart Content Hub Solution Architecture
  • 10. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The Opportunity
  • 11. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.11 Accelerating business outcomes Data lakes – the new enterprise data hub Social media IT/OT ImagesAudioVideo Transactional dataMobile Search engineEmail Texts Documents Hadoop Distributed File System (HDFS) Self-healing, high bandwidth, cheap clustered storage Map/Reduce Distributed Computing Framework Business outcomes
  • 12. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12 Analyzing data is a multi step process Data type • Structured tables • Semi-structured • Unstructured • Documents • Images • Audio • Video Speed • Batch • Interactive • Real-time Process • Acquisition • Preparation • Visualization • Analysis • Presentation • Collaboration Skill set • Business users • Programmer • Database expert • Statistician • Mathematician • Subject Matter Expert Types • Descriptive • Diagnostic • Predictive • Prescriptive Requires “Easy To Use” tools to meet wide range of skills
  • 13. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.13 Challenge: Barriers between business users and actionable information Business users Data Scientists Programmers Batch Data Cleansing Programming Statistics Reports Information Requests Hadoop
  • 14. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14 Contextual search Data exploration Image/video analytics Geospatial analytics SQL on HadoopAccelerated analytics Sentiment analysis Predictive analytics HP Haven Big Data platform Access Explore Enrich Analyze Predict Serve Act And more... Core big data business capabilities On-premise In the Cloud Industry-leading breath & depth of capabilities
  • 15. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.15 A Smarter Data Lake • Any Source − Build, enrich, and clean up your data lake • Data Clarity & Mapped Security – Data dictionary and information security within your data lake • Advanced Analytics - Provide contextual search and text, image, video, speech machine learning Fast Analytics with HP Vertica on Hadoop • The fastest and most advanced SQL analytics on Hadoop • Operationalize, democratize and monetize all your data • Data tiering – pick the best location and format for your data HP Haven for Hadoop
  • 16. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16 2D/3D clustering, Acoustic signature, Active matching, Agents, Alerting, Auto language detection, Auto query guidance, Boolean & legacy, Operations, Breaking news clustering, Categorization, Collaboration, Community, Concept highlighting, Concept-query, Summarization, Conceptual retrieval, Context summarization, Cross-modal suggest, Dynamic n-dimensional, Taxonomy generation, Dynamic XML, Consumption, Exact phrase matching, Expertise location, Explicit profiling, Face recognition, Field modulation, Frame analysis, Fuzzy matching, Hot clustering, Hyperlinking, Image analysis, Image association, Implicit profiling, Keyword search, Mail object identification, Melody classification, Melody identification, Metadata recognition, Natural language retrieval, Object identification, Object recognition, Ontology generation, Parametric refinement, Phrase spotting, Proper name identification, Query by example, Real-time aggregation, Routing, Scene detection, Script alignment, Sentiment analysis, Sound matching, Speaker identification, Speaker recognition, Spectographic analysis, Spell checking, Tag reconciliation, Transcription, Video analysis, Voice printing, Word spotting, Work groups, XML tagging…. AnalyzeEnrichFind Act HP IDOL: Act on 100% of your information Transactional data IOT Search engine ImagesSocial media Video Audio Mobile Documents Texts EmailCustomer communications Language independent Language independent News Forums Blogs …and more Enterprise External and Cloud HP Archiving HP Data Protection HP Marketing Optimization …and more Act on 100% of your information HP IDOL +500 powerful HP IDOL functions
  • 17. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Using HP Haven with Hadoop
  • 18. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.18 A Smarter Data Lake Needs… Automatically analyse rich media Connectors & Policies HP IDOL Features Integration points with Hadoop Understand myriad file formats and types Breakdown information silos across enterprise Improved, intuitive visibility to contents KeyView IDOL Server (incl HDFS Sync) Image Server & Video Server
  • 19. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.19 Hadoop HDFS Synchronizer Deep Hadoop integration with MPP M/R architecture and enterprise-class security • Automate the complete picture - Extracts the entire content of a given file residing on HDFS - Processing on HDFS • Configuration -> Map Reduce - Synchronized crawlers that translate configurations into Map/Reduce processes - No advanced programming necessary • Leverage M/R - Distributed MPP processing, data locality, minimized network traffic • Advanced analytics built in - OCR, entity extraction, logo detection, IDOL HDFS Sync: prepare data for analysis
  • 20. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.20 Demo: Smart Content Hub Hadoop Cluster HDFS HDFS Connector IDOL Enterprise Connectors IDOL Apps Enterprise Repositorie s Cloud & Web Business Users HDFS Sync Hadoop Services Edge Node Resource Slots Compute Nodes
  • 21. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Hadoop and IDOL in practice
  • 22. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.22 Case study: Hadoop and Big Data in healthcare 2 New use-cases enabled • Population and Community Health • Prediction capabilities (symptoms, ailments, outbreaks, etc.) • Clear picture of Community Health (attitudinal trends, demographics, geospatial) • International impact • Benefit/Reference-based plan design • Care Management/Care Coordination • Combine with Claims to fill in gaps (symptomatic, attitudes, education) • Outcome Success • Surveillance, Analysis, Product Development Innovation • Competitive intelligence • Trends (attitudinal/behavioral, caregiving, device usage, etc.) • Monetized data insight opportunities • Consumer Activation/Engagement/Education • Consumer conversations, trends, blogs • Interactive/participative approach • Expand “Circles of Influence” • Sets Quality Standards for Care/Providers • Reputation Management/Outreach • Sentiment management (competitor & brand) • Outreach to support members, clients, providers • Voice of the Customer Claims Data Treatment/ Service Data Call Center Data Innovation Payment integrity Product dev. Care delivery BrandConsumer activation Providers FWA Recovery Data Provider Information Lines of business Social Media Challenges: • Started with Payment Integrity Use-Case • Dealing with evolving patterns of FWA • Multiple payment systems , no single view • No-Self Service • Long turn around time for BI analysis reports HP solution: • IDOL based solution • Self-service analysis for business analyst • Single point of access - Multiple systems • Dynamic rule-engine tests against new and historical claims to identify potential recoveries • Scale out on Hadoop Architecture • New data, use-cases being added continually ROI: • 24 x improvement in analysis turnaround • Millions $$ saved in first few weeks
  • 23. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.23 Using HP IDOL with Hadoop • Reduce cost, time, and expertise required to gain actionable insight • Empower business users to interact with Hadoop data • Real-time and interactive • IDOL’s advanced analysis of data • Connects all data-types • Standardized data model RETURN ON INFORMATION Securely perform enterprise-class analysis of Hadoop data
  • 24. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.24 www.hp.com/go/haven hortonworks.com/partner/hp/ Solution brochure Technical white paper HP Vertica SQL on Hadoop FAQ Customer analytics use case Learn more about HP Haven
  • 25. © Hortonworks Inc. 2011 – 2014. All Rights Reserved Next Steps… Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop 2 More about HP & Hortonworks http://hortonworks.com/partner/HP/ Contact us: events@hortonworks.com

Hinweis der Redaktion

  1. Before we dive into Hadoop and its role within the modern data architecture, let’s set the context for why Hadoop has become important. Existing approaches for data management have become both technically and commercially impractical. Technically - these systems were never designed to store or process vast quantities of data Commercially – the licensing structures with the traditonal approach are no longer feasible. These two challenges combined with rate at which data is being produce predicated a need for a new approach to data systems. If we fast-forward another 3 to 5 years, more than half of the data under management within the enterprise will be from these new data sources.
  2. Enter Hadoop. Faced with this challenge the team at yahoo conceived and created apache hadoop to address the challenge. They then were convinced that contribution of this platform into an open community would speed innovation. They open sourced the technology and did so within the governance of the Apache Software Foundation. (ASF) This introduced two distinct significant advantages. Not only could they manage new data types at scale but the now had a commercially feasible approach. However, there will still significant challenges. The first generation of Hadoop was: - designed and optimized for Batch only workloads, - it required dedicated clusters for each application, and, - it didn’t integrate easily with many of the existing technologies present in the data center. Also, like any emerging technology, Hadoop was required to meet a certain level of readiness required by the enterprise. After running Hadoop at scale at yahoo, the team spun out to form Hortonworks with the intent to address these challenges and make Hadoop enterprise ready.
  3. Hortonworks has a singular focus - enabling Apache Hadoop as an enterprise data platform for any app and any data type We were founded in 2011 by 24 developers from Yahoo where Hadoop was conceived to address data challenges at internet scale. What we now know of as Hadoop really started in 2005, when a team at Yahoo was directed to build out a large-scale data storage and processing technology that would allow them to improve their most critical application, Search. Their challenge was essentially two-fold. First they needed to capture and archive the contents of the internet, and then process the data so that users could search through it effectively an efficiently. Clearly traditional approaches were both technically (due to the size of the data) and commercially (due to the cost) impractical. The result was the Apache Hadoop project that delivered large scale storage (HDFS) and processing (MapReduce). Today we are over 600 employees and have partnered with over 900 companies who are the leaders in the data center We have also been very fortunate to achieve very significant customer adoption with over 230 customers as of Q3 2014, spanning nearly every vertical.   Hortonworks was founded the sole intent to make Hadoop an enterprise data platform. With YARN as its foundation, HDP delivers a centralized architecture with true multi-tenancy for data-processing and shared services for Security, Governance and Operations to satisfy enterprise requirements, all deeply integrated and certified with leading datacenter technologies. We are uniquely focused on this transformation of Hadoop and doing our work completely in open source. This is all predicated on our leadership in the community, which enables not only to best support users of but also provides uniquely present customer requirements within this open, thriving community.      
  4. Our product, the Hortonworks Data Platform (or HDP for short) is a completely open source, enterprise-grade data platform that’s comprised of dozens of Apache open source projects including Apache Hadoop and YARN at its center.   We have a comprehensive engineering, testing, and certification process that integrates and packages all of these components into a cohesive platform that the enterprise can consume and deploy at scale. And our model enables us to proactively manage new innovations and new open source projects into HDP as they emerge.   To ensure the highest quality, we have a test suite, unique to Hortonworks, that is comprised of 10’s of thousands of system and integration tests that we run at scale on a regular basis including on the world’s largest Hadoop clusters at Yahoo! as part of our co-development relationship.   While our pure-play competitors focus on proprietary components for security, operations, and governance, we invest in new open source projects that address these areas.   For example, earlier in 2014, we acquired a small company called XA Secure that provided a comprehensive security and administration product. We flipped the technology in wholesale into open source as Apache Ranger.   Since our security, operations and governance technologies are open source projects, our partners are able to work with us on those projects to ensure deep integration within our joint solution architectures.
  5. As the information era continues to generate massive volumes of different data formats regularly, organizations are looking for more efficient means of storing and analyzing that data in a standardized way across the different lines of business. Many are obviously turning to Hadoop given it lends itself nicely to this problem by offering efficiencies that other platforms don’t. There’s still problems though.
  6. There’s multiple dimensions of complexity when trying to get insights from data being stored in a Hadoop system that’s leveraged at scale. 1)There’s of course the types of analysis that need to be done each with their own set of requirements and subtle complexities. Does this department or business need a predictive engine? Prescriptive? Does the data and data model support the kinds of questions I need to ask of the data. There’s also not a whole lot of analytics that can be used or enabled without significant effort. For the most part Hadoop allows you to store the data as is. There’s some open source engines and data on top of Hadoop that help you ask the hard questions, but they all use a different set of tools and APIs. 2)Then there’s the processing and delivery of results. What is the delivery / consumption model that works best with the problems I’m looking to solve? Does it align with the types of analysis I want to perform. 3)Most importantly, there’s data considerations. The many data types being used in the wild require fundamentally different methods to access and manage the information inside. Machine data, human information and structured data typically require fundamentally different approaches in the types of analysis which in turn require separate analytics engines. So data is really everything. All analytics decisions hinge on whether we can access what’s inside and what we can do with it. 4)Coupled with the skill set of the business users and the batch oriented processing of Hadoop, that leaves most organizations with a model that forces them to innovate slowly and by use case rather than dealing with the real root of the issue which is finding a way to access and collectively analyze all the data efficiently and through standardized, real-time procedures that are self-service and uniform across all the data. So the issues are now on the table. Hadoop is a powerful toolset that gives enterprises a means to an end when it comes to understanding and acting on their data. The question is how do I give it that extra edge? The short answer is….IDOL. Now let’s talk about how IDOL helps to fill that void.
  7. Key Messages: One of the largest challenges in getting value from Hadoop investments is the disconnect between business users and the data-scientists. They speak different languages Hard to collaborate on the same-data due to lack of tools Business users often have subject matter expertise, but don’t know technical data-science concepts. Data Scientists know how to manipulate the data and extract value, but don’t know the nuances of the business IDOL enables both Business Users and Data Scientists with: Interactive Exploration of the Data Non-SQL graphical navigation of Data Collaboration Features to share insights
  8. Powerful customer examples to lead off Market/industry landscape/trends (what is today’s reality?) What problems does this cause for the customer? What do you need to do to fix the problem? (here are 3-5 requirements) What are some issues with traditional solutions? (talk about challenges of human information, keyword search, etc) What is the answer? Our solution, powered by IDOL (what is hp enterprise search, what is IDOL) Why do people like you choose our solution? (powered by idol, gartner mq, KPIs, features, idol is key to hp’s strategy) Illustrate today vs tomorrow with our technology Summary slide
  9. So in the end using Map Reduce, we’re able to translate the fetch / indexing activities into a discrete set of concurrent tasks within the racks running Hadoop. This accomplishes three things: 1)It translates the HP connectivity processing into an Hadoop best practice by harnessing MapReduce. Instead of the point and shoot architecture of most connectors, we’re building a native plugin that can be used to process and analyze your data within the Hadoop ecosystem. IDOL becomes a native plugin for Hadoop. It also translates what can be exhausting and complex code to write into configuration driven analytics processing. 2)By conforming to Hadoop best practices, we’re able to create a faster and more efficient means of processing the data so that it can be sent off to IDOL in the same way it has in the past for analysis. We’re not just using IDOL distribution anymore, we’re also leveraging the MPP capabilities of Hadoop to do the heavy lifting for us. 3)IDOL is able to incorporate it’s industry leading analytics capabilities into OOTB functions that can be turned on via configuration rather than through complex programmatic integration. Now IDOL’s ingestion pipeline can do many things we all know that, but being able to leverage those functions in a streamlined and configuration driven manner has huge advantages versus the more brute force programming methodologies employed by other vendors. Many people don’t want to have to code their way through these issues. Just enable the features and that’s it. There’s a lot of value there and that’s just the connectors. But the connector is just part of the story i.e. just the data processing (ETL) and preparation before finally getting loaded into IDOL – it’s an important job but just one part of architecture. Once the data is in IDOL, that’s when the real interesting things happen because it’s when we start to really expose the powerful functions and capabilities of the platform. Stateful functions like retrieval, classification, clustering and many more functions become available to both explore and analyze your data in real-time. Let’s look at the big picture now…
  10. Key Messages: Unlike other technologies that simply read HDFS as a file-system, IDOL is integrated deeply into the Hadoop architecture Takes advantage of MPP compute power of Hadoop Deals with multi-tenancy and data with different security rights and privileges Advanced analytics for all data-types
  11. So now we’ll take a look at a use case that is becoming more and more common as different organizations adopt Hadoop and look to streamline data storage and analysis across the different lines of business that IT needs to support.
  12. Key Messages: Large Diversified Healthcare Company , acts as a payer & provider Claims are the life-blood of their operations, used traditional Data-Warehouse, BI, and statistical tools Challenges: Business SMEs with knowledge of payments processes not data-scientists Report generation took long time: 30-45 days Did not speak the same language Constant pressure to reduce Fraud, Waste, and Abuse Payment Integrity early user of analytics - identified as high ROI target for Hadoop and Analytics Challenging because patterns of providers and fraud constantly changing Changes in regulations & contracts, + errors in data entry and process can result in incorrect payments. Government estimates that $50B of $500B on Medicare is lost to FWA, private health insurers are also affected IDOL solved this problem by providing self-service analytics to business users and data-scientist. Hadoop is being used to scale out to all payment systems New data sources and use-cases being added constantly Enabling a wide variety of lines-of-business Has potential for very big impact on the organization
  13. So the issues are now on the table. Hadoop on it’s own isn’t enough. How do I create a real-time, efficient, all-encompassing, and multitenant environment to glean all the valuable insights contained within Hadoop. By pairing IDOL alongside Hadoop, you can leverage IDOL to: Supercharge your analytics: instead of writing complicated and time consume map / reduce or yarn scripts that are mostly batch oriented, develop real-time advanced analytics techniques built directly into IDOL instead. Democratize data and analysis – IDOL also offers something very unique for Hadoop. By removing the complexities involved in data processing through to configuration and offering a common analytics api, analysis and data management become a self-service function through a common and standardized Restful API that is simple and easy to use. Business intelligence is enabled across a wider set of of content. Allows you to leverage 100% of the data for analysis - By ingesting data into IDOL, you’re not just able to execute the analytics faster, but you’re able to expand the scope of your analytics to cover more data types beyond the most common. Later I’ll show you how we can apply the standard keyword counter example using Hadoop and turn it on it’s head by simply asking IDOL or leveraging some of it’s core libraries. Reduce Costs and Complexity: Also think about even the easiest problems to solve with Hadoop. Give me your best Hadoop technician and I’ll show you someone who needs a few hours if not a couple days to write scripts that work. Nothing ever works the first time and with batch oriented processing, IDOL enables you to ask complex questions and get real time answers. Saving time and getting answers faster saves time and money your resources could spend making decisions against the data they now fully understand.