SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Optimizing the Modern Data Architecture
with Attunity, Hortonworks and RCG Global Services
We do Hadoop.
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Speakers
	
    Hortonworks
◦  Adis Cesir, Big Data Solution Engineer
	
    RCG Global Services
◦  Ramu Kalvakuntla, Principal, Big Data Practice
	
    Attunity
◦  Santosh Chitakki, Director of Product Management
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Partnership
Strategy	
  and	
  Solu/on	
  Delivery	
  
Hadoop	
  Distribu/on,	
  Support	
  and	
  Training	
  
Any	
  Data,	
  Anywhere,	
  Any/me	
  
RCG	
  GLOBAL	
  SERVICES,	
  	
  
HORTONWORKS	
  AND	
  ATTUNITY	
  ARE	
  PARTNERING	
  TO	
  PROVIDE	
  AN	
  
EDW	
  OPTIMIZATION	
  SOLUTION	
  THAT	
  DELIVERS	
  REAL	
  FINANCIAL	
  
BENEFITS	
  BY	
  EFFECTIVELY	
  IMPLEMENTING	
  APACHE	
  HADOOP	
  TO	
  
AUGMENT	
  CURRENT	
  EDW	
  PLATFORMS.
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Traditional systems under pressure
Challenges
•  Can’t manage new data
•  Constrains data to app
•  Costly to scale
Business Value
Clickstream
Geolocation
Web Data
Internet of Things
Docs, emails
Server logs
2012
2.8 Zettabytes
2020
40 Zettabytes
LAGGARDS
INDUSTRY
LEADERS
1
2 New Data
ERP CRM SCM
New
Traditional
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
A Typical EDW Faces Three Challenges
1.  Data Storage: storing cold
data or throwing data away
2.  Processing Capacity:
wasting processing cycles
on low value workloads
3.  New Data Sources: unable
to capture and use new data
ANALYTICS
Data
Marts
Business
Analytics
Visualization
& Dashboards
DATASYSTEMS
Systems of
Record
RDBMS
ERP
CRM
Other
Clickstream	
   Web	
  &	
  Social	
   Geoloca3on	
   Sensor	
  	
  
&	
  Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
NEW
SOURCES
1 2
3
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Most EDWs Are Used InefficientlyANALYTICS
Data
Marts
Business
Analytics
Visualization
& Dashboards
DATASYSTEMS
Systems of
Record
RDBMS
ERP
CRM
Other
1.  Data Storage:
–  More than 50% of data is
unused
2.  Processing Capacity:
–  55% of CPU capacity is ETL
–  35% of CPU consumed by
ETL is to load unused data
–  30-40% of CPU is consumed
by only 5% of ETL workloads
In a typical EDW*:
Hot Warm Cold
Why pay first class price for economy data?
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Optimization: Realize Cost Savings with HDP
Archive data away from the EDW
•  Move cold or rarely used data to Hadoop
as active archive
•  Store more data longer
Offload costly ETL processes
•  Free your EDW to perform high-value functions like
analytics & operations, not ETL
•  Use Hadoop for advanced ELT
Enrich the value of your EDW
•  Use Hadoop to refine new data sources, such as
web and machine data, for new analytical context
HDP helps you reduce costs and optimize the value associated with your EDW
Clickstream	
   Web	
  	
  
&	
  Social	
  
Geoloca3on	
   Sensor	
  	
  
&	
  Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
SOURCES
Existing Systems
ERP	
   CRM	
   SCM	
  
ANALYTICS
Data
Marts
Business
Analytics
Visualization
& Dashboards
ANALYTICS
Applications
Business
Analytics
Visualization
& Dashboards
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
HDFS
(Hadoop Distributed File System)
YARN: Data Operating System
Interactive Real-TimeBatch Partner ISVBatch Batch
MPP	
   EDW	
  
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
•  Time spent understanding source data and defining
destination structure
•  High latency between data generation and availability
Challenge with traditional Architecture
DB
Structured Data
Source Layer
ETL / ELT
EDW ETL
Data Collection &
Processing
Data Mart
Integration, Storage &
Business View
Business / Department
Specific
Data Mart
Data Mart
Data Mart
Data Mart
Incapable/high
complexity when
dealing with loosely
structured data
•  No linear scale
•  High license cost
•  Large code footprint
Data discarded due
to cost or
performance
Low or no visibility
into transactional
data
EDW used as an
ETL tool with 100s of
staging tables
Data
Collection &
Processing
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Offload/Archive/Process – Hadoop based Platform
DB
Structured Data
Data Collection, Integration,
Storage and Processing
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   N	
  
Integrate, Transform, Archive,
Enrich
Source Layer
EDW
Data Mart
Data Mart
Data Mart
Data Mart
Data Mart
Data Mart
•  Store transactional data
•  Retain 7+ years of data (Hot archive)
•  Data Lineage – ability to store intermediate data sets
•  Becomes an analytics platform for data scientists
•  Linearly scalable
commodity hardware
•  Massively parallel
compute and storage
Support for any type of
data: structured or
unstructured with any
volume and velocity
Data Warehouse can now
focus less on storage and
transformation and more on
presentation
Clickstream	
   Social	
   Geo	
   Sensor	
   Server	
  	
  
Logs	
  
Unstrctur.	
  
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Optimization Customer Stories
Archive
TrueCar stores data on
millions of car purchases at
$0.12 per GB with HDP, well
below the $19 per GB
possible with other solutions.
Offload
Luminar cut its ETL
processing times from 3 days
to 3 hours with HDP, quickly
refreshing its models with new
customer transaction data.
Enrich
ZirMed enriches its EDW with
new data, including pharmacy
receipts, text messages, and
patient web searches.
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hadoop Driver: Enabling the Data LakeSCALE
SCOPE
Data Lake Definition
•  Centralized Architecture
Multiple applications on a shared data set
with consistent levels of service
•  Any App, Any Data
Multiple applications accessing all data
affording new insights and opportunities.
•  Unlocks ‘Systems of Insight’
Advanced algorithms and applications
used to derive new value and optimize
existing value.
Drivers:
1.  Cost Optimization
2.  Advanced Analytic Apps
Goal:
•  Centralized architecture
•  Data-driven business
DATA LAKE
Journey to the Data Lake with Hadoop
Systems of Insight
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Modern Data Architecture
•  Reduce cost and improve performance by
off-loading EDW data and processing to the
Hortonworks Distribution Platform (HDP)
•  Implement a platform that scales
incrementally using low cost hardware and
software
•  Support unstructured, semi-structured and
structured data in a single analytics
platform
•  Enable superior analytic capabilities
providing insight that is not possible to
achieve from their current environments
•  Provide seamless access to data for
analysis and business applications
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Solution Model - Modern Data Architecture
EDW Optimization Roadmap
Identify offload candidates,
create architectural blueprint,
implementation roadmap,
business case and ROI
EDW Optimization
Implementation
Execute Data and ETL/ELT off-
load, active archive, implement
data ingestion and data service
Data Value Realization
Provide insight, data in
motion, advanced
analytics, information
value creation, and
visualization
Enterprise Enablement
Enterprise access,
enriched data sources,
service orchestration and
data virtualization
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
EDW Optimization – Roadmap and Analysis
•  Assess current reporting,
ELT/ETL, and analytical
processes
•  Review logical and
physical data models
•  Assess current technical
architecture
•  Prioritize opportunities
•  Define future Hadoop
architecture and capacity
needs
•  Develop implementation plan
•  Create business case / ROI
•  Create and review Executive
Summary with Clients
•  Analyze Data Usage:
•  Identify under-utilized
•  Schemas
•  Tables / Columns
•  Data
•  Identify off-load
opportunities
Analyze EDW Workload
• Read vs. Writes
• ETL vs. ELT
• Analytical vs. Batch SQL’s
• CPU consumption
• CPU utilization
Current State
Analysis
Data Usage
Analysis
Workload
Analysis
Blueprint &
Roadmap
Activities Week1 Week2 Week3 Week4
Current State Analysis
Data Usage Analysis
Workload Analysis
Blueprint & Roadmap
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
EDW Optimization – Implementation
Activities Month 1 Month 2 Month 3 Month …
Data Off-Load
Process Off-load
Data Services
Analysis & Reporting
Data
Off-load
Process
Off-load
Data
Services
Analysis &
Reporting
•  POC / Reference
Implementation (if needed)
•  Install / expand HDP
cluster
•  Analyze off-load data sets
•  Automate data ingestion
•  Implement active archiving
•  Provide scheme-on-read for
direct business analysis
•  Migrate resource intensive
analysis to Hadoop
•  Connect analysis and
visualization tools to Hadoop
•  Migrate EDW ETL/ELT
workload to Hadoop
•  De-normalize data to
optimize performance
•  Load Hadoop ETL/ELT
output data back into
EDW
•  Provide data virtualization
for data transparency
across Hadoop and MPP
databases
•  Build business services
for reporting and
enterprise applications
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Data Warehouse Optimization - An Iterative Process
•  Identify low-hanging fruits
•  Get buy-in from stakeholders
•  Plan and implement in increments
•  Continuously assess and iterate
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Attunity Visibility Data Usage Analysis (Sample)
•  Unused Data (e.g. Tables
with no ‘SELECT’
statements)
70 Terabytes in
Unused Databases
Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Attunity Visibility Data Usage Analysis (Sample)
•  History of data used in
large “Fact” table
•  Queries go back only 2
years
•  Maintains 8 years of data
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Attunity Visibility Workload Analysis (Sample)
Almost 60% of CPU
to load and ingest
data
•  Intensive ETL
workloads
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Attunity Visibility Workload Analysis (Sample)
The Top 100 repetitive SQL of
101,000 in ETL SQL acounts for 30+
% of CPU consumption by ETL.
Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Attunity Visibility – The Data Dashboard
Completely Analyze Workloads And Data Usage
Reduce Cost | Optimize Performance | Justify Investments
User Activity Data Usage Workload Performance
Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
RCG Success Stories
•  Completed EDW optimization
projects for two large retailors
•  Offloading cold data and ELT
to Hadoop
•  Cost savings projected
between $6M to $10M
Top
Retailors
$
Top Financial
Services
•  Currently working with two large
Fortune 100 financial companies
•  Offloading 40TB to 60TB of RAW
data from EDW platforms to
Hadoop
•  Re-architecting their batch decision
processing with savings between
$10M to $15M.
Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Next Steps…
Download the Hortonworks Sandbox
Learn Hadoop
Build Your Analytic App
Try Hadoop
Learn more about our partnerships
http://hortonworks.com/partner/rcg-global-services/
http://hortonworks.com/partner/attunity/
Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
SAN JOSE
June 9-11
BRUSSELS
April 15-16
•  Deep-dive technical content
•  65+ sessions and 5 tracks
•  1,000 attendees
•  Sponsorships Available
•  Including Pre and Post event community meetups
and BOFs
•  Hadoop training available
•  100+ sessions and 7 tracks
•  Deep-dive technical content
•  5,000 attendees
•  Sponsorships Available
•  Including Pre and Post event community meetups
and BOFs
•  Hadoop training available
www.hadoopsummit.org
The Largest Hadoop Community Events in 

Europe and North America

Weitere ähnliche Inhalte

Was ist angesagt?

Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Hortonworks
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataHortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramHortonworks
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun ConnollyHortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHortonworks
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalHortonworks
 

Was ist angesagt? (20)

Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun Connolly
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 

Ähnlich wie Optimizing your Modern Data Architecture - with Attunity, RCG Global Services and Hortonworks

Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBigDataExpo
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformEMC
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseEdgar Alejandro Villegas
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Hortonworks
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Edw Optimization Solution
Edw Optimization Solution Edw Optimization Solution
Edw Optimization Solution Hortonworks
 
Hadoop is not an Island in the Enterprise
Hadoop is not an Island in the EnterpriseHadoop is not an Island in the Enterprise
Hadoop is not an Island in the EnterpriseDataWorks Summit
 
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonUsing Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonMapR Technologies
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...Platfora
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...DataWorks Summit
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 

Ähnlich wie Optimizing your Modern Data Architecture - with Attunity, RCG Global Services and Hortonworks (20)

Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Edw Optimization Solution
Edw Optimization Solution Edw Optimization Solution
Edw Optimization Solution
 
Hadoop is not an Island in the Enterprise
Hadoop is not an Island in the EnterpriseHadoop is not an Island in the Enterprise
Hadoop is not an Island in the Enterprise
 
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonUsing Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 

Mehr von Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Mehr von Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

KĂźrzlich hochgeladen

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 

KĂźrzlich hochgeladen (20)

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 

Optimizing your Modern Data Architecture - with Attunity, RCG Global Services and Hortonworks

  • 1. Page 1 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Optimizing the Modern Data Architecture with Attunity, Hortonworks and RCG Global Services We do Hadoop.
  • 2. Page 2 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Speakers    Hortonworks ◦  Adis Cesir, Big Data Solution Engineer    RCG Global Services ◦  Ramu Kalvakuntla, Principal, Big Data Practice    Attunity ◦  Santosh Chitakki, Director of Product Management
  • 3. Page 3 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Partnership Strategy  and  Solu/on  Delivery   Hadoop  Distribu/on,  Support  and  Training   Any  Data,  Anywhere,  Any/me   RCG  GLOBAL  SERVICES,     HORTONWORKS  AND  ATTUNITY  ARE  PARTNERING  TO  PROVIDE  AN   EDW  OPTIMIZATION  SOLUTION  THAT  DELIVERS  REAL  FINANCIAL   BENEFITS  BY  EFFECTIVELY  IMPLEMENTING  APACHE  HADOOP  TO   AUGMENT  CURRENT  EDW  PLATFORMS.
  • 4. Page 4 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Traditional systems under pressure Challenges •  Can’t manage new data •  Constrains data to app •  Costly to scale Business Value Clickstream Geolocation Web Data Internet of Things Docs, emails Server logs 2012 2.8 Zettabytes 2020 40 Zettabytes LAGGARDS INDUSTRY LEADERS 1 2 New Data ERP CRM SCM New Traditional
  • 5. Page 5 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved A Typical EDW Faces Three Challenges 1.  Data Storage: storing cold data or throwing data away 2.  Processing Capacity: wasting processing cycles on low value workloads 3.  New Data Sources: unable to capture and use new data ANALYTICS Data Marts Business Analytics Visualization & Dashboards DATASYSTEMS Systems of Record RDBMS ERP CRM Other Clickstream   Web  &  Social   Geoloca3on   Sensor     &  Machine   Server     Logs   Unstructured   NEW SOURCES 1 2 3
  • 6. Page 6 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Most EDWs Are Used InefficientlyANALYTICS Data Marts Business Analytics Visualization & Dashboards DATASYSTEMS Systems of Record RDBMS ERP CRM Other 1.  Data Storage: –  More than 50% of data is unused 2.  Processing Capacity: –  55% of CPU capacity is ETL –  35% of CPU consumed by ETL is to load unused data –  30-40% of CPU is consumed by only 5% of ETL workloads In a typical EDW*: Hot Warm Cold Why pay first class price for economy data?
  • 7. Page 7 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Optimization: Realize Cost Savings with HDP Archive data away from the EDW •  Move cold or rarely used data to Hadoop as active archive •  Store more data longer Offload costly ETL processes •  Free your EDW to perform high-value functions like analytics & operations, not ETL •  Use Hadoop for advanced ELT Enrich the value of your EDW •  Use Hadoop to refine new data sources, such as web and machine data, for new analytical context HDP helps you reduce costs and optimize the value associated with your EDW Clickstream   Web     &  Social   Geoloca3on   Sensor     &  Machine   Server     Logs   Unstructured   SOURCES Existing Systems ERP   CRM   SCM   ANALYTICS Data Marts Business Analytics Visualization & Dashboards ANALYTICS Applications Business Analytics Visualization & Dashboards ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) YARN: Data Operating System Interactive Real-TimeBatch Partner ISVBatch Batch MPP   EDW  
  • 8. Page 8 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved •  Time spent understanding source data and defining destination structure •  High latency between data generation and availability Challenge with traditional Architecture DB Structured Data Source Layer ETL / ELT EDW ETL Data Collection & Processing Data Mart Integration, Storage & Business View Business / Department Specific Data Mart Data Mart Data Mart Data Mart Incapable/high complexity when dealing with loosely structured data •  No linear scale •  High license cost •  Large code footprint Data discarded due to cost or performance Low or no visibility into transactional data EDW used as an ETL tool with 100s of staging tables Data Collection & Processing
  • 9. Page 9 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Offload/Archive/Process – Hadoop based Platform DB Structured Data Data Collection, Integration, Storage and Processing °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   N   Integrate, Transform, Archive, Enrich Source Layer EDW Data Mart Data Mart Data Mart Data Mart Data Mart Data Mart •  Store transactional data •  Retain 7+ years of data (Hot archive) •  Data Lineage – ability to store intermediate data sets •  Becomes an analytics platform for data scientists •  Linearly scalable commodity hardware •  Massively parallel compute and storage Support for any type of data: structured or unstructured with any volume and velocity Data Warehouse can now focus less on storage and transformation and more on presentation Clickstream   Social   Geo   Sensor   Server     Logs   Unstrctur.  
  • 10. Page 10 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Optimization Customer Stories Archive TrueCar stores data on millions of car purchases at $0.12 per GB with HDP, well below the $19 per GB possible with other solutions. Offload Luminar cut its ETL processing times from 3 days to 3 hours with HDP, quickly refreshing its models with new customer transaction data. Enrich ZirMed enriches its EDW with new data, including pharmacy receipts, text messages, and patient web searches.
  • 11. Page 11 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Hadoop Driver: Enabling the Data LakeSCALE SCOPE Data Lake Definition •  Centralized Architecture Multiple applications on a shared data set with consistent levels of service •  Any App, Any Data Multiple applications accessing all data affording new insights and opportunities. •  Unlocks ‘Systems of Insight’ Advanced algorithms and applications used to derive new value and optimize existing value. Drivers: 1.  Cost Optimization 2.  Advanced Analytic Apps Goal: •  Centralized architecture •  Data-driven business DATA LAKE Journey to the Data Lake with Hadoop Systems of Insight
  • 12. Page 12 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Modern Data Architecture •  Reduce cost and improve performance by off-loading EDW data and processing to the Hortonworks Distribution Platform (HDP) •  Implement a platform that scales incrementally using low cost hardware and software •  Support unstructured, semi-structured and structured data in a single analytics platform •  Enable superior analytic capabilities providing insight that is not possible to achieve from their current environments •  Provide seamless access to data for analysis and business applications
  • 13. Page 13 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Solution Model - Modern Data Architecture EDW Optimization Roadmap Identify offload candidates, create architectural blueprint, implementation roadmap, business case and ROI EDW Optimization Implementation Execute Data and ETL/ELT off- load, active archive, implement data ingestion and data service Data Value Realization Provide insight, data in motion, advanced analytics, information value creation, and visualization Enterprise Enablement Enterprise access, enriched data sources, service orchestration and data virtualization
  • 14. Page 14 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved EDW Optimization – Roadmap and Analysis •  Assess current reporting, ELT/ETL, and analytical processes •  Review logical and physical data models •  Assess current technical architecture •  Prioritize opportunities •  Define future Hadoop architecture and capacity needs •  Develop implementation plan •  Create business case / ROI •  Create and review Executive Summary with Clients •  Analyze Data Usage: •  Identify under-utilized •  Schemas •  Tables / Columns •  Data •  Identify off-load opportunities Analyze EDW Workload • Read vs. Writes • ETL vs. ELT • Analytical vs. Batch SQL’s • CPU consumption • CPU utilization Current State Analysis Data Usage Analysis Workload Analysis Blueprint & Roadmap Activities Week1 Week2 Week3 Week4 Current State Analysis Data Usage Analysis Workload Analysis Blueprint & Roadmap
  • 15. Page 15 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved EDW Optimization – Implementation Activities Month 1 Month 2 Month 3 Month … Data Off-Load Process Off-load Data Services Analysis & Reporting Data Off-load Process Off-load Data Services Analysis & Reporting •  POC / Reference Implementation (if needed) •  Install / expand HDP cluster •  Analyze off-load data sets •  Automate data ingestion •  Implement active archiving •  Provide scheme-on-read for direct business analysis •  Migrate resource intensive analysis to Hadoop •  Connect analysis and visualization tools to Hadoop •  Migrate EDW ETL/ELT workload to Hadoop •  De-normalize data to optimize performance •  Load Hadoop ETL/ELT output data back into EDW •  Provide data virtualization for data transparency across Hadoop and MPP databases •  Build business services for reporting and enterprise applications
  • 16. Page 16 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Data Warehouse Optimization - An Iterative Process •  Identify low-hanging fruits •  Get buy-in from stakeholders •  Plan and implement in increments •  Continuously assess and iterate
  • 17. Page 17 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Attunity Visibility Data Usage Analysis (Sample) •  Unused Data (e.g. Tables with no ‘SELECT’ statements) 70 Terabytes in Unused Databases
  • 18. Page 18 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Attunity Visibility Data Usage Analysis (Sample) •  History of data used in large “Fact” table •  Queries go back only 2 years •  Maintains 8 years of data
  • 19. Page 19 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Attunity Visibility Workload Analysis (Sample) Almost 60% of CPU to load and ingest data •  Intensive ETL workloads
  • 20. Page 20 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Attunity Visibility Workload Analysis (Sample) The Top 100 repetitive SQL of 101,000 in ETL SQL acounts for 30+ % of CPU consumption by ETL.
  • 21. Page 21 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Attunity Visibility – The Data Dashboard Completely Analyze Workloads And Data Usage Reduce Cost | Optimize Performance | Justify Investments User Activity Data Usage Workload Performance
  • 22. Page 22 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved RCG Success Stories •  Completed EDW optimization projects for two large retailors •  Offloading cold data and ELT to Hadoop •  Cost savings projected between $6M to $10M Top Retailors $ Top Financial Services •  Currently working with two large Fortune 100 financial companies •  Offloading 40TB to 60TB of RAW data from EDW platforms to Hadoop •  Re-architecting their batch decision processing with savings between $10M to $15M.
  • 23. Page 23 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved Next Steps… Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop Learn more about our partnerships http://hortonworks.com/partner/rcg-global-services/ http://hortonworks.com/partner/attunity/
  • 24. Page 24 Š Hortonworks Inc. 2011 – 2015. All Rights Reserved SAN JOSE June 9-11 BRUSSELS April 15-16 •  Deep-dive technical content •  65+ sessions and 5 tracks •  1,000 attendees •  Sponsorships Available •  Including Pre and Post event community meetups and BOFs •  Hadoop training available •  100+ sessions and 7 tracks •  Deep-dive technical content •  5,000 attendees •  Sponsorships Available •  Including Pre and Post event community meetups and BOFs •  Hadoop training available www.hadoopsummit.org The Largest Hadoop Community Events in 
 Europe and North America