SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Data Warehouse Optimization
3
Finding Business Pains
• Frequent or near-term EDW expansion/spend
• Short time windows for data
• SLA challenges with ELT
• Reports/analytics that are “Too big”
• Compliance issues requiring long-term storage AND
query
• Resource restrictions/contention or
disenfranchised/frustrated users
3
4
Common Challenges with the Data Warehouse
4
OLTP
Enterprise
Applications
Data
Warehouse
QueryExtract
Transform
Load
Business
Intelligence
Transform
1
1
1
Slow data transformations, missed SLAs.
2
2
Slow queries, poor QoS and missed opportunities.
4 Must archive. Archived data can’t provide value.
3
3 Wrong or incomplete, modified copies are made.5 Constant pressure to buy additional
warehouse capacity, just to maintain
current quality of service.
NO room to expand use cases.
NO room to innovate.
5
An EDH Compliments the Data Warehouse
5
OLTP
Enterprise
Applications
Data
Warehouse
Query
Extract
Load Business
Intelligence
Cloudera
3
3 Avoid “spreadmarts” across departments.
Transform
Query
2
2
Empowered business analysts.
2
1 Data loaded when & where it’s needed.
1
4 Complete view of all your
products, customers, etc.
5 Cost effective, infinitely scalable,
production ready enterprise data hub for all
your data.
All data.
All users.
6
Hadoop as a Data Warehouse???
6
7
2014 Gartner MQ for Data Warehouse DBMS
7
“A data warehouse DBMS is now expected to
coordinate data virtualization strategies, and
distributed file and/or processing
approaches, to address changes in data
management and access requirements.”
8
Thinking About Optimization
9
Understanding Benefits for Your Organization
9
• Help You Assess Your Enterprise Data Warehouse Ecosystem
• Identify Viable
Migration Candidates
and Target Reference
Architecture
• Develop a Project Plan
to Deliver the Full Scope
of Benefits
• Understand the
Business Case for
Making the Investment
10
Working With You Through the EDW Assessment
Process
10
Information
•Collect information about your
EDW environment
Analysis
•Identify migration candidates
•Determine feasibility
Recommendations
•Develop a migration plan
•Establish a business case
11
Identifying Sources and Workloads
12
Key Hadoop Platform Requirements
• High availability
• Disaster recovery
• Downtime-less upgrades
• Auditability
• Low-latency SQL & BI support
• Deep SAS & R support
13
Customers Agree: Cloudera Delivers
Customer Workload Results
Leading Payments
Company
Analytics, ETL
Processing, DR
Largest fraud discovery in firm history
Time to report collapsed from 2 days => 2 hours
Save $30M on DR
Global Money Center
Bank
Data Processing (ELT) Avoided tens of millions in expansion purchases
42% faster processing
Mobile Device
Manufacturer
Data Processing (ELT) Offloaded 90% of data volume; keep all data
Fortune 500 Retailer Analytics More insights by supporting more exploration of more
extensive & granular data
Leading Financial
Regulator
Data Processing (ELT)
and DR
Shrank EDW footprint by 4PB, 20X perf. boost
14
DATA WAREHOUSE
Operational Business
Intelligence
Analytics Self-Service BI
Data Processing (ELT)
Staged Data
Operational
Data
Archival Data
WORKLOADSDATA
Assessing Workloads and Data
• Data Processing (ELT)
• Staged data, to be processed
• Temp tables, BLOB/CLOB types, etc.
• Analytics / Machine Learning
• Deep and broad data sets, within and
beyond the warehouse
• Self-Service BI (Ad-Hoc Query)
• Operational data, actively used for BI
• Archival data, inactively used for BI
15
Offload Data Processing (ELT)
High-scale batch data processing
Implemented as SQL + scripting or ETL
running on expensive HW infrastructure
Staging data stored across diverse, temp
tables
High fraction of overall EDW utilization
(25 – 80%)
Difficult to store, manage staging data
in relational form
Limited user adoption risk to migrate
ETL tools to simplify migration
Over 2X the performance
1/10th the cost
What to Migrate Influencing Factors Better in Cloudera
Reliability for mission-critical workloads: high availability, disaster recovery,
downtime-less upgrades
Low-latency SQL processing, ability to absorb short-cycle ELT
Broad support of leading data integration tools
Only Available with Cloudera Partners
16
Offload Self-Service Business Intelligence
Self-Service BI,
Exploratory BI,
Data Discovery
Uncertain business questions
and uncertain data
Fastest growing workload for
many warehouses
Comparable support for end
user tools between Cloudera
and DBMS products
Schema flexibility
End user self-service on full
fidelity data
1/10th the cost
Workload Migration Priority Better In Cloudera
Open source parallel interactive SQL engine: Cloudera Impala
Integration and certification of every leading SSBI vendor
Only Available with Cloudera Partners
17
Offload Analytics / Machine Learning
Training & scoring
predictive models
Deep and broad data sets, within and
beyond the warehouse
Statisticians want unconstrained
analysis; limited DW compute resources
Paying top dollar for warehouse data
storage only to load into ML tools
Inability to analyze data beyond the
warehouse
Greater user productivity
(pre-packaged ML libraries, no more
down-sampling)
Support for 3rd party ML tools
Greater flexibility
(SQL + MR + SAS procs)
1/10th the cost
Workload and Data Influencing Factors Better in Cloudera
Ability to run SAS, R natively on the same cluster
Interactive search and SQL experience for data exploration
Built-in analytics libraries (Mahout, DataFu, ClouderaML)
Support from Cloudera’s Data Science team
Only Available with Cloudera Partners
18
Sample Cloudera Tools for Assisting Migration
• High-speed connector – Moves data between the two systems
• Data definition – Tool for mapping EDW tables & datatypes to Hive tables &
datatypes
• Mainframe input / output format – Support direct feed of mainframe data
into Cloudera
• Result validation – Verifies SQL applications in Cloudera produce the same
results as the original applications
• Support for SQL-H (planned) – Remote queries from EDW to Cloudera
18
19
Groundwork for Optimization
20
• Install and configure CDH and Cloudera Manager
• Run standard and specialized performance tests
• Recommend tuning, compression and
decompression, and scheduler configurations
• Document recommended cluster configuration
• Train and certify Hadoop administrators
Is Your Data Architecture Aligned to Your Use Case?
Lay the Foundation for Data Migration and Ensure Success
21
How Quickly and Securely Can You Transition Your Data?
Migrate Disparate Data Sources to Boost Performance
• Collect low-efficiency data from various silos
• Redeploy latent data from EDWs, RDBMSs,
and Hadoop environment
• Develop, test, and implement data
processing jobs
• Integrate Hadoop with relevant external
systems
• Document workload migration
22
Is Your Operational Environment Ready for Handover?
Maximize ROI by Rationalizing All Systems, Teams, and Workloads
• Review current and future requirements
• Review full ecosystem, all jobs, and regular processes
• Review application architecture, ingestion pipeline, data schema,
and data partitioning system
• Review key management and monitoring processes and relevant
production procedures
• Recommend additional training to assure Hadoop expertise on
management and operations teams
• Document cluster configuration, solutions implementation, and
production recommendations
23
How Much Additional Value Can You Capture Long-Term?
Ongoing Optimization Is Key to Deferring Additional Cost
• Expand framework without expanding
footprint
• Rationalize beyond initial burn-in period
• Evolve cluster to support additional use cases
• Annually benchmark performance to
diagnostic
• Balance business opportunity against
technical risk
24
Building the Optimization Plan
25
Prioritizing Workloads and Data
Current EDW
Constraints
Workload
Transferability
User
Communities
• Focus on computation
constraints
• Focus on disk space constraints
• Similar or same SQL functionality
• Similar or same tools support
• Opportunity for performance gains
• Group related workloads by user
community
• Migrate one community at a time
1 2 3
26
The Optimization Process
Profile Prioritize Migrate Validate
• Analyze all of the
workload in your
data warehouse
• Queries
• Objects
• User communities
• Framework driven
methodology for
ordering workloads
• Balance financial
opportunity with
business risk
• Set up data ingest
paths to Cloudera
• Map EDW
workload to
Cloudera
Repeat annually to defer
additional expansion
• Verify results
• Evaluate
performance
differences & tune
• Side-by-side “burn
in” period
• Cut-over
27
Sample EDW Rationalization Process
Initial Quarter Second Quarter Third Quarter Fourth Quarter
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12
Program Management
Responsible for overall program
success, resource assignment, project
management, and risk mitigation
Cloudera Migration Teams
Expert resources delivering initial
project framework and advanced
implementation releases
${Customer} Migration Teams
Customer staff resources, taking on
increasing responsibility for release
implementation over time
ProcessPeople
Technology
Management & Risk Mitigation
Initial EDW Assessment
Architecture Oversight
Assessment and Stratification Process
Detailed Workload Analysis
Implement Reference Architecture
Establish Repeatable Migration Approach
Enhance SDLC, Release, and Configuration Management Processes
Release
1
Release
2
Release
3
Release
N
Migration SDLC
Assignment/Kick-off
Execution
Testing
User Acceptance
Documentation
Sign-off
Release
2
Release
3
Release
N
Release
4
Release
5
28
Workload Classification
Cloudera Architecture Implementing Cloudera’s reference architecture(s) and building environment to fit
unique customer requirements
Data Ecosystem
Integration
BI, ETL, and other applications that require integration with the big data platform,
including existing EDW
Data Processing High-scale batch data processing, Implemented as SQL + scripting or via ETL tools,
Staging data stored across diverse, temp tables
Self-service BI Exploratory BI, Data Discovery, Uncertain business questions and uncertain data
Analytics Training & scoring, predictive models, deep and broad data sets (within and
beyond the warehouse)
Archival Processes Traditional archive storage and processes
29
Workload Complexity
Basic
• Leverages pre-existing
architecture and integrations
• Utilizes all off-the-shelf
components
• Repeatable solutions from
existing
training/documentation
Moderate
• Requires minimal
modifications to existing
architecture,
integrations, or other
dependencies
• Some expertise required
for new design decisions
Advanced
• Establishing new
reference architectures
• Several new design
decisions involved
• Unique skillsets required
(eg. Machine learning)
30
Sample Complexity vs. Time for Various Project Types
ComplexityofTask
Estimated Phase
Low
Moderate
High
1 2 3 4
Machine Learning Modeling
Graph Analytics Modeling
Hadoop cluster install/config
One-off ingest/ETL processes
Predictive Analytics Modeling
Production Certification
Hadoop storage schemas
Decision tree/forest/ensemble
Data Pipelining
Generic ingest/ETL processes
31
Mapping Resources to Project Task Type
ComplexityofTask
Estimated Phase
Low
Moderate
High
1 2 3 4
Data Scientist
Senior Architect
Consultant
Architect
Principal Architect
32
Developers AdminData Warehouse
Specialist
Architects
Technology & Ops
Management & Leadership
Big Data
Visionary
Executive
Sponsor
Program
Manager
Business & Data
Lead Data
Scientist
Lead Business
Analyst
LOB Rep
LOB Rep
LOB Rep
Data
Wranglers
Typical Big Data COE Program Roles
Staff Centrally and Train to Scale
33
Benefits Summary
1. Lower costs of data management, growth
2. Improve quality of service
• Meet critical data processing SLAs
• Faster BI queries
3. Extend existing warehouse capacity
• Increase ROI from current investments
• More operational data – volume and schemas
• More business intelligence and analytics workloads
4. Retain all data for analysis
5. Deliver a foundation for innovation
• Bring more applications to Hadoop data for low incremental cost
34
The Experts Agree
34
35
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Amazon Web Services
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouseAbhi Bhardwaj
 
Hilton's enterprise data journey
Hilton's enterprise data journeyHilton's enterprise data journey
Hilton's enterprise data journeyDataWorks Summit
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaRadhika Kotecha
 
Modern data warehouse presentation
Modern data warehouse presentationModern data warehouse presentation
Modern data warehouse presentationDavid Rice
 
Module 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdfModule 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdffathiah5
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouseAmin Choroomi
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016Kent Graziano
 
Tableau Architecture
Tableau ArchitectureTableau Architecture
Tableau ArchitectureVivek Mohan
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data WarehousingAlex Meadows
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consultingadivasoft
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 

Was ist angesagt? (20)

Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
 
Hilton's enterprise data journey
Hilton's enterprise data journeyHilton's enterprise data journey
Hilton's enterprise data journey
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 
Modern data warehouse presentation
Modern data warehouse presentationModern data warehouse presentation
Modern data warehouse presentation
 
Module 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdfModule 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdf
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Big Data Analytics (1).ppt
Big Data Analytics (1).pptBig Data Analytics (1).ppt
Big Data Analytics (1).ppt
 
Data Vault and DW2.0
Data Vault and DW2.0Data Vault and DW2.0
Data Vault and DW2.0
 
Elastic Data Warehousing
Elastic Data WarehousingElastic Data Warehousing
Elastic Data Warehousing
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouse
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016
 
Tableau Architecture
Tableau ArchitectureTableau Architecture
Tableau Architecture
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 

Andere mochten auch

Energy conservation week celebration
Energy conservation week celebrationEnergy conservation week celebration
Energy conservation week celebrationSudha Arun
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Clusterairbots
 
Cloud Computing v.s. Cyber Security
Cloud Computing v.s. Cyber Security Cloud Computing v.s. Cyber Security
Cloud Computing v.s. Cyber Security Bahtiyar Bircan
 
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...Dr.Choen Krainara
 
Making Display Advertising Work for Auto Dealers
Making Display Advertising Work for Auto DealersMaking Display Advertising Work for Auto Dealers
Making Display Advertising Work for Auto DealersSpeed Shift Media
 
Real-World Data Governance: Data Governance Roles & Responsibilities
Real-World Data Governance: Data Governance Roles & ResponsibilitiesReal-World Data Governance: Data Governance Roles & Responsibilities
Real-World Data Governance: Data Governance Roles & ResponsibilitiesDATAVERSITY
 
Top 10 heavy duty diesel mechanic interview questions and answers
Top 10 heavy duty diesel mechanic interview questions and answersTop 10 heavy duty diesel mechanic interview questions and answers
Top 10 heavy duty diesel mechanic interview questions and answerstonychoper8206
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousingKavisha Uniyal
 
Lab Report on copper cycle
 Lab Report on copper cycle  Lab Report on copper cycle
Lab Report on copper cycle Karanvir Sidhu
 
Equity derivatives
Equity derivativesEquity derivatives
Equity derivativesRahul Sane
 
How to perform an efficient Cold Chain Compliance and Gap Analysis
How to perform an efficient Cold Chain Compliance and Gap Analysis How to perform an efficient Cold Chain Compliance and Gap Analysis
How to perform an efficient Cold Chain Compliance and Gap Analysis Alternatives Technologie Pharma
 
Financial Management Best Practices
Financial Management Best PracticesFinancial Management Best Practices
Financial Management Best PracticesAutotask
 
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나Amazon Web Services Korea
 
Consulting Company Valuation Model
Consulting Company Valuation ModelConsulting Company Valuation Model
Consulting Company Valuation ModelTony Rice
 
Lecture 1 introduction to construction procurement process.
Lecture 1   introduction to construction procurement process.Lecture 1   introduction to construction procurement process.
Lecture 1 introduction to construction procurement process.Aszahari Aie
 
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theoBài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theoMasterCode.vn
 
Energy management final ppt
Energy management final pptEnergy management final ppt
Energy management final pptEcoEvents
 
Top 10 electrical project engineer interview questions and answers
Top 10 electrical project engineer interview questions and answersTop 10 electrical project engineer interview questions and answers
Top 10 electrical project engineer interview questions and answersrobin26331
 

Andere mochten auch (20)

Security issues in cloud database
Security  issues  in cloud   database Security  issues  in cloud   database
Security issues in cloud database
 
Energy conservation week celebration
Energy conservation week celebrationEnergy conservation week celebration
Energy conservation week celebration
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Cluster
 
Cloud Computing v.s. Cyber Security
Cloud Computing v.s. Cyber Security Cloud Computing v.s. Cyber Security
Cloud Computing v.s. Cyber Security
 
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
Export-Oriented Industrialization (EOI): Arguments For and Against What Have ...
 
Making Display Advertising Work for Auto Dealers
Making Display Advertising Work for Auto DealersMaking Display Advertising Work for Auto Dealers
Making Display Advertising Work for Auto Dealers
 
Real-World Data Governance: Data Governance Roles & Responsibilities
Real-World Data Governance: Data Governance Roles & ResponsibilitiesReal-World Data Governance: Data Governance Roles & Responsibilities
Real-World Data Governance: Data Governance Roles & Responsibilities
 
Top 10 heavy duty diesel mechanic interview questions and answers
Top 10 heavy duty diesel mechanic interview questions and answersTop 10 heavy duty diesel mechanic interview questions and answers
Top 10 heavy duty diesel mechanic interview questions and answers
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousing
 
Lab Report on copper cycle
 Lab Report on copper cycle  Lab Report on copper cycle
Lab Report on copper cycle
 
Equity derivatives
Equity derivativesEquity derivatives
Equity derivatives
 
How to perform an efficient Cold Chain Compliance and Gap Analysis
How to perform an efficient Cold Chain Compliance and Gap Analysis How to perform an efficient Cold Chain Compliance and Gap Analysis
How to perform an efficient Cold Chain Compliance and Gap Analysis
 
Financial Management Best Practices
Financial Management Best PracticesFinancial Management Best Practices
Financial Management Best Practices
 
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
AWS 클라우드 서비스 소개 및 사례 (방희란) - AWS 101 세미나
 
Churn management
Churn managementChurn management
Churn management
 
Consulting Company Valuation Model
Consulting Company Valuation ModelConsulting Company Valuation Model
Consulting Company Valuation Model
 
Lecture 1 introduction to construction procurement process.
Lecture 1   introduction to construction procurement process.Lecture 1   introduction to construction procurement process.
Lecture 1 introduction to construction procurement process.
 
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theoBài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
Bài 1: Làm quen với ASP.NET - Giáo trình FPT - Có ví dụ kèm theo
 
Energy management final ppt
Energy management final pptEnergy management final ppt
Energy management final ppt
 
Top 10 electrical project engineer interview questions and answers
Top 10 electrical project engineer interview questions and answersTop 10 electrical project engineer interview questions and answers
Top 10 electrical project engineer interview questions and answers
 

Ähnlich wie Data Warehouse Optimization

BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationBigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationExcelerate Systems
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with HadoopPrecisely
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesDATAVERSITY
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitMing Yuan
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketDremio Corporation
 
Cloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an EcosystemCloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an EcosystemDatabricks
 

Ähnlich wie Data Warehouse Optimization (20)

BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationBigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummit
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
 
Cloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an EcosystemCloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an Ecosystem
 

Mehr von Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mehr von Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Kürzlich hochgeladen

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Kürzlich hochgeladen (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Data Warehouse Optimization

  • 1.
  • 3. 3 Finding Business Pains • Frequent or near-term EDW expansion/spend • Short time windows for data • SLA challenges with ELT • Reports/analytics that are “Too big” • Compliance issues requiring long-term storage AND query • Resource restrictions/contention or disenfranchised/frustrated users 3
  • 4. 4 Common Challenges with the Data Warehouse 4 OLTP Enterprise Applications Data Warehouse QueryExtract Transform Load Business Intelligence Transform 1 1 1 Slow data transformations, missed SLAs. 2 2 Slow queries, poor QoS and missed opportunities. 4 Must archive. Archived data can’t provide value. 3 3 Wrong or incomplete, modified copies are made.5 Constant pressure to buy additional warehouse capacity, just to maintain current quality of service. NO room to expand use cases. NO room to innovate.
  • 5. 5 An EDH Compliments the Data Warehouse 5 OLTP Enterprise Applications Data Warehouse Query Extract Load Business Intelligence Cloudera 3 3 Avoid “spreadmarts” across departments. Transform Query 2 2 Empowered business analysts. 2 1 Data loaded when & where it’s needed. 1 4 Complete view of all your products, customers, etc. 5 Cost effective, infinitely scalable, production ready enterprise data hub for all your data. All data. All users.
  • 6. 6 Hadoop as a Data Warehouse??? 6
  • 7. 7 2014 Gartner MQ for Data Warehouse DBMS 7 “A data warehouse DBMS is now expected to coordinate data virtualization strategies, and distributed file and/or processing approaches, to address changes in data management and access requirements.”
  • 9. 9 Understanding Benefits for Your Organization 9 • Help You Assess Your Enterprise Data Warehouse Ecosystem • Identify Viable Migration Candidates and Target Reference Architecture • Develop a Project Plan to Deliver the Full Scope of Benefits • Understand the Business Case for Making the Investment
  • 10. 10 Working With You Through the EDW Assessment Process 10 Information •Collect information about your EDW environment Analysis •Identify migration candidates •Determine feasibility Recommendations •Develop a migration plan •Establish a business case
  • 12. 12 Key Hadoop Platform Requirements • High availability • Disaster recovery • Downtime-less upgrades • Auditability • Low-latency SQL & BI support • Deep SAS & R support
  • 13. 13 Customers Agree: Cloudera Delivers Customer Workload Results Leading Payments Company Analytics, ETL Processing, DR Largest fraud discovery in firm history Time to report collapsed from 2 days => 2 hours Save $30M on DR Global Money Center Bank Data Processing (ELT) Avoided tens of millions in expansion purchases 42% faster processing Mobile Device Manufacturer Data Processing (ELT) Offloaded 90% of data volume; keep all data Fortune 500 Retailer Analytics More insights by supporting more exploration of more extensive & granular data Leading Financial Regulator Data Processing (ELT) and DR Shrank EDW footprint by 4PB, 20X perf. boost
  • 14. 14 DATA WAREHOUSE Operational Business Intelligence Analytics Self-Service BI Data Processing (ELT) Staged Data Operational Data Archival Data WORKLOADSDATA Assessing Workloads and Data • Data Processing (ELT) • Staged data, to be processed • Temp tables, BLOB/CLOB types, etc. • Analytics / Machine Learning • Deep and broad data sets, within and beyond the warehouse • Self-Service BI (Ad-Hoc Query) • Operational data, actively used for BI • Archival data, inactively used for BI
  • 15. 15 Offload Data Processing (ELT) High-scale batch data processing Implemented as SQL + scripting or ETL running on expensive HW infrastructure Staging data stored across diverse, temp tables High fraction of overall EDW utilization (25 – 80%) Difficult to store, manage staging data in relational form Limited user adoption risk to migrate ETL tools to simplify migration Over 2X the performance 1/10th the cost What to Migrate Influencing Factors Better in Cloudera Reliability for mission-critical workloads: high availability, disaster recovery, downtime-less upgrades Low-latency SQL processing, ability to absorb short-cycle ELT Broad support of leading data integration tools Only Available with Cloudera Partners
  • 16. 16 Offload Self-Service Business Intelligence Self-Service BI, Exploratory BI, Data Discovery Uncertain business questions and uncertain data Fastest growing workload for many warehouses Comparable support for end user tools between Cloudera and DBMS products Schema flexibility End user self-service on full fidelity data 1/10th the cost Workload Migration Priority Better In Cloudera Open source parallel interactive SQL engine: Cloudera Impala Integration and certification of every leading SSBI vendor Only Available with Cloudera Partners
  • 17. 17 Offload Analytics / Machine Learning Training & scoring predictive models Deep and broad data sets, within and beyond the warehouse Statisticians want unconstrained analysis; limited DW compute resources Paying top dollar for warehouse data storage only to load into ML tools Inability to analyze data beyond the warehouse Greater user productivity (pre-packaged ML libraries, no more down-sampling) Support for 3rd party ML tools Greater flexibility (SQL + MR + SAS procs) 1/10th the cost Workload and Data Influencing Factors Better in Cloudera Ability to run SAS, R natively on the same cluster Interactive search and SQL experience for data exploration Built-in analytics libraries (Mahout, DataFu, ClouderaML) Support from Cloudera’s Data Science team Only Available with Cloudera Partners
  • 18. 18 Sample Cloudera Tools for Assisting Migration • High-speed connector – Moves data between the two systems • Data definition – Tool for mapping EDW tables & datatypes to Hive tables & datatypes • Mainframe input / output format – Support direct feed of mainframe data into Cloudera • Result validation – Verifies SQL applications in Cloudera produce the same results as the original applications • Support for SQL-H (planned) – Remote queries from EDW to Cloudera 18
  • 20. 20 • Install and configure CDH and Cloudera Manager • Run standard and specialized performance tests • Recommend tuning, compression and decompression, and scheduler configurations • Document recommended cluster configuration • Train and certify Hadoop administrators Is Your Data Architecture Aligned to Your Use Case? Lay the Foundation for Data Migration and Ensure Success
  • 21. 21 How Quickly and Securely Can You Transition Your Data? Migrate Disparate Data Sources to Boost Performance • Collect low-efficiency data from various silos • Redeploy latent data from EDWs, RDBMSs, and Hadoop environment • Develop, test, and implement data processing jobs • Integrate Hadoop with relevant external systems • Document workload migration
  • 22. 22 Is Your Operational Environment Ready for Handover? Maximize ROI by Rationalizing All Systems, Teams, and Workloads • Review current and future requirements • Review full ecosystem, all jobs, and regular processes • Review application architecture, ingestion pipeline, data schema, and data partitioning system • Review key management and monitoring processes and relevant production procedures • Recommend additional training to assure Hadoop expertise on management and operations teams • Document cluster configuration, solutions implementation, and production recommendations
  • 23. 23 How Much Additional Value Can You Capture Long-Term? Ongoing Optimization Is Key to Deferring Additional Cost • Expand framework without expanding footprint • Rationalize beyond initial burn-in period • Evolve cluster to support additional use cases • Annually benchmark performance to diagnostic • Balance business opportunity against technical risk
  • 25. 25 Prioritizing Workloads and Data Current EDW Constraints Workload Transferability User Communities • Focus on computation constraints • Focus on disk space constraints • Similar or same SQL functionality • Similar or same tools support • Opportunity for performance gains • Group related workloads by user community • Migrate one community at a time 1 2 3
  • 26. 26 The Optimization Process Profile Prioritize Migrate Validate • Analyze all of the workload in your data warehouse • Queries • Objects • User communities • Framework driven methodology for ordering workloads • Balance financial opportunity with business risk • Set up data ingest paths to Cloudera • Map EDW workload to Cloudera Repeat annually to defer additional expansion • Verify results • Evaluate performance differences & tune • Side-by-side “burn in” period • Cut-over
  • 27. 27 Sample EDW Rationalization Process Initial Quarter Second Quarter Third Quarter Fourth Quarter M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 Program Management Responsible for overall program success, resource assignment, project management, and risk mitigation Cloudera Migration Teams Expert resources delivering initial project framework and advanced implementation releases ${Customer} Migration Teams Customer staff resources, taking on increasing responsibility for release implementation over time ProcessPeople Technology Management & Risk Mitigation Initial EDW Assessment Architecture Oversight Assessment and Stratification Process Detailed Workload Analysis Implement Reference Architecture Establish Repeatable Migration Approach Enhance SDLC, Release, and Configuration Management Processes Release 1 Release 2 Release 3 Release N Migration SDLC Assignment/Kick-off Execution Testing User Acceptance Documentation Sign-off Release 2 Release 3 Release N Release 4 Release 5
  • 28. 28 Workload Classification Cloudera Architecture Implementing Cloudera’s reference architecture(s) and building environment to fit unique customer requirements Data Ecosystem Integration BI, ETL, and other applications that require integration with the big data platform, including existing EDW Data Processing High-scale batch data processing, Implemented as SQL + scripting or via ETL tools, Staging data stored across diverse, temp tables Self-service BI Exploratory BI, Data Discovery, Uncertain business questions and uncertain data Analytics Training & scoring, predictive models, deep and broad data sets (within and beyond the warehouse) Archival Processes Traditional archive storage and processes
  • 29. 29 Workload Complexity Basic • Leverages pre-existing architecture and integrations • Utilizes all off-the-shelf components • Repeatable solutions from existing training/documentation Moderate • Requires minimal modifications to existing architecture, integrations, or other dependencies • Some expertise required for new design decisions Advanced • Establishing new reference architectures • Several new design decisions involved • Unique skillsets required (eg. Machine learning)
  • 30. 30 Sample Complexity vs. Time for Various Project Types ComplexityofTask Estimated Phase Low Moderate High 1 2 3 4 Machine Learning Modeling Graph Analytics Modeling Hadoop cluster install/config One-off ingest/ETL processes Predictive Analytics Modeling Production Certification Hadoop storage schemas Decision tree/forest/ensemble Data Pipelining Generic ingest/ETL processes
  • 31. 31 Mapping Resources to Project Task Type ComplexityofTask Estimated Phase Low Moderate High 1 2 3 4 Data Scientist Senior Architect Consultant Architect Principal Architect
  • 32. 32 Developers AdminData Warehouse Specialist Architects Technology & Ops Management & Leadership Big Data Visionary Executive Sponsor Program Manager Business & Data Lead Data Scientist Lead Business Analyst LOB Rep LOB Rep LOB Rep Data Wranglers Typical Big Data COE Program Roles Staff Centrally and Train to Scale
  • 33. 33 Benefits Summary 1. Lower costs of data management, growth 2. Improve quality of service • Meet critical data processing SLAs • Faster BI queries 3. Extend existing warehouse capacity • Increase ROI from current investments • More operational data – volume and schemas • More business intelligence and analytics workloads 4. Retain all data for analysis 5. Deliver a foundation for innovation • Bring more applications to Hadoop data for low incremental cost

Hinweis der Redaktion

  1. IN THIS SESSION, WE WILL EXPLORE USING HADOOP TO ADDRESS QUESTIONS AND ISSUES SURROUNDING * Cost of storage * Value of accessibility * Getting maximum return on your IT investments and all of your data
  2. Tie workloads to data types