SlideShare ist ein Scribd-Unternehmen logo
1 von 46
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Zach Christopherson, Database Engineer, AWS
June 28, 2016
Amazon Redshift for Big Data Analytics
Agenda
• Service Overview
• Migration Considerations
• Optimization:
• Schema / Table Design
• Data Ingestion
• Maintenance and Database Tuning
• Demonstration
Service Overview
Relational data warehouse
Massively parallel; petabyte scale
Fully managed
HDD and SSD platforms
$1,000/TB/year; starts at $0.25/hour
Amazon
Redshift
a lot faster
a lot simpler
a lot cheaper
Selected Amazon Redshift customers
Amazon Redshift system architecture
Leader node
• SQL endpoint
• Stores metadata
• Coordinates query execution
Compute nodes
• Local, columnar storage
• Execute queries in parallel
• Load, backup, restore via
Amazon S3; load from
Amazon DynamoDB, Amazon EMR, or SSH
Two hardware platforms
• Optimized for data processing
• DS2: HDD; scale from 2TB to 2PB
• DC1: SSD; scale from 160GB to 326TB
10 GigE
(HPC)
Ingestion
Backup
Restore
JDBC/ODBC
A deeper look at compute node architecture
Each node contains multiple slices
• DS2 – 2 slices on XL, 16 on 8XL
• DC1 – 2 slices on L, 32 on 8XL
Each slice is allocated CPU, and
table data
Each slice processes a piece of
the workload in parallel
Leader Node
Amazon Redshift dramatically reduces I/O
Data compression
Zone maps
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
• Calculating SUM(Amount) with
row storage:
– Need to read everything
– Unnecessary I/O
ID Age State Amount
Amazon Redshift dramatically reduces I/O
Data compression
Zone maps
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
• Calculating SUM(Amount) with
column storage:
– Only scan the necessary
blocks
ID Age State Amount
Amazon Redshift dramatically reduces I/O
Column storage
Data compression
Zone maps
• Columnar compression
– Effective due to like data
– Reduces storage
requirements
– Reduces I/O
ID Age State Amount
analyze compression orders;
Table | Column | Encoding
--------+-------------+----------
orders | id | mostly32
orders | age | mostly32
orders | state | lzo
orders | amount | mostly32
Amazon Redshift dramatically reduces I/O
Column storage
Data compression
Zone maps
• In-memory block metadata
• Contains per-block MIN and MAX value
• Effectively prunes blocks which don’t
contain data for a given query
• Minimize unnecessary I/O
ID Age State Amount
Migration Considerations
Forklift = BAD
On Redshift
• Optimal throughput at 8-12 concurrency, 50 possible
• Upfront table design is critical, not able to add indexes
• DISTKEY and SORTKEY significantly influence performance
• Primary Key, Foreign Key, Unique constraints will help
• CAUTION: These are not enforced
• Compression is for speed as well
On Redshift
• Commits are expensive, batch loads
• Blocks are immutable, a small write (~1-10 rows) and a
relatively larger write (~100k rows) have a similar cost
• Partition temporal data with time series tables and
UNION ALL VIEWs
Optimization: Schema Design
Goals of Distribution
• Distribute data evenly for parallel processing
• Minimize data movement
• Co-located Joins
• Localized Aggregations
Distribution Key All
Node 1
Slice 1 Slice 2
Node 2
Slice 3 Slice 4
Node 1
Slice 1 Slice 2
Node 2
Slice 3 Slice 4
Full table data on first
slice of every node
Same key to same location
Node 1
Slice 1 Slice 2
Node 2
Slice 3 Slice 4
Even
Round robin
distribution
Choosing a Distribution Style
KEY
• Large FACT tables
• Large or rapidly changing
tables used in joins
• Localize columns used within
aggregations
ALL
• Have slowly changing data
• Reasonable size (i.e., few
millions but not 100’s of
millions of rows)
• No common distribution key for
frequent joins
• Typical use case – joined
dimension table without a
common distribution key
EVEN
• Tables not frequently joined or
aggregated
• Large tables without acceptable
candidate keys
Goals of Sorting
• Physically sort data within blocks and throughout table
• Enable rrscans to prune blocks by leveraging zone maps
• Optimal SORTKEY is dependent on:
• Query patterns
• Data profile
• Business requirements
COMPOUND
• Most Common
• Well defined filter criteria
• Time-series data
Choosing a SORTKEY
INTERLEAVED
• Edge Cases
• Large tables (>Billion Rows)
• No common filter criteria
• Non time-series data
• Primarily as a query predicate (date, identifier, …)
• Optionally choose a column frequently used for aggregates
• Optionally choose same as distribution key column for most efficient
joins (merge join)
Compressing Data
• COPY automatically analyzes and compresses data
when loading into empty tables
• ANALYZE COMPRESSION checks existing tables and
proposes optimal compression algorithms for each
column
• Changing column encoding requires a table rebuild
Automatic compression is a good thing (mostly)
“The definition of insanity is doing the same thing over and
over again, but expecting different results”.
- Albert Einstein
If you have a regular ETL process and you use temp tables
or staging tables, turn off automatic compression
• Use analyze compression to determine the right encodings
• Bake those encodings into your DML
• Use CREATE TABLE … LIKE
Automatic compression is a good thing (mostly)
• From the zone maps we know:
• Which block(s) contain the
range
• Which row offsets to scan
• Highly compressed sort keys:
• Many rows per block
• Large row offset
Skip compression on just the
leading column of the compound
sortkey
Optimization: Ingest Performance
Amazon Redshift Loading Data Overview
AWS CloudCorporate Data center
Amazon
DynamoDB
Amazon S3
Data
Volume
Amazon Elastic
MapReduce
Amazon
RDS
Amazon
Redshift
Amazon
Glacier
logs / files
Source DBs
VPN
Connection
AWS Direct
Connect
S3 Multipart
Upload
AWS Import/
Export
EC2 or On-
Prem (using
SSH)
Parallelism is a function of load files
Each slice’s query processors are able to load one file at a time
• Streaming Decompression
• Parse
• Distribute
• Write
A single input file means
only one slice is ingesting data
Realizing only partial cluster usage as 6.25% of slices are active
2 4 6 8 10 12 141 3 5 7 9 11 13 15
Maximize Throughput with Multiple Files
Use at least as many input files as
there are slices in cluster
With 16 input files, all slices are
working so you maximize
throughput
COPY continues to scale linearly
as you add additional nodes
2 4 6 8 10 12 141 3 5 7 9 11 13 15
Optimization: Performance Tuning
Optimizing a database for querying
• Periodically check your table status
• Vacuum and Analyze regularly
• SVV_TABLE_INFO
• Missing statistics
• Table skew
• Uncompressed Columns
• Unsorted Data
• Check your cluster status
• WLM queuing
• Commit queuing
• Database Locks
Missing Statistics
• Amazon Redshift’s query
optimizer relies on up-to-date
statistics
• Statistics are only necessary for
data which you are accessing
• Updated stats important on:
• SORTKEY
• DISTKEY
• Columns in query predicates
Table Skew
• Unbalanced workload
• Query completes as fast as the
slowest slice completes
• Can cause skew inflight:
• Temp data fills a single
node resulting in query
failure
Table Maintenance and Status
Unsorted Table
• Sortkey is just a guide, but data
needs to actually be sorted
• VACUUM or DEEP COPY to
sort
• Scans against unsorted tables
continue to benefit from zone
maps:
• Load sequential blocks
WLM Queue
Identify short/long-running queries
and prioritize them
Define multiple queues to route
queries appropriately.
Default concurrency of 5
Leverage wlm_apex_hourly to tune
WLM based on peak concurrency
requirements
Cluster Status: Commits and WLM
Commit Queue
How long is your commit queue?
• Identify needless transactions
• Group dependent statements
within a single transaction
• Offload operational workloads
• STL_COMMIT_STATS
Cluster Status: Database Locks
• Database Locks
• Read locks, Write locks, Exclusive locks
• Reads block exclusive
• Writes block writes and exclusive
• Exclusives block everything
• Ungranted locks block subsequent lock requests
• Exposed through SVV_TRANSACTIONS
Demonstration
https://s3.amazonaws.com/chriz-webinar/webinar.zip
Use SORTKEYs to effectively prune blocks
Use SORTKEYs to effectively prune blocks
Use SORTKEYs to effectively prune blocks
Don’t compress initial SORTKEY column
Use compression encoding to reduce I/O
Choose a DISTKEY which avoids data skew
Ingest: Disable predictable compression analysis
Ingest: Load multiple files to match cluster slices
VACUUM to physically removed deleted rows
VACUUM to keep your tables sorted
Gather statistics to assist the query planner
Thank you!
Questions?
https://s3.amazonaws.com/chriz-webinar/webinar.zip

Weitere ähnliche Inhalte

Was ist angesagt?

AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAmazon Web Services
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedFlyData Inc.
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAmazon Web Services
 
Migration to Redshift from SQL Server
Migration to Redshift from SQL ServerMigration to Redshift from SQL Server
Migration to Redshift from SQL Serverjoeharris76
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
 
Deep Dive Redshift, with a focus on performance
Deep Dive Redshift, with a focus on performanceDeep Dive Redshift, with a focus on performance
Deep Dive Redshift, with a focus on performanceAmazon Web Services
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Web Services
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftAmazon Web Services
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features Amazon Web Services
 
Amazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Web Services
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon RedshiftAmazon Web Services
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Amazon Web Services
 
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon RedshiftAmazon Web Services
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Amazon Web Services
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftAmazon Web Services
 

Was ist angesagt? (20)

AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing Performance
 
Migration to Redshift from SQL Server
Migration to Redshift from SQL ServerMigration to Redshift from SQL Server
Migration to Redshift from SQL Server
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Deep Dive Redshift, with a focus on performance
Deep Dive Redshift, with a focus on performanceDeep Dive Redshift, with a focus on performance
Deep Dive Redshift, with a focus on performance
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and Optimization
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features
 
Amazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech Talks
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
 
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
 
Amazon Redshift Masterclass
Amazon Redshift MasterclassAmazon Redshift Masterclass
Amazon Redshift Masterclass
 
Redshift overview
Redshift overviewRedshift overview
Redshift overview
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
 
What's New in Amazon Aurora
What's New in Amazon AuroraWhat's New in Amazon Aurora
What's New in Amazon Aurora
 

Andere mochten auch

Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftAmazon Web Services
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesAmazon Web Services
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Amazon Web Services
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Amazon Web Services
 
Accelerate Go-To-Market Speed in a CI/CD Environment
Accelerate Go-To-Market Speed in a CI/CD EnvironmentAccelerate Go-To-Market Speed in a CI/CD Environment
Accelerate Go-To-Market Speed in a CI/CD EnvironmentAmazon Web Services
 
AWS Customer Presentation : PBS - AWS Summit 2012 - NYC
AWS Customer Presentation : PBS - AWS Summit 2012 - NYCAWS Customer Presentation : PBS - AWS Summit 2012 - NYC
AWS Customer Presentation : PBS - AWS Summit 2012 - NYCAmazon Web Services
 
CPN203 Saving with EC2 Spot Instances - AWS re: Invent 2012
CPN203 Saving with EC2 Spot Instances - AWS re: Invent 2012CPN203 Saving with EC2 Spot Instances - AWS re: Invent 2012
CPN203 Saving with EC2 Spot Instances - AWS re: Invent 2012Amazon Web Services
 
DAT201 Migrating Databases to AWS - AWS re: Invent 2012
DAT201 Migrating Databases to AWS - AWS re: Invent 2012DAT201 Migrating Databases to AWS - AWS re: Invent 2012
DAT201 Migrating Databases to AWS - AWS re: Invent 2012Amazon Web Services
 
Security in the AWS Cloud - Steve Riley
Security in the AWS Cloud - Steve RileySecurity in the AWS Cloud - Steve Riley
Security in the AWS Cloud - Steve RileyAmazon Web Services
 
AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013
AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013
AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013Amazon Web Services
 
AWS Paris Summit 2014 - T2 - Amazon Workspaces, postes de travail sur le cloud
AWS Paris Summit 2014 - T2 - Amazon Workspaces, postes de travail sur le cloudAWS Paris Summit 2014 - T2 - Amazon Workspaces, postes de travail sur le cloud
AWS Paris Summit 2014 - T2 - Amazon Workspaces, postes de travail sur le cloudAmazon Web Services
 
Relational Databases Redefined with AWS
Relational Databases Redefined with AWSRelational Databases Redefined with AWS
Relational Databases Redefined with AWSAmazon Web Services
 
Modern Security and Compliance Through Automation
Modern Security and Compliance Through AutomationModern Security and Compliance Through Automation
Modern Security and Compliance Through AutomationAmazon Web Services
 
AWS Summit 2013 | India - 0 to Production in 40 minutes, Pieter Kemps
AWS Summit 2013 | India - 0 to Production in 40 minutes, Pieter KempsAWS Summit 2013 | India - 0 to Production in 40 minutes, Pieter Kemps
AWS Summit 2013 | India - 0 to Production in 40 minutes, Pieter KempsAmazon Web Services
 
The 2014 AWS Enterprise Summit Keynote
The 2014 AWS Enterprise Summit Keynote The 2014 AWS Enterprise Summit Keynote
The 2014 AWS Enterprise Summit Keynote Amazon Web Services
 
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by InstrumentAmazon Web Services
 
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012Amazon Web Services
 

Andere mochten auch (20)

Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
 
Accelerate Go-To-Market Speed in a CI/CD Environment
Accelerate Go-To-Market Speed in a CI/CD EnvironmentAccelerate Go-To-Market Speed in a CI/CD Environment
Accelerate Go-To-Market Speed in a CI/CD Environment
 
Stg205 amazon s3
Stg205 amazon s3Stg205 amazon s3
Stg205 amazon s3
 
AWS Customer Presentation : PBS - AWS Summit 2012 - NYC
AWS Customer Presentation : PBS - AWS Summit 2012 - NYCAWS Customer Presentation : PBS - AWS Summit 2012 - NYC
AWS Customer Presentation : PBS - AWS Summit 2012 - NYC
 
CPN203 Saving with EC2 Spot Instances - AWS re: Invent 2012
CPN203 Saving with EC2 Spot Instances - AWS re: Invent 2012CPN203 Saving with EC2 Spot Instances - AWS re: Invent 2012
CPN203 Saving with EC2 Spot Instances - AWS re: Invent 2012
 
DAT201 Migrating Databases to AWS - AWS re: Invent 2012
DAT201 Migrating Databases to AWS - AWS re: Invent 2012DAT201 Migrating Databases to AWS - AWS re: Invent 2012
DAT201 Migrating Databases to AWS - AWS re: Invent 2012
 
Security in the AWS Cloud - Steve Riley
Security in the AWS Cloud - Steve RileySecurity in the AWS Cloud - Steve Riley
Security in the AWS Cloud - Steve Riley
 
AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013
AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013
AWS Compute Services State of the Union (CPN202) | AWS re:Invent 2013
 
AWS Paris Summit 2014 - T2 - Amazon Workspaces, postes de travail sur le cloud
AWS Paris Summit 2014 - T2 - Amazon Workspaces, postes de travail sur le cloudAWS Paris Summit 2014 - T2 - Amazon Workspaces, postes de travail sur le cloud
AWS Paris Summit 2014 - T2 - Amazon Workspaces, postes de travail sur le cloud
 
6 rules for innovation
6 rules for innovation6 rules for innovation
6 rules for innovation
 
Relational Databases Redefined with AWS
Relational Databases Redefined with AWSRelational Databases Redefined with AWS
Relational Databases Redefined with AWS
 
Modern Security and Compliance Through Automation
Modern Security and Compliance Through AutomationModern Security and Compliance Through Automation
Modern Security and Compliance Through Automation
 
AWS Summit 2013 | India - 0 to Production in 40 minutes, Pieter Kemps
AWS Summit 2013 | India - 0 to Production in 40 minutes, Pieter KempsAWS Summit 2013 | India - 0 to Production in 40 minutes, Pieter Kemps
AWS Summit 2013 | India - 0 to Production in 40 minutes, Pieter Kemps
 
The 2014 AWS Enterprise Summit Keynote
The 2014 AWS Enterprise Summit Keynote The 2014 AWS Enterprise Summit Keynote
The 2014 AWS Enterprise Summit Keynote
 
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
 
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
 

Ähnlich wie AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics

Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)Julien SIMON
 
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Amazon Web Services
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
 
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)Amazon Web Services
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftAmazon Web Services
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...Edgar Alejandro Villegas
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDBAWS Germany
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Amazon Web Services
 
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013Michael Bohlig
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseAmazon Web Services
 
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...Maaz Anjum
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...Amazon Web Services
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic
 

Ähnlich wie AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics (20)

Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)
 
Processing and Analytics
Processing and AnalyticsProcessing and Analytics
Processing and Analytics
 
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon Redshift
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDB
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Zach Christopherson, Database Engineer, AWS June 28, 2016 Amazon Redshift for Big Data Analytics
  • 2. Agenda • Service Overview • Migration Considerations • Optimization: • Schema / Table Design • Data Ingestion • Maintenance and Database Tuning • Demonstration
  • 4. Relational data warehouse Massively parallel; petabyte scale Fully managed HDD and SSD platforms $1,000/TB/year; starts at $0.25/hour Amazon Redshift a lot faster a lot simpler a lot cheaper
  • 6. Amazon Redshift system architecture Leader node • SQL endpoint • Stores metadata • Coordinates query execution Compute nodes • Local, columnar storage • Execute queries in parallel • Load, backup, restore via Amazon S3; load from Amazon DynamoDB, Amazon EMR, or SSH Two hardware platforms • Optimized for data processing • DS2: HDD; scale from 2TB to 2PB • DC1: SSD; scale from 160GB to 326TB 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  • 7. A deeper look at compute node architecture Each node contains multiple slices • DS2 – 2 slices on XL, 16 on 8XL • DC1 – 2 slices on L, 32 on 8XL Each slice is allocated CPU, and table data Each slice processes a piece of the workload in parallel Leader Node
  • 8. Amazon Redshift dramatically reduces I/O Data compression Zone maps ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • Calculating SUM(Amount) with row storage: – Need to read everything – Unnecessary I/O ID Age State Amount
  • 9. Amazon Redshift dramatically reduces I/O Data compression Zone maps ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • Calculating SUM(Amount) with column storage: – Only scan the necessary blocks ID Age State Amount
  • 10. Amazon Redshift dramatically reduces I/O Column storage Data compression Zone maps • Columnar compression – Effective due to like data – Reduces storage requirements – Reduces I/O ID Age State Amount analyze compression orders; Table | Column | Encoding --------+-------------+---------- orders | id | mostly32 orders | age | mostly32 orders | state | lzo orders | amount | mostly32
  • 11. Amazon Redshift dramatically reduces I/O Column storage Data compression Zone maps • In-memory block metadata • Contains per-block MIN and MAX value • Effectively prunes blocks which don’t contain data for a given query • Minimize unnecessary I/O ID Age State Amount
  • 14. On Redshift • Optimal throughput at 8-12 concurrency, 50 possible • Upfront table design is critical, not able to add indexes • DISTKEY and SORTKEY significantly influence performance • Primary Key, Foreign Key, Unique constraints will help • CAUTION: These are not enforced • Compression is for speed as well
  • 15. On Redshift • Commits are expensive, batch loads • Blocks are immutable, a small write (~1-10 rows) and a relatively larger write (~100k rows) have a similar cost • Partition temporal data with time series tables and UNION ALL VIEWs
  • 17. Goals of Distribution • Distribute data evenly for parallel processing • Minimize data movement • Co-located Joins • Localized Aggregations Distribution Key All Node 1 Slice 1 Slice 2 Node 2 Slice 3 Slice 4 Node 1 Slice 1 Slice 2 Node 2 Slice 3 Slice 4 Full table data on first slice of every node Same key to same location Node 1 Slice 1 Slice 2 Node 2 Slice 3 Slice 4 Even Round robin distribution
  • 18. Choosing a Distribution Style KEY • Large FACT tables • Large or rapidly changing tables used in joins • Localize columns used within aggregations ALL • Have slowly changing data • Reasonable size (i.e., few millions but not 100’s of millions of rows) • No common distribution key for frequent joins • Typical use case – joined dimension table without a common distribution key EVEN • Tables not frequently joined or aggregated • Large tables without acceptable candidate keys
  • 19. Goals of Sorting • Physically sort data within blocks and throughout table • Enable rrscans to prune blocks by leveraging zone maps • Optimal SORTKEY is dependent on: • Query patterns • Data profile • Business requirements
  • 20. COMPOUND • Most Common • Well defined filter criteria • Time-series data Choosing a SORTKEY INTERLEAVED • Edge Cases • Large tables (>Billion Rows) • No common filter criteria • Non time-series data • Primarily as a query predicate (date, identifier, …) • Optionally choose a column frequently used for aggregates • Optionally choose same as distribution key column for most efficient joins (merge join)
  • 21. Compressing Data • COPY automatically analyzes and compresses data when loading into empty tables • ANALYZE COMPRESSION checks existing tables and proposes optimal compression algorithms for each column • Changing column encoding requires a table rebuild
  • 22. Automatic compression is a good thing (mostly) “The definition of insanity is doing the same thing over and over again, but expecting different results”. - Albert Einstein If you have a regular ETL process and you use temp tables or staging tables, turn off automatic compression • Use analyze compression to determine the right encodings • Bake those encodings into your DML • Use CREATE TABLE … LIKE
  • 23. Automatic compression is a good thing (mostly) • From the zone maps we know: • Which block(s) contain the range • Which row offsets to scan • Highly compressed sort keys: • Many rows per block • Large row offset Skip compression on just the leading column of the compound sortkey
  • 25. Amazon Redshift Loading Data Overview AWS CloudCorporate Data center Amazon DynamoDB Amazon S3 Data Volume Amazon Elastic MapReduce Amazon RDS Amazon Redshift Amazon Glacier logs / files Source DBs VPN Connection AWS Direct Connect S3 Multipart Upload AWS Import/ Export EC2 or On- Prem (using SSH)
  • 26. Parallelism is a function of load files Each slice’s query processors are able to load one file at a time • Streaming Decompression • Parse • Distribute • Write A single input file means only one slice is ingesting data Realizing only partial cluster usage as 6.25% of slices are active 2 4 6 8 10 12 141 3 5 7 9 11 13 15
  • 27. Maximize Throughput with Multiple Files Use at least as many input files as there are slices in cluster With 16 input files, all slices are working so you maximize throughput COPY continues to scale linearly as you add additional nodes 2 4 6 8 10 12 141 3 5 7 9 11 13 15
  • 29. Optimizing a database for querying • Periodically check your table status • Vacuum and Analyze regularly • SVV_TABLE_INFO • Missing statistics • Table skew • Uncompressed Columns • Unsorted Data • Check your cluster status • WLM queuing • Commit queuing • Database Locks
  • 30. Missing Statistics • Amazon Redshift’s query optimizer relies on up-to-date statistics • Statistics are only necessary for data which you are accessing • Updated stats important on: • SORTKEY • DISTKEY • Columns in query predicates
  • 31. Table Skew • Unbalanced workload • Query completes as fast as the slowest slice completes • Can cause skew inflight: • Temp data fills a single node resulting in query failure Table Maintenance and Status Unsorted Table • Sortkey is just a guide, but data needs to actually be sorted • VACUUM or DEEP COPY to sort • Scans against unsorted tables continue to benefit from zone maps: • Load sequential blocks
  • 32. WLM Queue Identify short/long-running queries and prioritize them Define multiple queues to route queries appropriately. Default concurrency of 5 Leverage wlm_apex_hourly to tune WLM based on peak concurrency requirements Cluster Status: Commits and WLM Commit Queue How long is your commit queue? • Identify needless transactions • Group dependent statements within a single transaction • Offload operational workloads • STL_COMMIT_STATS
  • 33. Cluster Status: Database Locks • Database Locks • Read locks, Write locks, Exclusive locks • Reads block exclusive • Writes block writes and exclusive • Exclusives block everything • Ungranted locks block subsequent lock requests • Exposed through SVV_TRANSACTIONS
  • 35. Use SORTKEYs to effectively prune blocks
  • 36. Use SORTKEYs to effectively prune blocks
  • 37. Use SORTKEYs to effectively prune blocks
  • 38. Don’t compress initial SORTKEY column
  • 39. Use compression encoding to reduce I/O
  • 40. Choose a DISTKEY which avoids data skew
  • 41. Ingest: Disable predictable compression analysis
  • 42. Ingest: Load multiple files to match cluster slices
  • 43. VACUUM to physically removed deleted rows
  • 44. VACUUM to keep your tables sorted
  • 45. Gather statistics to assist the query planner