SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Downloaden Sie, um offline zu lesen
Š 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Pavan Pothukuchi, Principal Product Manager, AWS
Chris Liu, Data Infrastructure Engineer, Coursera
July 13, 2016
Getting Started with
Amazon Redshift
Agenda
• Introduction
• Benefits
• Use cases
• Redshift @ Coursera
• Q&A
AnalyzeStore
Amazon
Glacier
Amazon S3
Amazon
DynamoDB
Amazon RDS,
Amazon Aurora
AWS big data portfolio
AWS Data Pipeline
Amazon
CloudSearch
Amazon EMR Amazon EC2
Amazon
Redshift
Amazon
Machine
Learning
Amazon
Elasticsearch
Service
AWS Database
Migration Service
Amazon
QuickSight
Amazon
Kinesis
Firehose
AWS Import/Export
Snowball
AWS Direct
Connect
Collect
Amazon Kinesis
Streams
Relational data warehouse
Massively parallel; petabyte scale
Fully managed
HDD and SSD platforms
$1,000/TB/year; starts at $0.25/hour
Amazon
Redshift
a lot faster
a lot simpler
a lot cheaper
The Amazon Redshift view of data warehousing
10x cheaper
Easy to provision
Higher DBA productivity
10x faster
No programming
Easily leverage BI tools,
Hadoop, machine learning,
streaming
Analysis inline with process
flows
Pay as you go, grow as you
need
Managed availability and
disaster recovery
Enterprise Big data SaaS
The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave™ are trademarks of Forrester Research, Inc. The Forrester Wave™ is a graphical
representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any
vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.
Forrester Wave™ enterprise data warehouse Q4 ’15
Selected Amazon Redshift customers
Amazon Redshift architecture
Leader node
Simple SQL endpoint
Stores metadata
Optimizes query plan
Coordinates query execution
Compute nodes
Local columnar storage
Parallel/distributed execution of all queries, loads,
backups, restores, resizes
Start at just $0.25/hour, grow to 2 PB (compressed)
DC1: SSD; scale from 160 GB to 326 TB
DS2: HDD; scale from 2 TB to 2 PB
Ingestion/backup
Backup
Restore
JDBC/ODBC
10 GigE
(HPC)
Benefit #1: Amazon Redshift is fast
Dramatically less I/O
Column storage
Data compression
Zone maps
Direct-attached storage
Large data block sizes
analyze compression listing;
Table | Column | Encoding
---------+----------------+----------
listing | listid | delta
listing | sellerid | delta32k
listing | eventid | delta32k
listing | dateid | bytedict
listing | numtickets | bytedict
listing | priceperticket | delta32k
listing | totalprice | mostly32
listing | listtime | raw
10 | 13 | 14 | 26 |…
… | 100 | 245 | 324
375 | 393 | 417…
… 512 | 549 | 623
637 | 712 | 809 …
… | 834 | 921 | 959
10
324
375
623
637
959
Benefit #1: Amazon Redshift is fast
Parallel and distributed
Query
Load
Export
Backup
Restore
Resize
Benefit #1: Amazon Redshift is fast
Hardware optimized for I/O intensive workloads, 4 GB/sec/node
Enhanced networking, over 1 million packets/sec/node
Choice of storage type, instance size
Regular cadence of autopatched improvements
Benefit #1: Amazon Redshift is fast
New dense storage (HDD) instance type (Jun 2015)
Improved memory 2x, compute 2x, disk throughput 1.5x
Cost: Same as our prior generation!
Performance improvement: 50%
Enhanced I/O and commit improvements (Jan 2016)
Performance improvement: 35%
Memory allocation improvements (May 2016)
Performance improvement: 60%
Benefit #2: Amazon Redshift is inexpensive
Ds2 (HDD)
Price per hour for
DW1.XL single node
Effective annual
price per TB compressed
On demand $ 0.850 $ 3,725
1-year reservation $ 0.500 $ 2,190
3-year reservation $ 0.228 $ 999
Dc1 (SSD)
Price per hour for
DW2.L single node
Effective annual
price per TB compressed
On demand $ 0.250 $ 13,690
1-year reservation $ 0.161 $ 8,795
3-year reservation $ 0.100 $ 5,500
Pricing is simple
Number of nodes x price/hour
No charge for leader node
No upfront costs
Pay as you go
Benefit #3: Amazon Redshift is fully managed
Continuous/incremental backups
Multiple copies within cluster
Continuous and incremental backups
to Amazon S3
Continuous and incremental backups
across regions
Streaming restore
Amazon S3
Amazon S3
Region 1
Region 2
Benefit #3: Amazon Redshift is fully managed
Amazon S3
Amazon S3
Region 1
Region 2
Fault tolerance
Disk failures
Node failures
Network failures
Availability Zone/region level disasters
Benefit #4: Security is built in
• Load encrypted from Amazon S3
• SSL to secure data in transit
• ECDHE perfect forward security
• Amazon VPC for network isolation
• Encryption to secure data at rest
• All blocks on disks and in Amazon S3 encrypted
• Block key, cluster key, master key (AES-256)
• On-premises HSM and AWS CloudHSM support
• Audit logging and AWS CloudTrail integration
• SOC 1/2/3, PCI-DSS, FedRAMP, BAA
10 GigE
(HPC)
Ingestion
Backup
Restore
Customer VPC
Internal
VPC
JDBC/ODBC
Benefit #5: We innovate quickly
Well over 125 new features added since launch
Release every two weeks
Automatic patching
Service Launch (2/14)
PDX (4/2)
Temp Credentials (4/11)
DUB (4/25)
SOC1/2/3 (5/8)
Unload Encrypted Files
NRT (6/5)
JDBC Fetch Size (6/27)
Unload logs (7/5)
SHA1 Builtin (7/15)
4 byte UTF-8 (7/18)
Sharing snapshots (7/18)
Statement Timeout (7/22)
Timezone, Epoch, Autoformat (7/25)
WLM Timeout/Wildcards (8/1)
CRC32 Builtin, CSV, Restore Progress
(8/9)
Resource Level IAM (8/9)
PCI (8/22)
UTF-8 Substitution (8/29)
JSON, Regex, Cursors (9/10)
Split_part, Audit tables (10/3)
SIN/SYD (10/8)
HSM Support (11/11)
Kinesis EMR/HDFS/SSH copy,
Distributed Tables, Audit
Logging/CloudTrail, Concurrency, Resize
Perf., Approximate Count Distinct, SNS
Alerts, Cross Region Backup (11/13)
Distributed Tables, Single Node Cursor
Support, Maximum Connections to 500
(12/13)
EIP Support for VPC Clusters (12/28)
New query monitoring system tables and
diststyle all (1/13)
Redshift on DW2 (SSD) Nodes (1/23)
Compression for COPY from SSH, Fetch
size support for single node clusters, new
system tables with commit stats,
row_number(), strotol() and query
termination (2/13)
Resize progress indicator & Cluster
Version (3/21)
Regex_Substr, COPY from JSON (3/25)
50 slots, COPY from EMR, ECDHE
ciphers (4/22)
3 new regex features, Unload to single
file, FedRAMP(5/6)
Rename Cluster (6/2)
Copy from multiple regions,
percentile_cont, percentile_disc (6/30)
Free Trial (7/1)
pg_last_unload_count (9/15)
AES-128 S3 encryption (9/29)
UTF-16 support (9/29)
Benefit #6: Amazon Redshift is powerful
• Approximate functions
• User-defined functions
• Machine learning
• Data science
Benefit #7: Amazon Redshift has a large ecosystem
Data integration Systems integratorsBusiness intelligence
Benefit #8: Service-oriented architecture
Amazon
DynamoDB
Amazon EMR
Amazon S3
Amazon EC2/SSH
Amazon RDS/
Amazon Aurora
Amazon
Redshift
Amazon Kinesis
Amazon ML
AWS Data Pipeline
Amazon
CloudSearch
Amazon
Mobile
Analytics
Performance
Ease of use
Security
Analytics and
functionality
SOA
Recent launches Dynamic WLM parameters
Queue hopping for timed-out queries
Merge rows from staging to prod. table
2x improvement in query throughput
10x latency improvement for UNION ALL queries
Bzip2 format for ingestion
Table level restore
10x improvement in vacuum perf.
Default access privileges
Tag-based AWS IAM access
IAM roles for COPY/UNLOAD
SAS connector enhancements,
Implicit conversion of SAS
queries to Amazon Redshift
DMS support from OLTP sources
Enhanced data ingestion from
Kinesis Firehose
Improved data schema conversion
to Amazon ML
Use cases
NTT Docomo: Japan’s largest mobile service
provider
68 million customers
Tens of TBs per day of data across a
mobile network
6 PB of total data (uncompressed)
Data science for marketing
operations, logistics, and so on
Greenplum on premises
Scaling challenges
Performance issues
Need same level of security
Need for a hybrid environment
NTT Docomo: Japan’s largest mobile service
provider
125 node DS2.8XL cluster
4,500 vCPUs, 30 TB RAM
2 PB compressed
10x faster analytic queries
50% reduction in time for new
BI application deployment
Significantly less operations
overhead
Data
Source
ET
AWS
Direct
Connect
Client
Forwarder
LoaderState
management
SandboxAmazon Redshift
S3
Nasdaq: powering 100 marketplaces in 50
countries
Orders, quotes, trade executions,
market “tick” data from 7 exchanges
7 billion rows/day
Analyze market share, client activity,
surveillance, billing, and so on
Microsoft SQL Server on premises
Expensive legacy DW
($1.16 M/yr.)
Limited capacity (1 yr. of data
online)
Needed lower TCO
Must satisfy multiple security
and regulatory requirements
Similar performance
23 node DS2.8XL cluster
828 vCPUs, 5 TB RAM
368 TB compressed
2.7 T rows, 900 B derived
8 tables with 100 B rows
7 man-month migration
Âź the cost, 2x storage, room to
grow
Faster performance, very
secure
Nasdaq: powering 100 marketplaces in 50
countries
Amazon Redshift @
Chris Liu – Data Infrastructure Engineer
Education at scale
140
partners
3.7 million
course completions
20 million
learners worldwide
1900
courses
Takeaways
• Simplicity
• Scalability
• Flexibility
• Extensibility
Outline
• Moving from no data warehouse to the Amazon
Redshift ecosystem
• No warehouse: m2.2xlarge read replica – 4 CPUs, 32 GB RAM on Amazon RDS
• First Amazon Redshift cluster: 1 ds1.xl node – 2 CPU, 16 GB RAM
Outline
• Moving from no data warehouse to the Amazon
Redshift ecosystem
• No warehouse: m2.2xlarge read replica – 4 CPUs, 32 GB RAM on Amazon RDS
• First Amazon Redshift cluster: 1 ds1.xl node – 2 CPU, 16 GB RAM
• The Amazon Redshift ecosystem at Coursera
• Current day: 9 dc1.8xl nodes – 288 CPUs, 2.4 TB RAM
Outline
• Moving from no data warehouse to the Amazon
Redshift ecosystem
• No warehouse: m2.2xlarge read replica – 4 CPUs, 32 GB RAM on Amazon RDS
• First Amazon Redshift cluster: 1 ds1.xl node – 2 CPU, 16 GB RAM
• The Amazon Redshift ecosystem at Coursera
• Current day: 9 dc1.8xl nodes – 288 CPUs, 2.4 TB RAM
• Learnings from 3 years on Amazon Redshift
• Lessons in communication, surprises, reflections
Starting point
• Querying production read replica
• Makeshift libraries providing thin abstraction layer
• 45 minutes to provide aggregate metrics over all classes running on Coursera =(
Starting point
• Querying production read replica
• Makeshift libraries providing thin abstraction layer
• 45 minutes to provide aggregate metrics over all classes running on Coursera =(
Move in progress
• Risk-free deployment
• "Let's try it out"
• Few clicks to deploy cluster, connect to cluster, resize
Move in progress
• Risk-free deployment
• "Let's try it out"
• Few clicks to deploy cluster, connect to cluster, resize
• AWS ecosystem integration
• COPY from S3/EMR/SSH
• Unload to S3
• UNLOAD(COPY(data)) == COPY(UNLOAD(data)) == data
Move in progress
• Risk-free deployment
• "Let's try it out"
• Few clicks to deploy cluster, connect to cluster, resize
• AWS ecosystem integration
• COPY from S3/EMR/SSH
• Unload to S3
• UNLOAD(COPY(data)) == COPY(UNLOAD(data)) == data
• Minimal administration
• In aggregate, less than 1 full-time employee for administration
• Automation and tooling for monitoring usage and performance
Minimal administration
Amazon Redshift ecosystem at Coursera
• Data flow in and out of Amazon Redshift
• Business insights and reporting
• Deriving value from data
• Democratizing data access
Amazon Redshift ecosystem at Coursera
• Data flow in and out of Amazon Redshift
• Business insights and reporting
• Deriving value from data
• Democratizing data access
Amazon
Redshift
Amazon
RDS
Amazon EMR Amazon S3
Event feeds
Amazon EC2
Amazon
RDS
Amazon S3
BI applications
Internal applications
Third-party tools
Cassandra
Cassandra
AWS Data Pipeline
Data flow
Amazon S3
Amazon Redshift ecosystem at Coursera
• Data flow in and out of Amazon Redshift
• Business insights and reporting
• Data products
• Democratizing data access
• Provide directional insights and aggregate metrics
• Aggregate metrics over all classes on Coursera: < 5 seconds
• Goal: Insight at the speed of thought
• Results1: 0.8s median, 28s p95, 120s p99
• Companywide goal tracking
• Scheduled reports to internal and external stakeholders
• Crucial part of data informed culture
1 Results for ad hoc queries ran in the last 90 days
Business insights and reporting
Amazon Redshift ecosystem at Coursera
• Data flow in and out of Amazon Redshift
• Business insights and reporting
• Data products
• Democratizing data access
• AB experimentation
• 300 M impression table joined with 1.8 B events table in 12 minutes.
• Recommendations model
• Amazon Redshift for relational transformation
• Unload to S3 for model training
• Providing university partners with analytical dashboard
and research exports
Data Products
Amazon Redshift ecosystem at Coursera
• Data flow in and out of Amazon Redshift
• Business insights and reporting
• Data products
• Democratizing data access
Democratizing data access
Democratizing data access
Learnings from 3 years on Amazon Redshift
• Thinking in Amazon Redshift
• Communicate to users
• Surprises
• Reflections
Thinking in Amazon Redshift
• Columnar
• SELECT * considered harmful in most cases
Thinking in Amazon Redshift
• Columnar
• SELECT * considered harmful in most cases
• Nodes, slices, blocks
• 1 MB blocks per slice, n slices per node depending on node type
Thinking in Amazon Redshift
• Columnar
• SELECT * considered harmful in most cases
• Nodes, slices, blocks
• 1 MB blocks per slice, n slices per node depending on node type
• Sorting and distribution
• Share nothing massively parallel processing => data is sorted per slice
• Up to 2 orders of magnitude increase in JOIN/GROUP BY for merge join vs hash join
Communicating to users
• Prefer the scientific method over gut feel
• Investigate how many rows were materialized with svl_query_report
• Understand EXPLAIN plan for data distribution, join strategy, predicate order
Communicating to users
• Prefer the scientific method over gut feel
• Investigate how many rows were materialized with svl_query_report
• Understand EXPLAIN plan for data distribution, join strategy, predicate order
• SQL style guide for readability
• Leading commas, capitalized SQL keywords, conventions for handling dates/timestamps,
conventions for table names, mapping tables
Communicating to users
• Prefer the scientific method over gut feel
• Investigate how many rows were materialized with svl_query_report
• Understand EXPLAIN plan for data distribution, join strategy, predicate order
• SQL style guide for readability
• Leading commas, capitalized SQL keywords, conventions for handling dates/timestamps,
conventions for table names, mapping tables
• Use the right tool for the right task
• Amazon Redshift is not for online traffic serving
• Amazon Redshift is not for stream processing
Surprises
• "Fundamental theorem of Redshift at Coursera"
• Most queries involve full table scans
• 9 nodes x 32 slices/node x 1 block/slice x 1 MB/block => at least 288 MB allocated per column
• Store ~75 M integer values and maintain 1 block per slice
Surprises
• "Fundamental theorem of Redshift at Coursera"
• Most queries involve full table scans
• 9 nodes x 32 slices/node x 1 block/slice x 1 MB/block => at least 288 MB allocated per column
• Store ~75 M integer values and maintain 1 block per slice
• Features may behave in unexpected fashions
• Sort key compression
• Primary and foreign keys
Surprises
• "Fundamental theorem of Redshift at Coursera"
• Most queries involve full table scans
• 9 nodes x 32 slices/node x 1 block/slice x 1 MB/block => at least 288 MB allocated per column
• Store ~75 M integer values and maintain 1 block per slice
• Features may behave in unexpected fashions
• Sort key compression
• Primary and foreign keys
• Features may be unexpectedly expensive
• COMMIT – Batch work, monitor with stl_commit_stats
• VACUUM – Prefer TRUNCATE, monitor with stl_vacuum, stl_query
• Your mileage may vary
Reflections
• Simplicity – Relational model, Postgres 8.0 compliant SQL, things just
work. Minimal administration. Minimal tuning.
Reflections
• Simplicity – Relational model, Postgres 8.0 compliant SQL, things just
work. Minimal administration. Minimal tuning.
• Scalability – Scaled cluster up 5 times in the last 3 years as data volume
and usage increased.
Reflections
• Simplicity – Relational model, Postgres 8.0 compliant SQL, things just
work. Minimal administration. Minimal tuning.
• Scalability – Scaled cluster up 5 times in the last 3 years as data volume
and usage increased.
• Flexibility – No strict requirement on data modeling; dusty knobs for
tuning in majority of cases. Handles both heavily normalized data model
and denormalized clickstream data.
Reflections
• Simplicity – Relational model, Postgres 8.0 compliant SQL, things just
work. Minimal administration. Minimal tuning.
• Scalability – Scaled cluster up 5 times in the last 3 years as data volume
and usage increased.
• Flexibility – No strict requirement on data modeling; dusty knobs for
tuning in majority of cases. Handles both heavily normalized data model
and denormalized clickstream data.
• Extensibility – Standard API (JDBC/ODBC/libpq) and integration points.
Resources
Pavan Pothukuchi | pavanpo@amazon.com |
Chris Liu | cliu@coursera.org |
Detail pages
• http://aws.amazon.com/redshift
• https://aws.amazon.com/marketplace/redshift/
Best practices
• http://docs.aws.amazon.com/redshift/latest/dg/c_loading-data-best-practices.html
• http://docs.aws.amazon.com/redshift/latest/dg/c_designing-tables-best-practices.html
• http://docs.aws.amazon.com/redshift/latest/dg/c-optimizing-query-performance.html
Related breakout sessions
• Deep Dive on Amazon QuickSight (2:15–3:15 pm)
• Getting Started with Amazon QuickSight (2:15–3:15 pm)
• Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with Minimal Downtime
(4:45–5:45 pm)
Getting Started with Amazon Redshift

Weitere ähnliche Inhalte

Was ist angesagt?

AWS re:Invent 2016: Case Study: How Monsanto Uses Amazon EFS with Their Large...
AWS re:Invent 2016: Case Study: How Monsanto Uses Amazon EFS with Their Large...AWS re:Invent 2016: Case Study: How Monsanto Uses Amazon EFS with Their Large...
AWS re:Invent 2016: Case Study: How Monsanto Uses Amazon EFS with Their Large...Amazon Web Services
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Web Services
 
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)Amazon Web Services
 
Intro to AWS: Database Services
Intro to AWS: Database ServicesIntro to AWS: Database Services
Intro to AWS: Database ServicesAmazon Web Services
 
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...Amazon Web Services
 
SRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraSRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraAmazon Web Services
 
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...Amazon Web Services
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon AuroraAmazon Web Services
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services
 
Getting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSGetting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSAmazon Web Services
 
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRAmazon Web Services
 
Cost Optimization at Scale
Cost Optimization at ScaleCost Optimization at Scale
Cost Optimization at ScaleAmazon Web Services
 
ENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudAmazon Web Services
 
HSBC and AWS Day - Big Data and HPC on AWS
HSBC and AWS Day - Big Data and HPC on AWSHSBC and AWS Day - Big Data and HPC on AWS
HSBC and AWS Day - Big Data and HPC on AWSAmazon Web Services
 
Deep Dive on MySQL Databases on AWS - AWS Online Tech Talks
Deep Dive on MySQL Databases on AWS - AWS Online Tech TalksDeep Dive on MySQL Databases on AWS - AWS Online Tech Talks
Deep Dive on MySQL Databases on AWS - AWS Online Tech TalksAmazon Web Services
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRAmazon Web Services
 
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and ScalableAmazon Web Services
 
Consolidate MySQL Shards Into Amazon Aurora Using AWS Database Migration Serv...
Consolidate MySQL Shards Into Amazon Aurora Using AWS Database Migration Serv...Consolidate MySQL Shards Into Amazon Aurora Using AWS Database Migration Serv...
Consolidate MySQL Shards Into Amazon Aurora Using AWS Database Migration Serv...Amazon Web Services
 

Was ist angesagt? (20)

AWS re:Invent 2016: Case Study: How Monsanto Uses Amazon EFS with Their Large...
AWS re:Invent 2016: Case Study: How Monsanto Uses Amazon EFS with Their Large...AWS re:Invent 2016: Case Study: How Monsanto Uses Amazon EFS with Their Large...
AWS re:Invent 2016: Case Study: How Monsanto Uses Amazon EFS with Their Large...
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
 
Intro to AWS: Database Services
Intro to AWS: Database ServicesIntro to AWS: Database Services
Intro to AWS: Database Services
 
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
 
SRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraSRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon Aurora
 
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
Getting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSGetting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWS
 
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
Cost Optimization at Scale
Cost Optimization at ScaleCost Optimization at Scale
Cost Optimization at Scale
 
ENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the Cloud
 
HSBC and AWS Day - Big Data and HPC on AWS
HSBC and AWS Day - Big Data and HPC on AWSHSBC and AWS Day - Big Data and HPC on AWS
HSBC and AWS Day - Big Data and HPC on AWS
 
Deep Dive on MySQL Databases on AWS - AWS Online Tech Talks
Deep Dive on MySQL Databases on AWS - AWS Online Tech TalksDeep Dive on MySQL Databases on AWS - AWS Online Tech Talks
Deep Dive on MySQL Databases on AWS - AWS Online Tech Talks
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
 
Amazon Redshift
Amazon Redshift Amazon Redshift
Amazon Redshift
 
Consolidate MySQL Shards Into Amazon Aurora Using AWS Database Migration Serv...
Consolidate MySQL Shards Into Amazon Aurora Using AWS Database Migration Serv...Consolidate MySQL Shards Into Amazon Aurora Using AWS Database Migration Serv...
Consolidate MySQL Shards Into Amazon Aurora Using AWS Database Migration Serv...
 

Ähnlich wie Getting Started with Amazon Redshift

Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoGetting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoAmazon Web Services
 
Getting Started with Amazon Redshift
 Getting Started with Amazon Redshift Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Introdução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon RedshiftIntrodução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon RedshiftAmazon Web Services LATAM
 
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesGetting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesAmazon Web Services
 
Getting started with Amazon Redshift
Getting started with Amazon RedshiftGetting started with Amazon Redshift
Getting started with Amazon RedshiftAmazon Web Services
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon RedshiftAmazon Web Services
 
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftAmazon Web Services
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAmazon Web Services
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Amazon Web Services
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기Amazon Web Services Korea
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features Amazon Web Services
 
BenefĂ­cios e melhores prĂĄticas no uso do Amazon Redshift
BenefĂ­cios e melhores prĂĄticas no uso do Amazon RedshiftBenefĂ­cios e melhores prĂĄticas no uso do Amazon Redshift
BenefĂ­cios e melhores prĂĄticas no uso do Amazon RedshiftAmazon Web Services LATAM
 

Ähnlich wie Getting Started with Amazon Redshift (20)

Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoGetting started with amazon redshift - Toronto
Getting started with amazon redshift - Toronto
 
Getting Started with Amazon Redshift
 Getting Started with Amazon Redshift Getting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Introdução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon RedshiftIntrodução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon Redshift
 
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesGetting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
 
Getting started with Amazon Redshift
Getting started with Amazon RedshiftGetting started with Amazon Redshift
Getting started with Amazon Redshift
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features
 
BenefĂ­cios e melhores prĂĄticas no uso do Amazon Redshift
BenefĂ­cios e melhores prĂĄticas no uso do Amazon RedshiftBenefĂ­cios e melhores prĂĄticas no uso do Amazon Redshift
BenefĂ­cios e melhores prĂĄticas no uso do Amazon Redshift
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalitĂ  Server...
Big Data per le Startup: come creare applicazioni Big Data in modalitĂ  Server...Big Data per le Startup: come creare applicazioni Big Data in modalitĂ  Server...
Big Data per le Startup: come creare applicazioni Big Data in modalitĂ  Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalitĂ  Server...
Big Data per le Startup: come creare applicazioni Big Data in modalitĂ  Server...Big Data per le Startup: come creare applicazioni Big Data in modalitĂ  Server...
Big Data per le Startup: come creare applicazioni Big Data in modalitĂ  Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

KĂźrzlich hochgeladen

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

KĂźrzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Getting Started with Amazon Redshift

  • 1. Š 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Pavan Pothukuchi, Principal Product Manager, AWS Chris Liu, Data Infrastructure Engineer, Coursera July 13, 2016 Getting Started with Amazon Redshift
  • 2. Agenda • Introduction • Benefits • Use cases • Redshift @ Coursera • Q&A
  • 3. AnalyzeStore Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora AWS big data portfolio AWS Data Pipeline Amazon CloudSearch Amazon EMR Amazon EC2 Amazon Redshift Amazon Machine Learning Amazon Elasticsearch Service AWS Database Migration Service Amazon QuickSight Amazon Kinesis Firehose AWS Import/Export Snowball AWS Direct Connect Collect Amazon Kinesis Streams
  • 4. Relational data warehouse Massively parallel; petabyte scale Fully managed HDD and SSD platforms $1,000/TB/year; starts at $0.25/hour Amazon Redshift a lot faster a lot simpler a lot cheaper
  • 5. The Amazon Redshift view of data warehousing 10x cheaper Easy to provision Higher DBA productivity 10x faster No programming Easily leverage BI tools, Hadoop, machine learning, streaming Analysis inline with process flows Pay as you go, grow as you need Managed availability and disaster recovery Enterprise Big data SaaS
  • 6. The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave™ are trademarks of Forrester Research, Inc. The Forrester Wave™ is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change. Forrester Wave™ enterprise data warehouse Q4 ’15
  • 8. Amazon Redshift architecture Leader node Simple SQL endpoint Stores metadata Optimizes query plan Coordinates query execution Compute nodes Local columnar storage Parallel/distributed execution of all queries, loads, backups, restores, resizes Start at just $0.25/hour, grow to 2 PB (compressed) DC1: SSD; scale from 160 GB to 326 TB DS2: HDD; scale from 2 TB to 2 PB Ingestion/backup Backup Restore JDBC/ODBC 10 GigE (HPC)
  • 9. Benefit #1: Amazon Redshift is fast Dramatically less I/O Column storage Data compression Zone maps Direct-attached storage Large data block sizes analyze compression listing; Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw 10 | 13 | 14 | 26 |… … | 100 | 245 | 324 375 | 393 | 417… … 512 | 549 | 623 637 | 712 | 809 … … | 834 | 921 | 959 10 324 375 623 637 959
  • 10. Benefit #1: Amazon Redshift is fast Parallel and distributed Query Load Export Backup Restore Resize
  • 11. Benefit #1: Amazon Redshift is fast Hardware optimized for I/O intensive workloads, 4 GB/sec/node Enhanced networking, over 1 million packets/sec/node Choice of storage type, instance size Regular cadence of autopatched improvements
  • 12. Benefit #1: Amazon Redshift is fast New dense storage (HDD) instance type (Jun 2015) Improved memory 2x, compute 2x, disk throughput 1.5x Cost: Same as our prior generation! Performance improvement: 50% Enhanced I/O and commit improvements (Jan 2016) Performance improvement: 35% Memory allocation improvements (May 2016) Performance improvement: 60%
  • 13. Benefit #2: Amazon Redshift is inexpensive Ds2 (HDD) Price per hour for DW1.XL single node Effective annual price per TB compressed On demand $ 0.850 $ 3,725 1-year reservation $ 0.500 $ 2,190 3-year reservation $ 0.228 $ 999 Dc1 (SSD) Price per hour for DW2.L single node Effective annual price per TB compressed On demand $ 0.250 $ 13,690 1-year reservation $ 0.161 $ 8,795 3-year reservation $ 0.100 $ 5,500 Pricing is simple Number of nodes x price/hour No charge for leader node No upfront costs Pay as you go
  • 14. Benefit #3: Amazon Redshift is fully managed Continuous/incremental backups Multiple copies within cluster Continuous and incremental backups to Amazon S3 Continuous and incremental backups across regions Streaming restore Amazon S3 Amazon S3 Region 1 Region 2
  • 15. Benefit #3: Amazon Redshift is fully managed Amazon S3 Amazon S3 Region 1 Region 2 Fault tolerance Disk failures Node failures Network failures Availability Zone/region level disasters
  • 16. Benefit #4: Security is built in • Load encrypted from Amazon S3 • SSL to secure data in transit • ECDHE perfect forward security • Amazon VPC for network isolation • Encryption to secure data at rest • All blocks on disks and in Amazon S3 encrypted • Block key, cluster key, master key (AES-256) • On-premises HSM and AWS CloudHSM support • Audit logging and AWS CloudTrail integration • SOC 1/2/3, PCI-DSS, FedRAMP, BAA 10 GigE (HPC) Ingestion Backup Restore Customer VPC Internal VPC JDBC/ODBC
  • 17. Benefit #5: We innovate quickly Well over 125 new features added since launch Release every two weeks Automatic patching Service Launch (2/14) PDX (4/2) Temp Credentials (4/11) DUB (4/25) SOC1/2/3 (5/8) Unload Encrypted Files NRT (6/5) JDBC Fetch Size (6/27) Unload logs (7/5) SHA1 Builtin (7/15) 4 byte UTF-8 (7/18) Sharing snapshots (7/18) Statement Timeout (7/22) Timezone, Epoch, Autoformat (7/25) WLM Timeout/Wildcards (8/1) CRC32 Builtin, CSV, Restore Progress (8/9) Resource Level IAM (8/9) PCI (8/22) UTF-8 Substitution (8/29) JSON, Regex, Cursors (9/10) Split_part, Audit tables (10/3) SIN/SYD (10/8) HSM Support (11/11) Kinesis EMR/HDFS/SSH copy, Distributed Tables, Audit Logging/CloudTrail, Concurrency, Resize Perf., Approximate Count Distinct, SNS Alerts, Cross Region Backup (11/13) Distributed Tables, Single Node Cursor Support, Maximum Connections to 500 (12/13) EIP Support for VPC Clusters (12/28) New query monitoring system tables and diststyle all (1/13) Redshift on DW2 (SSD) Nodes (1/23) Compression for COPY from SSH, Fetch size support for single node clusters, new system tables with commit stats, row_number(), strotol() and query termination (2/13) Resize progress indicator & Cluster Version (3/21) Regex_Substr, COPY from JSON (3/25) 50 slots, COPY from EMR, ECDHE ciphers (4/22) 3 new regex features, Unload to single file, FedRAMP(5/6) Rename Cluster (6/2) Copy from multiple regions, percentile_cont, percentile_disc (6/30) Free Trial (7/1) pg_last_unload_count (9/15) AES-128 S3 encryption (9/29) UTF-16 support (9/29)
  • 18. Benefit #6: Amazon Redshift is powerful • Approximate functions • User-defined functions • Machine learning • Data science
  • 19. Benefit #7: Amazon Redshift has a large ecosystem Data integration Systems integratorsBusiness intelligence
  • 20. Benefit #8: Service-oriented architecture Amazon DynamoDB Amazon EMR Amazon S3 Amazon EC2/SSH Amazon RDS/ Amazon Aurora Amazon Redshift Amazon Kinesis Amazon ML AWS Data Pipeline Amazon CloudSearch Amazon Mobile Analytics
  • 21. Performance Ease of use Security Analytics and functionality SOA Recent launches Dynamic WLM parameters Queue hopping for timed-out queries Merge rows from staging to prod. table 2x improvement in query throughput 10x latency improvement for UNION ALL queries Bzip2 format for ingestion Table level restore 10x improvement in vacuum perf. Default access privileges Tag-based AWS IAM access IAM roles for COPY/UNLOAD SAS connector enhancements, Implicit conversion of SAS queries to Amazon Redshift DMS support from OLTP sources Enhanced data ingestion from Kinesis Firehose Improved data schema conversion to Amazon ML
  • 23. NTT Docomo: Japan’s largest mobile service provider 68 million customers Tens of TBs per day of data across a mobile network 6 PB of total data (uncompressed) Data science for marketing operations, logistics, and so on Greenplum on premises Scaling challenges Performance issues Need same level of security Need for a hybrid environment
  • 24. NTT Docomo: Japan’s largest mobile service provider 125 node DS2.8XL cluster 4,500 vCPUs, 30 TB RAM 2 PB compressed 10x faster analytic queries 50% reduction in time for new BI application deployment Significantly less operations overhead Data Source ET AWS Direct Connect Client Forwarder LoaderState management SandboxAmazon Redshift S3
  • 25. Nasdaq: powering 100 marketplaces in 50 countries Orders, quotes, trade executions, market “tick” data from 7 exchanges 7 billion rows/day Analyze market share, client activity, surveillance, billing, and so on Microsoft SQL Server on premises Expensive legacy DW ($1.16 M/yr.) Limited capacity (1 yr. of data online) Needed lower TCO Must satisfy multiple security and regulatory requirements Similar performance
  • 26. 23 node DS2.8XL cluster 828 vCPUs, 5 TB RAM 368 TB compressed 2.7 T rows, 900 B derived 8 tables with 100 B rows 7 man-month migration Âź the cost, 2x storage, room to grow Faster performance, very secure Nasdaq: powering 100 marketplaces in 50 countries
  • 27. Amazon Redshift @ Chris Liu – Data Infrastructure Engineer
  • 28.
  • 29. Education at scale 140 partners 3.7 million course completions 20 million learners worldwide 1900 courses
  • 30. Takeaways • Simplicity • Scalability • Flexibility • Extensibility
  • 31. Outline • Moving from no data warehouse to the Amazon Redshift ecosystem • No warehouse: m2.2xlarge read replica – 4 CPUs, 32 GB RAM on Amazon RDS • First Amazon Redshift cluster: 1 ds1.xl node – 2 CPU, 16 GB RAM
  • 32. Outline • Moving from no data warehouse to the Amazon Redshift ecosystem • No warehouse: m2.2xlarge read replica – 4 CPUs, 32 GB RAM on Amazon RDS • First Amazon Redshift cluster: 1 ds1.xl node – 2 CPU, 16 GB RAM • The Amazon Redshift ecosystem at Coursera • Current day: 9 dc1.8xl nodes – 288 CPUs, 2.4 TB RAM
  • 33. Outline • Moving from no data warehouse to the Amazon Redshift ecosystem • No warehouse: m2.2xlarge read replica – 4 CPUs, 32 GB RAM on Amazon RDS • First Amazon Redshift cluster: 1 ds1.xl node – 2 CPU, 16 GB RAM • The Amazon Redshift ecosystem at Coursera • Current day: 9 dc1.8xl nodes – 288 CPUs, 2.4 TB RAM • Learnings from 3 years on Amazon Redshift • Lessons in communication, surprises, reflections
  • 34. Starting point • Querying production read replica • Makeshift libraries providing thin abstraction layer • 45 minutes to provide aggregate metrics over all classes running on Coursera =(
  • 35. Starting point • Querying production read replica • Makeshift libraries providing thin abstraction layer • 45 minutes to provide aggregate metrics over all classes running on Coursera =(
  • 36. Move in progress • Risk-free deployment • "Let's try it out" • Few clicks to deploy cluster, connect to cluster, resize
  • 37. Move in progress • Risk-free deployment • "Let's try it out" • Few clicks to deploy cluster, connect to cluster, resize • AWS ecosystem integration • COPY from S3/EMR/SSH • Unload to S3 • UNLOAD(COPY(data)) == COPY(UNLOAD(data)) == data
  • 38. Move in progress • Risk-free deployment • "Let's try it out" • Few clicks to deploy cluster, connect to cluster, resize • AWS ecosystem integration • COPY from S3/EMR/SSH • Unload to S3 • UNLOAD(COPY(data)) == COPY(UNLOAD(data)) == data • Minimal administration • In aggregate, less than 1 full-time employee for administration • Automation and tooling for monitoring usage and performance
  • 40. Amazon Redshift ecosystem at Coursera • Data flow in and out of Amazon Redshift • Business insights and reporting • Deriving value from data • Democratizing data access
  • 41. Amazon Redshift ecosystem at Coursera • Data flow in and out of Amazon Redshift • Business insights and reporting • Deriving value from data • Democratizing data access
  • 42. Amazon Redshift Amazon RDS Amazon EMR Amazon S3 Event feeds Amazon EC2 Amazon RDS Amazon S3 BI applications Internal applications Third-party tools Cassandra Cassandra AWS Data Pipeline Data flow Amazon S3
  • 43. Amazon Redshift ecosystem at Coursera • Data flow in and out of Amazon Redshift • Business insights and reporting • Data products • Democratizing data access
  • 44. • Provide directional insights and aggregate metrics • Aggregate metrics over all classes on Coursera: < 5 seconds • Goal: Insight at the speed of thought • Results1: 0.8s median, 28s p95, 120s p99 • Companywide goal tracking • Scheduled reports to internal and external stakeholders • Crucial part of data informed culture 1 Results for ad hoc queries ran in the last 90 days Business insights and reporting
  • 45. Amazon Redshift ecosystem at Coursera • Data flow in and out of Amazon Redshift • Business insights and reporting • Data products • Democratizing data access
  • 46. • AB experimentation • 300 M impression table joined with 1.8 B events table in 12 minutes. • Recommendations model • Amazon Redshift for relational transformation • Unload to S3 for model training • Providing university partners with analytical dashboard and research exports Data Products
  • 47. Amazon Redshift ecosystem at Coursera • Data flow in and out of Amazon Redshift • Business insights and reporting • Data products • Democratizing data access
  • 50. Learnings from 3 years on Amazon Redshift • Thinking in Amazon Redshift • Communicate to users • Surprises • Reflections
  • 51. Thinking in Amazon Redshift • Columnar • SELECT * considered harmful in most cases
  • 52. Thinking in Amazon Redshift • Columnar • SELECT * considered harmful in most cases • Nodes, slices, blocks • 1 MB blocks per slice, n slices per node depending on node type
  • 53. Thinking in Amazon Redshift • Columnar • SELECT * considered harmful in most cases • Nodes, slices, blocks • 1 MB blocks per slice, n slices per node depending on node type • Sorting and distribution • Share nothing massively parallel processing => data is sorted per slice • Up to 2 orders of magnitude increase in JOIN/GROUP BY for merge join vs hash join
  • 54. Communicating to users • Prefer the scientific method over gut feel • Investigate how many rows were materialized with svl_query_report • Understand EXPLAIN plan for data distribution, join strategy, predicate order
  • 55. Communicating to users • Prefer the scientific method over gut feel • Investigate how many rows were materialized with svl_query_report • Understand EXPLAIN plan for data distribution, join strategy, predicate order • SQL style guide for readability • Leading commas, capitalized SQL keywords, conventions for handling dates/timestamps, conventions for table names, mapping tables
  • 56. Communicating to users • Prefer the scientific method over gut feel • Investigate how many rows were materialized with svl_query_report • Understand EXPLAIN plan for data distribution, join strategy, predicate order • SQL style guide for readability • Leading commas, capitalized SQL keywords, conventions for handling dates/timestamps, conventions for table names, mapping tables • Use the right tool for the right task • Amazon Redshift is not for online traffic serving • Amazon Redshift is not for stream processing
  • 57. Surprises • "Fundamental theorem of Redshift at Coursera" • Most queries involve full table scans • 9 nodes x 32 slices/node x 1 block/slice x 1 MB/block => at least 288 MB allocated per column • Store ~75 M integer values and maintain 1 block per slice
  • 58. Surprises • "Fundamental theorem of Redshift at Coursera" • Most queries involve full table scans • 9 nodes x 32 slices/node x 1 block/slice x 1 MB/block => at least 288 MB allocated per column • Store ~75 M integer values and maintain 1 block per slice • Features may behave in unexpected fashions • Sort key compression • Primary and foreign keys
  • 59. Surprises • "Fundamental theorem of Redshift at Coursera" • Most queries involve full table scans • 9 nodes x 32 slices/node x 1 block/slice x 1 MB/block => at least 288 MB allocated per column • Store ~75 M integer values and maintain 1 block per slice • Features may behave in unexpected fashions • Sort key compression • Primary and foreign keys • Features may be unexpectedly expensive • COMMIT – Batch work, monitor with stl_commit_stats • VACUUM – Prefer TRUNCATE, monitor with stl_vacuum, stl_query • Your mileage may vary
  • 60. Reflections • Simplicity – Relational model, Postgres 8.0 compliant SQL, things just work. Minimal administration. Minimal tuning.
  • 61. Reflections • Simplicity – Relational model, Postgres 8.0 compliant SQL, things just work. Minimal administration. Minimal tuning. • Scalability – Scaled cluster up 5 times in the last 3 years as data volume and usage increased.
  • 62. Reflections • Simplicity – Relational model, Postgres 8.0 compliant SQL, things just work. Minimal administration. Minimal tuning. • Scalability – Scaled cluster up 5 times in the last 3 years as data volume and usage increased. • Flexibility – No strict requirement on data modeling; dusty knobs for tuning in majority of cases. Handles both heavily normalized data model and denormalized clickstream data.
  • 63. Reflections • Simplicity – Relational model, Postgres 8.0 compliant SQL, things just work. Minimal administration. Minimal tuning. • Scalability – Scaled cluster up 5 times in the last 3 years as data volume and usage increased. • Flexibility – No strict requirement on data modeling; dusty knobs for tuning in majority of cases. Handles both heavily normalized data model and denormalized clickstream data. • Extensibility – Standard API (JDBC/ODBC/libpq) and integration points.
  • 64. Resources Pavan Pothukuchi | pavanpo@amazon.com | Chris Liu | cliu@coursera.org | Detail pages • http://aws.amazon.com/redshift • https://aws.amazon.com/marketplace/redshift/ Best practices • http://docs.aws.amazon.com/redshift/latest/dg/c_loading-data-best-practices.html • http://docs.aws.amazon.com/redshift/latest/dg/c_designing-tables-best-practices.html • http://docs.aws.amazon.com/redshift/latest/dg/c-optimizing-query-performance.html Related breakout sessions • Deep Dive on Amazon QuickSight (2:15–3:15 pm) • Getting Started with Amazon QuickSight (2:15–3:15 pm) • Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with Minimal Downtime (4:45–5:45 pm)