SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Downloaden Sie, um offline zu lesen
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Alex Sinner, Solutions Architecture PMO – Amazon Web Services
Luuk Linssen, Product Manager - Bannerconnect
May 24, 2016
Amazon Redshift Deep Dive
Fast, fully managed, petabyte-scale data warehousing for less than $1,000/TB/year
Amazon Redshift
Amazon Redshift delivers performance
“[Amazon] Redshift is twenty times faster than Hive.” (5x–20x reduction in query times) link
“Queries that used to take hours came back in seconds. Our analysts are orders of magnitude
more productive.” (20x–40x reduction in query times) link
“…[Amazon Redshift] performance has blown away everyone here (we generally see 50–
100x speedup over Hive).” link
“Team played with [Amazon] Redshift today and concluded it is ****** awesome. Un-indexed
complex queries returning in < 10s.”
“Did I mention it's ridiculously fast? We'll be using it immediately to provide our analysts an
alternative to Hadoop.”
“We saw… 2x improvement in query times.”
Channel “We regularly process multibillion row datasets and we do that in a matter of hours.” link
Amazon Redshift system architecture
10 GigE
(HPC)
Ingestion
Backup
Restore
SQL Clients/BI Tools
128GB RAM
16TB disk
16 cores
Amazon S3/Amazon EMR/Amazon DynamoDB/SSH
JDBC/ODBC
128GB RAM
16TB disk
16 cores
Compute Node
128GB RAM
16TB disk
16 cores
Compute Node
128GB RAM
16TB disk
16 cores
Compute Node
Leader
Node
A deeper look at compute node architecture
Leader Node
Dense compute nodes
Large
•  2 slices/cores
•  15 GB RAM
•  160 GB SSD
8XL
•  32 slices/cores
•  244 GB RAM
•  2.56 TB SSD
Dense storage nodes
X-large
•  2 slices/4 cores
•  31 GB RAM
•  2 TB HDD
8XL
•  16 slices/ 36 cores
•  244 GB RAM
•  16 TB HDD
Data Ingestion
Summary of Best Practices
Table Design
ü  Choose the best Sort Key
ü  Choose the best Distribution Key
ü  Compression Encodings – use automatic compression
Loading Data
ü  Use COPY command (not INSERT)
ü  Load multiple, compressed files in a single COPY
command, in Sort Key order
Use multiple input files to maximize throughput
COPY command
Each slice loads one file at a time
A single input file means
only one slice is ingesting data
Instead of full bandwidth, you
only get 1/16th
Use multiple input files to maximize throughput
COPY command
Use at least as many input files
as you have slices
With 16 input files, all slices are
working so you maximize
throughput from S3
Scale linearly as you add nodes
Primary keys and manifest files
Amazon Redshift doesn’t enforce primary key constraints
•  If you load data multiple times, Amazon Redshift won’t complain
•  If you declare primary keys, the optimizer will expect the data to
be unique
Use manifest files to control exactly what is loaded and
how to respond if input files are missing
•  Define a JSON manifest on Amazon S3
•  Ensures that the cluster loads exactly what you want
Data hygiene
Analyze tables regularly
•  Every single load for popular columns
•  Weekly for all columns
•  Look SVV_TABLE_INFO(stats_off) for stale stats
•  Look stl_alert_event_log for missing stats
Vacuum tables regularly
•  Weekly is a good target
•  Number of unsorted blocks as trigger
•  Look SVV_TABLE_INFO(unsorted, empty)
•  Deep copy might be faster for high percent unsorted
(20% unsorted usually is faster to deep copy)
Automatic compression is a good thing (mostly)
Better performance, lower costs
Samples data automatically when COPY into an empty table
•  Samples up to 100,000 rows and picks optimal encoding
Regular ETL process using temp or staging tables:
Turn off automatic compression
•  Use analyze compression to determine the right encodings
•  Bake those encodings into your DML
Be careful when compressing your sort keys
Zone maps store min/max per block
After we know which block(s) contain the range,
we know which row offsets to scan
Highly compressed sort keys means many rows
per block
You’ll scan more data blocks than you need
If your sort keys compress significantly more
than your data columns, you might want to skip
compression of sortkey column(s)
Check SVV_TABLE_INFO(skew_sortkey1)
COL1 COL2
Keep your columns as narrow as possible
•  Buffers allocated based on declared
column width
•  Wider than needed columns mean
memory is wasted
•  Fewer rows fit into memory; increased
likelihood of queries spilling to disk
•  Check
SVV_TABLE_INFO(max_varchar)
Recent features
IAM Role support for COPY and UNLOAD
Use IAM roles to securely give access permission to COPY
or UNLOAD data from/to S3 / DynamoDB
You can associate up to 10 IAM roles with a cluster
Use IAM role ARN identifier in COPY/UNLOAD command
Restrict roles to specific Redshift users
New SQL functions
We add SQL functions regularly to expand Amazon Redshift’s query capabilities
Added 25+ window and aggregate functions since launch, including:
•  LISTAGG
•  [APPROXIMATE] COUNT
•  DROP IF EXISTS, CREATE IF NOT EXISTS
•  REGEXP_SUBSTR, _COUNT, _INSTR, _REPLACE
•  PERCENTILE_CONT, _DISC, MEDIAN
•  PERCENT_RANK, RATIO_TO_REPORT
We’ll continue iterating but also want to enable you to write your own
Scalar user-defined functions (UDFs)
You can write UDFs using Python 2.7
•  Syntax is largely identical to PostgreSQL UDF syntax
•  System and network calls within UDFs are prohibited
Comes with Pandas, NumPy, and SciPy pre-installed
•  You’ll also be able import your own libraries for even more
flexibility
Scalar UDF example
CREATE FUNCTION f_hostname (VARCHAR url)
  RETURNS varchar
IMMUTABLE AS $$
  import urlparse
  return urlparse.urlparse(url).hostname
$$ LANGUAGE plpythonu;
Scalar UDF examples from partners
http://www.looker.com/blog/amazon-redshift-user-
defined-functions
https://www.periscope.io/blog/redshift-user-defined-
functions-python.html
1-click deployment to launch, on
multiple regions around the world
Pay-as-you-go pricing with no long
term contracts required
Advanced Analytics Business IntelligenceData Integration
Interleaved sort keys
Compound sort keys
Records in Amazon
Redshift are stored in
blocks
For this illustration, let’s
assume that four records
fill a block
Records with a given
cust_id are all in one
block
However, records with a
given prod_id are spread
across four blocks
1
1
1
1
2
3
4
1
4
4
4
2
3
4
4
1
3
3
3
2
3
4
3
1
2
2
2
2
3
4
2
1
1  [1,1] [1,2] [1,3] [1,4]
2  [2,1] [2,2] [2,3] [2,4]
3  [3,1] [3,2] [3,3] [3,4]
4  [4,1] [4,2] [4,3] [4,4]
1 2 3 4
prod_id
cust_id
cust_id prod_id other columns blocks
Select sum(amt)
From big_tab
Where cust_id = (1234);
Select sum(amt)
From big_tab
Where prod_id = (5678);
1  [1,1] [1,2] [1,3] [1,4]
2  [2,1] [2,2] [2,3] [2,4]
3  [3,1] [3,2] [3,3] [3,4]
4  [4,1] [4,2] [4,3] [4,4]
1 2 3 4
prod_id
cust_id
Interleaved sort keys
Records with a given
cust_id are spread
across two blocks
Records with a given
prod_id are also spread
across two blocks
Data is sorted in equal
measures for both keys
1
1
2
2
2
1
2
3
3
4
4
4
3
4
3
1
3
4
4
2
1
2
3
3
1
2
2
4
3
4
1
1
cust_id prod_id other columns blocks
Usage
New keyword INTERLEAVED when defining sort keys
•  Existing syntax will still work and behavior is unchanged
•  You can choose up to 8 columns to include and can query with any
or all of them
No change needed to queries
[[ COMPOUND | INTERLEAVED ] SORTKEY ( column_name [, ...] ) ]
Migrating existing workloads
Typical ETL/ELT on legacy data warehouse
  One file per table, maybe a few if too big
  Many updates (“massage” the data)
  Every job clears the data, then load
  Count on primary key to block double loads
  High concurrency of load jobs
Two questions to ask
Why do you do what you do?
•  Many times, users don’t even know
What is the customer need?
•  Many times, needs do not match current practice
•  You might benefit from adding other AWS services
Open-source tools
https://github.com/awslabs/amazon-redshift-utils
Admin scripts
•  Collection of utilities for running diagnostics on your cluster.
Admin views
•  Collection of utilities for managing your cluster, generating schema DDL, and so on
Column encoding utility
•  Gives you the ability to apply optimal column encoding to an established schema with data already loaded
Analyze and vacuum utility
•  Gives you the ability to automate VACUUM and ANALYZE operations
Unload and copy utility
•  Helps you to migrate data between Amazon Redshift clusters or databases
all things programmatic
we are
passionate &
resourceful
people in
pursuit of
mastery
we are
Bannerconnect
90 employees
founded in 2004
Amsterdam & Sittard
old
situation
ü  700GB data per day
ü  servicing 100 accounts
today
ü  1.9TB data per day
ü  servicing 250 accounts
challenges
industry becomes more advanced
processing more data than ever!
from 700GB to 1.9TB on a daily basis
need of near real time data processing
not want to wait hours for new insights
why
amazon
upscale your cluster within a couple of clicks
on-premises hardware vs. virtual server
no upfront costs
our tech
setup
data
collection
data
storage Amazon
Redshift
processing
connecting
to external
platforms
programmatic
viewability
adserver
audiences
input
aggregated reports
audiences
data exports
output
client
cluster
(HDD)
big data
cluster
(HDD)
reporting
data
(SSD)
outcome
increase in speed
10 GB report went from minutes to seconds
increase in detail
all reports are now created on a hourly instead of daily basis
completely new audience insights
45GB per report per account
next
steps
expanding
current stack
Kinesis for (near) real
time data processing
lessons
learned
use the AWS best practices!
ü  load compressed files instead of
uncompressed files
ü  make sure load data is being split into
multiples files, where the number of files
is a multiple of the number of slices in the
cluster
ü  make sure to choose the best distribution
and sort key for your tables
our
contacts
contact us
+31 (0)46 707 4992
info@bannerconnect.net
online
www.bannerconnect.net
/bannerconnect
@bannerconnect
/company/bannerconnect
@wearebannerconnect
address
Poststraat 12
6135 KR, Sittard
The Netherlands
-
Karperstraat 10
1075 KZ, Amsterdam
The Netherlands
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationVolodymyr Rovetskiy
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaAmazon Web Services
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Amazon Web Services
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Web Services
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
iceberg introduction.pptx
iceberg introduction.pptxiceberg introduction.pptx
iceberg introduction.pptxDori Waldman
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon RedshiftKel Graham
 
Redshift performance tuning
Redshift performance tuningRedshift performance tuning
Redshift performance tuningCarlos del Cacho
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best PracticesAmazon Web Services
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...Amazon Web Services Korea
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Amazon Web Services
 

Was ist angesagt? (20)

Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
 
Introduction to sql
Introduction to sqlIntroduction to sql
Introduction to sql
 
PostgreSQL
PostgreSQL PostgreSQL
PostgreSQL
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and Optimization
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
iceberg introduction.pptx
iceberg introduction.pptxiceberg introduction.pptx
iceberg introduction.pptx
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
ElastiCache & Redis
ElastiCache & RedisElastiCache & Redis
ElastiCache & Redis
 
Redshift performance tuning
Redshift performance tuningRedshift performance tuning
Redshift performance tuning
 
Introduction to AWS Glue
Introduction to AWS Glue Introduction to AWS Glue
Introduction to AWS Glue
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
 

Andere mochten auch

Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Amazon Web Services
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Getting started with Data Warehousing and BI
Getting started with Data Warehousing and BIGetting started with Data Warehousing and BI
Getting started with Data Warehousing and BIEdureka!
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAmazon Web Services
 
The AWS Shared Responsibility Model: Presented by Amazon Web Services
The AWS Shared Responsibility Model: Presented by Amazon Web ServicesThe AWS Shared Responsibility Model: Presented by Amazon Web Services
The AWS Shared Responsibility Model: Presented by Amazon Web ServicesAlert Logic
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
 
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...Amazon Web Services
 
Women in Technology: Supporting Diversity in a Technical Workplace
Women in Technology: Supporting Diversity in a Technical WorkplaceWomen in Technology: Supporting Diversity in a Technical Workplace
Women in Technology: Supporting Diversity in a Technical WorkplaceAmazon Web Services
 
AWS IoT - Introduction - Pop-up Loft
AWS IoT - Introduction - Pop-up LoftAWS IoT - Introduction - Pop-up Loft
AWS IoT - Introduction - Pop-up LoftAmazon Web Services
 
AWSome Day Intro - Stockholm 20160308
AWSome Day Intro - Stockholm 20160308AWSome Day Intro - Stockholm 20160308
AWSome Day Intro - Stockholm 20160308Amazon Web Services
 
DevOps en Amazon: Un vistazo a nuestras herramientas y procesos
DevOps en Amazon: Un vistazo a nuestras herramientas y procesosDevOps en Amazon: Un vistazo a nuestras herramientas y procesos
DevOps en Amazon: Un vistazo a nuestras herramientas y procesosAmazon Web Services
 
Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016Amazon Web Services
 

Andere mochten auch (20)

Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Getting started with Data Warehousing and BI
Getting started with Data Warehousing and BIGetting started with Data Warehousing and BI
Getting started with Data Warehousing and BI
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing Performance
 
The AWS Shared Responsibility Model: Presented by Amazon Web Services
The AWS Shared Responsibility Model: Presented by Amazon Web ServicesThe AWS Shared Responsibility Model: Presented by Amazon Web Services
The AWS Shared Responsibility Model: Presented by Amazon Web Services
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Amazon Redshift Masterclass
Amazon Redshift MasterclassAmazon Redshift Masterclass
Amazon Redshift Masterclass
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
 
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
 
Women in Technology: Supporting Diversity in a Technical Workplace
Women in Technology: Supporting Diversity in a Technical WorkplaceWomen in Technology: Supporting Diversity in a Technical Workplace
Women in Technology: Supporting Diversity in a Technical Workplace
 
AWS IoT - Introduction - Pop-up Loft
AWS IoT - Introduction - Pop-up LoftAWS IoT - Introduction - Pop-up Loft
AWS IoT - Introduction - Pop-up Loft
 
Sundog Media Toolkit
Sundog Media Toolkit Sundog Media Toolkit
Sundog Media Toolkit
 
AWSome Day Intro - Stockholm 20160308
AWSome Day Intro - Stockholm 20160308AWSome Day Intro - Stockholm 20160308
AWSome Day Intro - Stockholm 20160308
 
Simplestream
SimplestreamSimplestream
Simplestream
 
AWS Mobile Hub
AWS Mobile HubAWS Mobile Hub
AWS Mobile Hub
 
Movidiam
MovidiamMovidiam
Movidiam
 
Ingest and storage options
Ingest and storage optionsIngest and storage options
Ingest and storage options
 
DevOps en Amazon: Un vistazo a nuestras herramientas y procesos
DevOps en Amazon: Un vistazo a nuestras herramientas y procesosDevOps en Amazon: Un vistazo a nuestras herramientas y procesos
DevOps en Amazon: Un vistazo a nuestras herramientas y procesos
 
Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016
 

Ähnlich wie Amazon Redshift Deep Dive

Melhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftMelhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftAmazon Web Services LATAM
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftAmazon Web Services
 
London Redshift Meetup - July 2017
London Redshift Meetup - July 2017London Redshift Meetup - July 2017
London Redshift Meetup - July 2017Pratim Das
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Amazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseAmazon Web Services
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015Amazon Web Services Korea
 
ENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million usersENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million usersAmazon Web Services
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon RedshiftAmazon Web Services
 
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...Amazon Web Services
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Amazon Web Services
 

Ähnlich wie Amazon Redshift Deep Dive (20)

Melhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftMelhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon Redshift
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon Redshift
 
London Redshift Meetup - July 2017
London Redshift Meetup - July 2017London Redshift Meetup - July 2017
London Redshift Meetup - July 2017
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
Redshift overview
Redshift overviewRedshift overview
Redshift overview
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
 
ENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million usersENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million users
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
 
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Amazon Redshift Deep Dive

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Alex Sinner, Solutions Architecture PMO – Amazon Web Services Luuk Linssen, Product Manager - Bannerconnect May 24, 2016 Amazon Redshift Deep Dive
  • 2. Fast, fully managed, petabyte-scale data warehousing for less than $1,000/TB/year Amazon Redshift
  • 3. Amazon Redshift delivers performance “[Amazon] Redshift is twenty times faster than Hive.” (5x–20x reduction in query times) link “Queries that used to take hours came back in seconds. Our analysts are orders of magnitude more productive.” (20x–40x reduction in query times) link “…[Amazon Redshift] performance has blown away everyone here (we generally see 50– 100x speedup over Hive).” link “Team played with [Amazon] Redshift today and concluded it is ****** awesome. Un-indexed complex queries returning in < 10s.” “Did I mention it's ridiculously fast? We'll be using it immediately to provide our analysts an alternative to Hadoop.” “We saw… 2x improvement in query times.” Channel “We regularly process multibillion row datasets and we do that in a matter of hours.” link
  • 4. Amazon Redshift system architecture 10 GigE (HPC) Ingestion Backup Restore SQL Clients/BI Tools 128GB RAM 16TB disk 16 cores Amazon S3/Amazon EMR/Amazon DynamoDB/SSH JDBC/ODBC 128GB RAM 16TB disk 16 cores Compute Node 128GB RAM 16TB disk 16 cores Compute Node 128GB RAM 16TB disk 16 cores Compute Node Leader Node
  • 5. A deeper look at compute node architecture Leader Node Dense compute nodes Large •  2 slices/cores •  15 GB RAM •  160 GB SSD 8XL •  32 slices/cores •  244 GB RAM •  2.56 TB SSD Dense storage nodes X-large •  2 slices/4 cores •  31 GB RAM •  2 TB HDD 8XL •  16 slices/ 36 cores •  244 GB RAM •  16 TB HDD
  • 7. Summary of Best Practices Table Design ü  Choose the best Sort Key ü  Choose the best Distribution Key ü  Compression Encodings – use automatic compression Loading Data ü  Use COPY command (not INSERT) ü  Load multiple, compressed files in a single COPY command, in Sort Key order
  • 8. Use multiple input files to maximize throughput COPY command Each slice loads one file at a time A single input file means only one slice is ingesting data Instead of full bandwidth, you only get 1/16th
  • 9. Use multiple input files to maximize throughput COPY command Use at least as many input files as you have slices With 16 input files, all slices are working so you maximize throughput from S3 Scale linearly as you add nodes
  • 10. Primary keys and manifest files Amazon Redshift doesn’t enforce primary key constraints •  If you load data multiple times, Amazon Redshift won’t complain •  If you declare primary keys, the optimizer will expect the data to be unique Use manifest files to control exactly what is loaded and how to respond if input files are missing •  Define a JSON manifest on Amazon S3 •  Ensures that the cluster loads exactly what you want
  • 11. Data hygiene Analyze tables regularly •  Every single load for popular columns •  Weekly for all columns •  Look SVV_TABLE_INFO(stats_off) for stale stats •  Look stl_alert_event_log for missing stats Vacuum tables regularly •  Weekly is a good target •  Number of unsorted blocks as trigger •  Look SVV_TABLE_INFO(unsorted, empty) •  Deep copy might be faster for high percent unsorted (20% unsorted usually is faster to deep copy)
  • 12. Automatic compression is a good thing (mostly) Better performance, lower costs Samples data automatically when COPY into an empty table •  Samples up to 100,000 rows and picks optimal encoding Regular ETL process using temp or staging tables: Turn off automatic compression •  Use analyze compression to determine the right encodings •  Bake those encodings into your DML
  • 13. Be careful when compressing your sort keys Zone maps store min/max per block After we know which block(s) contain the range, we know which row offsets to scan Highly compressed sort keys means many rows per block You’ll scan more data blocks than you need If your sort keys compress significantly more than your data columns, you might want to skip compression of sortkey column(s) Check SVV_TABLE_INFO(skew_sortkey1) COL1 COL2
  • 14. Keep your columns as narrow as possible •  Buffers allocated based on declared column width •  Wider than needed columns mean memory is wasted •  Fewer rows fit into memory; increased likelihood of queries spilling to disk •  Check SVV_TABLE_INFO(max_varchar)
  • 16. IAM Role support for COPY and UNLOAD Use IAM roles to securely give access permission to COPY or UNLOAD data from/to S3 / DynamoDB You can associate up to 10 IAM roles with a cluster Use IAM role ARN identifier in COPY/UNLOAD command Restrict roles to specific Redshift users
  • 17. New SQL functions We add SQL functions regularly to expand Amazon Redshift’s query capabilities Added 25+ window and aggregate functions since launch, including: •  LISTAGG •  [APPROXIMATE] COUNT •  DROP IF EXISTS, CREATE IF NOT EXISTS •  REGEXP_SUBSTR, _COUNT, _INSTR, _REPLACE •  PERCENTILE_CONT, _DISC, MEDIAN •  PERCENT_RANK, RATIO_TO_REPORT We’ll continue iterating but also want to enable you to write your own
  • 18. Scalar user-defined functions (UDFs) You can write UDFs using Python 2.7 •  Syntax is largely identical to PostgreSQL UDF syntax •  System and network calls within UDFs are prohibited Comes with Pandas, NumPy, and SciPy pre-installed •  You’ll also be able import your own libraries for even more flexibility
  • 19. Scalar UDF example CREATE FUNCTION f_hostname (VARCHAR url)   RETURNS varchar IMMUTABLE AS $$   import urlparse   return urlparse.urlparse(url).hostname $$ LANGUAGE plpythonu;
  • 20. Scalar UDF examples from partners http://www.looker.com/blog/amazon-redshift-user- defined-functions https://www.periscope.io/blog/redshift-user-defined- functions-python.html
  • 21. 1-click deployment to launch, on multiple regions around the world Pay-as-you-go pricing with no long term contracts required Advanced Analytics Business IntelligenceData Integration
  • 23. Compound sort keys Records in Amazon Redshift are stored in blocks For this illustration, let’s assume that four records fill a block Records with a given cust_id are all in one block However, records with a given prod_id are spread across four blocks 1 1 1 1 2 3 4 1 4 4 4 2 3 4 4 1 3 3 3 2 3 4 3 1 2 2 2 2 3 4 2 1 1  [1,1] [1,2] [1,3] [1,4] 2  [2,1] [2,2] [2,3] [2,4] 3  [3,1] [3,2] [3,3] [3,4] 4  [4,1] [4,2] [4,3] [4,4] 1 2 3 4 prod_id cust_id cust_id prod_id other columns blocks Select sum(amt) From big_tab Where cust_id = (1234); Select sum(amt) From big_tab Where prod_id = (5678);
  • 24. 1  [1,1] [1,2] [1,3] [1,4] 2  [2,1] [2,2] [2,3] [2,4] 3  [3,1] [3,2] [3,3] [3,4] 4  [4,1] [4,2] [4,3] [4,4] 1 2 3 4 prod_id cust_id Interleaved sort keys Records with a given cust_id are spread across two blocks Records with a given prod_id are also spread across two blocks Data is sorted in equal measures for both keys 1 1 2 2 2 1 2 3 3 4 4 4 3 4 3 1 3 4 4 2 1 2 3 3 1 2 2 4 3 4 1 1 cust_id prod_id other columns blocks
  • 25. Usage New keyword INTERLEAVED when defining sort keys •  Existing syntax will still work and behavior is unchanged •  You can choose up to 8 columns to include and can query with any or all of them No change needed to queries [[ COMPOUND | INTERLEAVED ] SORTKEY ( column_name [, ...] ) ]
  • 27.
  • 28. Typical ETL/ELT on legacy data warehouse   One file per table, maybe a few if too big   Many updates (“massage” the data)   Every job clears the data, then load   Count on primary key to block double loads   High concurrency of load jobs
  • 29. Two questions to ask Why do you do what you do? •  Many times, users don’t even know What is the customer need? •  Many times, needs do not match current practice •  You might benefit from adding other AWS services
  • 30. Open-source tools https://github.com/awslabs/amazon-redshift-utils Admin scripts •  Collection of utilities for running diagnostics on your cluster. Admin views •  Collection of utilities for managing your cluster, generating schema DDL, and so on Column encoding utility •  Gives you the ability to apply optimal column encoding to an established schema with data already loaded Analyze and vacuum utility •  Gives you the ability to automate VACUUM and ANALYZE operations Unload and copy utility •  Helps you to migrate data between Amazon Redshift clusters or databases
  • 32. we are passionate & resourceful people in pursuit of mastery we are Bannerconnect 90 employees founded in 2004 Amsterdam & Sittard
  • 33. old situation ü  700GB data per day ü  servicing 100 accounts
  • 34. today ü  1.9TB data per day ü  servicing 250 accounts
  • 35. challenges industry becomes more advanced processing more data than ever! from 700GB to 1.9TB on a daily basis need of near real time data processing not want to wait hours for new insights
  • 36. why amazon upscale your cluster within a couple of clicks on-premises hardware vs. virtual server no upfront costs
  • 37. our tech setup data collection data storage Amazon Redshift processing connecting to external platforms programmatic viewability adserver audiences input aggregated reports audiences data exports output client cluster (HDD) big data cluster (HDD) reporting data (SSD)
  • 38. outcome increase in speed 10 GB report went from minutes to seconds increase in detail all reports are now created on a hourly instead of daily basis completely new audience insights 45GB per report per account
  • 39. next steps expanding current stack Kinesis for (near) real time data processing
  • 40. lessons learned use the AWS best practices! ü  load compressed files instead of uncompressed files ü  make sure load data is being split into multiples files, where the number of files is a multiple of the number of slices in the cluster ü  make sure to choose the best distribution and sort key for your tables
  • 41. our contacts contact us +31 (0)46 707 4992 info@bannerconnect.net online www.bannerconnect.net /bannerconnect @bannerconnect /company/bannerconnect @wearebannerconnect address Poststraat 12 6135 KR, Sittard The Netherlands - Karperstraat 10 1075 KZ, Amsterdam The Netherlands
  • 42.