SlideShare ist ein Scribd-Unternehmen logo
1 von 77
Downloaden Sie, um offline zu lesen
Amazon RedShift 자세히 살펴보기
BuildingAccessLogAnalysisSystemforRainist
YoungJoon Jeong - AWS Solutions Architect
Time : 16:20 – 17:20
Sunghyun Hwang - CTO of Rainist
We start with the customer… and innovate
Enterprise-class, Accelerated Computing Instances
Managing databases is painful & difficult
SQL DBs do not perform well at scale
Hadoop is difficult to deploy and manage
DWs are complex, costly, and slow
Commercial DBs are punitive & expensive
Streaming data Is difficult to capture & analyze
BI Tools are expensive and hard to manage
 X1,P2,G2,I3 Instances*
 Amazon RDS
 Amazon DynamoDB
 Amazon EMR
 Amazon Redshift
 Amazon Aurora
 Amazon Kinesis
 Amazon QuickSight
Customers told us… We created…
*https://aws.amazon.com/intel/
AnalyzeStore
Glacier
S3
DynamoDB
RDS, Aurora
AWS Big Data Portfolio
Data Pipeline
CloudSearch
EMR EC2
Redshift Machine
Learning
ElasticSearch
Database
Migration
QuickSight
Amazon
Athena
Kinesis Fir
ehose
Import Export
Direct Connect
Collect
Kinesis
Kinesis An
alytics
Global Footprint
16 Regions; 42 Availability Zones; 76 Edge Locations
Redshift
Relational data warehouse
Massively parallel; Petabyte scale
Fully managed
HDD and SSD Platforms
$1,000/TB/Year; starts at $0.25/hour
Amazon
Redshift
a lot faster
a lot simpler
a lot cheaper
Amazon Redshift
NTT Docomo | Telecom FINRA | Financial Svcs Philips | Healthcare Yelp | Technology NASDAQ | Financial Svcs
The Weather Company | Media Nokia | Telecom Pinterest | Technology Foursquare | Technology Coursera | Education
Coinbase | Bitcoin Amazon | E-Commerce Etix | Entertainment Spuul | Entertainment Vivaki | Ad Tech
Z2 | Gaming Neustar | Ad Tech SoundCloud | Technology BeachMint | E-Commerce Civis | Technology
Selected Amazon Redshift Customers
Redshift is used for mission-critical workloads
Payments to suppliers
and billing workflows
Web/Mobile clickstream
and event analysis
Recommendation and
predictive analytics
Financial and management
reporting
Amazon Redshift Architecture
Leader Node
Simple SQL end point
Stores metadata
Optimizes query plan
Coordinates query execution
Compute Nodes
Local columnar storage
Parallel/distributed execution of all queries, loads, backups, restores, resizes
Start at just $0.25/hour, grow to 2 PB (compressed)
DC1: SSD; scale from 160 GB to 326 TB
DS1/DS2: HDD; scale from 2 TB to 2 PB
Ingestion/Backup
Backup
Restore
JDBC/ODBC
10 GigE
(HPC)
Benefit #1: Amazon Redshift is fast
Dramatically less I/O
Column storage
Data compression
Zone maps
Direct-attached storage
Large data block sizes
analyze compression listing;
Table | Column | Encoding
---------+----------------+----------
listing | listid | delta
listing | sellerid | delta32k
listing | eventid | delta32k
listing | dateid | bytedict
listing | numtickets | bytedict
listing | priceperticket | delta32k
listing | totalprice | mostly32
listing | listtime | raw
10 | 13 | 14 | 26 |…
… | 100 | 245 | 324
375 | 393 | 417…
… 512 | 549 | 623
637 | 712 | 809 …
… | 834 | 921 | 959
10
324
375
623
637
959
SELECT COUNT(*) FROM LOGS WHERE DATE = ‘09-JUNE-2016’
MIN: 01-JUNE-2016
MAX: 20-JUNE-2016
MIN: 08-JUNE-2016
MAX: 30-JUNE-2016
MIN: 12-JUNE-2016
MAX: 20-JUNE-2016
MIN: 02-JUNE-2016
MAX: 25-JUNE-2016
Unsorted Table
MIN: 01-JUNE-2016
MAX: 06-JUNE-2016
MIN: 07-JUNE-2016
MAX: 12-JUNE-2016
MIN: 13-JUNE-2016
MAX: 18-JUNE-2016
MIN: 19-JUNE-2016
MAX: 24-JUNE-2016
Sorted By Date
Benefit #1: Amazon Redshift is fast
Sort Keys and Zone Maps
Benefit #1: Amazon Redshift is fast
Parallel and Distributed
Query
Load
Export
Backup
Restore
Resize
ID Name
1 John Smith
2 Jane Jones
3 Peter Black
4 Pat Partridge
5 Sarah Cyan
6 Brian Snail
1 John Smith
4 Pat Partridge
2 Jane Jones
5 Sarah Cyan
3 Peter Black
6 Brian Snail
Benefit #1: Amazon Redshift is fast
Distribution Keys
Benefit #1: Amazon Redshift is fast
Benefit #1: Amazon Redshift is fast
H/W optimized for I/O intensive workloads, 4GB/sec/node
Enhanced networking, over 1M packets/sec/node
Choice of storage type, instance size
Regular cadence of auto-patched improvements
Example: Our new Dense Storage (HDD) instance type
Improved memory 2x, compute 2x, disk throughput 1.5x
Cost: same as our prior generation !
Benefit #2: Amazon Redshift is inexpensive
DS2 (HDD) Price Per Hour for
DS2.XL Single Node
Effective Annual
Price per TB compressed
On-Demand $ 1.150 $ 5,037
1 Year Reservation $ 0.670 $ 2,910
3 Year Reservation $ 0.280 $ 1,226
DC1 (SSD) Price Per Hour for
DC1.L Single Node
Effective Annual
Price per TB compressed
On-Demand $ 0.300 $ 15,768
1 Year Reservation $ 0.190 $ 9,360
3 Year Reservation $ 0.110 $ 5,782
Pricing is simple
Number of nodes x price/hour
No charge for leader node
No up front costs
Pay as you go
Benefit #2: Amazon Redshift is inexpensive
Benefit #2: Amazon Redshift lets you start small and grow big
Dense Storage (DS2.XL)
2 TB HDD, 31 GB RAM, 2 slices/4 cores
Single Node (2 TB)
Cluster 2-32 Nodes (4 TB – 64 TB)
Dense Storage (DS2.8XL)
16 TB HDD, 244 GB RAM, 16 slices/36 cores, 10 GigE
Cluster 2-128 Nodes (32 TB – 2 PB)
Note: Nodes not to scale
Continuous/incremental backups
Multiple copies within cluster
Continuous and incremental backups to S3
Continuous and incremental backups across regions
Streaming restore
Amazon S3
Amazon S3
Region 1
Region 2
Benefit #3: Amazon Redshift is fully managed
Amazon S3
Amazon S3
Region 1
Region 2
Benefit #3: Amazon Redshift is fully managed
Fault tolerance
Disk failures
Node failures
Network failures
Availability Zone/Region level disasters
Benefit #4: Security is built-in
• Load encrypted from S3
• SSL to secure data in transit
• ECDHE perfect forward security
• Amazon VPC for network isolation
• Encryption to secure data at rest
• All blocks on disks & in Amazon S3 encrypted
• Block key, Cluster key, Master key (AES-256)
• On-premises HSM, AWS CloudHSM & KMS support
• Audit logging and AWS CloudTrail integration
• SOC 1/2/3, PCI-DSS, FedRAMP, BAA
10 GigE
(HPC)
Ingestion
Backup
Restore
Customer VPC
Internal
VPC
JDBC/ODBC
•Initial Release in US East (N. Virginia; US West (Oregon), EU (Ireland); Asia
Pacific (Tokyo, Singapore, Sydney) Regions
•MANIFEST option for the COPY & UNLOAD commands
•SQL Functions: Most recent queries
•Resource-level IAM, CRC32
•Data Pipeline
•Event notifications, encryption, key rotation, audit logging, on-premises or
AWS CloudHSM; PCI, SOC 1/2/3
•Cross-Region Snapshot Copy
•Audit features, cursor support, 500 concurrent client connections
•EIP Address for VPC Cluster
•New system views to tune table design and track WLM query queues
•Custom ODBC/JDBC drivers; Query Visualization
•Mobile Analytics auto export
•KMS for GovCloud Region; HIPAA BAA
•Interleaved sort keys
•New Dense Storage Nodes (DS2) with better RAM and CPU.
•New Reserved Storage Nodes: No, Partial & All Upfront Options
•Cross-region backups for KMS encrypted clusters
•Scaler UDFs in Python
•AVRO Ingestion; Kinesis Firehose; Database Migration Service (DMS)
•Modify Cluster Dynamically
•Tag-based permissions and BZIP2
•System Tables for query Tuning
•Dense Compute Nodes
•Gzip & Lzop; JSON , RegEx, Cursors
•EMR Data Loading & Bootstrap Action with COPY command; WLM concurrency limit to 50
; support for the ECDH cipher suites for SSL connections; FedRAMP
•Cross-region ingestion
•Free trials & price reductions in Asia Pacific
•CloudWatch Alarm for Disk Usage
•AES 128-bit encryption; UTF-16; KMS Integration
•EU (Frankfurt); GovCloud Regions
•S3 Servier-side encryption support for UNLOAD
•Tagging Support for Cost-allocation
•WLM Queue-Hopping for timed-out queries
•Append rows & Export to BZIP-2
•Lambda for Clusters in VPC; Data Schema C
onversion Support from ML Console
•US West (N. California) Region.
Benefit #5: We innovate quickly
2013 20152014 2016
100+ new features added since launch
Release every two weeks
Automatic patching
Benefit #6: Amazon Redshift is powerful
• Approximate functions
• User defined functions
• Machine Learning
• Data Science
Amazon ML
Benefit #7: Amazon Redshift has a large ecosystem
Data Integration Systems IntegratorsBusiness Intelligence
DynamoDB
EMR
S3
EC2/SSH
RDS/Aurora
Amazon Re
dshift
Amazon Kinesis
Machine
Learning
Data Pipeline
CloudSearch
Mobile Analy
tics
Benefit #8: Service oriented architecture
Building Access Log Analysis System
UsingAWS Redshift
Sunghyun Hwang
CTO of Rainist
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
2015-06
18,925(명)
2017-02
360,031(명)
뱅크샐러드
Building Access Log Analysis System
Using AWS Redshift
MAU 2년 전 대비 1910% 성장
2015-06
29(장)
2017-02
779(장)
카드 발급 2년 전 대비 4100% 성장
Building Access Log Analysis System
Using AWS Redshift
로그 분석 시스템의 목적
1. 금융사 통계 자료 수집을 위함
2. 더 나은 고객 경험 설계를 위한 피드백 자료로 활용 및 분석
3. 비정상적인 요청 감지
Building Access Log Analysis System
Using AWS Redshift
금융사 성과 보고서
1. 각 금융사별 전체 상품 노출/클릭/신청 수 통계
2. 각 금융사별 인기 금융 상품 노출/클릭/신청 수 통계
3. 사용자가 가장 많이 방문하는 가맹점 입력 횟수/평균 입력 금액
4. 사용자의 지출 금액이 가장 높은 가맹점 입력 횟수/평균 입력 금액
로그 분석 시스템 구축의 어려움
Building Access Log Analysis System
Using AWS Redshift
1. 서비스의 성장과 함께 계속 늘어나는 로그 데이터
2016-01-30
50,000(건)
2017-02-14
3,000,000(건)
로그 분석 시스템 구축의 어려움
Building Access Log Analysis System
Using AWS Redshift
1. 서비스의 성장과 함께 계속 늘어나는 로그 데이터
로그 분석 시스템 구축의 어려움
Building Access Log Analysis System
Using AWS Redshift
1. 서비스의 성장과 함께 계속 늘어나는 로그 데이터
2. (MSA 환경에서) 분석에 필요한 데이터 파편화 문제
3. 적은 수의 인원(서버 엔지니어 한 명과 데이터 엔지니어 한 명)
2016-01-30
50,000(건)
2017-02-14
3,000,000(건)
Building Access Log Analysis System
Using AWS Redshift
뱅크샐러드의 Redshift 도입
1. 서비스의 성장과 함께 계속 늘어나는 로그 데이터
• 데이터 웨어하우스 도입에 적합한 성능
2. (MSA 환경에서) 분석에 필요한 데이터 파편화 문제
• AWS 서비스를 활용한 손쉬운 파이프라인 구축
3. 적은 수의 인원(서버 엔지니어 한 명과 데이터 엔지니어 한 명)
• 팀에게 익숙하고 편한 환경 제공 (SQL, Postgres Interface)
Building Access Log Analysis System
Using AWS Redshift
뱅크샐러드 로그 분석 시스템 구성
Amazon
Route53
ELB
AmazonS3
AWS Data
Pipeline
Amazon
Redshift
Amazon ECS
(Card Domain)
Amazon RDS
(Aurora)
Amazon ECS
(Data Analysis)
… Amazon
SES
Amazon
CloudWatch
AWS Data
Pipeline
뱅크샐러드 로그 분석 시스템 결과물
Building Access Log Analysis System
Using AWS Redshift
Building Access Log Analysis System
Using AWS Redshift
Big Data in real world
When your data sets become so large and diverse
that you have to start innovating around how to
collect, store, process, analyze and share them
Generate
Collect & Store
Analyze
Collaborate & Act
Individual AWS customers
generate over a PB/day
It’s never been easier to generate vast amounts of data
Generate
Collect & Store
Analyze
Collaborate & Act
Individual AWS customers
generating over PB/day
Amazon S3 lets you collect and store all this data
Store exabytes of
data in S3
Generate
Collect & Store
Analyze
Collaborate & Act
Individual AWS customers
generating over PB/day
Highly
Constrained
But how do you analyze it?
Store exabytes of
data in S3
1990 2000 2010 2020
Generated Data
Available for Analysis
Sources:
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
Data Volume
Year
Most generated data is unavailable for analysis
The Dark Data Problem
The tyranny of “OR”
• Amazon Redshift
• Super-fast local disk performance
• Sophisticated query optimization
• Join-optimized data formats
• Query using standard SQL
• Optimized for data warehousing
•Amazon EMR
• Directly access data in S3
• Scale out to thousands of nodes
• Open data formats
• Popular big data frameworks
• Anything you can dream up and code
You don’t need to choose.
I shouldn’t have to choose
I want “all of the above”
Amazon Redshift Spectrum
S3
SQL
Fast @ exabyte scale Elastic & highly available On-demand, pay-per-query
High concurrency: Multiple
clusters access same data
No ETL: Query data in-place
using open file formats
Full Amazon Redshift
SQL support
Run SQL queries directly against data in S3 using thousands of nodes
Amazon Redshift Spectrum is fast
Leverages Amazon Redshift’s advanced cost-based optimizer
Pushes down projections, filters, aggregations and join reduction
Dynamic partition pruning to minimize data processed
Automatic parallelization of query execution against S3 data
Efficient join processing within the Amazon Redshift cluster
Amazon Redshift Spectrum is cost-effective
You pay for your Amazon Redshift cluster plus $5 per TB scanned from S3
Each query can leverage 1000s of Amazon Redshift Spectrum nodes
You can reduce the TB scanned and improve query performance by:
• Partitioning data
• Using a columnar file format
• Compressing data
Amazon Redshift Spectrum is secure
Alerts &
notifications
Virtual private cloud
Audit logging
End-to-end
data encryption
Certifications &
compliance
Encrypt S3 data using SSE and AWS
KMS
Encrypt all Amazon Redshift data usi
ng KMS, AWS CloudHSM or your on-
premises HSMs
Enforce SSL with perfect forward enc
ryption using ECDHE
Amazon Redshift leader node in your
VPC. Compute nodes in private VPC.
Spectrum nodes in private VPC, store
no state.
Communicate event-specific notificati
ons via email, text message, or call w
ith Amazon SNS
All API calls are logged using
AWS CloudTrail
All SQL statements are logged
within Amazon Redshift
PCI/DSSFedRAMP
SOC1/2/3 HIPAA/BAA
Amazon Redshift Spectrum uses standard SQL
Redshift Spectrum seamlessly integrates with your existing SQL & BI apps
Support for complex joins, nested queries & window functions
Support for data partitioned in S3 by any key
Date, Time and any other custom keys
e.g., Year, Month, Day, Hour
Is Amazon Redshift Spectrum useful if I don’t have an exabyte?
Your data will get bigger
On average, data warehousing volumes grow 10x every 5 years
The average Amazon Redshift customer doubles data each year
Amazon Redshift Spectrum makes data analysis simpler
Access your data without ETL pipelines
Teams using Amazon EMR, Athena & Redshift can collaborate using the same data lake
Amazon Redshift Spectrum improves availability and concurrency
Run multiple Amazon Redshift clusters against common data
Isolate jobs with tight SLAs from ad hoc analysis
The Emerging Analytics Architecture
AthenaAmazon Athena
Interactive Query
AWS Glue
ETL & Data Catalog
Storage
Serverless
Compute
Data
Processing
Amazon S3
Exabyte-scale Object Storage
Amazon Kinesis Firehose
Real-Time Data Streaming
Amazon EMR
Managed Hadoop Applications
AWS Lambda
Trigger-based Code Execution
AWS Glue Data Catalog
Hive-compatible Metastore
Amazon Redshift Spectrum
Fast @ Exabyte scale
Amazon Redshift
Petabyte-scale Data Warehousing
Over 20 customers helped preview Amazon Redshift Spectrum
Defining External Schema and Creating Tables
Define an external schema in Amazon Redshift using the Amazon Athena data catal
og or your own Apache Hive Metastore
CREATE EXTERNAL SCHEMA <schema_name>
Query external tables using <schema_name>.<table_name>
Register external tables using Athena, your Hive Metastore client, or from Amazo
n Redshift CREATE EXTERNAL TABLE SCHEMA syntax
CREATE EXTERNAL TABLE <table_name>
[PARTITIONED BY <column_name, data_type, …>]
STORED AS file_format
LOCATION s3_location
[TABLE PROPERTIES property_name=property_value, …];
Amazon Redshift Spectrum – Current support
File formats
• Parquet
• CSV
• Sequence
• RCFile
• ORC (coming soon)
• RegExSerDe (coming soon)
Compression
• Gzip
• Snappy
• Lzo (coming soon)
• Bz2
Encryption
• SSE with AES256
• SSE KMS with default key
Column types
• Numeric: bigint, int, smallint, float, double and
decimal
• Char/varchar/string
• Timestamp
• Boolean
• DATE type can be used only as a partitioning key
Table type
• Non-partitioned table (s3://mybucket/orders/..)
• Partitioned table
(s3://mybucket/orders/date=YYYY-MM-DD/..)
Converting to Parquet and ORC using Amazon EMR
You can use Hive CREATE TABLE AS SELECT to convert data
CREATE TABLE data_converted
STORED AS PARQUET
AS
SELECT col_1, col2, col3 FROM data_source
Or use Spark - 20 lines of Pyspark code, running on Amazon EMR
• 1TB of text data reduced to 130 GB in Parquet format with snappy compression
• Total cost of EMR job to do this: $5
https://github.com/awslabs/aws-big-data-blog/tree/master/aws-blog-spark-parquet-conversion
Lets build an analytic query - #1
An author is releasing the 8th book in her popular series. How many s
hould we order for Seattle? What were prior first few day sales?
Lets get the prior books she’s written.
1 Table
2 Filters
• SELECT
• P.ASIN,
• P.TITLE
• FROM
• products P
• WHERE
• P.TITLE LIKE ‘%POTTER%’ AND
• P.AUTHOR = ‘J. K. Rowling’
Lets build an analytic query - #2
An author is releasing the 8th book in her popular series. How many s
hould we order for Seattle? What were prior first few day sales?
Lets compute the sales of the prior books she’s written in this series a
nd return the top 20 values
2 Tables (1 S3, 1 local)
2 Filters
1 Join
2 Group By columns
1 Order By
1 Limit
1 Aggregation
• SELECT
• P.ASIN,
• P.TITLE,
• SUM(D.QUANTITY * D.OUR_PRICE) AS SALES_sum
• FROM
• s3.d_customer_order_item_details D,
• products P
• WHERE
• D.ASIN = P.ASIN AND
• P.TITLE LIKE '%Potter%' AND
• P.AUTHOR = 'J. K. Rowling' AND
• GROUP BY P.ASIN, P.TITLE
• ORDER BY SALES_sum DESC
• LIMIT 20;
Lets build an analytic query - #3
An author is releasing the 8th book in her popular series. How man
y should we order for Seattle? What were prior first few day sales?
Lets compute the sales of the prior books she’s written in this serie
s and return the top 20 values, just for the first three days of sales
of first editions
3 Tables (1 S3, 2 local)
5 Filters
2 Joins
3 Group By columns
1 Order By
1 Limit
1 Aggregation
1 Function
2 Casts
• SELECT
• P.ASIN,
• P.TITLE,
• P.RELEASE_DATE,
• SUM(D.QUANTITY * D.OUR_PRICE) AS SALES_sum
• FROM
• s3.d_customer_order_item_details D,
• asin_attributes A,
• products P
• WHERE
• D.ASIN = P.ASIN AND
• P.ASIN = A.ASIN AND
• A.EDITION LIKE '%FIRST%' AND
• P.TITLE LIKE '%Potter%' AND
• P.AUTHOR = 'J. K. Rowling' AND
• D.ORDER_DAY :: DATE >= P.RELEASE_DATE AND
• D.ORDER_DAY :: DATE < dateadd(day, 3, P.RELEASE_DATE)
• GROUP BY P.ASIN, P.TITLE, P.RELEASE_DATE
• ORDER BY SALES_sum DESC
• LIMIT 20;
•
Lets build an analytic query - #4
An author is releasing the 8th book in her popular series. How ma
ny should we order for Seattle? What were prior first few day sal
es?
Lets compute the sales of the prior books she’s written in this ser
ies and return the top 20 values, just for the first three days of sa
les of first editions in the city of Seattle, WA, USA
4 Tables (1 S3, 3 local)
8 Filters
3 Joins
4 Group By columns
1 Order By
1 Limit
1 Aggregation
1 Function
2 Casts
• SELECT
• P.ASIN,
• P.TITLE,
• R.POSTAL_CODE,
• P.RELEASE_DATE,
• SUM(D.QUANTITY * D.OUR_PRICE) AS SALES_sum
• FROM
• s3.d_customer_order_item_details D,
• asin_attributes A,
• products P,
• regions R
• WHERE
• D.ASIN = P.ASIN AND
• P.ASIN = A.ASIN AND
• D.REGION_ID = R.REGION_ID AND
• A.EDITION LIKE '%FIRST%' AND
• P.TITLE LIKE '%Potter%' AND
• P.AUTHOR = 'J. K. Rowling' AND
• R.COUNTRY_CODE = ‘US’ AND
• R.CITY = ‘Seattle’ AND
• R.STATE = ‘WA’ AND
• D.ORDER_DAY :: DATE >= P.RELEASE_DATE AND
• D.ORDER_DAY :: DATE < dateadd(day, 3, P.RELEASE_DATE)
• GROUP BY P.ASIN, P.TITLE, R.POSTAL_CODE, P.RELEASE_DATE
• ORDER BY SALES_sum DESC
• LIMIT 20;
•
Now let’s run that query over an exabyte of data in S3
Roughly 140 TB of customer item order detail rec
ords for each day over past 20 years.
190 million files across 15,000 partitions in S3. O
ne partition per day for USA and rest of world.
Need a billion-fold reduction in data processed.
Running this query using a 1000 node Hive clust
er would take over 5 years.*
• Compression ……………..….……..5X
• Columnar file format……….......…10X
• Scanning with 2500 nodes…....2500X
• Static partition elimination…............2X
• Dynamic partition elimination..….350X
• Redshift’s query optimizer……......40X
---------------------------------------------------
Total reduction……….…………3.5B X
* Estimated using 20 node Hive cluster & 1.4TB, assume linear
* Query used a 20 node DC1.8XLarge Amazon Redshift cluster
* Not actual sales data - generated for this demo based on data
format used by Amazon Retail.
ReCap : Redshift and Redshift Spectrum is…
Relational data warehouse
Can be streamed in
Can be processed in real time
Can be expend Exabyte scale
You can mix and match
On premises and cloud
Custom development and managed services
Infrastructure with managed scaling, security
Redshift
can support
Thank you!
Questions?
One More Thing
Beyond Amazon Redshift
Kinesis Stream, Kinesis Firehose
Elastic MapReduce
Amazon Machine Learning
Amazon QuickSight
End of Document

Weitere ähnliche Inhalte

Was ist angesagt?

Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Chris Fregly
 
AWS Database Migration Service
AWS Database Migration ServiceAWS Database Migration Service
AWS Database Migration Servicetechugo
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSAmazon Web Services
 
ENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudAmazon Web Services
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesAmazon Web Services
 
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...Amazon Web Services
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...Amazon Web Services
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRAmazon Web Services
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBAmazon Web Services
 
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개Amazon Web Services Korea
 
Getting started with Amazon DynamoDB
Getting started with Amazon DynamoDBGetting started with Amazon DynamoDB
Getting started with Amazon DynamoDBAmazon Web Services
 
re:Invent Round-up, Time Stream, Quantum and Managed Blockchain
re:Invent Round-up, Time Stream, Quantum and Managed Blockchain re:Invent Round-up, Time Stream, Quantum and Managed Blockchain
re:Invent Round-up, Time Stream, Quantum and Managed Blockchain Amazon Web Services
 
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...Amazon Web Services
 
Optimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsOptimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsAmazon Web Services
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAmazon Web Services
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Optimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics WorkloadsOptimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics WorkloadsAmazon Web Services
 
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal HealthSRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal HealthAmazon Web Services
 

Was ist angesagt? (20)

Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
AWS Database Migration Service
AWS Database Migration ServiceAWS Database Migration Service
AWS Database Migration Service
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWS
 
ENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the Cloud
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS Updates
 
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
 
Getting started with Amazon DynamoDB
Getting started with Amazon DynamoDBGetting started with Amazon DynamoDB
Getting started with Amazon DynamoDB
 
re:Invent Round-up, Time Stream, Quantum and Managed Blockchain
re:Invent Round-up, Time Stream, Quantum and Managed Blockchain re:Invent Round-up, Time Stream, Quantum and Managed Blockchain
re:Invent Round-up, Time Stream, Quantum and Managed Blockchain
 
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
 
Optimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsOptimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics Workloads
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache Storm
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Optimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics WorkloadsOptimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics Workloads
 
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal HealthSRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 

Ähnlich wie 2017 AWS DB Day | Amazon Redshift 자세히 살펴보기

Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
 Getting Started with Amazon Redshift Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoGetting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoAmazon Web Services
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Benefícios e melhores práticas no uso do Amazon Redshift
Benefícios e melhores práticas no uso do Amazon RedshiftBenefícios e melhores práticas no uso do Amazon Redshift
Benefícios e melhores práticas no uso do Amazon RedshiftAmazon Web Services LATAM
 
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介Amazon Web Services Japan
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Amazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseAmazon Web Services
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Amazon Web Services
 
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介Amazon Web Services Japan
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftAmazon Web Services
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftAmazon Web Services
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...Amazon Web Services
 

Ähnlich wie 2017 AWS DB Day | Amazon Redshift 자세히 살펴보기 (20)

Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
 Getting Started with Amazon Redshift Getting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoGetting started with amazon redshift - Toronto
Getting started with amazon redshift - Toronto
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Benefícios e melhores práticas no uso do Amazon Redshift
Benefícios e melhores práticas no uso do Amazon RedshiftBenefícios e melhores práticas no uso do Amazon Redshift
Benefícios e melhores práticas no uso do Amazon Redshift
 
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon Redshift
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 

Mehr von Amazon Web Services Korea

AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2Amazon Web Services Korea
 
AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1Amazon Web Services Korea
 
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...Amazon Web Services Korea
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon Web Services Korea
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Web Services Korea
 
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Amazon Web Services Korea
 
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...Amazon Web Services Korea
 
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Amazon Web Services Korea
 
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon Web Services Korea
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon Web Services Korea
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Amazon Web Services Korea
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Web Services Korea
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...Amazon Web Services Korea
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...Amazon Web Services Korea
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon Web Services Korea
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...Amazon Web Services Korea
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...Amazon Web Services Korea
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...Amazon Web Services Korea
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...Amazon Web Services Korea
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...Amazon Web Services Korea
 

Mehr von Amazon Web Services Korea (20)

AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2
 
AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1
 
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
 
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
 
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
 
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
 
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
 

Kürzlich hochgeladen

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

2017 AWS DB Day | Amazon Redshift 자세히 살펴보기

  • 1. Amazon RedShift 자세히 살펴보기 BuildingAccessLogAnalysisSystemforRainist YoungJoon Jeong - AWS Solutions Architect Time : 16:20 – 17:20 Sunghyun Hwang - CTO of Rainist
  • 2. We start with the customer… and innovate Enterprise-class, Accelerated Computing Instances Managing databases is painful & difficult SQL DBs do not perform well at scale Hadoop is difficult to deploy and manage DWs are complex, costly, and slow Commercial DBs are punitive & expensive Streaming data Is difficult to capture & analyze BI Tools are expensive and hard to manage  X1,P2,G2,I3 Instances*  Amazon RDS  Amazon DynamoDB  Amazon EMR  Amazon Redshift  Amazon Aurora  Amazon Kinesis  Amazon QuickSight Customers told us… We created… *https://aws.amazon.com/intel/
  • 3. AnalyzeStore Glacier S3 DynamoDB RDS, Aurora AWS Big Data Portfolio Data Pipeline CloudSearch EMR EC2 Redshift Machine Learning ElasticSearch Database Migration QuickSight Amazon Athena Kinesis Fir ehose Import Export Direct Connect Collect Kinesis Kinesis An alytics
  • 4. Global Footprint 16 Regions; 42 Availability Zones; 76 Edge Locations Redshift
  • 5. Relational data warehouse Massively parallel; Petabyte scale Fully managed HDD and SSD Platforms $1,000/TB/Year; starts at $0.25/hour Amazon Redshift a lot faster a lot simpler a lot cheaper Amazon Redshift
  • 6. NTT Docomo | Telecom FINRA | Financial Svcs Philips | Healthcare Yelp | Technology NASDAQ | Financial Svcs The Weather Company | Media Nokia | Telecom Pinterest | Technology Foursquare | Technology Coursera | Education Coinbase | Bitcoin Amazon | E-Commerce Etix | Entertainment Spuul | Entertainment Vivaki | Ad Tech Z2 | Gaming Neustar | Ad Tech SoundCloud | Technology BeachMint | E-Commerce Civis | Technology Selected Amazon Redshift Customers
  • 7. Redshift is used for mission-critical workloads Payments to suppliers and billing workflows Web/Mobile clickstream and event analysis Recommendation and predictive analytics Financial and management reporting
  • 8. Amazon Redshift Architecture Leader Node Simple SQL end point Stores metadata Optimizes query plan Coordinates query execution Compute Nodes Local columnar storage Parallel/distributed execution of all queries, loads, backups, restores, resizes Start at just $0.25/hour, grow to 2 PB (compressed) DC1: SSD; scale from 160 GB to 326 TB DS1/DS2: HDD; scale from 2 TB to 2 PB Ingestion/Backup Backup Restore JDBC/ODBC 10 GigE (HPC)
  • 9. Benefit #1: Amazon Redshift is fast Dramatically less I/O Column storage Data compression Zone maps Direct-attached storage Large data block sizes analyze compression listing; Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw 10 | 13 | 14 | 26 |… … | 100 | 245 | 324 375 | 393 | 417… … 512 | 549 | 623 637 | 712 | 809 … … | 834 | 921 | 959 10 324 375 623 637 959
  • 10. SELECT COUNT(*) FROM LOGS WHERE DATE = ‘09-JUNE-2016’ MIN: 01-JUNE-2016 MAX: 20-JUNE-2016 MIN: 08-JUNE-2016 MAX: 30-JUNE-2016 MIN: 12-JUNE-2016 MAX: 20-JUNE-2016 MIN: 02-JUNE-2016 MAX: 25-JUNE-2016 Unsorted Table MIN: 01-JUNE-2016 MAX: 06-JUNE-2016 MIN: 07-JUNE-2016 MAX: 12-JUNE-2016 MIN: 13-JUNE-2016 MAX: 18-JUNE-2016 MIN: 19-JUNE-2016 MAX: 24-JUNE-2016 Sorted By Date Benefit #1: Amazon Redshift is fast Sort Keys and Zone Maps
  • 11. Benefit #1: Amazon Redshift is fast Parallel and Distributed Query Load Export Backup Restore Resize
  • 12. ID Name 1 John Smith 2 Jane Jones 3 Peter Black 4 Pat Partridge 5 Sarah Cyan 6 Brian Snail 1 John Smith 4 Pat Partridge 2 Jane Jones 5 Sarah Cyan 3 Peter Black 6 Brian Snail Benefit #1: Amazon Redshift is fast Distribution Keys
  • 13. Benefit #1: Amazon Redshift is fast
  • 14. Benefit #1: Amazon Redshift is fast H/W optimized for I/O intensive workloads, 4GB/sec/node Enhanced networking, over 1M packets/sec/node Choice of storage type, instance size Regular cadence of auto-patched improvements Example: Our new Dense Storage (HDD) instance type Improved memory 2x, compute 2x, disk throughput 1.5x Cost: same as our prior generation !
  • 15. Benefit #2: Amazon Redshift is inexpensive
  • 16. DS2 (HDD) Price Per Hour for DS2.XL Single Node Effective Annual Price per TB compressed On-Demand $ 1.150 $ 5,037 1 Year Reservation $ 0.670 $ 2,910 3 Year Reservation $ 0.280 $ 1,226 DC1 (SSD) Price Per Hour for DC1.L Single Node Effective Annual Price per TB compressed On-Demand $ 0.300 $ 15,768 1 Year Reservation $ 0.190 $ 9,360 3 Year Reservation $ 0.110 $ 5,782 Pricing is simple Number of nodes x price/hour No charge for leader node No up front costs Pay as you go Benefit #2: Amazon Redshift is inexpensive
  • 17. Benefit #2: Amazon Redshift lets you start small and grow big Dense Storage (DS2.XL) 2 TB HDD, 31 GB RAM, 2 slices/4 cores Single Node (2 TB) Cluster 2-32 Nodes (4 TB – 64 TB) Dense Storage (DS2.8XL) 16 TB HDD, 244 GB RAM, 16 slices/36 cores, 10 GigE Cluster 2-128 Nodes (32 TB – 2 PB) Note: Nodes not to scale
  • 18. Continuous/incremental backups Multiple copies within cluster Continuous and incremental backups to S3 Continuous and incremental backups across regions Streaming restore Amazon S3 Amazon S3 Region 1 Region 2 Benefit #3: Amazon Redshift is fully managed
  • 19. Amazon S3 Amazon S3 Region 1 Region 2 Benefit #3: Amazon Redshift is fully managed Fault tolerance Disk failures Node failures Network failures Availability Zone/Region level disasters
  • 20. Benefit #4: Security is built-in • Load encrypted from S3 • SSL to secure data in transit • ECDHE perfect forward security • Amazon VPC for network isolation • Encryption to secure data at rest • All blocks on disks & in Amazon S3 encrypted • Block key, Cluster key, Master key (AES-256) • On-premises HSM, AWS CloudHSM & KMS support • Audit logging and AWS CloudTrail integration • SOC 1/2/3, PCI-DSS, FedRAMP, BAA 10 GigE (HPC) Ingestion Backup Restore Customer VPC Internal VPC JDBC/ODBC
  • 21. •Initial Release in US East (N. Virginia; US West (Oregon), EU (Ireland); Asia Pacific (Tokyo, Singapore, Sydney) Regions •MANIFEST option for the COPY & UNLOAD commands •SQL Functions: Most recent queries •Resource-level IAM, CRC32 •Data Pipeline •Event notifications, encryption, key rotation, audit logging, on-premises or AWS CloudHSM; PCI, SOC 1/2/3 •Cross-Region Snapshot Copy •Audit features, cursor support, 500 concurrent client connections •EIP Address for VPC Cluster •New system views to tune table design and track WLM query queues •Custom ODBC/JDBC drivers; Query Visualization •Mobile Analytics auto export •KMS for GovCloud Region; HIPAA BAA •Interleaved sort keys •New Dense Storage Nodes (DS2) with better RAM and CPU. •New Reserved Storage Nodes: No, Partial & All Upfront Options •Cross-region backups for KMS encrypted clusters •Scaler UDFs in Python •AVRO Ingestion; Kinesis Firehose; Database Migration Service (DMS) •Modify Cluster Dynamically •Tag-based permissions and BZIP2 •System Tables for query Tuning •Dense Compute Nodes •Gzip & Lzop; JSON , RegEx, Cursors •EMR Data Loading & Bootstrap Action with COPY command; WLM concurrency limit to 50 ; support for the ECDH cipher suites for SSL connections; FedRAMP •Cross-region ingestion •Free trials & price reductions in Asia Pacific •CloudWatch Alarm for Disk Usage •AES 128-bit encryption; UTF-16; KMS Integration •EU (Frankfurt); GovCloud Regions •S3 Servier-side encryption support for UNLOAD •Tagging Support for Cost-allocation •WLM Queue-Hopping for timed-out queries •Append rows & Export to BZIP-2 •Lambda for Clusters in VPC; Data Schema C onversion Support from ML Console •US West (N. California) Region. Benefit #5: We innovate quickly 2013 20152014 2016 100+ new features added since launch Release every two weeks Automatic patching
  • 22. Benefit #6: Amazon Redshift is powerful • Approximate functions • User defined functions • Machine Learning • Data Science Amazon ML
  • 23. Benefit #7: Amazon Redshift has a large ecosystem Data Integration Systems IntegratorsBusiness Intelligence
  • 24. DynamoDB EMR S3 EC2/SSH RDS/Aurora Amazon Re dshift Amazon Kinesis Machine Learning Data Pipeline CloudSearch Mobile Analy tics Benefit #8: Service oriented architecture
  • 25. Building Access Log Analysis System UsingAWS Redshift Sunghyun Hwang CTO of Rainist
  • 26. Building Access Log Analysis System Using AWS Redshift
  • 27. Building Access Log Analysis System Using AWS Redshift
  • 28. Building Access Log Analysis System Using AWS Redshift
  • 29. Building Access Log Analysis System Using AWS Redshift
  • 30. Building Access Log Analysis System Using AWS Redshift
  • 31. Building Access Log Analysis System Using AWS Redshift
  • 32. Building Access Log Analysis System Using AWS Redshift
  • 33. Building Access Log Analysis System Using AWS Redshift
  • 34. Building Access Log Analysis System Using AWS Redshift
  • 35. Building Access Log Analysis System Using AWS Redshift
  • 36. Building Access Log Analysis System Using AWS Redshift
  • 37. 2015-06 18,925(명) 2017-02 360,031(명) 뱅크샐러드 Building Access Log Analysis System Using AWS Redshift MAU 2년 전 대비 1910% 성장 2015-06 29(장) 2017-02 779(장) 카드 발급 2년 전 대비 4100% 성장
  • 38. Building Access Log Analysis System Using AWS Redshift 로그 분석 시스템의 목적 1. 금융사 통계 자료 수집을 위함 2. 더 나은 고객 경험 설계를 위한 피드백 자료로 활용 및 분석 3. 비정상적인 요청 감지
  • 39. Building Access Log Analysis System Using AWS Redshift 금융사 성과 보고서 1. 각 금융사별 전체 상품 노출/클릭/신청 수 통계 2. 각 금융사별 인기 금융 상품 노출/클릭/신청 수 통계 3. 사용자가 가장 많이 방문하는 가맹점 입력 횟수/평균 입력 금액 4. 사용자의 지출 금액이 가장 높은 가맹점 입력 횟수/평균 입력 금액
  • 40. 로그 분석 시스템 구축의 어려움 Building Access Log Analysis System Using AWS Redshift 1. 서비스의 성장과 함께 계속 늘어나는 로그 데이터
  • 41. 2016-01-30 50,000(건) 2017-02-14 3,000,000(건) 로그 분석 시스템 구축의 어려움 Building Access Log Analysis System Using AWS Redshift 1. 서비스의 성장과 함께 계속 늘어나는 로그 데이터
  • 42. 로그 분석 시스템 구축의 어려움 Building Access Log Analysis System Using AWS Redshift 1. 서비스의 성장과 함께 계속 늘어나는 로그 데이터 2. (MSA 환경에서) 분석에 필요한 데이터 파편화 문제 3. 적은 수의 인원(서버 엔지니어 한 명과 데이터 엔지니어 한 명) 2016-01-30 50,000(건) 2017-02-14 3,000,000(건)
  • 43. Building Access Log Analysis System Using AWS Redshift 뱅크샐러드의 Redshift 도입 1. 서비스의 성장과 함께 계속 늘어나는 로그 데이터 • 데이터 웨어하우스 도입에 적합한 성능 2. (MSA 환경에서) 분석에 필요한 데이터 파편화 문제 • AWS 서비스를 활용한 손쉬운 파이프라인 구축 3. 적은 수의 인원(서버 엔지니어 한 명과 데이터 엔지니어 한 명) • 팀에게 익숙하고 편한 환경 제공 (SQL, Postgres Interface)
  • 44. Building Access Log Analysis System Using AWS Redshift 뱅크샐러드 로그 분석 시스템 구성 Amazon Route53 ELB AmazonS3 AWS Data Pipeline Amazon Redshift Amazon ECS (Card Domain) Amazon RDS (Aurora) Amazon ECS (Data Analysis) … Amazon SES Amazon CloudWatch AWS Data Pipeline
  • 45. 뱅크샐러드 로그 분석 시스템 결과물 Building Access Log Analysis System Using AWS Redshift
  • 46. Building Access Log Analysis System Using AWS Redshift
  • 47. Big Data in real world When your data sets become so large and diverse that you have to start innovating around how to collect, store, process, analyze and share them
  • 48. Generate Collect & Store Analyze Collaborate & Act Individual AWS customers generate over a PB/day It’s never been easier to generate vast amounts of data
  • 49. Generate Collect & Store Analyze Collaborate & Act Individual AWS customers generating over PB/day Amazon S3 lets you collect and store all this data Store exabytes of data in S3
  • 50. Generate Collect & Store Analyze Collaborate & Act Individual AWS customers generating over PB/day Highly Constrained But how do you analyze it? Store exabytes of data in S3
  • 51. 1990 2000 2010 2020 Generated Data Available for Analysis Sources: Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares Data Volume Year Most generated data is unavailable for analysis The Dark Data Problem
  • 52. The tyranny of “OR” • Amazon Redshift • Super-fast local disk performance • Sophisticated query optimization • Join-optimized data formats • Query using standard SQL • Optimized for data warehousing •Amazon EMR • Directly access data in S3 • Scale out to thousands of nodes • Open data formats • Popular big data frameworks • Anything you can dream up and code
  • 53. You don’t need to choose. I shouldn’t have to choose I want “all of the above”
  • 54. Amazon Redshift Spectrum S3 SQL Fast @ exabyte scale Elastic & highly available On-demand, pay-per-query High concurrency: Multiple clusters access same data No ETL: Query data in-place using open file formats Full Amazon Redshift SQL support Run SQL queries directly against data in S3 using thousands of nodes
  • 55. Amazon Redshift Spectrum is fast Leverages Amazon Redshift’s advanced cost-based optimizer Pushes down projections, filters, aggregations and join reduction Dynamic partition pruning to minimize data processed Automatic parallelization of query execution against S3 data Efficient join processing within the Amazon Redshift cluster
  • 56. Amazon Redshift Spectrum is cost-effective You pay for your Amazon Redshift cluster plus $5 per TB scanned from S3 Each query can leverage 1000s of Amazon Redshift Spectrum nodes You can reduce the TB scanned and improve query performance by: • Partitioning data • Using a columnar file format • Compressing data
  • 57. Amazon Redshift Spectrum is secure Alerts & notifications Virtual private cloud Audit logging End-to-end data encryption Certifications & compliance Encrypt S3 data using SSE and AWS KMS Encrypt all Amazon Redshift data usi ng KMS, AWS CloudHSM or your on- premises HSMs Enforce SSL with perfect forward enc ryption using ECDHE Amazon Redshift leader node in your VPC. Compute nodes in private VPC. Spectrum nodes in private VPC, store no state. Communicate event-specific notificati ons via email, text message, or call w ith Amazon SNS All API calls are logged using AWS CloudTrail All SQL statements are logged within Amazon Redshift PCI/DSSFedRAMP SOC1/2/3 HIPAA/BAA
  • 58. Amazon Redshift Spectrum uses standard SQL Redshift Spectrum seamlessly integrates with your existing SQL & BI apps Support for complex joins, nested queries & window functions Support for data partitioned in S3 by any key Date, Time and any other custom keys e.g., Year, Month, Day, Hour
  • 59. Is Amazon Redshift Spectrum useful if I don’t have an exabyte? Your data will get bigger On average, data warehousing volumes grow 10x every 5 years The average Amazon Redshift customer doubles data each year Amazon Redshift Spectrum makes data analysis simpler Access your data without ETL pipelines Teams using Amazon EMR, Athena & Redshift can collaborate using the same data lake Amazon Redshift Spectrum improves availability and concurrency Run multiple Amazon Redshift clusters against common data Isolate jobs with tight SLAs from ad hoc analysis
  • 60. The Emerging Analytics Architecture AthenaAmazon Athena Interactive Query AWS Glue ETL & Data Catalog Storage Serverless Compute Data Processing Amazon S3 Exabyte-scale Object Storage Amazon Kinesis Firehose Real-Time Data Streaming Amazon EMR Managed Hadoop Applications AWS Lambda Trigger-based Code Execution AWS Glue Data Catalog Hive-compatible Metastore Amazon Redshift Spectrum Fast @ Exabyte scale Amazon Redshift Petabyte-scale Data Warehousing
  • 61. Over 20 customers helped preview Amazon Redshift Spectrum
  • 62. Defining External Schema and Creating Tables Define an external schema in Amazon Redshift using the Amazon Athena data catal og or your own Apache Hive Metastore CREATE EXTERNAL SCHEMA <schema_name> Query external tables using <schema_name>.<table_name> Register external tables using Athena, your Hive Metastore client, or from Amazo n Redshift CREATE EXTERNAL TABLE SCHEMA syntax CREATE EXTERNAL TABLE <table_name> [PARTITIONED BY <column_name, data_type, …>] STORED AS file_format LOCATION s3_location [TABLE PROPERTIES property_name=property_value, …];
  • 63. Amazon Redshift Spectrum – Current support File formats • Parquet • CSV • Sequence • RCFile • ORC (coming soon) • RegExSerDe (coming soon) Compression • Gzip • Snappy • Lzo (coming soon) • Bz2 Encryption • SSE with AES256 • SSE KMS with default key Column types • Numeric: bigint, int, smallint, float, double and decimal • Char/varchar/string • Timestamp • Boolean • DATE type can be used only as a partitioning key Table type • Non-partitioned table (s3://mybucket/orders/..) • Partitioned table (s3://mybucket/orders/date=YYYY-MM-DD/..)
  • 64. Converting to Parquet and ORC using Amazon EMR You can use Hive CREATE TABLE AS SELECT to convert data CREATE TABLE data_converted STORED AS PARQUET AS SELECT col_1, col2, col3 FROM data_source Or use Spark - 20 lines of Pyspark code, running on Amazon EMR • 1TB of text data reduced to 130 GB in Parquet format with snappy compression • Total cost of EMR job to do this: $5 https://github.com/awslabs/aws-big-data-blog/tree/master/aws-blog-spark-parquet-conversion
  • 65. Lets build an analytic query - #1 An author is releasing the 8th book in her popular series. How many s hould we order for Seattle? What were prior first few day sales? Lets get the prior books she’s written. 1 Table 2 Filters • SELECT • P.ASIN, • P.TITLE • FROM • products P • WHERE • P.TITLE LIKE ‘%POTTER%’ AND • P.AUTHOR = ‘J. K. Rowling’
  • 66. Lets build an analytic query - #2 An author is releasing the 8th book in her popular series. How many s hould we order for Seattle? What were prior first few day sales? Lets compute the sales of the prior books she’s written in this series a nd return the top 20 values 2 Tables (1 S3, 1 local) 2 Filters 1 Join 2 Group By columns 1 Order By 1 Limit 1 Aggregation • SELECT • P.ASIN, • P.TITLE, • SUM(D.QUANTITY * D.OUR_PRICE) AS SALES_sum • FROM • s3.d_customer_order_item_details D, • products P • WHERE • D.ASIN = P.ASIN AND • P.TITLE LIKE '%Potter%' AND • P.AUTHOR = 'J. K. Rowling' AND • GROUP BY P.ASIN, P.TITLE • ORDER BY SALES_sum DESC • LIMIT 20;
  • 67. Lets build an analytic query - #3 An author is releasing the 8th book in her popular series. How man y should we order for Seattle? What were prior first few day sales? Lets compute the sales of the prior books she’s written in this serie s and return the top 20 values, just for the first three days of sales of first editions 3 Tables (1 S3, 2 local) 5 Filters 2 Joins 3 Group By columns 1 Order By 1 Limit 1 Aggregation 1 Function 2 Casts • SELECT • P.ASIN, • P.TITLE, • P.RELEASE_DATE, • SUM(D.QUANTITY * D.OUR_PRICE) AS SALES_sum • FROM • s3.d_customer_order_item_details D, • asin_attributes A, • products P • WHERE • D.ASIN = P.ASIN AND • P.ASIN = A.ASIN AND • A.EDITION LIKE '%FIRST%' AND • P.TITLE LIKE '%Potter%' AND • P.AUTHOR = 'J. K. Rowling' AND • D.ORDER_DAY :: DATE >= P.RELEASE_DATE AND • D.ORDER_DAY :: DATE < dateadd(day, 3, P.RELEASE_DATE) • GROUP BY P.ASIN, P.TITLE, P.RELEASE_DATE • ORDER BY SALES_sum DESC • LIMIT 20; •
  • 68. Lets build an analytic query - #4 An author is releasing the 8th book in her popular series. How ma ny should we order for Seattle? What were prior first few day sal es? Lets compute the sales of the prior books she’s written in this ser ies and return the top 20 values, just for the first three days of sa les of first editions in the city of Seattle, WA, USA 4 Tables (1 S3, 3 local) 8 Filters 3 Joins 4 Group By columns 1 Order By 1 Limit 1 Aggregation 1 Function 2 Casts • SELECT • P.ASIN, • P.TITLE, • R.POSTAL_CODE, • P.RELEASE_DATE, • SUM(D.QUANTITY * D.OUR_PRICE) AS SALES_sum • FROM • s3.d_customer_order_item_details D, • asin_attributes A, • products P, • regions R • WHERE • D.ASIN = P.ASIN AND • P.ASIN = A.ASIN AND • D.REGION_ID = R.REGION_ID AND • A.EDITION LIKE '%FIRST%' AND • P.TITLE LIKE '%Potter%' AND • P.AUTHOR = 'J. K. Rowling' AND • R.COUNTRY_CODE = ‘US’ AND • R.CITY = ‘Seattle’ AND • R.STATE = ‘WA’ AND • D.ORDER_DAY :: DATE >= P.RELEASE_DATE AND • D.ORDER_DAY :: DATE < dateadd(day, 3, P.RELEASE_DATE) • GROUP BY P.ASIN, P.TITLE, R.POSTAL_CODE, P.RELEASE_DATE • ORDER BY SALES_sum DESC • LIMIT 20; •
  • 69. Now let’s run that query over an exabyte of data in S3 Roughly 140 TB of customer item order detail rec ords for each day over past 20 years. 190 million files across 15,000 partitions in S3. O ne partition per day for USA and rest of world. Need a billion-fold reduction in data processed. Running this query using a 1000 node Hive clust er would take over 5 years.* • Compression ……………..….……..5X • Columnar file format……….......…10X • Scanning with 2500 nodes…....2500X • Static partition elimination…............2X • Dynamic partition elimination..….350X • Redshift’s query optimizer……......40X --------------------------------------------------- Total reduction……….…………3.5B X * Estimated using 20 node Hive cluster & 1.4TB, assume linear * Query used a 20 node DC1.8XLarge Amazon Redshift cluster * Not actual sales data - generated for this demo based on data format used by Amazon Retail.
  • 70. ReCap : Redshift and Redshift Spectrum is… Relational data warehouse Can be streamed in Can be processed in real time Can be expend Exabyte scale You can mix and match On premises and cloud Custom development and managed services Infrastructure with managed scaling, security Redshift can support
  • 72. One More Thing Beyond Amazon Redshift