SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Amazon Redshift
for Data Analysts
Amazon Redshift
For Data Analysts
D. Can Abacıgil, CTO, DataRow
Eren Baydemir, CEO, DataRow
w w w . d a t a r o w . c o m
Are you an
Amazon Redshift user?
Have you used
TeamSQL before?
Do you know
what DataRow is?
Today’s Overview
Amazon Redshift System Overview
Cluster Management
Importing & Exporting Data
Break
Data Modeling and Table Design
Maintenance
Amazon Redshift System Overview
Amazon Redshift Architecture
Massively parallel,
shared nothing columnar architecture.
Leader node
- SQL endpoint
- Stores metadata
- Coordinates parallel SQL processing
Compute nodes
- Local, columnar storage
- Executes queries in parallel
- Load, unload, backup, restore
Amazon Redshift Spectrum nodes
- Execute queries directly against
Amazon Simple Storage Service (Amazon S3)
Source: AWS Documentation
SQL Clients / Tools (DataRow)
Leader node
JDBC / ODBC
Compute node Compute node Compute node
Amazon Simple
Storage Service (S3)
Amazon Redshift Performance
Massively Parallel Processing
Fast execution of the most complex queries operating on large amounts of data.
Columnar Data Storage
Drastically reduces the overall disk I/O requirements.
Data Compression
Reduces storage requirements, thereby reducing disk I/O, which improves query performance.
Query Optimizer
Implements significant enhancements and extensions for processing complex analytic queries.
Result Caching
Caches the results of certain types of queries in memory on the leader node.
Compiled Code
The leader node distributes fully optimized compiled code across all of the nodes of a cluster.
Cluster Management
Launch an Amazon Redshift Cluster
1. Decide on what type of node you’ll use
2. Figure out how many nodes to use
3. Additional setup and the networking options
4. Configure the networking options
5. Launch the cluster
User Management
● Cluster Management Permissions
○ Authentication
■ AWS account root user
■ IAM user
■ IAM role
○ Access Control
Creating an Amazon Redshift cluster, IP addresses, Security Groups, Snapshots and
more.
● Access to Database Permissions
Ability to have control over a database’s objects like tables and views. You must be a superuser to
create an Amazon Redshift user.
Importing &
Exporting Data
Load Data Into Amazon Redshift
● Access Rights and Credentials
To grant access to an Amazon Redshift instance to access and manipulate other resources, you need to
authenticate it. There are two options available: Role Based and Key Based Access.
● Importing Data
The COPY command loads data into a table from data files or from an Amazon DynamoDB table.
● Sources to Load your Data
The COPY command supports a wide number of different sources to load data.
○ Amazon S3
○ Amazon EMR Cluster
○ Remote Hosts
○ DynamoDB
Overview of System Tables and Views
An Amazon Redshift cluster has many system tables and views you can query to
understand how your system behaves.
● STL_LOAD_ERRORS
Displays the records of all Amazon Redshift load errors.
● STL_FILE_SCAN
Returns the files that Amazon Redshift read while loading data via the COPY command.
● STL_S3CLIENT_ERROR
Records errors encountered by a slice while loading a file from Amazon S3.
Export Data from Amazon Redshift
● What is UNLOAD command?
Unload the result of a query to one or more files on Amazon S3.
● UNLOAD command syntax
Create a sample table and insert a few records into it.
● DataRow UNLOAD Command Wizard
Perform your UNLOAD command in seconds, and easily upload data to a table.
● Reading Data directly from Amazon Redshift
Access your data directly on Amazon Redshift.
It’s
Pizza Time!
Data Modeling and Table Design
Table Distribution Styles
● Understanding Redshift Distribution Key
Redshift Distribution Keys (DIST Keys) determine where data is stored in
Redshift.
● Amazon Redshift Distribution Styles
○ All
○ Even
○ Key
● Choosing the right Distribution Styles
Choose columns used in the query that leads to least skewness as the DISTKEY. The good choice is the
column with maximum distinct values, such as the timestamp.
Understanding and Selecting Sort Keys
● Introduction to Redshift Sort Key
Redshift Sort Key determines the order in which rows in a table are stored. Amazon Redshift supports
two kinds of Sort Keys:
○ Compound Sort Keys
○ Interleaved Sort Key
● Choosing Sorting Keys
Selecting the right kind needs the knowledge of the queries.
Column Compression Settings
● How Column Compression Works
It is possible to define a Column Compression Encoding manually or ask Amazon Redshift to select an
Encoding automatically during the execution of a COPY command.
● Compression Encoding
A compression encoding specifies the type of compression that is applied to a column of data values as
rows are added to a table.
● Analyze Compression
Performs a compression analysis on your data and returns suggestions for the compression encoding to
be used.
Choosing a Column Compression Type
The following statement creates a CUSTOMER table that has columns with various data types. This CREATE
TABLE statement shows one of many possible combinations of compression encodings for these columns.
MAINTENANCE
Why to Vacuum Amazon Redshift?
● Why Vacuum?
Amazon Redshift reclaims deleted space and sorts the new data when VACUUM
query is issued.
● When to run Vacuum?
It is recommended to perform VACUUM depending on the amount of space that
needs to be reclaimed and also upon unsorted data.
● Vacuum types
You can issue vacuum either on a table or on the complete database, running a
query or using DataRow.
Why Redshift Analyze?
● Why Analyze?
The ANALYZE operation updates the statistical metadata that the query planner
uses to choose optimal plans.
● When to run Analyze?
COPY command performs an ANALYZE after it loads data into an empty table.
● How to run Analyze?
Analyze command can be performed by running a query. Alternatively, and more
easily, you can use DataRow to perform an ANALYZE command.
Monitoring Query Performance
Amazon Redshift provides performance metrics and data so that you can track the
health and performance of your clusters and databases.
You can get information about the query:
1. Query ID
2. Run time
3. Start time
LET’S KEEP IN TOUCH!
https://datarow.com
support@datarow.com
@getdatarow

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
 
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
U-SQL Does SQL (SQLBits 2016)
U-SQL Does SQL (SQLBits 2016)U-SQL Does SQL (SQLBits 2016)
U-SQL Does SQL (SQLBits 2016)
 
Amazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech Talks
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
 
U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)
 
Be A Hero: Transforming GoPro Analytics Data Pipeline
Be A Hero: Transforming GoPro Analytics Data PipelineBe A Hero: Transforming GoPro Analytics Data Pipeline
Be A Hero: Transforming GoPro Analytics Data Pipeline
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQL
 
Dynamo db
Dynamo dbDynamo db
Dynamo db
 
Stored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sqlStored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sql
 
SRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon RedshiftSRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon Redshift
 
How to Extract Data from Amazon Redshift
How to Extract Data from Amazon RedshiftHow to Extract Data from Amazon Redshift
How to Extract Data from Amazon Redshift
 
03 hive query language (hql)
03 hive query language (hql)03 hive query language (hql)
03 hive query language (hql)
 
Introduction to aws dynamo db
Introduction to aws dynamo dbIntroduction to aws dynamo db
Introduction to aws dynamo db
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
 

Ähnlich wie Amazon Redshift For Data Analysts

Ähnlich wie Amazon Redshift For Data Analysts (20)

AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722
 
London Redshift Meetup - July 2017
London Redshift Meetup - July 2017London Redshift Meetup - July 2017
London Redshift Meetup - July 2017
 
SQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISSQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSIS
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
 
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
 
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
 
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Melhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftMelhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon Redshift
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
How to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon RedshiftHow to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon Redshift
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
AWS Analytics
AWS AnalyticsAWS Analytics
AWS Analytics
 
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - May 2017 A...
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - May 2017 A...Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - May 2017 A...
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - May 2017 A...
 
Loading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF LoftLoading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF Loft
 
AWS re:Invent 2016: Workshop: Converting Your Oracle or Microsoft SQL Server ...
AWS re:Invent 2016: Workshop: Converting Your Oracle or Microsoft SQL Server ...AWS re:Invent 2016: Workshop: Converting Your Oracle or Microsoft SQL Server ...
AWS re:Invent 2016: Workshop: Converting Your Oracle or Microsoft SQL Server ...
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
 

Kürzlich hochgeladen

Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Kürzlich hochgeladen (20)

Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

Amazon Redshift For Data Analysts

  • 1. Amazon Redshift for Data Analysts Amazon Redshift For Data Analysts D. Can Abacıgil, CTO, DataRow Eren Baydemir, CEO, DataRow w w w . d a t a r o w . c o m
  • 2. Are you an Amazon Redshift user? Have you used TeamSQL before? Do you know what DataRow is?
  • 3. Today’s Overview Amazon Redshift System Overview Cluster Management Importing & Exporting Data Break Data Modeling and Table Design Maintenance
  • 5. Amazon Redshift Architecture Massively parallel, shared nothing columnar architecture. Leader node - SQL endpoint - Stores metadata - Coordinates parallel SQL processing Compute nodes - Local, columnar storage - Executes queries in parallel - Load, unload, backup, restore Amazon Redshift Spectrum nodes - Execute queries directly against Amazon Simple Storage Service (Amazon S3) Source: AWS Documentation SQL Clients / Tools (DataRow) Leader node JDBC / ODBC Compute node Compute node Compute node Amazon Simple Storage Service (S3)
  • 6. Amazon Redshift Performance Massively Parallel Processing Fast execution of the most complex queries operating on large amounts of data. Columnar Data Storage Drastically reduces the overall disk I/O requirements. Data Compression Reduces storage requirements, thereby reducing disk I/O, which improves query performance. Query Optimizer Implements significant enhancements and extensions for processing complex analytic queries. Result Caching Caches the results of certain types of queries in memory on the leader node. Compiled Code The leader node distributes fully optimized compiled code across all of the nodes of a cluster.
  • 8. Launch an Amazon Redshift Cluster 1. Decide on what type of node you’ll use 2. Figure out how many nodes to use 3. Additional setup and the networking options 4. Configure the networking options 5. Launch the cluster
  • 9. User Management ● Cluster Management Permissions ○ Authentication ■ AWS account root user ■ IAM user ■ IAM role ○ Access Control Creating an Amazon Redshift cluster, IP addresses, Security Groups, Snapshots and more. ● Access to Database Permissions Ability to have control over a database’s objects like tables and views. You must be a superuser to create an Amazon Redshift user.
  • 11. Load Data Into Amazon Redshift ● Access Rights and Credentials To grant access to an Amazon Redshift instance to access and manipulate other resources, you need to authenticate it. There are two options available: Role Based and Key Based Access. ● Importing Data The COPY command loads data into a table from data files or from an Amazon DynamoDB table. ● Sources to Load your Data The COPY command supports a wide number of different sources to load data. ○ Amazon S3 ○ Amazon EMR Cluster ○ Remote Hosts ○ DynamoDB
  • 12. Overview of System Tables and Views An Amazon Redshift cluster has many system tables and views you can query to understand how your system behaves. ● STL_LOAD_ERRORS Displays the records of all Amazon Redshift load errors. ● STL_FILE_SCAN Returns the files that Amazon Redshift read while loading data via the COPY command. ● STL_S3CLIENT_ERROR Records errors encountered by a slice while loading a file from Amazon S3.
  • 13. Export Data from Amazon Redshift ● What is UNLOAD command? Unload the result of a query to one or more files on Amazon S3. ● UNLOAD command syntax Create a sample table and insert a few records into it. ● DataRow UNLOAD Command Wizard Perform your UNLOAD command in seconds, and easily upload data to a table. ● Reading Data directly from Amazon Redshift Access your data directly on Amazon Redshift.
  • 15. Data Modeling and Table Design
  • 16. Table Distribution Styles ● Understanding Redshift Distribution Key Redshift Distribution Keys (DIST Keys) determine where data is stored in Redshift. ● Amazon Redshift Distribution Styles ○ All ○ Even ○ Key ● Choosing the right Distribution Styles Choose columns used in the query that leads to least skewness as the DISTKEY. The good choice is the column with maximum distinct values, such as the timestamp.
  • 17. Understanding and Selecting Sort Keys ● Introduction to Redshift Sort Key Redshift Sort Key determines the order in which rows in a table are stored. Amazon Redshift supports two kinds of Sort Keys: ○ Compound Sort Keys ○ Interleaved Sort Key ● Choosing Sorting Keys Selecting the right kind needs the knowledge of the queries.
  • 18. Column Compression Settings ● How Column Compression Works It is possible to define a Column Compression Encoding manually or ask Amazon Redshift to select an Encoding automatically during the execution of a COPY command. ● Compression Encoding A compression encoding specifies the type of compression that is applied to a column of data values as rows are added to a table. ● Analyze Compression Performs a compression analysis on your data and returns suggestions for the compression encoding to be used.
  • 19. Choosing a Column Compression Type The following statement creates a CUSTOMER table that has columns with various data types. This CREATE TABLE statement shows one of many possible combinations of compression encodings for these columns.
  • 21. Why to Vacuum Amazon Redshift? ● Why Vacuum? Amazon Redshift reclaims deleted space and sorts the new data when VACUUM query is issued. ● When to run Vacuum? It is recommended to perform VACUUM depending on the amount of space that needs to be reclaimed and also upon unsorted data. ● Vacuum types You can issue vacuum either on a table or on the complete database, running a query or using DataRow.
  • 22. Why Redshift Analyze? ● Why Analyze? The ANALYZE operation updates the statistical metadata that the query planner uses to choose optimal plans. ● When to run Analyze? COPY command performs an ANALYZE after it loads data into an empty table. ● How to run Analyze? Analyze command can be performed by running a query. Alternatively, and more easily, you can use DataRow to perform an ANALYZE command.
  • 23. Monitoring Query Performance Amazon Redshift provides performance metrics and data so that you can track the health and performance of your clusters and databases. You can get information about the query: 1. Query ID 2. Run time 3. Start time
  • 24. LET’S KEEP IN TOUCH! https://datarow.com support@datarow.com @getdatarow