SlideShare ist ein Scribd-Unternehmen logo
1 von 24
How GE Deployed a Data
Warehouse on AWS using
Matillion ETL
Today’s Presenters
Brandon Chavis, Solutions Architect, Amazon Web Services
Matthew Scullion, Matillion
Ryan Oattes, Enterprise Architect, GE Water
Free Trial Available in AWS
Marketplace
Today’s Agenda
1. An overview of AWS and AWS Marketplace, with an emphasis on
Amazon Redshift
2. The GE Water success story with AWS and Matillion
3. Matillion ETL for Redshift – product demo
4. Q&A/Discussion
Relational data warehouse
Massively parallel; petabyte scale
Fully managed
HDD and SSD platforms
$1,000/TB/year; starts at $0.25/hour
Amazon
Redshift
a lot faster
a lot simpler
a lot cheaper
The Amazon Redshift view of data warehousing
 10x cheaper
 Easy to provision
 Higher DBA productivity
 10x faster
 No programming
 Easily leverage BI tools,
Hadoop, machine learning,
streaming
 Analysis inline with process
flows
 Pay as you go, grow as you
need
 Managed availability and
disaster recovery
Enterprise Big data SaaS
Selected Amazon Redshift customers
Amazon Redshift architecture
 Leader node
Simple SQL endpoint
Stores metadata
Optimizes query plan
Coordinates query execution
 Compute nodes
Local columnar storage
Parallel/distributed execution of all queries, loads,
backups, restores, resizes
 Start at just $0.25/hour, grow to 2 PB
(compressed)
DC1: SSD; scale from 160 GB to 326 TB
DS2: HDD; scale from 2 TB to 2 PB
Ingestion/backup
Backup
Restore
JDBC/ODBC
10 GigE
(HPC)
Benefit #1: Amazon Redshift is fast
 Parallel and distributed
Query
Load
Export
Backup
Restore
Resize
Benefit #1: Amazon Redshift is fast
 Dramatically less I/O
Column storage
Data compression
Zone maps
Direct-attached storage
Large data block sizes
analyze compression listing;
Table | Column | Encoding
---------+----------------+----------
listing | listid | delta
listing | sellerid | delta32k
listing | eventid | delta32k
listing | dateid | bytedict
listing | numtickets | bytedict
listing | priceperticket | delta32k
listing | totalprice | mostly32
listing | listtime | raw
10 | 13 | 14 | 26 |…
… | 100 | 245 | 324
375 | 393 | 417…
… 512 | 549 | 623
637 | 712 | 809 …
… | 834 | 921 | 959
10
324
375
623
637
959
Benefit #1: Amazon Redshift is fast
Hardware optimized for I/O intensive workloads, 4 GB/sec/node
Enhanced networking, over 1 million packets/sec/node
Choice of storage type, instance size
Regular cadence of autopatched improvements
Benefit #2: Amazon Redshift is inexpensive
Ds2 (HDD)
Price per hour for
DW1.XL single node
Effective annual
price per TB compressed
On demand $ 0.850 $ 3,725
1-year reservation $ 0.500 $ 2,190
3-year reservation $ 0.228 $ 999
Dc1 (SSD)
Price per hour for
DW2.L single node
Effective annual
price per TB compressed
On demand $ 0.250 $ 13,690
1-year reservation $ 0.161 $ 8,795
3-year reservation $ 0.100 $ 5,500
Pricing is simple
Number of nodes x price/hour
No charge for leader node
No upfront costs
Pay as you go
Benefit #3: Amazon Redshift is fully managed
Continuous/incremental backups
Multiple copies within cluster
Continuous and incremental backups
to Amazon S3
Continuous and incremental backups
across regions
Streaming restore
Amazon S3
Amazon S3
Region 1
Region 2
Benefit #3: Amazon Redshift is fully managed
Amazon S3
Amazon S3
Region 1
Region 2
Fault tolerance
Disk failures
Node failures
Network failures
Availability Zone/region level disasters
Benefit #4: Security is built in
• Load encrypted from Amazon S3
• SSL to secure data in transit
• ECDHE perfect forward security
• Amazon VPC for network isolation
• Encryption to secure data at rest
– All blocks on disks and in Amazon S3 encrypted
– Block key, cluster key, master key (AES-256)
– On-premises HSM and AWS CloudHSM support
• Audit logging and AWS CloudTrail integration
• SOC 1/2/3, PCI-DSS, FedRAMP, BAA
10 GigE
(HPC)
Ingestion
Backup
Restore
Customer VPC
Internal
VPC
JDBC/ODBC
Benefit #5: We innovate quickly
 Well over 125 new features added since launch
 Release every two weeks
 Automatic patching
Service Launch (2/14)
PDX (4/2)
Temp Credentials (4/11)
DUB (4/25)
SOC1/2/3 (5/8)
Unload Encrypted Files
NRT (6/5)
JDBC Fetch Size (6/27)
Unload logs (7/5)
SHA1 Builtin (7/15)
4 byte UTF-8 (7/18)
Sharing snapshots (7/18)
Statement Timeout (7/22)
Timezone, Epoch, Autoformat (7/25)
WLM Timeout/Wildcards (8/1)
CRC32 Builtin, CSV, Restore Progress
(8/9)
Resource Level IAM (8/9)
PCI (8/22)
UTF-8 Substitution (8/29)
JSON, Regex, Cursors (9/10)
Split_part, Audit tables (10/3)
SIN/SYD (10/8)
HSM Support (11/11)
Kinesis EMR/HDFS/SSH copy,
Distributed Tables, Audit
Logging/CloudTrail, Concurrency, Resize
Perf., Approximate Count Distinct, SNS
Alerts, Cross Region Backup (11/13)
Distributed Tables, Single Node Cursor
Support, Maximum Connections to 500
(12/13)
EIP Support for VPC Clusters (12/28)
New query monitoring system tables and
diststyle all (1/13)
Redshift on DW2 (SSD) Nodes (1/23)
Compression for COPY from SSH, Fetch
size support for single node clusters, new
system tables with commit stats,
row_number(), strotol() and query
termination (2/13)
Resize progress indicator & Cluster
Version (3/21)
Regex_Substr, COPY from JSON (3/25)
50 slots, COPY from EMR, ECDHE
ciphers (4/22)
3 new regex features, Unload to single
file, FedRAMP(5/6)
Rename Cluster (6/2)
Copy from multiple regions,
percentile_cont, percentile_disc (6/30)
Free Trial (7/1)
pg_last_unload_count (9/15)
AES-128 S3 encryption (9/29)
UTF-16 support (9/29)
Matillion ETL for Redshift and AWS
With Matillion ETL for Redshift on AWS
Marketplace, customers enjoy a range of benefits
building data warehouses on Amazon Redshift
quickly and simply:
 ELT architecture: Using the power of the
Redshift cluster for transformation. Fast and
always in touch with the data
 Intuitive and productive: Cloud 1st and
designed for AWS and Amazon Redshift. Makes
leveraging Amazon Redshift simple and
productive
 Launched via AWS Marketplace: simple to
install and scale, secure, integrated utility billing
• General Electric has been in business for over 140 years,
investing $5.4B annually in R&D (6% of Revenue)
• Augmenting our Operational Technology depth with Digital.
• Focus on Industrial Internet of Things, and creating insights
based on business and machine data.
• Knew we needed best in class partners to let us focus on
what we do best; GE migrating 9000 workloads into AWS.
Our Challenge: Raise the bar on Data Warehouse scalability,
integration, stability and development velocity.
• Needed scalability for machine and business data, as GE
increasingly digitizes.
• Self-serve BI strategy meant we had to maintain our current
compute capabilities.
• Increasingly critical dependencies require rock-solid platform.
• Desired more intuitive and accessible analytics solution.
Our solution: AWS and Matillion ETL for Redshift
SAP
10 x DC1 Nodes
Amazon Redshift Cluster
Staging DWH
Matillion ETL
M3.Large
ELT
Tableau
CDC Data Replication (HVR)
Key Numbers:
45%operating cost
reduction
6months from
ideation to go-live
<$100PoC pilot cost
Outcomes for GE:
• Preserved performance and added stability.
• Simplified operations with managed Redshift solution.
• OPEX savings. No CAPEX required.
• Fast development from concept to reality.
• Highly resilient, seamless scaling
Live Product Demonstration
 Launch a free 14-day trial of Matillion ETL for Amazon Redshift on the AWS Marketplace
– https://aws.amazon.com/marketplace
 Get tutorials and videos at the Matillion YouTube channel:
– https://www.youtube.com/user/MatillionVideos
 Try Amazon Redshift for free. Get 750 free DC1.Large hours per month for 2 months.
– https://aws.amazon.com/redshift
 Support and documentation
– https://redshiftsupport.matillion.com
Next steps and further information
Q&A
Free Trial Available in AWS
Marketplace

Weitere ähnliche Inhalte

Andere mochten auch

REA Sydney Customer Appreciation Day
REA Sydney Customer Appreciation DayREA Sydney Customer Appreciation Day
REA Sydney Customer Appreciation DayAmazon Web Services
 
AWS Partner Presentation - Sonian
AWS Partner Presentation - SonianAWS Partner Presentation - Sonian
AWS Partner Presentation - SonianAmazon Web Services
 
Getting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBGetting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBAmazon Web Services
 
AWS Customer Presentation - AdaptiveBlue
AWS Customer Presentation - AdaptiveBlueAWS Customer Presentation - AdaptiveBlue
AWS Customer Presentation - AdaptiveBlueAmazon Web Services
 
Accelerating DevOps Pipelines with AWS
Accelerating DevOps Pipelines with AWSAccelerating DevOps Pipelines with AWS
Accelerating DevOps Pipelines with AWSAmazon Web Services
 
STP205 Making it Big Without Breaking the Bank - AWS re: Invent 2012
STP205 Making it Big Without Breaking the Bank - AWS re: Invent 2012STP205 Making it Big Without Breaking the Bank - AWS re: Invent 2012
STP205 Making it Big Without Breaking the Bank - AWS re: Invent 2012Amazon Web Services
 
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...Amazon Web Services
 
Your First Week on Amazon Web Services
Your First Week on Amazon Web ServicesYour First Week on Amazon Web Services
Your First Week on Amazon Web ServicesAmazon Web Services
 
Best Practices in Architecting for the Cloud Webinar - Jinesh Varia
Best Practices in Architecting for the Cloud Webinar - Jinesh VariaBest Practices in Architecting for the Cloud Webinar - Jinesh Varia
Best Practices in Architecting for the Cloud Webinar - Jinesh VariaAmazon Web Services
 
Design Patterns for Developers - Technical 201
Design Patterns for Developers - Technical 201Design Patterns for Developers - Technical 201
Design Patterns for Developers - Technical 201Amazon Web Services
 
Unlocking the Value of your Data Featuring AWS Enterprise Use Cases
Unlocking the Value of your Data Featuring AWS Enterprise Use CasesUnlocking the Value of your Data Featuring AWS Enterprise Use Cases
Unlocking the Value of your Data Featuring AWS Enterprise Use CasesAmazon Web Services
 
AWS Summit Auckland 2014 | Understanding AWS Security
AWS Summit Auckland 2014 | Understanding AWS Security AWS Summit Auckland 2014 | Understanding AWS Security
AWS Summit Auckland 2014 | Understanding AWS Security Amazon Web Services
 
MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012Amazon Web Services
 
Amazon WorkSpaces - Fully Managed Desktops in the Cloud
Amazon WorkSpaces - Fully Managed Desktops in the Cloud Amazon WorkSpaces - Fully Managed Desktops in the Cloud
Amazon WorkSpaces - Fully Managed Desktops in the Cloud Amazon Web Services
 

Andere mochten auch (20)

Scalability and Availability
Scalability and AvailabilityScalability and Availability
Scalability and Availability
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
 
REA Sydney Customer Appreciation Day
REA Sydney Customer Appreciation DayREA Sydney Customer Appreciation Day
REA Sydney Customer Appreciation Day
 
6 rules for innovation
6 rules for innovation6 rules for innovation
6 rules for innovation
 
AWS Blackbelt NINJA Dojo
AWS Blackbelt NINJA DojoAWS Blackbelt NINJA Dojo
AWS Blackbelt NINJA Dojo
 
AWS Partner Presentation - Sonian
AWS Partner Presentation - SonianAWS Partner Presentation - Sonian
AWS Partner Presentation - Sonian
 
Getting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBGetting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDB
 
AWS Customer Presentation - AdaptiveBlue
AWS Customer Presentation - AdaptiveBlueAWS Customer Presentation - AdaptiveBlue
AWS Customer Presentation - AdaptiveBlue
 
Accelerating DevOps Pipelines with AWS
Accelerating DevOps Pipelines with AWSAccelerating DevOps Pipelines with AWS
Accelerating DevOps Pipelines with AWS
 
STP205 Making it Big Without Breaking the Bank - AWS re: Invent 2012
STP205 Making it Big Without Breaking the Bank - AWS re: Invent 2012STP205 Making it Big Without Breaking the Bank - AWS re: Invent 2012
STP205 Making it Big Without Breaking the Bank - AWS re: Invent 2012
 
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
Track 3 - Atelier 3 - Assurez l’agilité et la profitabilité de votre business...
 
Your First Week on Amazon Web Services
Your First Week on Amazon Web ServicesYour First Week on Amazon Web Services
Your First Week on Amazon Web Services
 
Best Practices in Architecting for the Cloud Webinar - Jinesh Varia
Best Practices in Architecting for the Cloud Webinar - Jinesh VariaBest Practices in Architecting for the Cloud Webinar - Jinesh Varia
Best Practices in Architecting for the Cloud Webinar - Jinesh Varia
 
Design Patterns for Developers - Technical 201
Design Patterns for Developers - Technical 201Design Patterns for Developers - Technical 201
Design Patterns for Developers - Technical 201
 
Unlocking the Value of your Data Featuring AWS Enterprise Use Cases
Unlocking the Value of your Data Featuring AWS Enterprise Use CasesUnlocking the Value of your Data Featuring AWS Enterprise Use Cases
Unlocking the Value of your Data Featuring AWS Enterprise Use Cases
 
AWS Summit Auckland 2014 | Understanding AWS Security
AWS Summit Auckland 2014 | Understanding AWS Security AWS Summit Auckland 2014 | Understanding AWS Security
AWS Summit Auckland 2014 | Understanding AWS Security
 
Building mobile apps on aws
Building mobile apps on awsBuilding mobile apps on aws
Building mobile apps on aws
 
MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012
 
Amazon WorkSpaces - Fully Managed Desktops in the Cloud
Amazon WorkSpaces - Fully Managed Desktops in the Cloud Amazon WorkSpaces - Fully Managed Desktops in the Cloud
Amazon WorkSpaces - Fully Managed Desktops in the Cloud
 
Getting to MVP
Getting to MVPGetting to MVP
Getting to MVP
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 

Kürzlich hochgeladen (20)

Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 

How GE Deployed a Data Warehouse on AWS using Matillion ETL

  • 1. How GE Deployed a Data Warehouse on AWS using Matillion ETL
  • 2. Today’s Presenters Brandon Chavis, Solutions Architect, Amazon Web Services Matthew Scullion, Matillion Ryan Oattes, Enterprise Architect, GE Water Free Trial Available in AWS Marketplace
  • 3. Today’s Agenda 1. An overview of AWS and AWS Marketplace, with an emphasis on Amazon Redshift 2. The GE Water success story with AWS and Matillion 3. Matillion ETL for Redshift – product demo 4. Q&A/Discussion
  • 4. Relational data warehouse Massively parallel; petabyte scale Fully managed HDD and SSD platforms $1,000/TB/year; starts at $0.25/hour Amazon Redshift a lot faster a lot simpler a lot cheaper
  • 5. The Amazon Redshift view of data warehousing  10x cheaper  Easy to provision  Higher DBA productivity  10x faster  No programming  Easily leverage BI tools, Hadoop, machine learning, streaming  Analysis inline with process flows  Pay as you go, grow as you need  Managed availability and disaster recovery Enterprise Big data SaaS
  • 7. Amazon Redshift architecture  Leader node Simple SQL endpoint Stores metadata Optimizes query plan Coordinates query execution  Compute nodes Local columnar storage Parallel/distributed execution of all queries, loads, backups, restores, resizes  Start at just $0.25/hour, grow to 2 PB (compressed) DC1: SSD; scale from 160 GB to 326 TB DS2: HDD; scale from 2 TB to 2 PB Ingestion/backup Backup Restore JDBC/ODBC 10 GigE (HPC)
  • 8. Benefit #1: Amazon Redshift is fast  Parallel and distributed Query Load Export Backup Restore Resize
  • 9. Benefit #1: Amazon Redshift is fast  Dramatically less I/O Column storage Data compression Zone maps Direct-attached storage Large data block sizes analyze compression listing; Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw 10 | 13 | 14 | 26 |… … | 100 | 245 | 324 375 | 393 | 417… … 512 | 549 | 623 637 | 712 | 809 … … | 834 | 921 | 959 10 324 375 623 637 959
  • 10. Benefit #1: Amazon Redshift is fast Hardware optimized for I/O intensive workloads, 4 GB/sec/node Enhanced networking, over 1 million packets/sec/node Choice of storage type, instance size Regular cadence of autopatched improvements
  • 11. Benefit #2: Amazon Redshift is inexpensive Ds2 (HDD) Price per hour for DW1.XL single node Effective annual price per TB compressed On demand $ 0.850 $ 3,725 1-year reservation $ 0.500 $ 2,190 3-year reservation $ 0.228 $ 999 Dc1 (SSD) Price per hour for DW2.L single node Effective annual price per TB compressed On demand $ 0.250 $ 13,690 1-year reservation $ 0.161 $ 8,795 3-year reservation $ 0.100 $ 5,500 Pricing is simple Number of nodes x price/hour No charge for leader node No upfront costs Pay as you go
  • 12. Benefit #3: Amazon Redshift is fully managed Continuous/incremental backups Multiple copies within cluster Continuous and incremental backups to Amazon S3 Continuous and incremental backups across regions Streaming restore Amazon S3 Amazon S3 Region 1 Region 2
  • 13. Benefit #3: Amazon Redshift is fully managed Amazon S3 Amazon S3 Region 1 Region 2 Fault tolerance Disk failures Node failures Network failures Availability Zone/region level disasters
  • 14. Benefit #4: Security is built in • Load encrypted from Amazon S3 • SSL to secure data in transit • ECDHE perfect forward security • Amazon VPC for network isolation • Encryption to secure data at rest – All blocks on disks and in Amazon S3 encrypted – Block key, cluster key, master key (AES-256) – On-premises HSM and AWS CloudHSM support • Audit logging and AWS CloudTrail integration • SOC 1/2/3, PCI-DSS, FedRAMP, BAA 10 GigE (HPC) Ingestion Backup Restore Customer VPC Internal VPC JDBC/ODBC
  • 15. Benefit #5: We innovate quickly  Well over 125 new features added since launch  Release every two weeks  Automatic patching Service Launch (2/14) PDX (4/2) Temp Credentials (4/11) DUB (4/25) SOC1/2/3 (5/8) Unload Encrypted Files NRT (6/5) JDBC Fetch Size (6/27) Unload logs (7/5) SHA1 Builtin (7/15) 4 byte UTF-8 (7/18) Sharing snapshots (7/18) Statement Timeout (7/22) Timezone, Epoch, Autoformat (7/25) WLM Timeout/Wildcards (8/1) CRC32 Builtin, CSV, Restore Progress (8/9) Resource Level IAM (8/9) PCI (8/22) UTF-8 Substitution (8/29) JSON, Regex, Cursors (9/10) Split_part, Audit tables (10/3) SIN/SYD (10/8) HSM Support (11/11) Kinesis EMR/HDFS/SSH copy, Distributed Tables, Audit Logging/CloudTrail, Concurrency, Resize Perf., Approximate Count Distinct, SNS Alerts, Cross Region Backup (11/13) Distributed Tables, Single Node Cursor Support, Maximum Connections to 500 (12/13) EIP Support for VPC Clusters (12/28) New query monitoring system tables and diststyle all (1/13) Redshift on DW2 (SSD) Nodes (1/23) Compression for COPY from SSH, Fetch size support for single node clusters, new system tables with commit stats, row_number(), strotol() and query termination (2/13) Resize progress indicator & Cluster Version (3/21) Regex_Substr, COPY from JSON (3/25) 50 slots, COPY from EMR, ECDHE ciphers (4/22) 3 new regex features, Unload to single file, FedRAMP(5/6) Rename Cluster (6/2) Copy from multiple regions, percentile_cont, percentile_disc (6/30) Free Trial (7/1) pg_last_unload_count (9/15) AES-128 S3 encryption (9/29) UTF-16 support (9/29)
  • 16. Matillion ETL for Redshift and AWS With Matillion ETL for Redshift on AWS Marketplace, customers enjoy a range of benefits building data warehouses on Amazon Redshift quickly and simply:  ELT architecture: Using the power of the Redshift cluster for transformation. Fast and always in touch with the data  Intuitive and productive: Cloud 1st and designed for AWS and Amazon Redshift. Makes leveraging Amazon Redshift simple and productive  Launched via AWS Marketplace: simple to install and scale, secure, integrated utility billing
  • 17. • General Electric has been in business for over 140 years, investing $5.4B annually in R&D (6% of Revenue) • Augmenting our Operational Technology depth with Digital. • Focus on Industrial Internet of Things, and creating insights based on business and machine data. • Knew we needed best in class partners to let us focus on what we do best; GE migrating 9000 workloads into AWS.
  • 18. Our Challenge: Raise the bar on Data Warehouse scalability, integration, stability and development velocity. • Needed scalability for machine and business data, as GE increasingly digitizes. • Self-serve BI strategy meant we had to maintain our current compute capabilities. • Increasingly critical dependencies require rock-solid platform. • Desired more intuitive and accessible analytics solution.
  • 19. Our solution: AWS and Matillion ETL for Redshift SAP 10 x DC1 Nodes Amazon Redshift Cluster Staging DWH Matillion ETL M3.Large ELT Tableau CDC Data Replication (HVR)
  • 20. Key Numbers: 45%operating cost reduction 6months from ideation to go-live <$100PoC pilot cost
  • 21. Outcomes for GE: • Preserved performance and added stability. • Simplified operations with managed Redshift solution. • OPEX savings. No CAPEX required. • Fast development from concept to reality. • Highly resilient, seamless scaling
  • 23.  Launch a free 14-day trial of Matillion ETL for Amazon Redshift on the AWS Marketplace – https://aws.amazon.com/marketplace  Get tutorials and videos at the Matillion YouTube channel: – https://www.youtube.com/user/MatillionVideos  Try Amazon Redshift for free. Get 750 free DC1.Large hours per month for 2 months. – https://aws.amazon.com/redshift  Support and documentation – https://redshiftsupport.matillion.com Next steps and further information
  • 24. Q&A Free Trial Available in AWS Marketplace