SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Cloud and Amazon Redshift
Rahul Pathak, Amazon Redshift Product Management
Nicolas Brisoux, Informatica Cloud Platform Adoption
Darren Cunningham, Informatica Cloud Marketing
@infacloud #redshift
Today’s Agenda
• Informatica and Amazon Strategic Partnership
• Amazon Redshift Overview
• Informatica Cloud Redshift Connector
• Demonstration
• Discussion
• Next Steps
2
Informatica: The Information Management Leader
B2B Data Exchange
Informatica supports the
requirements of cross-organizational
data exchange, so users apply
familiar & trusted data integration
tools and techniques to the growing
practice of B2B data integration.
Cloud Data IntegrationEnterprise Data Integration
Complex Event Processing
Informatica received high praise for
its services from customers. For
deployments involving systems
monitoring use cases, Informatica
offers a five-day stand‐up of
RulePoint.
Ultra Messaging
In spite of the new
entrants, Informatica remains the
market leader in this highly
demanding part of the messaging
market.
Data Quality Master Data Management
Application ILM
Informatica Cloud: our fastest growing product line
Today’s Focus: Cloud Data Integration
4
Informatica Cloud and Amazon Redshift:
Enabling cost-effective data warehousing
• Redshift Connector pre-release announced in February
• General availability this month (August)
5
InformaticaCloud.com/Amazon-Redshift
Rahul Pathak | rapathak@amazon.com | @rahulpathak
Senior Product Manager
Amazon Redshift
AWS Database Services
Amazon RDS
Fully managed SQL database service for OLTP
workloads
Amazon
DynamoDB
Fully managed NoSQL service for massively
scalable, high throughput, low latency
workloads
Amazon
Redshift
Fully managed fast and powerful, petabyte-
scale data warehouse service
Amazon
ElastiCache
Fully managed Memcached-compliant in
memory caching service
We set out to build…
A fast and powerful, petabyte-scale data warehouse that is:
A Lot Faster
A Lot Cheaper
A Lot Simpler
Amazon Redshift
Data warehousing done the AWS way
• Pay as you go, no up front costs
• Fast, cheap, easy to use
• SQL
• Easy to provision
Common Customer Use Cases
• Reduce costs by
extending DW rather than
adding HW
• Migrate completely from
existing DW systems
• Respond faster to
business; provision in
minutes
• Improve performance by
an order of magnitude
• Make more data
available for analysis
• Access business data via
standard reporting tools
• Add analytic functionality
to applications
• Scale DW capacity as
demand grows
• Reduce HW & SW costs
by an order of magnitude
Traditional Enterprise DW Companies with Big Data SaaS Companies
Progress Since Launch on Feb 14, 2013
• Fastest growing service in AWS history
• Well over 1,000 customers; adding over 100 per week
• Obtained SOC1 & SOC2 certification with more in progress
• Deployed in US East (N. Virginia), US West (Oregon), EU
(Ireland) and Asia Pacific (Tokyo)
• Additional global regions coming soon
Amazon Redshift Customers
• 5x – 20x reduction in query times; 4x cost reduction over HIVE
• 20x – 40x reduction in query times
• Nokia: 50% reduction in costs, 2x improvement in query times
Amazon Redshift Customer: bit.ly
“When we want to answer a
question with Redshift, we
just write a SQL query and
get an answer within a few
minutes – if not seconds.”
- Sean O’Connor, Engineer at bit.ly
Bit.ly provides social link sharing
analytics, managing over 300
million shortens and 5 billion
clicks each month
14
Amazon Redshift Customer: HasOffers
“Amazon Redshift introduces a
major opportunity to improve
the performance of our real-
time reporting, allowing us to
run queries up to 50 times faster
than our current OLAP solution.”
- Niek Sanders, VP of Engineering, HasOffers
HasOffers records and reports
billions of desktop and mobile
interactions for performance
marketers
Amazon Redshift Customer: Infor
“This is the formula for fast and broad
adoption, where customers can get
consistent, accurate, and useful
data fast - in weeks not months or
years.”
- Ali Shadman, SVP, Business Cloud & Upgrades, Infor
Infor is the world’s third largest
ERP vendor, serving over 70,000
customers in 194 countries
Amazon Redshift dramatically reduces I/O
• Data compression
• Zone maps
• Direct-attached storage
• Large data block sizes
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
• With row storage you do
unnecessary I/O
• To get total amount, you
have to read everything
Amazon Redshift dramatically reduces I/O
• Data compression
• Zone maps
• Direct-attached storage
• Large data block sizes
• With column storage, you
only read the data you need
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
• Large data block sizes
• Columnar compression saves
space & reduces I/O
• Amazon Redshift analyzes
and compresses your data
analyze compression listing;
Table | Column | Encoding
---------+----------------+----------
listing | listid | delta
listing | sellerid | delta32k
listing | eventid | delta32k
listing | dateid | bytedict
listing | numtickets | bytedict
listing | priceperticket | delta32k
listing | totalprice | mostly32
listing | listtime | raw
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Direct-attached storage
• Large data block sizes
• Track of the minimum
and maximum value for
each block
• Skip over blocks that
don’t contain the data
needed for a given query
• Minimize unnecessary I/O
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
• Large data block sizes
• Use direct-attached storage
to maximize throughput
• Hardware optimized for high
performance data
processing
• Large block sizes to make
the most of each read
• Amazon Redshift manages
durability for you
Amazon Redshift architecture
• Leader Node
– SQL endpoint
– Stores metadata
– Coordinates query execution
• Compute Nodes
– Local, columnar storage
– Execute queries in parallel
– Load, backup, restore via
Amazon S3
– Parallel load from Amazon
DynamoDB
• Single node version available
10 GigE
(HPC)
Ingestion
Backup
Restore
JDBC/ODBC
Amazon Redshift runs on optimized hardware
HS1.8XL: 128 GB RAM, 16 Cores, 24 Spindles, 16 TB compressed user storage, 2 GB/sec scan rate
HS1.XL: 16 GB RAM, 2 Cores, 3 Spindles, 2 TB compressed customer storage
• Optimized for I/O intensive workloads
• High disk density
• Runs in HPC - fast network
• HS1.8XL available on Amazon EC2
Amazon Redshift lets you start small and grow big
Extra Large Node (HS1.XL)
3 spindles, 2 TB, 16 GB RAM, 2 cores
Single Node (2 TB)
Cluster 2-32 Nodes (4 TB – 64 TB)
Eight Extra Large Node (HS1.8XL)
24 spindles, 16 TB, 128 GB RAM, 16 cores, 10 GigE
Cluster 2-100 Nodes (32 TB – 1.6 PB)
Note: Nodes not to scale
Amazon Redshift is priced to let you analyze all your data
Simple Pricing
Number of Nodes x Cost per Hour
No charge for Leader Node
No upfront costs
Pay as you go
Price Per Hour for
HS1.XL Single Node
Effective Hourly
Price Per TB
Effective Annual
Price per TB
On-Demand $ 0.850 $ 0.425 $ 3,723
1 Year Reservation $ 0.500 $ 0.250 $ 2,190
3 Year Reservation $ 0.228 $ 0.114 $ 999
Amazon Redshift is easy to use
• Provision in minutes
• Monitor query
performance
• Point and click resize
• Built in security
• Automatic backups
Slides not intended for redistribution.
Amazon Redshift has security built-in
• SSL to secure data in transit
• Encryption to secure data at rest
– AES-256; hardware accelerated
– All blocks on disks and in Amazon
S3 encrypted
• No direct access to compute
nodes
• Amazon VPC support
Slides not intended for redistribution.
10 GigE
(HPC)
Ingestion
Backup
Restore
Customer VPC
Internal
Security
Group
JDBC/ODBC
Amazon Redshift continuously backs up your data and
recovers from failures
• Replication within the cluster and backup to Amazon S3 to maintain
multiple copies of data at all times
• Backups to Amazon S3 are continuous, automatic, and incremental
– Designed for eleven nines of durability
• Continuous monitoring and automated recovery from failures of drives
and nodes
• Able to restore snapshots to any Availability Zone within a region
Slides not intended for redistribution.
Amazon Redshift works with your existing analysis tools
More coming soon…
JDBC/ODBC
Amazon Redshift
Amazon Redshift integrates with multiple data sources
Amazon Elastic
MapReduce
Amazon
DynamoDB
Amazon Elastic
Compute Cloud
(EC2)
AWS Storage
Gateway
Service
Amazon Simple
Storage Service
(S3)
Corporate
Data Center
Amazon Relational
Database Service
(RDS)
Amazon
Redshift
Today’s Agenda
• Informatica and Amazon Strategic Partnership
• Amazon Redshift Overview
• Informatica Cloud Redshift Connector
• Demonstration
• Discussion
• Next Steps
30
2
1
Informatica Cloud Architecture Overview
4Secure
Agent
Your Company 3
Marketplace
Amazon
Redshift
Map Once. Deploy Anywhere.
ON PREMISE HADOOP 3rd PARTY
APPLICATIONS
CLOUD
Cloud Amazon Redshift
Connector Demo
Nicolas Brisoux, Cloud Platform Adoption
Best practices to remember…
• The Amazon S3 bucket that holds the data files must be
created in the same region as your cluster
• Files are deleted from Amazon S3 bucket when upload is
complete
• Choose a batch size where the number of batches
matches the number of slices in your cluster
• Each XL node has 2 slices, each 8XL node has 16
• If you have a 2 node XL cluster and 40,000 rows of data,
choose a batch size of 10,000
• The Informatica Cloud Redshift connector can maximize
Amazon’s parallel processing capabilities this way
Informatica Cloud Amazon Redshift demonstration
Firewall
Informatica Cloud
Secure Agent
Metadata Mappings
Authenticate and retrieve Data
Synchronization Task
1
1
Retrieve Account Data2
2
3 Perform lookup on SLA level
3
4
4
Put Account Data & SLA Level into
Flat File
5 Transferred compressed Flat File
5
6 Initiate load from Amazon S3
6
7 Load data into Amazon Redshift
7
PowerCenter Mappings and Informatica Cloud
• If you want to reuse your existing PowerCenter mappings
with Informatica Cloud and Redshift you have 2 options:
• Use the PowerCenter Repository Manager to export your
existing workflows and import them into Informatica Cloud
using the PowerCenter Tasks feature
Or…
• Keep your existing mappings in PowerCenter and stage the
data
• Create a DSS task in Informatica Cloud to move the data to
Redshift from the staging area
• This task can be managed from PowerCenter
1
2
Why Informatica Cloud Integration for Redshift?
37
1 Map Once, Deploy Anywhere
2 Rapid Connectivity & Deployment
3 Advanced Integration Delivered Easily
4 Excellence in batch and real-time integration
InformaticaCloud.com
Next Steps
• Get started with Amazon Redshift
• Get started with Informatica Cloud
• InformaticaCloud.com
• Learn more about our Redshift Connector
• InformaticaCloud.com/Amazon-Redshift
38
Discussion
Rahul Pathak, Amazon Redshift Product Management
Nicolas Brisoux, Informatica Cloud Platform Adoption
Darren Cunningham, Informatica Cloud Marketing
@infacloud #redshift
InformaticaCloud.com
Big Data in the Cloud with Informatica Cloud and Amazon Redshift

Weitere ähnliche Inhalte

Andere mochten auch

Bill Burns-resume (2016)
Bill Burns-resume (2016)Bill Burns-resume (2016)
Bill Burns-resume (2016)wburns2015
 
Turbo Charge Your Trading Partner Networks!
Turbo Charge Your Trading Partner Networks!Turbo Charge Your Trading Partner Networks!
Turbo Charge Your Trading Partner Networks!SEEBURGER
 
Retail Technology - The Need for a New Approach
Retail Technology - The Need for a New ApproachRetail Technology - The Need for a New Approach
Retail Technology - The Need for a New ApproachOliver Guy
 
Retail Trends for 2016 & Beyond
Retail Trends for 2016 & BeyondRetail Trends for 2016 & Beyond
Retail Trends for 2016 & BeyondOliver Guy
 
API Management - ProcessForum Nordic, Nov.14 2013
API Management - ProcessForum Nordic, Nov.14 2013API Management - ProcessForum Nordic, Nov.14 2013
API Management - ProcessForum Nordic, Nov.14 2013Software AG
 
Software AG - Socio tecnológico en el proceso de la transformación digital de...
Software AG - Socio tecnológico en el proceso de la transformación digital de...Software AG - Socio tecnológico en el proceso de la transformación digital de...
Software AG - Socio tecnológico en el proceso de la transformación digital de...Andreas Jaffke
 
What's new in webMethods - Gareth Whitaker - Software AG
What's new in webMethods - Gareth Whitaker - Software AGWhat's new in webMethods - Gareth Whitaker - Software AG
What's new in webMethods - Gareth Whitaker - Software AGSoftware AG South Africa
 
Machine Learning 101
Machine Learning 101Machine Learning 101
Machine Learning 101Setu Chokshi
 
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...Amazon Web Services
 
Software_AG_Investor_Fact_Book _December 2015_tcm16-137105
Software_AG_Investor_Fact_Book _December 2015_tcm16-137105Software_AG_Investor_Fact_Book _December 2015_tcm16-137105
Software_AG_Investor_Fact_Book _December 2015_tcm16-137105Bapi Reddy Medapati
 
Realising IoT in Retail and Beyond
Realising IoT in Retail and BeyondRealising IoT in Retail and Beyond
Realising IoT in Retail and BeyondOliver Guy
 
API Management: Unlock the Value of Your Unique Assets with a Robust API
API Management: Unlock the Value of Your Unique Assets with a Robust APIAPI Management: Unlock the Value of Your Unique Assets with a Robust API
API Management: Unlock the Value of Your Unique Assets with a Robust APISoftware AG
 
Powering Digital Retail Transformation
Powering Digital Retail TransformationPowering Digital Retail Transformation
Powering Digital Retail TransformationOliver Guy
 

Andere mochten auch (13)

Bill Burns-resume (2016)
Bill Burns-resume (2016)Bill Burns-resume (2016)
Bill Burns-resume (2016)
 
Turbo Charge Your Trading Partner Networks!
Turbo Charge Your Trading Partner Networks!Turbo Charge Your Trading Partner Networks!
Turbo Charge Your Trading Partner Networks!
 
Retail Technology - The Need for a New Approach
Retail Technology - The Need for a New ApproachRetail Technology - The Need for a New Approach
Retail Technology - The Need for a New Approach
 
Retail Trends for 2016 & Beyond
Retail Trends for 2016 & BeyondRetail Trends for 2016 & Beyond
Retail Trends for 2016 & Beyond
 
API Management - ProcessForum Nordic, Nov.14 2013
API Management - ProcessForum Nordic, Nov.14 2013API Management - ProcessForum Nordic, Nov.14 2013
API Management - ProcessForum Nordic, Nov.14 2013
 
Software AG - Socio tecnológico en el proceso de la transformación digital de...
Software AG - Socio tecnológico en el proceso de la transformación digital de...Software AG - Socio tecnológico en el proceso de la transformación digital de...
Software AG - Socio tecnológico en el proceso de la transformación digital de...
 
What's new in webMethods - Gareth Whitaker - Software AG
What's new in webMethods - Gareth Whitaker - Software AGWhat's new in webMethods - Gareth Whitaker - Software AG
What's new in webMethods - Gareth Whitaker - Software AG
 
Machine Learning 101
Machine Learning 101Machine Learning 101
Machine Learning 101
 
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
 
Software_AG_Investor_Fact_Book _December 2015_tcm16-137105
Software_AG_Investor_Fact_Book _December 2015_tcm16-137105Software_AG_Investor_Fact_Book _December 2015_tcm16-137105
Software_AG_Investor_Fact_Book _December 2015_tcm16-137105
 
Realising IoT in Retail and Beyond
Realising IoT in Retail and BeyondRealising IoT in Retail and Beyond
Realising IoT in Retail and Beyond
 
API Management: Unlock the Value of Your Unique Assets with a Robust API
API Management: Unlock the Value of Your Unique Assets with a Robust APIAPI Management: Unlock the Value of Your Unique Assets with a Robust API
API Management: Unlock the Value of Your Unique Assets with a Robust API
 
Powering Digital Retail Transformation
Powering Digital Retail TransformationPowering Digital Retail Transformation
Powering Digital Retail Transformation
 

Mehr von Informatica Cloud

Informatica Cloud Summer 2016 Release Webinar Slides
Informatica Cloud Summer 2016 Release Webinar SlidesInformatica Cloud Summer 2016 Release Webinar Slides
Informatica Cloud Summer 2016 Release Webinar SlidesInformatica Cloud
 
Informatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release WebinarInformatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release WebinarInformatica Cloud
 
5 Ways to Make Waves with Informatica and Salesforce Analytics
5 Ways to Make Waves with Informatica and Salesforce Analytics5 Ways to Make Waves with Informatica and Salesforce Analytics
5 Ways to Make Waves with Informatica and Salesforce AnalyticsInformatica Cloud
 
Amp Your Customer Service Statistics by Improving Data in Salesforce Service ...
Amp Your Customer Service Statistics by Improving Data in Salesforce Service ...Amp Your Customer Service Statistics by Improving Data in Salesforce Service ...
Amp Your Customer Service Statistics by Improving Data in Salesforce Service ...Informatica Cloud
 
How Schneider Electric Transformed Front-office Operations With Real-time Dat...
How Schneider Electric Transformed Front-office Operations With Real-time Dat...How Schneider Electric Transformed Front-office Operations With Real-time Dat...
How Schneider Electric Transformed Front-office Operations With Real-time Dat...Informatica Cloud
 
Informatica Cloud Summer 2014 Presentation
Informatica Cloud Summer 2014 PresentationInformatica Cloud Summer 2014 Presentation
Informatica Cloud Summer 2014 PresentationInformatica Cloud
 
Accelerate Business Velocity with NetSuite and Salesforce Integration
Accelerate Business Velocity with NetSuite and Salesforce IntegrationAccelerate Business Velocity with NetSuite and Salesforce Integration
Accelerate Business Velocity with NetSuite and Salesforce IntegrationInformatica Cloud
 
Informatica Cloud Spring 2014 Launch Webinar Presentation
Informatica Cloud Spring 2014 Launch Webinar PresentationInformatica Cloud Spring 2014 Launch Webinar Presentation
Informatica Cloud Spring 2014 Launch Webinar PresentationInformatica Cloud
 
Caught in the SaaS data spiral?
Caught in the SaaS data spiral? Caught in the SaaS data spiral?
Caught in the SaaS data spiral? Informatica Cloud
 
Healthcare Payer and Provider Webinar
Healthcare Payer and Provider WebinarHealthcare Payer and Provider Webinar
Healthcare Payer and Provider WebinarInformatica Cloud
 
Tech Tuesdays SAP Connectivity
Tech Tuesdays SAP ConnectivityTech Tuesdays SAP Connectivity
Tech Tuesdays SAP ConnectivityInformatica Cloud
 
Silicon Valley Salesforce & Hybrid IT Strategy Breakfast
Silicon Valley Salesforce & Hybrid IT Strategy BreakfastSilicon Valley Salesforce & Hybrid IT Strategy Breakfast
Silicon Valley Salesforce & Hybrid IT Strategy BreakfastInformatica Cloud
 
Informatica Cloud 101: Fast Track to Integration with Intuit
Informatica Cloud 101: Fast Track to Integration with IntuitInformatica Cloud 101: Fast Track to Integration with Intuit
Informatica Cloud 101: Fast Track to Integration with IntuitInformatica Cloud
 

Mehr von Informatica Cloud (20)

Informatica Cloud Summer 2016 Release Webinar Slides
Informatica Cloud Summer 2016 Release Webinar SlidesInformatica Cloud Summer 2016 Release Webinar Slides
Informatica Cloud Summer 2016 Release Webinar Slides
 
Informatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release WebinarInformatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release Webinar
 
5 Ways to Make Waves with Informatica and Salesforce Analytics
5 Ways to Make Waves with Informatica and Salesforce Analytics5 Ways to Make Waves with Informatica and Salesforce Analytics
5 Ways to Make Waves with Informatica and Salesforce Analytics
 
Amp Your Customer Service Statistics by Improving Data in Salesforce Service ...
Amp Your Customer Service Statistics by Improving Data in Salesforce Service ...Amp Your Customer Service Statistics by Improving Data in Salesforce Service ...
Amp Your Customer Service Statistics by Improving Data in Salesforce Service ...
 
How Schneider Electric Transformed Front-office Operations With Real-time Dat...
How Schneider Electric Transformed Front-office Operations With Real-time Dat...How Schneider Electric Transformed Front-office Operations With Real-time Dat...
How Schneider Electric Transformed Front-office Operations With Real-time Dat...
 
Informatica Cloud Summer 2014 Presentation
Informatica Cloud Summer 2014 PresentationInformatica Cloud Summer 2014 Presentation
Informatica Cloud Summer 2014 Presentation
 
Accelerate Business Velocity with NetSuite and Salesforce Integration
Accelerate Business Velocity with NetSuite and Salesforce IntegrationAccelerate Business Velocity with NetSuite and Salesforce Integration
Accelerate Business Velocity with NetSuite and Salesforce Integration
 
Informatica Cloud Spring 2014 Launch Webinar Presentation
Informatica Cloud Spring 2014 Launch Webinar PresentationInformatica Cloud Spring 2014 Launch Webinar Presentation
Informatica Cloud Spring 2014 Launch Webinar Presentation
 
Caught in the SaaS data spiral?
Caught in the SaaS data spiral? Caught in the SaaS data spiral?
Caught in the SaaS data spiral?
 
Healthcare Payer and Provider Webinar
Healthcare Payer and Provider WebinarHealthcare Payer and Provider Webinar
Healthcare Payer and Provider Webinar
 
Summer School Lesson 3
Summer School Lesson 3Summer School Lesson 3
Summer School Lesson 3
 
TechTuesdays Session 2
TechTuesdays Session 2TechTuesdays Session 2
TechTuesdays Session 2
 
Tech Tuesdays Session 1
Tech Tuesdays Session 1Tech Tuesdays Session 1
Tech Tuesdays Session 1
 
Tech Tuesdays SAP Connectivity
Tech Tuesdays SAP ConnectivityTech Tuesdays SAP Connectivity
Tech Tuesdays SAP Connectivity
 
I
II
I
 
Silicon Valley Salesforce & Hybrid IT Strategy Breakfast
Silicon Valley Salesforce & Hybrid IT Strategy BreakfastSilicon Valley Salesforce & Hybrid IT Strategy Breakfast
Silicon Valley Salesforce & Hybrid IT Strategy Breakfast
 
Hybrid ICC
Hybrid ICCHybrid ICC
Hybrid ICC
 
Summer School Lesson 1
Summer School Lesson 1Summer School Lesson 1
Summer School Lesson 1
 
Informatica Cloud 101: Fast Track to Integration with Intuit
Informatica Cloud 101: Fast Track to Integration with IntuitInformatica Cloud 101: Fast Track to Integration with Intuit
Informatica Cloud 101: Fast Track to Integration with Intuit
 
Hybrid IT: Legg Mason
Hybrid IT: Legg MasonHybrid IT: Legg Mason
Hybrid IT: Legg Mason
 

Kürzlich hochgeladen

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Big Data in the Cloud with Informatica Cloud and Amazon Redshift

  • 1. Cloud and Amazon Redshift Rahul Pathak, Amazon Redshift Product Management Nicolas Brisoux, Informatica Cloud Platform Adoption Darren Cunningham, Informatica Cloud Marketing @infacloud #redshift
  • 2. Today’s Agenda • Informatica and Amazon Strategic Partnership • Amazon Redshift Overview • Informatica Cloud Redshift Connector • Demonstration • Discussion • Next Steps 2
  • 3. Informatica: The Information Management Leader B2B Data Exchange Informatica supports the requirements of cross-organizational data exchange, so users apply familiar & trusted data integration tools and techniques to the growing practice of B2B data integration. Cloud Data IntegrationEnterprise Data Integration Complex Event Processing Informatica received high praise for its services from customers. For deployments involving systems monitoring use cases, Informatica offers a five-day stand‐up of RulePoint. Ultra Messaging In spite of the new entrants, Informatica remains the market leader in this highly demanding part of the messaging market. Data Quality Master Data Management Application ILM
  • 4. Informatica Cloud: our fastest growing product line Today’s Focus: Cloud Data Integration 4
  • 5. Informatica Cloud and Amazon Redshift: Enabling cost-effective data warehousing • Redshift Connector pre-release announced in February • General availability this month (August) 5 InformaticaCloud.com/Amazon-Redshift
  • 6. Rahul Pathak | rapathak@amazon.com | @rahulpathak Senior Product Manager Amazon Redshift
  • 7. AWS Database Services Amazon RDS Fully managed SQL database service for OLTP workloads Amazon DynamoDB Fully managed NoSQL service for massively scalable, high throughput, low latency workloads Amazon Redshift Fully managed fast and powerful, petabyte- scale data warehouse service Amazon ElastiCache Fully managed Memcached-compliant in memory caching service
  • 8. We set out to build… A fast and powerful, petabyte-scale data warehouse that is: A Lot Faster A Lot Cheaper A Lot Simpler Amazon Redshift
  • 9. Data warehousing done the AWS way • Pay as you go, no up front costs • Fast, cheap, easy to use • SQL • Easy to provision
  • 10. Common Customer Use Cases • Reduce costs by extending DW rather than adding HW • Migrate completely from existing DW systems • Respond faster to business; provision in minutes • Improve performance by an order of magnitude • Make more data available for analysis • Access business data via standard reporting tools • Add analytic functionality to applications • Scale DW capacity as demand grows • Reduce HW & SW costs by an order of magnitude Traditional Enterprise DW Companies with Big Data SaaS Companies
  • 11. Progress Since Launch on Feb 14, 2013 • Fastest growing service in AWS history • Well over 1,000 customers; adding over 100 per week • Obtained SOC1 & SOC2 certification with more in progress • Deployed in US East (N. Virginia), US West (Oregon), EU (Ireland) and Asia Pacific (Tokyo) • Additional global regions coming soon
  • 12. Amazon Redshift Customers • 5x – 20x reduction in query times; 4x cost reduction over HIVE • 20x – 40x reduction in query times • Nokia: 50% reduction in costs, 2x improvement in query times
  • 13. Amazon Redshift Customer: bit.ly “When we want to answer a question with Redshift, we just write a SQL query and get an answer within a few minutes – if not seconds.” - Sean O’Connor, Engineer at bit.ly Bit.ly provides social link sharing analytics, managing over 300 million shortens and 5 billion clicks each month
  • 14. 14 Amazon Redshift Customer: HasOffers “Amazon Redshift introduces a major opportunity to improve the performance of our real- time reporting, allowing us to run queries up to 50 times faster than our current OLAP solution.” - Niek Sanders, VP of Engineering, HasOffers HasOffers records and reports billions of desktop and mobile interactions for performance marketers
  • 15. Amazon Redshift Customer: Infor “This is the formula for fast and broad adoption, where customers can get consistent, accurate, and useful data fast - in weeks not months or years.” - Ali Shadman, SVP, Business Cloud & Upgrades, Infor Infor is the world’s third largest ERP vendor, serving over 70,000 customers in 194 countries
  • 16. Amazon Redshift dramatically reduces I/O • Data compression • Zone maps • Direct-attached storage • Large data block sizes ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • With row storage you do unnecessary I/O • To get total amount, you have to read everything
  • 17. Amazon Redshift dramatically reduces I/O • Data compression • Zone maps • Direct-attached storage • Large data block sizes • With column storage, you only read the data you need ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375
  • 18. Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage • Large data block sizes • Columnar compression saves space & reduces I/O • Amazon Redshift analyzes and compresses your data analyze compression listing; Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw
  • 19. Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Direct-attached storage • Large data block sizes • Track of the minimum and maximum value for each block • Skip over blocks that don’t contain the data needed for a given query • Minimize unnecessary I/O
  • 20. Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage • Large data block sizes • Use direct-attached storage to maximize throughput • Hardware optimized for high performance data processing • Large block sizes to make the most of each read • Amazon Redshift manages durability for you
  • 21. Amazon Redshift architecture • Leader Node – SQL endpoint – Stores metadata – Coordinates query execution • Compute Nodes – Local, columnar storage – Execute queries in parallel – Load, backup, restore via Amazon S3 – Parallel load from Amazon DynamoDB • Single node version available 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  • 22. Amazon Redshift runs on optimized hardware HS1.8XL: 128 GB RAM, 16 Cores, 24 Spindles, 16 TB compressed user storage, 2 GB/sec scan rate HS1.XL: 16 GB RAM, 2 Cores, 3 Spindles, 2 TB compressed customer storage • Optimized for I/O intensive workloads • High disk density • Runs in HPC - fast network • HS1.8XL available on Amazon EC2
  • 23. Amazon Redshift lets you start small and grow big Extra Large Node (HS1.XL) 3 spindles, 2 TB, 16 GB RAM, 2 cores Single Node (2 TB) Cluster 2-32 Nodes (4 TB – 64 TB) Eight Extra Large Node (HS1.8XL) 24 spindles, 16 TB, 128 GB RAM, 16 cores, 10 GigE Cluster 2-100 Nodes (32 TB – 1.6 PB) Note: Nodes not to scale
  • 24. Amazon Redshift is priced to let you analyze all your data Simple Pricing Number of Nodes x Cost per Hour No charge for Leader Node No upfront costs Pay as you go Price Per Hour for HS1.XL Single Node Effective Hourly Price Per TB Effective Annual Price per TB On-Demand $ 0.850 $ 0.425 $ 3,723 1 Year Reservation $ 0.500 $ 0.250 $ 2,190 3 Year Reservation $ 0.228 $ 0.114 $ 999
  • 25. Amazon Redshift is easy to use • Provision in minutes • Monitor query performance • Point and click resize • Built in security • Automatic backups Slides not intended for redistribution.
  • 26. Amazon Redshift has security built-in • SSL to secure data in transit • Encryption to secure data at rest – AES-256; hardware accelerated – All blocks on disks and in Amazon S3 encrypted • No direct access to compute nodes • Amazon VPC support Slides not intended for redistribution. 10 GigE (HPC) Ingestion Backup Restore Customer VPC Internal Security Group JDBC/ODBC
  • 27. Amazon Redshift continuously backs up your data and recovers from failures • Replication within the cluster and backup to Amazon S3 to maintain multiple copies of data at all times • Backups to Amazon S3 are continuous, automatic, and incremental – Designed for eleven nines of durability • Continuous monitoring and automated recovery from failures of drives and nodes • Able to restore snapshots to any Availability Zone within a region Slides not intended for redistribution.
  • 28. Amazon Redshift works with your existing analysis tools More coming soon… JDBC/ODBC Amazon Redshift
  • 29. Amazon Redshift integrates with multiple data sources Amazon Elastic MapReduce Amazon DynamoDB Amazon Elastic Compute Cloud (EC2) AWS Storage Gateway Service Amazon Simple Storage Service (S3) Corporate Data Center Amazon Relational Database Service (RDS) Amazon Redshift
  • 30. Today’s Agenda • Informatica and Amazon Strategic Partnership • Amazon Redshift Overview • Informatica Cloud Redshift Connector • Demonstration • Discussion • Next Steps 30
  • 31. 2 1 Informatica Cloud Architecture Overview 4Secure Agent Your Company 3 Marketplace Amazon Redshift
  • 32. Map Once. Deploy Anywhere. ON PREMISE HADOOP 3rd PARTY APPLICATIONS CLOUD
  • 33. Cloud Amazon Redshift Connector Demo Nicolas Brisoux, Cloud Platform Adoption
  • 34. Best practices to remember… • The Amazon S3 bucket that holds the data files must be created in the same region as your cluster • Files are deleted from Amazon S3 bucket when upload is complete • Choose a batch size where the number of batches matches the number of slices in your cluster • Each XL node has 2 slices, each 8XL node has 16 • If you have a 2 node XL cluster and 40,000 rows of data, choose a batch size of 10,000 • The Informatica Cloud Redshift connector can maximize Amazon’s parallel processing capabilities this way
  • 35. Informatica Cloud Amazon Redshift demonstration Firewall Informatica Cloud Secure Agent Metadata Mappings Authenticate and retrieve Data Synchronization Task 1 1 Retrieve Account Data2 2 3 Perform lookup on SLA level 3 4 4 Put Account Data & SLA Level into Flat File 5 Transferred compressed Flat File 5 6 Initiate load from Amazon S3 6 7 Load data into Amazon Redshift 7
  • 36. PowerCenter Mappings and Informatica Cloud • If you want to reuse your existing PowerCenter mappings with Informatica Cloud and Redshift you have 2 options: • Use the PowerCenter Repository Manager to export your existing workflows and import them into Informatica Cloud using the PowerCenter Tasks feature Or… • Keep your existing mappings in PowerCenter and stage the data • Create a DSS task in Informatica Cloud to move the data to Redshift from the staging area • This task can be managed from PowerCenter 1 2
  • 37. Why Informatica Cloud Integration for Redshift? 37 1 Map Once, Deploy Anywhere 2 Rapid Connectivity & Deployment 3 Advanced Integration Delivered Easily 4 Excellence in batch and real-time integration InformaticaCloud.com
  • 38. Next Steps • Get started with Amazon Redshift • Get started with Informatica Cloud • InformaticaCloud.com • Learn more about our Redshift Connector • InformaticaCloud.com/Amazon-Redshift 38
  • 39. Discussion Rahul Pathak, Amazon Redshift Product Management Nicolas Brisoux, Informatica Cloud Platform Adoption Darren Cunningham, Informatica Cloud Marketing @infacloud #redshift InformaticaCloud.com

Hinweis der Redaktion

  1. Announced RedshiftProvision multiple database nodes on demandStart large petabyte-scale data warehousing projects soonerOffload raw data from on-premise databases for cost effective processing
  2. Use Amazon Redshift for easy scalabilityMigrate completely from existing DW to Amazon RedshiftAnalyze data that was previously too expensive to put into a DWDeploy Redshift because provisioning existing DW systems takes monthsReplace HIVE with Amazon Redshift if they were using HIVE to save money
  3. Encryption enhancements
  4. Airbnb: 5x – 20x reduction in query times; 4x reduction in cost over HIVEAccordant Media: 20x – 40x reduction in query timesMeteor Entertainment: Queries across millions of rows running in < 10sNokia: 50% reduction in costs, 2x improvement in query times
  5. Queries across billions of rows running in < 1 min
  6. Using Amazon Redshift to power its upcoming SkyVault productFully managed by Infor to enable customers to run business analyticsChose Redshift for performance, cost, ease-of-use, and scalability
  7. Read only the data you need
  8. Read only the data you need
  9. Read only the data you need
  10. Read only the data you need
  11. Read only the data you need
  12. Informatica Cloud is powered by the Vibe, the same technology that powers the virtual data machine that runs the secure agent. Thus, you use Informatica Cloud to store the various metadata mappings, and upon run-time, the data moves directly from source to target through the execution of the Vibe Secure Agent.
  13. Vibe is the industry’s first and only embeddable virtual data machine to access, aggregate and manage data – regardless of data type, source, volume, compute platform or user. It lets you map once, and deploy anywhere. So you can take your logic that may have defined on-premise, then move it to the cloud. And then move it to Hadoop, or embed it in an application– without recoding.This makes your architecture faster, more flexible, and futureproof.Business BenefitFive time faster turn-around from business idea to solutionAdapt the technology to your business, not vice-versaUtilize all your data, regardless of location, type or volumeIT BenefitFive times faster project deliveryEliminate skills gaps for adopting new technologies and approachesReduce cost of maintaining complex assortment of technologies