SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building Serverless Analytics on AWS
Ivan Cheng
Solutions Architect
AWS
Steven Hsieh
Engineer
TrendMicro
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
COLLECT STORE
PROCESS/
ANALYZE
CONSUME
Data Answers
Time to answer (Latency)
Throughput
Cost
Data Processing START HERE
WITH A BUSINESS CASE
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
To answer new questions quickly, we look to a
modern data architecture design
Massive upfront costs
Overprovisioned capacity
Long implementation times
Pay as you go, for what you use
Decoupled pipelines and engines
Experimentation platform
Ingest/
Collect
Consume/
visualize
Store Process/
analyze
1 4
0 9
5
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Is Changing  Analytics Are Adopting
Capture and store
new data at PB-EB
scale
Do new type of analytics
in a cost effective way
• Machine learning
• Big data processing
• Real-time analytics
• Full-text search
New types of
analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
More data lakes and analytics than anywhere else
More than 10,000 data lakes on AWS
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Movement
Analytics
AWS Analytics Portfolio
Broadest and deepest portfolio, purpose-built for builders
+ 10 more
Redshift
EMR (Spark &
Hadoop)
Athena
Elasticsearch
Service
Kinesis Data
Analytics
Glue (Spark &
Python)
S3/Glacier GlueLake
Formation
Visualization, Engagement, & Machine Learning
QuickSight SageMaker Comprehend Lex Polly Rekognition Translate Transcribe
Deep Learning
AMIs
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Managed Streaming for Kafka
Data Lake Infrastructure & Management
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Snowball
Snowmobile Kinesis
Data Firehose
Kinesis
Data Streams
S3
Redshift
EMR
Athena Kinesis
Elasticsearch Service
Kinesis
Video Streams
AI Services
QuickSight
Durable and available; Exabyte scale
Secure, compliant, auditable
Rapid ingest and transformation
Schema on read
Decoupling of compute and storage
On-demand resources, tiering, cost choices
Data Lake Robust Infrastructure
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ingest Consume
Amazon Kinesis
BI Tools
Data Analytics Pipeline
Database
Migration Service
AWS Snowball
Amazon MSK
Amazon
Athena
Amazon
EMR
Amazon
Redshift
Amazon
Elasticsearch
Process & Analyze
Jupyter
Notebooks
Amazon
API Gateway
Amazon
QuickSight
Catalog
AWS Glue
Store
Amazon S3
Store
Amazon S3
Data sources
Web logs /
cookies
ERP
Connected
devices
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Virtual
machines
Managed
services
Serverless
Cloud Services Evolution
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Serverless analytics
Deliver on-demand analytics on the data lake
S3
Data lake
Glue
(ETL &
Data Catalog)
Athena
QuickSight
Serverless. Zero
infrastructure. Zero
administration
Never pay for
idle resources
$
Availability and
fault tolerance
built in
Automatically scales
resources with
usage
AI/ML
Devices Web Sensors Social
Kinesis Data
Firehose
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Athena-Interactive Analysis
Interactive query service to analyze data in Amazon S3 using standard SQL
No infrastructure to set up or manage and no data to load
Supports Multiple Data Formats – Define Schema on Demand
Fast. Really
Fast.
Interactive performance
even for large datasets.
Athena automatically
executes queries in parallel,
so most results come back
within seconds.
Open. Powerful.
Standard
Start Querying
Instantly
Pay Per Query
Athena is serverless. Just
point to your data in
Amazon S3, define the
schema, and start querying
using the built-in query
editor.
Amazon Athena uses Presto
with ANSI SQL support and
works with a variety of
standard data formats,
including CSV, JSON, ORC,
Avro, and Parquet
With Amazon Athena, you
pay only for the queries that
you run. You are charged $5
per terabyte scanned by your
queries.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon S3 Amazon Athena
Data catalog
Data Engineer Data Consumer
AWS Tools and SDKs
AWS Management Console
Amazon QuickSight
Amazon SageMaker
User
Analyst
Data Scientist
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data consumption – Automated Reporting
athena.startQueryExecution("SELECT * FROM business_view”)
1
2
3 4
5
1. Schedule query
2. Track QueryID for status
3. Query results to Amazon S3
4. New file trigger
5. Job complete notification
Email
notification
Query_ID
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Athena Workgroups
Athena Workgroups are used to isolate queries
between different teams, workloads or applications,
and to set limits on amount of data each query or the
entire workgroup can process
Workload Isolation Query Metrics Cost Controls
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Visualize your data with your favorite tools
Featured Athena Partners
Amazon QuickSight
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why QuickSight
Scalable
From 10 users to 10,000, QuickSight seamlessly grows
with you with no need for additional servers or
infrastructure.
No Servers to Manage
QuickSight is a fully managed cloud service. There is
no infrastructure to maintain or upgrade and no
upfront costs.
Fully integrated
QuickSight integrates with your other AWS services
and data sources giving you everything you need to
build an end-to-end cloud analytics solution.
Pay For What You Use
Instead of buying costly licenses for all of your users,
QuickSight allows you to share dashboards and reports
and only pay when users access them.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Connect to your data, wherever it is
QuickSight allows you to connect to AWS data sources, Private VPC subnets, on-premise and
hosted databases and third party business applications.
On-premises
Securely connect to on-premise
databases and flat files like
Excel and CSV
In the cloud
Connect to hosted database, big
data formats, and secure VPCs
Applications
Connect directly to third
party business applications
• Salesforce
• Square
• Adobe Analytics
• Jira
• ServiceNow
• Twitter
• Github
• Redshift
• RDS
• S3
• Athena
• Aurora
• Teradata
• MySQL
• Presto
• Spark
• SQL Server
• Postgre SQL
• MariaDB
• Snowflake
• IoT Analytics
• Excel
• CSV
• Teradata
• MySQL
• SQL Server
• PostgreSQL
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon S3
(Processed Data)
Amazon
Athena
Amazon
QuickSight
Demo Scenario
Glue Data
catalog
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building AWS Multi Account Cost
Analytics Solution at Scale
Steven Hsieh
Engineer
TrendMicro
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
About Me
Steven Hsieh
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Background
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pillars of
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Design Principles for Cost Optimization
• Adopt a consumption model
• Measure overall efficiency
• Stop spending money on data center operations
• Analyze and attribute expenditure
• Use managed services to reduce cost of ownership
Pay as you go / need
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Challenges
Large Scale Accounts
• Almost 400 accounts
• Hard management via
AWS console
Multiple Data Sources
• Billing data
• Utilization data of AWS
services ( e.g., EC2, S3)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Challenges
Permission Management
• Multiple teams
• Authorization of
different team
Insight for Better Design
• Finding insight for
design improvement
• Providing utilization
visibility for design
change
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Other solution we have tried…
AWS Billing Console
• Hard to use in large
scale
• Single data source
Amazon Redshift
• Cost Model
• ETL
3rd Party BI Tool
• Expensive license
fee
• Additional
operation cost
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ideas
+ +
• Data persistence in Amazon S3
• Data querying via Amazon Athena
• Dashboard / Reporting via Amazon QuickSight
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Challenges
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Global Accelerator
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Using SQS to trigger parallel tasks
• Lambda limitation:
• Timeout: 15 minutes
• /tmp: 512 MB
• Spot instance interruptions
• Fargate limitation:
• Container storage: 10 GB
• Run-task: 10
• Using assume role to collect data
across accounts
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Using SNS to trace data
uploading result
• Preprocessing data before
uploading to S3
• Only creator can modify
datasets in QuickSight
• Create view in Athena
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Global Accelerator
• Web application host in
Fargate
• Lambda Integration with
QuickSight for embedded
URL.
• Using ALB to handle all
HTTPS interaction.
• Permission & Metadata in
DynamoDB
• ADFS Federation using
Cognito
• Performance Improvement
via AWS Global Accelerator
• Web Security Enhancement
via AWS WAF
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Quick Development & Evaluation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Low Utilization & Right Sizing
• Trusted Advisor Checks
• Low utilization EC2 instances: CPU was 10% or less and
network I/O was 5 MB or less on 4 or more days during last
14 days
• Right Sizing
• Analysis metric data to recommend proper instance type and
size
• Awareness of NIC driver and Linux virtualization type issue
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Saving Polar Bear
• Analyzing the CPU utilization pattern
• Tuning off non-production instances can saving almost
70% cost
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Recap
• Using cost effective way to build the end-to-end BI
solution
• 2 power users $36 + ALB $18 = $54
• Using flexible reporting architecture to integrate with
multiple data sources
• Quick win & timely data driven decision
• Validating innovation idea (e.g., the potential saving of polar bear
project)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Summary
• More organizations building datalake on cloud to stay competitive
• AWS provides the broadest and deepest portfolio of databases and
analytics services includes machine learning.
• Serverless Analytics helps you build modern data pipeline with
increased agility and lower cost.
• Learn more at: https://aws.amazon.com/big-data/datalakes-and-analytics/
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ivan Cheng
Solutions Architect
AWS
Steven Hsieh
Engineer
TrendMicro

Weitere ähnliche Inhalte

Was ist angesagt?

Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Amazon Web Services
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices
Build Data Lakes & Analytics on AWS: Patterns & Best PracticesBuild Data Lakes & Analytics on AWS: Patterns & Best Practices
Build Data Lakes & Analytics on AWS: Patterns & Best PracticesAmazon Web Services
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay NordicsBuilding a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay Nordicsjavier ramirez
 
Visualization with Amazon QuickSight
Visualization with Amazon QuickSightVisualization with Amazon QuickSight
Visualization with Amazon QuickSightAmazon Web Services
 
Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018Amazon Web Services
 
Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Amazon Web Services
 
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018Amazon Web Services
 
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...Amazon Web Services
 
Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200Amazon Web Services
 
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Amazon Web Services
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfAmazon Web Services
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and QuboleAmazon Web Services
 
Architecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWSArchitecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWSAmazon Web Services
 

Was ist angesagt? (20)

Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices
Build Data Lakes & Analytics on AWS: Patterns & Best PracticesBuild Data Lakes & Analytics on AWS: Patterns & Best Practices
Build Data Lakes & Analytics on AWS: Patterns & Best Practices
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay NordicsBuilding a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay Nordics
 
Building-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWSBuilding-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWS
 
Visualization with Amazon QuickSight
Visualization with Amazon QuickSightVisualization with Amazon QuickSight
Visualization with Amazon QuickSight
 
Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018
 
Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28
 
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
 
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
 
Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200
 
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
How Amazon uses AWS Analytics
How Amazon uses AWS AnalyticsHow Amazon uses AWS Analytics
How Amazon uses AWS Analytics
 
Architecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWSArchitecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWS
 

Ähnlich wie 在 AWS 上構建無服務器分析

Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSAmazon Web Services
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSSteven Hsieh
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaAmazon Web Services
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...AWS Riyadh User Group
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSAmazon Web Services
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudAlluxio, Inc.
 
Leveraging Data Analytics in the Cloud to Support Data-Driven Decisions
Leveraging Data Analytics in the Cloud to Support Data-Driven DecisionsLeveraging Data Analytics in the Cloud to Support Data-Driven Decisions
Leveraging Data Analytics in the Cloud to Support Data-Driven DecisionsAmazon Web Services
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Immersion Day - Democratize o acesso ao dado
Immersion Day - Democratize o acesso ao dadoImmersion Day - Democratize o acesso ao dado
Immersion Day - Democratize o acesso ao dadoAmazon Web Services LATAM
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/MLPreparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/MLAmazon Web Services
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWSAmazon Web Services
 
AWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAdir Sharabi
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Amazon Web Services
 
Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
 Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
Big Data Meets AI - Driving Insights and Adding Intelligence to Your SolutionsAmazon Web Services
 
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...Amazon Web Services
 
Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfAmazon Web Services
 
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Amazon Web Services
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudAmazon Web Services
 

Ähnlich wie 在 AWS 上構建無服務器分析 (20)

Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWS
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWS
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
 
Leveraging Data Analytics in the Cloud to Support Data-Driven Decisions
Leveraging Data Analytics in the Cloud to Support Data-Driven DecisionsLeveraging Data Analytics in the Cloud to Support Data-Driven Decisions
Leveraging Data Analytics in the Cloud to Support Data-Driven Decisions
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Immersion Day - Democratize o acesso ao dado
Immersion Day - Democratize o acesso ao dadoImmersion Day - Democratize o acesso ao dado
Immersion Day - Democratize o acesso ao dado
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/MLPreparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWS
 
AWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWS
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
 
Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
 Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
 
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
 
Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdf
 
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the Cloud
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

在 AWS 上構建無服務器分析

  • 1.
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Building Serverless Analytics on AWS Ivan Cheng Solutions Architect AWS Steven Hsieh Engineer TrendMicro
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. COLLECT STORE PROCESS/ ANALYZE CONSUME Data Answers Time to answer (Latency) Throughput Cost Data Processing START HERE WITH A BUSINESS CASE
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. To answer new questions quickly, we look to a modern data architecture design Massive upfront costs Overprovisioned capacity Long implementation times Pay as you go, for what you use Decoupled pipelines and engines Experimentation platform Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Is Changing  Analytics Are Adopting Capture and store new data at PB-EB scale Do new type of analytics in a cost effective way • Machine learning • Big data processing • Real-time analytics • Full-text search New types of analytics
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. More data lakes and analytics than anywhere else More than 10,000 data lakes on AWS
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Movement Analytics AWS Analytics Portfolio Broadest and deepest portfolio, purpose-built for builders + 10 more Redshift EMR (Spark & Hadoop) Athena Elasticsearch Service Kinesis Data Analytics Glue (Spark & Python) S3/Glacier GlueLake Formation Visualization, Engagement, & Machine Learning QuickSight SageMaker Comprehend Lex Polly Rekognition Translate Transcribe Deep Learning AMIs Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Managed Streaming for Kafka Data Lake Infrastructure & Management
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Snowball Snowmobile Kinesis Data Firehose Kinesis Data Streams S3 Redshift EMR Athena Kinesis Elasticsearch Service Kinesis Video Streams AI Services QuickSight Durable and available; Exabyte scale Secure, compliant, auditable Rapid ingest and transformation Schema on read Decoupling of compute and storage On-demand resources, tiering, cost choices Data Lake Robust Infrastructure
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ingest Consume Amazon Kinesis BI Tools Data Analytics Pipeline Database Migration Service AWS Snowball Amazon MSK Amazon Athena Amazon EMR Amazon Redshift Amazon Elasticsearch Process & Analyze Jupyter Notebooks Amazon API Gateway Amazon QuickSight Catalog AWS Glue Store Amazon S3 Store Amazon S3 Data sources Web logs / cookies ERP Connected devices
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Virtual machines Managed services Serverless Cloud Services Evolution
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless analytics Deliver on-demand analytics on the data lake S3 Data lake Glue (ETL & Data Catalog) Athena QuickSight Serverless. Zero infrastructure. Zero administration Never pay for idle resources $ Availability and fault tolerance built in Automatically scales resources with usage AI/ML Devices Web Sensors Social Kinesis Data Firehose
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Athena-Interactive Analysis Interactive query service to analyze data in Amazon S3 using standard SQL No infrastructure to set up or manage and no data to load Supports Multiple Data Formats – Define Schema on Demand Fast. Really Fast. Interactive performance even for large datasets. Athena automatically executes queries in parallel, so most results come back within seconds. Open. Powerful. Standard Start Querying Instantly Pay Per Query Athena is serverless. Just point to your data in Amazon S3, define the schema, and start querying using the built-in query editor. Amazon Athena uses Presto with ANSI SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, Avro, and Parquet With Amazon Athena, you pay only for the queries that you run. You are charged $5 per terabyte scanned by your queries.
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon S3 Amazon Athena Data catalog Data Engineer Data Consumer AWS Tools and SDKs AWS Management Console Amazon QuickSight Amazon SageMaker User Analyst Data Scientist
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data consumption – Automated Reporting athena.startQueryExecution("SELECT * FROM business_view”) 1 2 3 4 5 1. Schedule query 2. Track QueryID for status 3. Query results to Amazon S3 4. New file trigger 5. Job complete notification Email notification Query_ID
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Athena Workgroups Athena Workgroups are used to isolate queries between different teams, workloads or applications, and to set limits on amount of data each query or the entire workgroup can process Workload Isolation Query Metrics Cost Controls
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Visualize your data with your favorite tools Featured Athena Partners Amazon QuickSight
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Why QuickSight Scalable From 10 users to 10,000, QuickSight seamlessly grows with you with no need for additional servers or infrastructure. No Servers to Manage QuickSight is a fully managed cloud service. There is no infrastructure to maintain or upgrade and no upfront costs. Fully integrated QuickSight integrates with your other AWS services and data sources giving you everything you need to build an end-to-end cloud analytics solution. Pay For What You Use Instead of buying costly licenses for all of your users, QuickSight allows you to share dashboards and reports and only pay when users access them.
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Connect to your data, wherever it is QuickSight allows you to connect to AWS data sources, Private VPC subnets, on-premise and hosted databases and third party business applications. On-premises Securely connect to on-premise databases and flat files like Excel and CSV In the cloud Connect to hosted database, big data formats, and secure VPCs Applications Connect directly to third party business applications • Salesforce • Square • Adobe Analytics • Jira • ServiceNow • Twitter • Github • Redshift • RDS • S3 • Athena • Aurora • Teradata • MySQL • Presto • Spark • SQL Server • Postgre SQL • MariaDB • Snowflake • IoT Analytics • Excel • CSV • Teradata • MySQL • SQL Server • PostgreSQL
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon S3 (Processed Data) Amazon Athena Amazon QuickSight Demo Scenario Glue Data catalog
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Building AWS Multi Account Cost Analytics Solution at Scale Steven Hsieh Engineer TrendMicro
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. About Me Steven Hsieh
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Background
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Pillars of
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Design Principles for Cost Optimization • Adopt a consumption model • Measure overall efficiency • Stop spending money on data center operations • Analyze and attribute expenditure • Use managed services to reduce cost of ownership Pay as you go / need
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Challenges Large Scale Accounts • Almost 400 accounts • Hard management via AWS console Multiple Data Sources • Billing data • Utilization data of AWS services ( e.g., EC2, S3)
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Challenges Permission Management • Multiple teams • Authorization of different team Insight for Better Design • Finding insight for design improvement • Providing utilization visibility for design change
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Other solution we have tried… AWS Billing Console • Hard to use in large scale • Single data source Amazon Redshift • Cost Model • ETL 3rd Party BI Tool • Expensive license fee • Additional operation cost
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ideas + + • Data persistence in Amazon S3 • Data querying via Amazon Athena • Dashboard / Reporting via Amazon QuickSight
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Challenges
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Global Accelerator
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. • Using SQS to trigger parallel tasks • Lambda limitation: • Timeout: 15 minutes • /tmp: 512 MB • Spot instance interruptions • Fargate limitation: • Container storage: 10 GB • Run-task: 10 • Using assume role to collect data across accounts
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. • Using SNS to trace data uploading result • Preprocessing data before uploading to S3 • Only creator can modify datasets in QuickSight • Create view in Athena
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Global Accelerator • Web application host in Fargate • Lambda Integration with QuickSight for embedded URL. • Using ALB to handle all HTTPS interaction. • Permission & Metadata in DynamoDB • ADFS Federation using Cognito • Performance Improvement via AWS Global Accelerator • Web Security Enhancement via AWS WAF
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Quick Development & Evaluation
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Low Utilization & Right Sizing • Trusted Advisor Checks • Low utilization EC2 instances: CPU was 10% or less and network I/O was 5 MB or less on 4 or more days during last 14 days • Right Sizing • Analysis metric data to recommend proper instance type and size • Awareness of NIC driver and Linux virtualization type issue
  • 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Saving Polar Bear • Analyzing the CPU utilization pattern • Tuning off non-production instances can saving almost 70% cost
  • 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Recap • Using cost effective way to build the end-to-end BI solution • 2 power users $36 + ALB $18 = $54 • Using flexible reporting architecture to integrate with multiple data sources • Quick win & timely data driven decision • Validating innovation idea (e.g., the potential saving of polar bear project)
  • 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Summary • More organizations building datalake on cloud to stay competitive • AWS provides the broadest and deepest portfolio of databases and analytics services includes machine learning. • Serverless Analytics helps you build modern data pipeline with increased agility and lower cost. • Learn more at: https://aws.amazon.com/big-data/datalakes-and-analytics/
  • 41. Thank you! © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ivan Cheng Solutions Architect AWS Steven Hsieh Engineer TrendMicro