SlideShare ist ein Scribd-Unternehmen logo
1 von 44
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Specialist Solutions Architect, Data and Analytics, EMEA
July 5th, 2017
Full Stack Analytics on AWS
Ian Robinson
Forces and Trends
Cost Optimization
Licenses
Hardware
Data center and operations
Dark Data
Prematurely discarding data
Agility
Experimentation (data & tools)
Democratised Access to Data
Time-to-first-results
Terminate failed experiments early
From BI to Data Science
In-house data science
From back office to product
Storage is the Gravity for Cloud Applications
Storage is Job #1
Foundations: Storage, Discovery and Lifecycle
Secure, governed, scalable, cheap
Storage & Catalog
Secure, cost-effectivestorage in Amazon
S3. Robust metadata in AWSCatalog
Amazon EFS
File
Amazon EBS
Amazon EC2
Instance Store
Block
Amazon S3 Amazon Glacier
Object
Data Transfer
AWS Direct
Connect
AWS
Snowball
ISV Connectors Amazon
Kinesis
Firehose
S3 Transfer
Acceleration
Storage
Gateway
AWS Storage Platforms
Amazon S3 Amazon Glacier
Object
Object Storage is Foundational
EC2 Lambda EMR
Data
Pipeline
Kinesis
CloudFront
RDS DynamoDB RedShift
Database
AnalyticsCompute
Elastic
Transcoder
Content Delivery
S3 Data Lifecycle and Events
Standard
Active data Archive dataInfrequently accessed data
Standard - Infrequent
Access
Amazon Glacier
Create
Delete
Data Catalog
Scalable (secure, versioned, durable) storage +
Immutable data at every stage of its lifecycle +
Versioned schema and metadata
=
Data discovery, lineage and governance
AWS Glue: Components
Data Catalog
Crawl, store, search metadata in different data stores
Populate in a Hive metastore compliant catalog
Job Execution
Fully managed orchestration & execution of ETL jobs
Server-less execution model – no need to pre-provision
resources
Job Authoring
Author, edit, share ETL jobs in using your favorite tools
Store, share, re-use ETL code/script with Git integration
Manage table metadata through a Hive
metastore API or Hive SQL. Supported by
tools such as Hive, Presto, Spark, etc.
We added a few extensions:
 Search metadata for data discovery
 Connection info – JDBC URLs, credentials
 Classification for identifying and parsing files
 Versioning of table metadata as schemas
evolve and other metadata are updated
Populate using Hive DDL, bulk import, or
automatically through crawlers.
Glue Data Catalog
semi-structured
per-file schema
semi-structured
unified schema
identify file type
and parse files
enumerate
S3 objects
file 1
file 2
file N
…
int
array
intchar
struct
char int
array
struct
char
bool int
int
arraybool int
char
char int
custom classifiers
app log parser
metrics parser
…
system classifiers
JSON parser
CSV parser
Apache log parser
…
Crawlers: Automatic Schema Inference
Security is Job #0
Data Access & Authorisation
Give your users easy and secure access
Storage & Catalog
Secure, cost-effectivestorage in Amazon
S3. Robust metadata in AWSCatalog
Protect and Secure
Use entitlements to ensure data is secure and users’ identities are verified
AWS implements security at the data level,
not tool-by-tool
IAM
Amazon
S3
Amazon
ElastiCache
Amazon
DynamoDB
Amazon
EMR
Amazon
Kinesis
Amazon
Athena
Service API Access
Third Party Ecosystem Security Tools
Amazon
S3
AWS
CloudTrail
http://amzn.to/2tSimHj
Amazon
Athena
Access Logging
API Logging
Access Log
Analytics
IAM
Amazon
EMR
http://amzn.to/2si6RqS
+ storage level support for access logging and audit
Additional S3 Security Practices
Use S3 bucket policies:
• Restrict access by IP
address
• Restrict deletes
• Enforce encryption use
Restrict deletes to require
MFA Authentication
Use Versioning!!!
AWS Server-Side encryption
AWS managed key infrastructure
AWS Key Management Service
Automated key rotation & auditing
Integration with other AWS services
AWS CloudHSM
Dedicated Tenancy SafeNet Luna SA HSM Device
Common Criteria EAL4+, NIST FIPS 140-2
Encryption Options
Extensible and Hybrid Crypto Integration for AWS Services
class myCrypt implements EncryptionMaterialsProvider
Amazon
Redshift
On Premises
HSM
Kinesis Firehose
Data Access & Authorisation
Give your users easy and secure access
Data Ingestion
Get your data into S3
quicklyand securely
Storage & Catalog
Secure, cost-effectivestorage in Amazon
S3. Robust metadata in AWSCatalog
Protect and Secure
Use entitlements to ensure data is secure and users’ identities are verified
Data Ingestion into S3
S3 Transfer Acceleration
S3 Bucket
AWS Edge
Location
Uploader
Optimized
Throughput!
Typically 50%-400% faster
Change your endpoint, not your code
No firewall exceptions or client
software required
59 global edge locations
Rio De
Janeiro
Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los
Angeles
Seattle Tokyo Singapore
Time[hrs.]
500 GB upload from these edge locations to a bucket in Singapore
Public Internet
How Fast is S3 Transfer Acceleration?
S3 Transfer Acceleration
Stream Events to S3 Using Kinesis Firehose
Write Database Changes to S3 with DMS
<schema_name>/<table_name>/LOAD001.csv
<schema_name>/<table_name>/LOAD002.csv
<schema_name>/<table_name>/<time-stamp>.csv
Full Load
Change Data Capture
Kinesis Firehose
Athena
Query Service Glue
Data Access & Authorisation
Give your users easy and secure access
Data Ingestion
Get your data into S3
quicklyand securely
Processing & Analytics
Use of predictive and prescriptive
analytics to gain better understanding
Storage & Catalog
Secure, cost-effectivestorage in Amazon
S3. Robust metadata in AWSCatalog
Protect and Secure
Use entitlements to ensure data is secure and users’ identities are verified
Machine Learning
Predictive analytics
Amazon AI
Glue: Managed ETL
• Serverless job execution
• PySpark transformations
• Monitoring, metrics and
notifications
• Combine with AWS Lambda
and AWS Step Functions for
complex data orchestrations
Analyzing Streaming Data…
Amazon Kinesis Analytics
• Interact with streaming data in real time using SQL
• Build fully managed and elastic stream processing
applications that process data for real-time visualizations
and alarms
SELECT STREAM author,
count(author) OVER ONE_MINUTE
FROM Tweets
WINDOW ONE_MINUTE AS
(PARTITION BY author
RANGE INTERVAL '1' MINUTE PRECEDING)
WHERE text LIKE ‘%#AWSSummit%';
Amazon Kinesis Analytics – Simple SQL Interface
Analyzing Streaming Data… and Data at Rest
Amazon Athena
• No Infrastructure or administration
• Zero spin up time
• Transparent upgrades
• Query data in its raw format
• AVRO, Text, CSV, JSON, weblogs, AWS service logs
• Convert to an optimized form like ORC or Parquet for the
best performance and lowest cost
• No loading of data, no ETL required
• Stream data from directly from Amazon S3, take advantage
of Amazon S3 durability and availability
Simple Query editor
with syntax highlighting
and autocomplete
Data Catalog
Query History, Saved Queries, and
Catalog Management
QuickSight allows you to connect to data from a wide variety of AWS, third-party, and on-premises sources including Amazon Athena
Amazon RDS
Amazon S3
Amazon Redshift
Amazon Athena
Using Amazon Athena with Amazon QuickSight
Building Smarter Applications
Add Machine Learning Capabilities
Amazon Machine Learning Service
Batch and online predictions
Train using data in S3, RDS and
Redshift
Amazon EMR
Comprehensive machine learning
libraries (eg Spark MLlib, Anaconda)
Provision analytics clusters in minutes,
autoscale with data volume or query
demand
Amazon AI Services
Amazon Polly – Lifelike Text-to-Speech
47 voices, 24 languages
Low-latency, real time
Amazon Rekognition – Image Analysis
Object and scene detection
Facial analysis
Amazon Lex – Conversational Engine
Speech and text recognition
Enterprise connectors
Demographic Data
Facial Landmarks
Sentiment Expressed
Image Quality
Facial Analysis with Rekognition
Brightness: 25.84
Sharpness: 160
General Attributes
Up to ~40k CUDA cores
Pre-configured CUDA drivers
Jupyter notebook with Python2,
Python3, Anaconda
CloudFormation Template
AWS Marketplace – one-click deploy
AWS Deep Learning AMI
Scaling Distributed Experiments
• Inception v3 model
• Increasing machines
from 1 to 47
• 2x faster than
TensorFlow if using
more than 10 machines
Example MXNet User | TuSimple|Autonomous Driving
Kinesis Firehose
Athena
Query Service Glue
Machine Learning
Predictive analytics
Data Access & Authorisation
Give your users easy and secure access
Data Ingestion
Get your data into S3
quicklyand securely
Processing & Analytics
Use of predictive and prescriptive
analytics to gain better understanding
Storage & Catalog
Secure, cost-effectivestorage in Amazon
S3. Robust metadata in AWSCatalog
Protect and Secure
Use entitlements to ensure data is secure and users’ identities are verified
Amazon AI
Thank You
Full Stack Analytics on AWS

Weitere ähnliche Inhalte

Was ist angesagt?

A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionAmazon Web Services
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon KinesisAmazon Web Services
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementAmazon Web Services
 
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...Amazon Web Services
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
 
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)Amazon Web Services
 
ENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS ResourcesENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS ResourcesAmazon Web Services
 
Optimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsOptimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsAmazon Web Services
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSAmazon Web Services
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...Amazon Web Services
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudAmazon Web Services
 
The Value of Certified AWS Experts to Your Business
The Value of Certified AWS Experts to Your BusinessThe Value of Certified AWS Experts to Your Business
The Value of Certified AWS Experts to Your BusinessAmazon Web Services
 
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)Amazon Web Services
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleModern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleAmazon Web Services
 
Hong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - KeynoteHong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - KeynoteAmazon Web Services
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisAmazon Web Services
 
AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)
AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)
AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)Amazon Web Services
 
AWS re:Invent 2016: Turner's cloud native media supply chain for TNT, TBS, Ad...
AWS re:Invent 2016: Turner's cloud native media supply chain for TNT, TBS, Ad...AWS re:Invent 2016: Turner's cloud native media supply chain for TNT, TBS, Ad...
AWS re:Invent 2016: Turner's cloud native media supply chain for TNT, TBS, Ad...Amazon Web Services
 
Getting Started with AWS Database Migration Service
Getting Started with AWS Database Migration ServiceGetting Started with AWS Database Migration Service
Getting Started with AWS Database Migration ServiceAmazon Web Services
 
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...Amazon Web Services
 

Was ist angesagt? (20)

A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in Action
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon Kinesis
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagement
 
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
 
ENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS ResourcesENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS Resources
 
Optimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsOptimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics Workloads
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWS
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
The Value of Certified AWS Experts to Your Business
The Value of Certified AWS Experts to Your BusinessThe Value of Certified AWS Experts to Your Business
The Value of Certified AWS Experts to Your Business
 
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleModern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
 
Hong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - KeynoteHong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - Keynote
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
 
AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)
AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)
AWS re:Invent 2016: Large-scale AWS Migrations (ENT204)
 
AWS re:Invent 2016: Turner's cloud native media supply chain for TNT, TBS, Ad...
AWS re:Invent 2016: Turner's cloud native media supply chain for TNT, TBS, Ad...AWS re:Invent 2016: Turner's cloud native media supply chain for TNT, TBS, Ad...
AWS re:Invent 2016: Turner's cloud native media supply chain for TNT, TBS, Ad...
 
Getting Started with AWS Database Migration Service
Getting Started with AWS Database Migration ServiceGetting Started with AWS Database Migration Service
Getting Started with AWS Database Migration Service
 
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
 

Ähnlich wie Full Stack Analytics on AWS - AWS Summit Cape Town 2017

BDA309 Building Your Data Lake on AWS
BDA309 Building Your Data Lake on AWSBDA309 Building Your Data Lake on AWS
BDA309 Building Your Data Lake on AWSAmazon Web Services
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
Serverless Big Data Architectures: Serverless Data Analytics
Serverless Big Data Architectures: Serverless Data AnalyticsServerless Big Data Architectures: Serverless Data Analytics
Serverless Big Data Architectures: Serverless Data AnalyticsKristana Kane
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSAmazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Amazon Web Services
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Amazon Web Services
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Amazon Web Services
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Amazon Web Services
 
Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview Amazon Web Services
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...Amazon Web Services
 
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAmazon Web Services
 
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개 2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개 Amazon Web Services Korea
 
Architetture serverless e pattern avanzati per AWS Lambda
Architetture serverless e pattern avanzati per AWS LambdaArchitetture serverless e pattern avanzati per AWS Lambda
Architetture serverless e pattern avanzati per AWS LambdaAmazon Web Services
 

Ähnlich wie Full Stack Analytics on AWS - AWS Summit Cape Town 2017 (20)

BDA309 Building Your Data Lake on AWS
BDA309 Building Your Data Lake on AWSBDA309 Building Your Data Lake on AWS
BDA309 Building Your Data Lake on AWS
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Serverless Big Data Architectures: Serverless Data Analytics
Serverless Big Data Architectures: Serverless Data AnalyticsServerless Big Data Architectures: Serverless Data Analytics
Serverless Big Data Architectures: Serverless Data Analytics
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWS
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
 
Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
 
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
AWS 資料湖服務
AWS 資料湖服務AWS 資料湖服務
AWS 資料湖服務
 
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개 2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
 
Architetture serverless e pattern avanzati per AWS Lambda
Architetture serverless e pattern avanzati per AWS LambdaArchitetture serverless e pattern avanzati per AWS Lambda
Architetture serverless e pattern avanzati per AWS Lambda
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Full Stack Analytics on AWS - AWS Summit Cape Town 2017

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Specialist Solutions Architect, Data and Analytics, EMEA July 5th, 2017 Full Stack Analytics on AWS Ian Robinson
  • 2. Forces and Trends Cost Optimization Licenses Hardware Data center and operations Dark Data Prematurely discarding data Agility Experimentation (data & tools) Democratised Access to Data Time-to-first-results Terminate failed experiments early From BI to Data Science In-house data science From back office to product
  • 3. Storage is the Gravity for Cloud Applications
  • 5. Foundations: Storage, Discovery and Lifecycle Secure, governed, scalable, cheap Storage & Catalog Secure, cost-effectivestorage in Amazon S3. Robust metadata in AWSCatalog
  • 6. Amazon EFS File Amazon EBS Amazon EC2 Instance Store Block Amazon S3 Amazon Glacier Object Data Transfer AWS Direct Connect AWS Snowball ISV Connectors Amazon Kinesis Firehose S3 Transfer Acceleration Storage Gateway AWS Storage Platforms
  • 7. Amazon S3 Amazon Glacier Object Object Storage is Foundational EC2 Lambda EMR Data Pipeline Kinesis CloudFront RDS DynamoDB RedShift Database AnalyticsCompute Elastic Transcoder Content Delivery
  • 8. S3 Data Lifecycle and Events Standard Active data Archive dataInfrequently accessed data Standard - Infrequent Access Amazon Glacier Create Delete
  • 9. Data Catalog Scalable (secure, versioned, durable) storage + Immutable data at every stage of its lifecycle + Versioned schema and metadata = Data discovery, lineage and governance
  • 10. AWS Glue: Components Data Catalog Crawl, store, search metadata in different data stores Populate in a Hive metastore compliant catalog Job Execution Fully managed orchestration & execution of ETL jobs Server-less execution model – no need to pre-provision resources Job Authoring Author, edit, share ETL jobs in using your favorite tools Store, share, re-use ETL code/script with Git integration
  • 11. Manage table metadata through a Hive metastore API or Hive SQL. Supported by tools such as Hive, Presto, Spark, etc. We added a few extensions:  Search metadata for data discovery  Connection info – JDBC URLs, credentials  Classification for identifying and parsing files  Versioning of table metadata as schemas evolve and other metadata are updated Populate using Hive DDL, bulk import, or automatically through crawlers. Glue Data Catalog
  • 12. semi-structured per-file schema semi-structured unified schema identify file type and parse files enumerate S3 objects file 1 file 2 file N … int array intchar struct char int array struct char bool int int arraybool int char char int custom classifiers app log parser metrics parser … system classifiers JSON parser CSV parser Apache log parser … Crawlers: Automatic Schema Inference
  • 14. Data Access & Authorisation Give your users easy and secure access Storage & Catalog Secure, cost-effectivestorage in Amazon S3. Robust metadata in AWSCatalog Protect and Secure Use entitlements to ensure data is secure and users’ identities are verified
  • 15. AWS implements security at the data level, not tool-by-tool IAM Amazon S3 Amazon ElastiCache Amazon DynamoDB Amazon EMR Amazon Kinesis Amazon Athena Service API Access
  • 16. Third Party Ecosystem Security Tools Amazon S3 AWS CloudTrail http://amzn.to/2tSimHj Amazon Athena Access Logging API Logging Access Log Analytics IAM Amazon EMR http://amzn.to/2si6RqS + storage level support for access logging and audit
  • 17. Additional S3 Security Practices Use S3 bucket policies: • Restrict access by IP address • Restrict deletes • Enforce encryption use Restrict deletes to require MFA Authentication Use Versioning!!!
  • 18. AWS Server-Side encryption AWS managed key infrastructure AWS Key Management Service Automated key rotation & auditing Integration with other AWS services AWS CloudHSM Dedicated Tenancy SafeNet Luna SA HSM Device Common Criteria EAL4+, NIST FIPS 140-2 Encryption Options
  • 19. Extensible and Hybrid Crypto Integration for AWS Services class myCrypt implements EncryptionMaterialsProvider Amazon Redshift On Premises HSM
  • 20. Kinesis Firehose Data Access & Authorisation Give your users easy and secure access Data Ingestion Get your data into S3 quicklyand securely Storage & Catalog Secure, cost-effectivestorage in Amazon S3. Robust metadata in AWSCatalog Protect and Secure Use entitlements to ensure data is secure and users’ identities are verified
  • 22. S3 Transfer Acceleration S3 Bucket AWS Edge Location Uploader Optimized Throughput! Typically 50%-400% faster Change your endpoint, not your code No firewall exceptions or client software required 59 global edge locations
  • 23. Rio De Janeiro Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los Angeles Seattle Tokyo Singapore Time[hrs.] 500 GB upload from these edge locations to a bucket in Singapore Public Internet How Fast is S3 Transfer Acceleration? S3 Transfer Acceleration
  • 24. Stream Events to S3 Using Kinesis Firehose
  • 25. Write Database Changes to S3 with DMS <schema_name>/<table_name>/LOAD001.csv <schema_name>/<table_name>/LOAD002.csv <schema_name>/<table_name>/<time-stamp>.csv Full Load Change Data Capture
  • 26. Kinesis Firehose Athena Query Service Glue Data Access & Authorisation Give your users easy and secure access Data Ingestion Get your data into S3 quicklyand securely Processing & Analytics Use of predictive and prescriptive analytics to gain better understanding Storage & Catalog Secure, cost-effectivestorage in Amazon S3. Robust metadata in AWSCatalog Protect and Secure Use entitlements to ensure data is secure and users’ identities are verified Machine Learning Predictive analytics Amazon AI
  • 27. Glue: Managed ETL • Serverless job execution • PySpark transformations • Monitoring, metrics and notifications • Combine with AWS Lambda and AWS Step Functions for complex data orchestrations
  • 29. Amazon Kinesis Analytics • Interact with streaming data in real time using SQL • Build fully managed and elastic stream processing applications that process data for real-time visualizations and alarms
  • 30. SELECT STREAM author, count(author) OVER ONE_MINUTE FROM Tweets WINDOW ONE_MINUTE AS (PARTITION BY author RANGE INTERVAL '1' MINUTE PRECEDING) WHERE text LIKE ‘%#AWSSummit%'; Amazon Kinesis Analytics – Simple SQL Interface
  • 31. Analyzing Streaming Data… and Data at Rest
  • 32. Amazon Athena • No Infrastructure or administration • Zero spin up time • Transparent upgrades • Query data in its raw format • AVRO, Text, CSV, JSON, weblogs, AWS service logs • Convert to an optimized form like ORC or Parquet for the best performance and lowest cost • No loading of data, no ETL required • Stream data from directly from Amazon S3, take advantage of Amazon S3 durability and availability
  • 33. Simple Query editor with syntax highlighting and autocomplete Data Catalog Query History, Saved Queries, and Catalog Management
  • 34. QuickSight allows you to connect to data from a wide variety of AWS, third-party, and on-premises sources including Amazon Athena Amazon RDS Amazon S3 Amazon Redshift Amazon Athena Using Amazon Athena with Amazon QuickSight
  • 35.
  • 37. Add Machine Learning Capabilities Amazon Machine Learning Service Batch and online predictions Train using data in S3, RDS and Redshift Amazon EMR Comprehensive machine learning libraries (eg Spark MLlib, Anaconda) Provision analytics clusters in minutes, autoscale with data volume or query demand
  • 38. Amazon AI Services Amazon Polly – Lifelike Text-to-Speech 47 voices, 24 languages Low-latency, real time Amazon Rekognition – Image Analysis Object and scene detection Facial analysis Amazon Lex – Conversational Engine Speech and text recognition Enterprise connectors
  • 39. Demographic Data Facial Landmarks Sentiment Expressed Image Quality Facial Analysis with Rekognition Brightness: 25.84 Sharpness: 160 General Attributes
  • 40. Up to ~40k CUDA cores Pre-configured CUDA drivers Jupyter notebook with Python2, Python3, Anaconda CloudFormation Template AWS Marketplace – one-click deploy AWS Deep Learning AMI
  • 41. Scaling Distributed Experiments • Inception v3 model • Increasing machines from 1 to 47 • 2x faster than TensorFlow if using more than 10 machines
  • 42. Example MXNet User | TuSimple|Autonomous Driving
  • 43. Kinesis Firehose Athena Query Service Glue Machine Learning Predictive analytics Data Access & Authorisation Give your users easy and secure access Data Ingestion Get your data into S3 quicklyand securely Processing & Analytics Use of predictive and prescriptive analytics to gain better understanding Storage & Catalog Secure, cost-effectivestorage in Amazon S3. Robust metadata in AWSCatalog Protect and Secure Use entitlements to ensure data is secure and users’ identities are verified Amazon AI
  • 44. Thank You Full Stack Analytics on AWS

Hinweis der Redaktion

  1. Any modern data and analytics architecture must address a number of forces and trends
  2. Technologies come and go Data has a geological lifespan Store all your data, forever, at every stage of its lifecycle Apply it using the appropriate technology As we talk to customers on their cloud journey and walk through our cloud adoption framework, at each step of the process (Strategy -> Plan  Build/Iterate  Run), storage is a critical element to be considered. In fact, storage is central to virtually every workload running in AWS today. So as we begin to think about either cloud native or cloud migration strategies, think about storage as a strategic, foundational element. Once your data is stored in the cloud, the world of AWS services offerings opens to you,
  3. A good architecture reduces irreversibility, and allows you to defer decisions to a later point in time while locking in key parameters, such as cost. That is, you can anticipate how much a deferred decision will cost when you have to make it. With our foundational storage capability we don’t want to have to make upfront, irreversible decisions about capacity or data format. What if I have to choose capacity up front, but then exceed it? I close down certain opportunities. What if I never use my reserved capacity? I pay for wasted space. If I have to choose a specific format, perhaps a proprietary format, up front, I potentially exclude certain opportunities in the future. Related to this, we don’t want to choose a foundational storage strategy that results in a “pager architecture” We don’t want to be in a position where in order to guarantee durability and availability of a our data, we always need someone on call, to replace a failed node and restore 3x redundancy We want something that provides for near-infinite scalability, a range of data formats, and high durability and availability guarantees. S3 allows you to keep your data in a secure, cheap, near-infinitely scalable environment, without having to make up-front decisions about capacity or data formats. To that extent, S3 is the most architecturally significant element in your data architecture.
  4. Storage is more than just the protocol or interface. It’s the lifeblood of application design and renewed architectures. Our customers have taught us that they need two things: scale and trust. 1. Make sure I can grow. 2. Make sure I can access what I need when I need it, (and of course help me keep costs down). The suite of transfer services that support customers in their migrations means more choice. Large batches, incremental changes, constant streams or seamless integration are all part of the storage offering. Today we’re going to talk about two of the newest ways to do cloud data migration, Snowball and S3 Transfer Acceleration.
  5. By convention, S3 has been at the heart of our “data lake” architecture for many years, but more and more it is being integrated with our data and analytics services: RDS, Redshift, etc. backup to S3 Athena can query against S3 using SQL Redshift Spectrum can join data in S3 with data in Redshift Kinesis Firehose ingests streaming data into S3 EMR can treat S3 as near-infinite capacity, highly durable HDFS And so on…
  6. S3 isn’t just dumb storage: it allows you to manage data lifecycle and act on data events S3 Standard – general purpose storage class. High dur, avail, performance. If you don’t want to think about your data access patterns. Glacier archival storage with 3-5 hours of retrieval time. low cost with pricing starting at 7-tenth of a cent many AWS customers store backups or log files that are almost never read. Or access frequency drops as the data ages. But need immmediate access. S3 Standard-IA is a new storage class on Amazon S3 design for colder or less frequently access workload. Offers same high performance, high throughput and low latency as S3 Standard. low cost with storage starting at one and a quarter cent per GB If you think about the typical lifecycle of data, newly created active data is access very frequently. In our example take a new video clip you share with your friends and family. People will be consuming this new data actively, this new video will be played back frequently, shared and commented on very frequently. As this video becomes older, a smaller number of people will engage, it will be LESS FREQUENTLY accessed. If you don’t want to think about your data access patterns but just want to high durability, availability and performance for Amazon S3 you can simply select S3 Standard. For data that is less-frequently accessed, you can leverage Amazon S3 Standard-IA to save on cost while still benefiting from the great durability and performance as S3 Standard. At some point in time your data will be ready to be archived because no one if actively interacting with your data and you need to archive that away for record keeping etc. In addition to transitioning your data to S-IA as its characteristics change, you can also leverage Amazon S3 Standard-IA for new data that fits the bill for Infrequently accessed data. For example you can leverage the S-IA storage class to stored detailed applications logs that you analyst in-frequently and save on storage cost.
  7. Near infinite (secure, versioned, highly durable) storage + immutable data at every stage of its lifecycle + versioned metadata = data lineage and governance
  8. Need better graphics. Have asked Jason for some. Data Catalog: A metadata store that automatically organizes the metadata for all your data assets across your business. You can organize and search your assets. ETL system: An engine that automatically generates ETL scripts and allows you to orchestrate, monitor, refine and manage your jobs
  9. Identity, authorization, entitlements Auditing, key management, conformance
  10. Securely control access to all digital resources based on users, groups, and application roles In S3 we can control access at the bucket and even at the object level: not only who can access an object, but what they can do with it
  11. But you can layer on additional security Apache Ranger or Knox: pluggable security layer for Hive, allow AD-federated access to data Comprehensive auditing of all data access API calls via CloudTrail, which you can then analyze with Athena
  12. We can say here that these strategies can help give additional protection against ransomware attacks For additional security, enable MFA (multi-factor authentication) delete, which requires additional authentication to: Change the versioning state of your bucket Permanently delete an object version MFA delete requires both your security credentials and a code from an approved authentication device Protection even if you give your account credentials to the wrong person or a malicious employee Protects from recover from unintended user deletes or application logic failures, no performance penalty. Keeps all versions, new uploads stored separately, with delete, latest version is maintained, delete marker added Can retrieve deleted or roll back to previous 3 states: **default, not versions saved, deleted objects cannot be retrieved, ** versioning-enabled, as discussed, save versions of overwritten or deleted, ** suspended, all saved versions are maintained, but new versions are not created
  13. AWS offers a number of encryption options that allow you to vary the security based on where the key is stored and who has access to it. AWS SSE At the simplest level you can take advantage of the AWS SSE. Integrated with S3 and Resdhift. Encrypts automatically and transparently which makes it very easy to use. AWS KMS Create and control encryption keys, rotate them Centralized and fully managed, so you focus on encryption needs, not infrastructure CloudHSM Dedicated tennacy hardware security modules. Certified infrastructure where AWS has absolutely no access to your encrption information. Designed to destroy the keys rather than allow you access into the system. Clustered for HA so that your keys are secure and durable.
  14. HSM = hardware security modules custom encrytion materials provider - use any strategy for providing encryption materials, such as integrating with existing key management systems
  15. One-off migrations Batch uploads Streaming data – whether streaming events from IoT devices, logs, clickstream providers, or change data capture events from an on-prem relational
  16. Transferring large files over a long distance can be challenging. If you are moving data across continents, large number of objects, or if your customers are long ways away from AWS regions. Accelerate transfer to S3 using the AWS edge network. Leverages POP locations to insure your transfers travel a shorter distance on the public internet and then travel the remaining portion over an optimized route via the Amazon backbone. FASTER OR FREE There is no cost for using the XA if the upload is not faster. In the event the network is the same as normal upload, you don’t pay for XA.. S3-XA uses standard TCP and HTTP so it does not require any firewall exceptions or custom software installation. Variable internet traffic is shorter with TA, when you’re uploading file thru a long distance. we transfer thru optimized Amazon backbone. Data transfer thru the Amazon backbone, which has much stable connectivity than the internet. Freeway with performance booster because we know the road is open. Although primary is upload/ ingestion to S3. we have seen customer for downloads, sharing video on individual basis, very few people pulling the video. Downloading thru XA is a fast path Use XA to improve their Upload availability : spotty internet connection. Uploading file across the global poor internet connectivity, increase their upload availability. (FS) Associate performance wit availability. Taking a long time to upload or download, assume it’s not working properly. Customer may not be as patient and cancel. Finish faster. -download files from S3 when the files are not pulled down frequently (single user versus many).  Customers were seeing a benefit of pulling down files from S3 in the quickest way possible by leveraging S3-XA.  for files that are frequently accessed, recommended to use CloudFront which caches your data at the edge
  17. Time it takes to upload a 500GB object to SIN from various location Yellow bar, blue bar The greater difference between the two bars, the more improvement 1/ the farther your bucket, the more benefit from moving over the AWS network. 2/ the larger file you upload, the more benefit you’ll see Lot less variability over all locations
  18. Transform data – ETL or computationally expensive operation on dataset that emits another dataset Analytics over streaming data – understand something about current state of the world – end users or system behaviours Interactive, ad hoc, exploratory analysis over our data Build predictive analytics that help us forecast the future and pre-emptively act on insights we generate
  19. You can then edit these transformations, if necessary, using the tools and technologies you already know, such as Python, Spark, Git and your favorite integrated developer environment (IDE), and share them with other AWS Glue users.
  20. Work with fast moving data accesible language: SQL, manipulate data on stream, describe filters, projections, functions, emit another data source as stream -> could push to Kibana, for example
  21. Sliding window for each new record that appears on the stream, we emit an output by applying aggregates on tweets in the preceding 1-minute window.
  22. Ad hoc, interactive, exploratory analyses over TB or PB data in S3
  23. P2 instances – GPU accelerated instances Up to 16 GPUs per instance, nearly 40,000 CUDA cores CUDA is NVIDIA’s GPU-accelerated parallel computing programming model
  24. Inception: architecture for training image recognition system 16 x p2.16xlarge - Scale out beyond single instance, get near linear scalability – over half million parallel processing cores
  25. US- and China-based technology company developing autonomous driving technology. Mission: set a new standard on safety, reliability, and efficiency in the trucking industry.