SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Architecting Data in the
AWS Ecosystem
Seth Luersen
Head of Training, MemSQL
2
what is happening within AWS
overall data landscape
benefits of using MemSQL in EC2
3
Modern Data
Relational
SQL
Schema
Structured
Operational
Non-Relational
NoSQL
Schema-less
Unstructured
Analytical
4
How to Match
Database
Data-driven
application
5
Workloads
Data
Shape
Size
Compute
6
Shape
Columnstore Aggregations and table scans
Document Index and store docs for query on any property
Graph Persist and retrieve relationships
Key-Value Query by key with fast ingest and high throughput
Rowstore Operate on a row or row set
Time-Series Store and process sequence
Unstructured Get and put objects
7
Size
Limit Bounded or Unbounded to a size
Working Set 30 years cold
Caching Last 10 minutes of hot
Result size 1 row at 100 bytes
2 million rows at 200 MB
Monolith One big refrigerator
Partition Natural boundaries for distribution
8
Compute
Aggregations Average, Count, Sum on 1 trillion rows
Batch 50 million rows per batch
Concurrency 10,000 requests per second
Streaming Ingest 1 million rows ingest per second
Latency SLAs for sub-second response
Transactions Singleton operations
9
Choice
On Size
Fits All
Use Case
Specific
10
Navigating the Data Landscape
NoSQL
Database
Data
Warehouse
Data LakeNon-relational
Relational
Analytical Operational
NoSQL
Database
Data
Warehouse
Data Lake
11
Navigate in AWS
Dynamo
DB
RDS
Aurora
MySQL
PostegreSQL
MariaDB
SQL Server
Oracle
S3
Non-relational
Relational
Elastic
Cache
Analytical Operational
DAX
Kinesis
Analytics
Redshift
Athena
ElasticSearch
EMR
Elastic
MapReduce
Hadoop
Spark
Presto
Hbase
12
General Use Cases
singletons
system of record
content
blobs
descriptive
predictive
prescriptive
OperationalAnalytics
columnar
partitions
billions of rows
heavy push down
batch writes
few updates
in-memory
compute
13
Analytical Dimensions
billions of rows
cached / in-memory
partitions
computed result set
files
unstructured
schema-less
relational
snowflake
etl
batches
aggregations
query latency
pushdown
shape
compute
size
NoSQL
Database
Data
Warehouse
Data Lake
14
Navigate Analytical
Dynamo
DB
RDS
Aurora
MySQL
PostegreSQL
MariaDB
SQL Server
Oracle
EMR
Elastic
MapReduce
Hadoop
Spark
Presto
Hbase
S3
Non-relational
Relational
Elastic
Cache
Athena
Analytical Operational
Kinesis
Analytics
Redshift
ElasticSearch
DAX
15
Amazon EMR
Elastic Reliable Secure Easy
Hadoop, Spark, Hbase, Presto
Clickstream Analytics, Real-time Analytics, Log Analysis,
ETL, Predictive Analytics
Big Data Framework
Retries failed task for Hadoop
Replace poor performing instance
16
Amazon Redshift
Scalable Secure Inexpensive Fast
Fast, powerful, and simple data warehousing;
Massively parallel, petabyte scale
Scale by resizing
Columnar performance
$1000 per TB per year
Data Warehouse
17
Amazon S3 + Athena
Query
Instantly
Pay Per
Query
ANSI
SQL
Server-less
Easy
No infrastructure to setup or manage
SQL to query S3 files
JDBC / ODBC
Multiple data formats
Relational Joins
S3 upload latency
Data Lake
18
Elasticsearch Service
Easy to
Use
Open Source
API
Secure Fully
Managed
Easy to deploy, secure, operate, and scale Elasticsearch
Log analytics, full text search, & application monitoring
Logstash
Kibana
NoSQL Full Text Search
19
Analytics Summary
Amazon Redshift Amazon S3 + Athena
serveless ad-hoc query
process, prepare, and index key-value / document
low latency
per query $$$
non-relational
multiple enterprise data sources
multiple data formats
20
General Use Cases
singletons
system of record
content
blobs
descriptive
predictive
prescriptive
OperationalAnalytics
hot – caching
singletons –
small compute
size
low latency
high throughput
high concurrency
ACID, HA, DR
21
Operational Dimensions
shape
size
bounded
unbounded
monolithic
partitioned
rows
key-values
documents
relational
schema
velocity
ingest
compute
NoSQL
Database
Data
Warehouse
Data Lake
22
Navigate AWS
RDS
Aurora
MySQL
PostegreSQL
MariaDB
SQL Server
Oracle
S3
Non-relational
Relational
Elastic
Cache
Analytical Operational
Kinesis
Analytics
Redshift
Dynamo
DB
Athena
ElasticSearch
DAX
EMR
Elastic
MapReduce
Hadoop
Spark
Presto
Hbase
23
Amazon RDS
Administer
Easily
Highly
Scalable
Available,
Durable
SSD
Speed
Managed relational database service;
Six popular database engines
Amazon Aurora is multi-AZ durable
Database
24
Amazon ElasticCache
Scale
Easily
Secure,
Hardened
Available,
Reliable
Extreme
Performance
Managed, in-memory data store;
Redis or Memcached
Add to database to improve read latency
Good hit rate if working set fits in cache
Price is stale cache reads
In-memory Database
25
Amazon DynamoDB
Fully
Managed
Auto
Scaling
AZ
Replication
Consistent
Performance
NoSQL database for document and key-store
Automatic provisioning
Auto-scaled tables server millions of request per second
Millisecond latency
Fault tolerant availability
No relational capabilities
NoSQL
26
Amazon DynamoDB Accelerator (DAX)
Fully
Managed
No Stale
Cache Reads
Extreme
Performance
Fully managed write-through cache for DynamoDB
Reduces millisecond latency to microseconds
Fast NoSQL
27
Operational Summary
Amazon RDS Amazon DynamoDB
bounded
unbounded
key-value / document
rows
relational
non-relational
monolith
partitioned
velocity
push-down compute
fast ingest with DAX
28
Strategic Planning Assumptions
By 2017, as "NoSQL" ceases to distinguish
DBMSs, data and analytics leaders will
select multimodel and/or specific document,
key-value, graph and wide-column DBMSs.
Gartner Critical Capabilities for Operational Database Management Systems
Published: 6 October 2016
Analyst(s): Merv Adrian, Donald Feinberg, Nick Heudecker, Terilyn Palanca, Rick Greenwald
29
Navigating the Data Landscape
NoSQL
No
Problem
Database
Data
Warehouse
Data LakeNon-relational
Relational
Analytical Operational
30
Navigating the Data Landscape
Database
Data
Warehouse
Data LakeNon-relational
Relational
Analytical Operational
31
Simplify the Data Landscape
Converged Data Warehouse Database
Data Lake (AWS S3)Non-relational
Relational
Analytical Operational
HTAP, HOAP, Translytical
32
Latency Holding Back the Enterprise
Lengthy Query Execution
Slow query responses
Slow reports
No real-time response
Limited User Access
Single threaded operations
Challenge with mixed workloads
Single box performance
Slow Data Loading
Batch processing
Hours to load
Sampled data views
33
The Enterprise Requires Performance
Fast Queries
Scalable SQL
Real-time dashboards
Live data access
Scalable User Access
Multi-threaded processing
Converged transactions and analytics
Scale-out for performance
Live Loading
Stream data
On-the-fly transformation
Multiple sources
34
The Database for Real-Time Applications
Delivering Operational Analytics at Scale
Run
Anywhere
Any cloud, hybrid, or multicloud
On-premises
Low cost standard hardware
Scale
Transactions and Analytics
Petabyte scale
In-memory and disk-based
Unified mixed workload architecture
Power
Real-Time Applications
Fast ingestion and queries
Operational capabilities
Multi-model and data support
35
Durable Distributed Storage
Highly Available
Online replication ensures
data consistency and protects
against outages
Big Data Capacity
Petabyte scale with up to
10x compression and instant
query retrieval
Distributed and Durable
Store and process on clusters
of machines for performance
and persistence
36
MemSQL Unified Architecture
Historical Data
Disk-optimized tables
with compression for
fast analytic queries
Live Data
Memory optimized tables
for analyzing real-time
events
Streaming Ingest
Real-time data pipelines
with exactly-once
semantics
37
Drive Real-Time Insights
• Rich analytics with Scalable SQL
• Support for JSON, Geospatial,
Key-Value
• Fast Query Vectorization and
Compilation
• User Defined Functions
38
Deliver Real-Time ETL
Load
Guarantee message delivery with
exactly-once semantics
Transform
Map and enrich data with user defined
functions or Spark transformations
Extract
Ingest from Apache Kafka or Spark
Change data capture or bulk load
39
Simple Setup -> CREATE PIPELINE
memsql> CREATE PIPELINE twitter_pipeline AS
-> LOAD DATA KAFKA
"public-kafka.memcompute.com:9092/tweets-json"
-> INTO TABLE tweets
-> (id, tweet);
Query OK, (0.89 sec)
memsql> START PIPELINE twitter_pipeline;
Query OK, (0.01 sec)
40
Ecosystem Overview
Streaming Ingest Live Data Historical Data
Real-Time Data
Messaging and
Transforms
Historical Data BI Dashboards
Kafka Spark
Relational Hadoop Amazon S3
Bare Metal, Virtual Machines, Containers On-Premises, Cloud, As a Service
Real-Time Applications
Tableau Looker Microstrategy
41
Amazon EC2 + MemSQL
Size
Memory
Size
Compute
Size
Storage
ANSI
SQL
Build a cluster in minutes
Pipelines for ingest
Easy to deploy with MemSQL Ops
High Availability
ACID
Data Warehouse
and Database
42
AWS Aurora MemSQL
Dataset easily fits
under 500 GB
Single server compute
Write-centric without reads
Dataset from
100 GB to 1 PB
Horizontal scale
Simultaneous read and write
workloads
Database from AWS and MemSQL
43
Redshift MemSQL
No requirements for
fast data ingest
No requirement for
for concurrency
Fast data ingest required
Support for high concurrency
Data Warehouse from AWS and MemSQL
Thank You!

Weitere ähnliche Inhalte

Was ist angesagt?

From Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsFrom Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsSingleStore
 
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | Qubole
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | QuboleEbooks - Accelerating Time to Value of Big Data of Apache Spark | Qubole
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | QuboleVasu S
 
Building Data Intensive Analytic Application on Top of Delta Lakes
Building Data Intensive Analytic Application on Top of Delta LakesBuilding Data Intensive Analytic Application on Top of Delta Lakes
Building Data Intensive Analytic Application on Top of Delta LakesDatabricks
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lakeMykola Zerniuk
 
Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...
Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...
Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...Databricks
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
 
Google App Engine
Google App EngineGoogle App Engine
Google App EngineDave Nielsen
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...DataWorks Summit
 
IBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeIBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeTorsten Steinbach
 
See who is using MemSQL
See who is using MemSQLSee who is using MemSQL
See who is using MemSQLjenjermain
 
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on EverythingDavid Phillips
 
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Data Con LA
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseGrant Fritchey
 
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...Databricks
 
Powering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakePowering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakeDatabricks
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWSGary Stafford
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Building a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public HealthBuilding a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public HealthDatabricks
 
Personalization Journey: From Single Node to Cloud Streaming
Personalization Journey: From Single Node to Cloud StreamingPersonalization Journey: From Single Node to Cloud Streaming
Personalization Journey: From Single Node to Cloud StreamingDatabricks
 

Was ist angesagt? (20)

From Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsFrom Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
 
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | Qubole
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | QuboleEbooks - Accelerating Time to Value of Big Data of Apache Spark | Qubole
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | Qubole
 
Building Data Intensive Analytic Application on Top of Delta Lakes
Building Data Intensive Analytic Application on Top of Delta LakesBuilding Data Intensive Analytic Application on Top of Delta Lakes
Building Data Intensive Analytic Application on Top of Delta Lakes
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lake
 
Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...
Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...
Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
Google App Engine
Google App EngineGoogle App Engine
Google App Engine
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 
IBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeIBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lake
 
See who is using MemSQL
See who is using MemSQLSee who is using MemSQL
See who is using MemSQL
 
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on Everything
 
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
 
Powering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakePowering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta Lake
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Building a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public HealthBuilding a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public Health
 
Personalization Journey: From Single Node to Cloud Streaming
Personalization Journey: From Single Node to Cloud StreamingPersonalization Journey: From Single Node to Cloud Streaming
Personalization Journey: From Single Node to Cloud Streaming
 

Ă„hnlich wie Architecting Data in the AWS Ecosystem

Presentacion redislabs-ihub
Presentacion redislabs-ihubPresentacion redislabs-ihub
Presentacion redislabs-ihubssuser9d7c90
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for RedisManaging Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for RedisAmazon Web Services
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
IBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data LakeIBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data LakeTorsten Steinbach
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Amazon Web Services
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Amazon Web Services
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - DatalakeLam Le
 
re:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any Scalere:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any ScaleAdrian Hornsby
 
Modern Analytics Academy - Data Modeling (1).pptx
Modern Analytics Academy - Data Modeling (1).pptxModern Analytics Academy - Data Modeling (1).pptx
Modern Analytics Academy - Data Modeling (1).pptxssuser290967
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
 

Ă„hnlich wie Architecting Data in the AWS Ecosystem (20)

Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Presentacion redislabs-ihub
Presentacion redislabs-ihubPresentacion redislabs-ihub
Presentacion redislabs-ihub
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for RedisManaging Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
IBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data LakeIBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data Lake
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - Datalake
 
re:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any Scalere:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any Scale
 
Modern Analytics Academy - Data Modeling (1).pptx
Modern Analytics Academy - Data Modeling (1).pptxModern Analytics Academy - Data Modeling (1).pptx
Modern Analytics Academy - Data Modeling (1).pptx
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 

Mehr von SingleStore

MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastSingleStore
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQLSingleStore
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureSingleStore
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored ProceduresSingleStore
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017SingleStore
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming DataSingleStore
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSingleStore
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementSingleStore
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AISingleStore
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudSingleStore
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataSingleStore
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSingleStore
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleSingleStore
 
Machines and the Magic of Fast Learning
Machines and the Magic of Fast LearningMachines and the Magic of Fast Learning
Machines and the Magic of Fast LearningSingleStore
 
Machines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata KeynoteMachines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata KeynoteSingleStore
 
Enabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoTEnabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoTSingleStore
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLSingleStore
 
Driving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsDriving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsSingleStore
 
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
Tapjoy: Building a Real-Time Data Science Service for Mobile AdvertisingTapjoy: Building a Real-Time Data Science Service for Mobile Advertising
Tapjoy: Building a Real-Time Data Science Service for Mobile AdvertisingSingleStore
 
The Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
The Real-Time CDO and the Cloud-Forward Path to Predictive AnalyticsThe Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
The Real-Time CDO and the Cloud-Forward Path to Predictive AnalyticsSingleStore
 

Mehr von SingleStore (20)

MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQL
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed Architecture
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored Procedures
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data Management
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AI
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming Data
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber Scale
 
Machines and the Magic of Fast Learning
Machines and the Magic of Fast LearningMachines and the Magic of Fast Learning
Machines and the Magic of Fast Learning
 
Machines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata KeynoteMachines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata Keynote
 
Enabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoTEnabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoT
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
Driving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsDriving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive Analytics
 
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
Tapjoy: Building a Real-Time Data Science Service for Mobile AdvertisingTapjoy: Building a Real-Time Data Science Service for Mobile Advertising
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
 
The Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
The Real-Time CDO and the Cloud-Forward Path to Predictive AnalyticsThe Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
The Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
 

KĂĽrzlich hochgeladen

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptDr. Soumendra Kumar Patra
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 

KĂĽrzlich hochgeladen (20)

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 

Architecting Data in the AWS Ecosystem

  • 1. Architecting Data in the AWS Ecosystem Seth Luersen Head of Training, MemSQL
  • 2. 2 what is happening within AWS overall data landscape benefits of using MemSQL in EC2
  • 6. 6 Shape Columnstore Aggregations and table scans Document Index and store docs for query on any property Graph Persist and retrieve relationships Key-Value Query by key with fast ingest and high throughput Rowstore Operate on a row or row set Time-Series Store and process sequence Unstructured Get and put objects
  • 7. 7 Size Limit Bounded or Unbounded to a size Working Set 30 years cold Caching Last 10 minutes of hot Result size 1 row at 100 bytes 2 million rows at 200 MB Monolith One big refrigerator Partition Natural boundaries for distribution
  • 8. 8 Compute Aggregations Average, Count, Sum on 1 trillion rows Batch 50 million rows per batch Concurrency 10,000 requests per second Streaming Ingest 1 million rows ingest per second Latency SLAs for sub-second response Transactions Singleton operations
  • 10. 10 Navigating the Data Landscape NoSQL Database Data Warehouse Data LakeNon-relational Relational Analytical Operational
  • 11. NoSQL Database Data Warehouse Data Lake 11 Navigate in AWS Dynamo DB RDS Aurora MySQL PostegreSQL MariaDB SQL Server Oracle S3 Non-relational Relational Elastic Cache Analytical Operational DAX Kinesis Analytics Redshift Athena ElasticSearch EMR Elastic MapReduce Hadoop Spark Presto Hbase
  • 12. 12 General Use Cases singletons system of record content blobs descriptive predictive prescriptive OperationalAnalytics
  • 13. columnar partitions billions of rows heavy push down batch writes few updates in-memory compute 13 Analytical Dimensions billions of rows cached / in-memory partitions computed result set files unstructured schema-less relational snowflake etl batches aggregations query latency pushdown shape compute size
  • 14. NoSQL Database Data Warehouse Data Lake 14 Navigate Analytical Dynamo DB RDS Aurora MySQL PostegreSQL MariaDB SQL Server Oracle EMR Elastic MapReduce Hadoop Spark Presto Hbase S3 Non-relational Relational Elastic Cache Athena Analytical Operational Kinesis Analytics Redshift ElasticSearch DAX
  • 15. 15 Amazon EMR Elastic Reliable Secure Easy Hadoop, Spark, Hbase, Presto Clickstream Analytics, Real-time Analytics, Log Analysis, ETL, Predictive Analytics Big Data Framework Retries failed task for Hadoop Replace poor performing instance
  • 16. 16 Amazon Redshift Scalable Secure Inexpensive Fast Fast, powerful, and simple data warehousing; Massively parallel, petabyte scale Scale by resizing Columnar performance $1000 per TB per year Data Warehouse
  • 17. 17 Amazon S3 + Athena Query Instantly Pay Per Query ANSI SQL Server-less Easy No infrastructure to setup or manage SQL to query S3 files JDBC / ODBC Multiple data formats Relational Joins S3 upload latency Data Lake
  • 18. 18 Elasticsearch Service Easy to Use Open Source API Secure Fully Managed Easy to deploy, secure, operate, and scale Elasticsearch Log analytics, full text search, & application monitoring Logstash Kibana NoSQL Full Text Search
  • 19. 19 Analytics Summary Amazon Redshift Amazon S3 + Athena serveless ad-hoc query process, prepare, and index key-value / document low latency per query $$$ non-relational multiple enterprise data sources multiple data formats
  • 20. 20 General Use Cases singletons system of record content blobs descriptive predictive prescriptive OperationalAnalytics
  • 21. hot – caching singletons – small compute size low latency high throughput high concurrency ACID, HA, DR 21 Operational Dimensions shape size bounded unbounded monolithic partitioned rows key-values documents relational schema velocity ingest compute
  • 22. NoSQL Database Data Warehouse Data Lake 22 Navigate AWS RDS Aurora MySQL PostegreSQL MariaDB SQL Server Oracle S3 Non-relational Relational Elastic Cache Analytical Operational Kinesis Analytics Redshift Dynamo DB Athena ElasticSearch DAX EMR Elastic MapReduce Hadoop Spark Presto Hbase
  • 23. 23 Amazon RDS Administer Easily Highly Scalable Available, Durable SSD Speed Managed relational database service; Six popular database engines Amazon Aurora is multi-AZ durable Database
  • 24. 24 Amazon ElasticCache Scale Easily Secure, Hardened Available, Reliable Extreme Performance Managed, in-memory data store; Redis or Memcached Add to database to improve read latency Good hit rate if working set fits in cache Price is stale cache reads In-memory Database
  • 25. 25 Amazon DynamoDB Fully Managed Auto Scaling AZ Replication Consistent Performance NoSQL database for document and key-store Automatic provisioning Auto-scaled tables server millions of request per second Millisecond latency Fault tolerant availability No relational capabilities NoSQL
  • 26. 26 Amazon DynamoDB Accelerator (DAX) Fully Managed No Stale Cache Reads Extreme Performance Fully managed write-through cache for DynamoDB Reduces millisecond latency to microseconds Fast NoSQL
  • 27. 27 Operational Summary Amazon RDS Amazon DynamoDB bounded unbounded key-value / document rows relational non-relational monolith partitioned velocity push-down compute fast ingest with DAX
  • 28. 28 Strategic Planning Assumptions By 2017, as "NoSQL" ceases to distinguish DBMSs, data and analytics leaders will select multimodel and/or specific document, key-value, graph and wide-column DBMSs. Gartner Critical Capabilities for Operational Database Management Systems Published: 6 October 2016 Analyst(s): Merv Adrian, Donald Feinberg, Nick Heudecker, Terilyn Palanca, Rick Greenwald
  • 29. 29 Navigating the Data Landscape NoSQL No Problem Database Data Warehouse Data LakeNon-relational Relational Analytical Operational
  • 30. 30 Navigating the Data Landscape Database Data Warehouse Data LakeNon-relational Relational Analytical Operational
  • 31. 31 Simplify the Data Landscape Converged Data Warehouse Database Data Lake (AWS S3)Non-relational Relational Analytical Operational HTAP, HOAP, Translytical
  • 32. 32 Latency Holding Back the Enterprise Lengthy Query Execution Slow query responses Slow reports No real-time response Limited User Access Single threaded operations Challenge with mixed workloads Single box performance Slow Data Loading Batch processing Hours to load Sampled data views
  • 33. 33 The Enterprise Requires Performance Fast Queries Scalable SQL Real-time dashboards Live data access Scalable User Access Multi-threaded processing Converged transactions and analytics Scale-out for performance Live Loading Stream data On-the-fly transformation Multiple sources
  • 34. 34 The Database for Real-Time Applications Delivering Operational Analytics at Scale Run Anywhere Any cloud, hybrid, or multicloud On-premises Low cost standard hardware Scale Transactions and Analytics Petabyte scale In-memory and disk-based Unified mixed workload architecture Power Real-Time Applications Fast ingestion and queries Operational capabilities Multi-model and data support
  • 35. 35 Durable Distributed Storage Highly Available Online replication ensures data consistency and protects against outages Big Data Capacity Petabyte scale with up to 10x compression and instant query retrieval Distributed and Durable Store and process on clusters of machines for performance and persistence
  • 36. 36 MemSQL Unified Architecture Historical Data Disk-optimized tables with compression for fast analytic queries Live Data Memory optimized tables for analyzing real-time events Streaming Ingest Real-time data pipelines with exactly-once semantics
  • 37. 37 Drive Real-Time Insights • Rich analytics with Scalable SQL • Support for JSON, Geospatial, Key-Value • Fast Query Vectorization and Compilation • User Defined Functions
  • 38. 38 Deliver Real-Time ETL Load Guarantee message delivery with exactly-once semantics Transform Map and enrich data with user defined functions or Spark transformations Extract Ingest from Apache Kafka or Spark Change data capture or bulk load
  • 39. 39 Simple Setup -> CREATE PIPELINE memsql> CREATE PIPELINE twitter_pipeline AS -> LOAD DATA KAFKA "public-kafka.memcompute.com:9092/tweets-json" -> INTO TABLE tweets -> (id, tweet); Query OK, (0.89 sec) memsql> START PIPELINE twitter_pipeline; Query OK, (0.01 sec)
  • 40. 40 Ecosystem Overview Streaming Ingest Live Data Historical Data Real-Time Data Messaging and Transforms Historical Data BI Dashboards Kafka Spark Relational Hadoop Amazon S3 Bare Metal, Virtual Machines, Containers On-Premises, Cloud, As a Service Real-Time Applications Tableau Looker Microstrategy
  • 41. 41 Amazon EC2 + MemSQL Size Memory Size Compute Size Storage ANSI SQL Build a cluster in minutes Pipelines for ingest Easy to deploy with MemSQL Ops High Availability ACID Data Warehouse and Database
  • 42. 42 AWS Aurora MemSQL Dataset easily fits under 500 GB Single server compute Write-centric without reads Dataset from 100 GB to 1 PB Horizontal scale Simultaneous read and write workloads Database from AWS and MemSQL
  • 43. 43 Redshift MemSQL No requirements for fast data ingest No requirement for for concurrency Fast data ingest required Support for high concurrency Data Warehouse from AWS and MemSQL