Data Con LA 2019 - Patterns for Persistence and Streaming in Cloud Architectures by Jeffrey Carpenter

Patterns for Persistence
and Streaming in Cloud
Architectures
Jeff Carpenter
Director of Developer Advocacy
community.datastax.com | @jscarp

Photoshopping
by Steve
Halladay

3 © DataStax, All Rights Reserved.
Agenda
1 Context – Monolith to Microservices, On-Prem to Cloud
2 Selecting Infrastructure, Then and Now
3 Persistence Patterns – Featuring Cassandra
4 Persistence + Streaming – Featuring Kafka
5 Resources

Agenda
5 Resources

Old School Enterprise Architecture
5
© DataStax, All Rights Reserved.
All tables
ACID Transactions
Joins
Indexes
RDBMS
Monolithi
c
Applicatio
n
Other
AppsIntegration
by
database

Transitional Architecture
6
RDBMS
Monolithi
c
Applicatio
n
Integration by API
Service
s
Other
Apps
NoSQL,
NewSQL,
RDBMS
?

On Prem DC
Microservices in the Cloud
Services
Clients
Applications
AWS DC A AWS DC B GCP DC

Agenda
5 Resources

Tasks of the Architect
Defining Components and Interfaces
Identifying Patterns
Managing the –ilities
Making tradeoffs

Infrastructure Selection – Then

Infrastructure Selection – Now?

Quality Attribute Bingo - Then
•Performance •Scalability •Availability •Reliability
•Extensibility •Modularity •Reusability •Monitorability
•Deployability •Maintainability •Usability •Cost

Data Infrastructure Criteria - Now
DX
Performance
Availability
Security
Flexibility
Cost

Minimizing Cost of Change - Abstraction
17
Service
Database
API
Busines
s Logic
Messaging
Data
Access
Queue / Stream

Agenda
5 Resources

Core
application data
Microservices and Polyglot Persistence
19
Servic
e A
Service
B
Tabular Key-value (cache)
Servic
e C
RelationalDocument Graph
Service
D
Service
E
Reference data Content
Highly
networked data
Legacy, low
volume data

Apache Cassandra Overview
• First developed by Facebook
• Top-level Apache project since 2010
• Partitioned row store
• Distributed, decentralized
• Elastic scalability / high performance
• High availability / fault tolerant
• Tuneable consistency
• Cassandra Query Language (CQL)
© DataStax, All Rights Reserved.20
Apache Cassandra ® Apache Software Foundation

KillrVideo – A video sharing application
https://github.com/KillrVideohttps://killrvideo.github.io

KillrVideo High Level Architecture
KillrVideo
Services
Your
Browser
Web
Application
Technology Choices
• Node.js
• Falcor
• Java / C# / Node.js / Python
• GRPC
• Etcd
• DataStax Drivers
• DataStax Enterprise
including Apache
Cassandra & Spark, Graph
Deployment
• Download and run locally
via Docker
• Deployed in AWS using
DataStax Managed
Services:
http://killrvideo.com/

Application Workflow in KillrVideo
User Logs
into site
Show basic
information
about user
Show videos
added by a
user
Show
comments
posted by a
user
Search for a
video by tag
Show latest
videos added
to the site
Show
comments
for a video
Show ratings
for a video
Show video
and its
details

Queries in KillrVideo to Support Workflows
Users
User Logs into
site
Find user by email
address
Show basic
information
about user
Find user by id
Comments
Show
comments for
a video
Find comments by
video (latest first)
Show
comments
posted by a
user
Find comments by
user (latest first)
Ratings
Show ratings
for a video Find ratings by video

Designing Tables Based on Queries
Show video
and its
details
Find video by id
Show videos
added by a
user
Find videos by user (latest
first)
CREATE TABLE videos (
videoid uuid,
userid uuid,
name text,
description text,
location text,
location_type int,
preview_image_location text,
tags set<text>,
added_date timestamp,
PRIMARY KEY (videoid)
);
CREATE TABLE user_videos (
userid uuid,
added_date timestamp,
videoid uuid,
name text,
preview_image_location text,
PRIMARY KEY (userid,
added_date, videoid)
)
WITH CLUSTERING ORDER BY (
added_date DESC,
videoid ASC);

Delivery Models for Cloud (Data) Infrastructure
Enterprise Versions
• Pro
– Certification and
Support
– Additional features
– Security
• Con
– Licensing cost
– Cost of change
Open Source
• Pro
– Free for dev and
prod
– Visibility and
modifiability
• Con
– Cost to maintain
expertise
– Dependence on
community
Managed Services
• Pro
– Ease of adoption
– Lowest time to
prod
– Pay as you go
• Con
– Observability
obscured
– Cost
management

Comparing Scale-out Databases

Core
application data
Microservices and Polyglot Persistence
29
Servic
e A
Service
B
Tabular Key-value (cache)
Servic
e C
RelationalDocument Graph
Service
D
Service
E
Reference data Content
Highly
networked data
Legacy, low
volume data

Should a Service be Polyglot?
30
Hotel
Service
Cassandra Key-value
(Redis, etc.)
Name-to-
ID
mapping
?
Primary
store
(tabular)

Emerging - Multi-model Databases
Servic
e A
Service
B
DSE database
Key-value
semantics
Servic
e C
Service
D
CQL JSON Gremlin
DSE Graph

Agenda
5 Resources

Apache Kafka Overview
• First developed by LinkedIn
• Top-level Apache Project since 2012
• Distributed streaming platform
• Used for real-time data pipelines and
streaming applications
• Horizontal scalability / high performance
• High availability / Fault tolerance
• Stream persistence and querying
(KSQL)
• Connect framework
Apache Kafka ® Apache Software Foundation

Kafka Concepts
• Topics
– Collection of key/value pairs
– Append-only
– Can be partitioned
• Producers
• Consumers
– Separate offsets

Kafka Concepts
• Streams applications
– Combined Producer/Consumer
• KSQL
– Query language used by stream
applications

Kafka Concepts
• Brokers
• Clusters
• Connect Framework
– Sources
– Sinks

Cassandra + Kafka – Similarities and Distinctives
• Concepts in common
– Distributed Systems
– Partitioning / Hashing
– Replication
• Slight differences in implementation
– Multi-DC
– Log-structured
– TTL / retention
• Cassandra excels at…
– High volume, write intensive data storage
workloads at scale
– Suitable as a system of record
– High performance searching via DSE
• Kafka excels at…
– Streaming data to/from services and legacy
data sources
– Acting upon changes in data from multiple
sources (aka pipelines)

+
Better Together – using the best of both

Pattern 1: Cassandra + Kafka in Microservices
Some
Producer
My
microservice
DataStax Enterprise
• Consume
topic(s)
Other
consumers
• Read /
write data
• Publish to
topic(s)

KillrVideo Services Suggested
Videos
Service
DataStax Enterprise
DSE Graph
• UserCreated
• YouTubeVideoAdded
• UserRatedVideo • Populate graph
• Graph recommender
traversal
• Read and
write data
User Management, Video
Catalog, Ratings
Cassandra + Kafka – KillrVideo Example

Takeaways
Flexibility in selection of databases per microservice
Select and deploy infrastructure based on scale
Use queues to coordinate data synchronization
Use abstraction to minimize the cost of change

Agenda
5 Resources

DataStax Academy
• Free self-paced courses
• DS201: Apache Cassandra™
• DS210: Operations
• DS220: Data Modeling
• DS310: Search
• DS320: Analytics
• DS330: Graph
• Kafka Connector Getting Started
https://academy.datastax.com

Docker and Datastax
45 Confidential
• WHERE
– https://hub.docker.com/u/datastax/
– https://github.com/datastax/docker-
images/tree/master/datastax-docker-image-
examples
• We provide
– Dockers images for DSE, studio, Opscenter
– Docker-compose configuration files
– Sample Deployments
• We support
– Installation on dev before 6.7
– Installation on prod from 6.7 (December 2018)

Live Coding on Twitch
• Live coding sessions with advocates and
guests
• Working through the challenges of
building distributed systems
• Join the conversation and ask questions
• Twitch Rewind: Kafka Connector
– https://www.youtube.com/watch?v=2_BidD
K5zGE
https://www.twitch.tv/datastaxacademy

Resources – DataStax Kafka Connector
• Blog
– https://www.datastax.com/2018/12/introducing-the-datastax-apache-kafka-connector
• Download
– https://academy.datastax.com/downloads#connectors
• Docs
– https://docs.datastax.com/en/kafka/doc/index.html
• Demonstration
– https://github.com/clun/kafka-dse/tree/driver2
• Examples
– https://github.com/datastax/kafka-examples

Thank You!
Come visit our booth!

Data Con LA 2019 - Patterns for Persistence and Streaming in Cloud Architectures by Jeffrey Carpenter

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Data Con LA 2019 - Patterns for Persistence and Streaming in Cloud Architectures by Jeffrey Carpenter

Ähnlich wie Data Con LA 2019 - Patterns for Persistence and Streaming in Cloud Architectures by Jeffrey Carpenter (20)

Mehr von Data Con LA

Mehr von Data Con LA (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Data Con LA 2019 - Patterns for Persistence and Streaming in Cloud Architectures by Jeffrey Carpenter

Hinweis der Redaktion