A session in the DevNet Zone at Cisco Live, Berlin. Big data and the Internet of Things (IoT) are two of the hottest categories in information technology today, yet there are significant challenges when trying to create an end-to-end solution. The worlds of "IT" and “IoT" differ in terms of programming interfaces, protocols, security frameworks, and application lifecycle management. In this talk we will describe proven ways to overcome challenges when deploying a complete “device to datacenter” system, including how to stream IoT telemetry into big data repositories; how to perform real-time analytics on machine data; and how to close the loop with reliable, secure command and control back out to remote control systems and other devices.
1. Dmitri
Chtchourov,
MANTL
Data
Pla6orm,
Microservices
and
BigData
Services
Innova?on
Architect,
CIS
CTO
Group
2. Agenda
Problem
&
Opportunity
What
do
we
want
to
do?
What
is
in
it
for
us?
How
does
it
work?
What
have
we
done
so
far?
Anatomy
of
a
Service
Reference
Architectures
and
real
use
cases
PuMng
it
all
together
3. Problem & Opportunity
Rapid
innova?on
in
compu?ng
and
applica?on
development
services
No
single
service
is
op.mal
for
all
solu.ons
Customers
want
to
run
mul.ple
services
in
a
single
cluster
and
run
mul.ple
clusters
in
Intercloud
environment
...to
maximize
u,liza,on
...to
share
data
between
services
…Complex/BigData
and
Microservices
together
4. Technologies matrix*
Service
Product
Cloud/Virtualiza?on
CIS/AWS/Metacloud/UCS…
Provisioning
Open
Stack/Terraform
Automa?on
Ansible
Clustering
&
Resource
Management
Mesos,
Marathon,
Docker
Load
Balancing
Avi
Networks
ETL
&
Data
Shaping
StreamSets
Log
Data
Gathering
Logstash
Metrics
Gathering
CollectD,
Avi
Networks
Messaging
KaUa,
Solace
Data
Storing
(Batch)
HDFS
Data
Storing
(OLTP/Real-‐?me)
Cassandra
Data
Storing
(Indexing)
Elas?c
search
Data
Processing
Apache
Spark
Visualiza?on
Zoomdata
*Subset
example
5. Cloud
Management
Data
Collect
Data
Storage
Data
Processing
Visualisa.on
Technologies stack
6. Datacenter and solution today
VM7
or
BM7
VM8
or
BM8
VM4
or
BM4
VM5
or
BM5
VM6
or
BM6
VM1
or
BM1
Visualization Service
Data Ingestion Service
Analytics Service
• Configuration and management
of 3 separate clusters
• Resources stay idle if service
is not active
• Need to move data between
clusters for each service
VM2
or
BM2
VM3
or
BM3
VM1
or
BM1
VM2
or
BM2
VM3
or
BM3
7. What do we want to do?
Data
Inges?on
Service
Analy?cs
Service
Visualiza?on
Service
….to
maximize
u,liza,on
...to
share
data
between
services
Shared
cluster
Mul.ple
clusters
9. What is in it for us?
Maximize
u.liza.on
Deliver
more
services
with
smaller
footprint
Shared
clusters
for
all
services
Easier
deployment
and
management
with
unified
service
plaCorm
Shared
data
between
services
Faster
and
more
compe..ve
services
and
solu.ons
Combine
paradigms
for
flexibility
and
func.onality
Run
complex
services
and
microservices
in
the
single
environment
10. How does this work?
Mesos
Slave
Spark
Task
Executor
Mesos
Executor
Mesos
Slave
Docker
Executor
Docker
Executor
Mesos
Master
Task
#1
Task
#2
./python
XYZ
java
-‐jar
XYZ.jar
./xyz
Mesos
Master
Mesos
Master
Spark
Service
Scheduler
Marathon
Service
Scheduler
Zookeeper
quorum
11. How does this work?
Mesos
provides
fine
grained
resource
isola+on
Mesos
Slave
Process
Spark
Task
Executor
Mesos
Executor
Task
#1
Task
#2
./python
XYZ
Compute
Node
Executor
Container
(cgroups)
12. How does this work?
Mesos
provides
scalability
Mesos
Slave
Process
Spark
Task
Executor
Task
#1
Task
#2
./ruby
XYZ
Compute
Node
Python
executor
finished,
more
available
resources
more
Spark
Container
(cgroups)
Task
#3
Task
#4
13. How does this work?
Mesos has no single point of failure
Mesos
Master
Mesos
Master
Mesos
Master
VM1
or
BM1
VM2
or
BM2
VM3
or
BM3
VM4
or
BM4
VM5
or
BM5
Services keep running if VM fails!
14. How does this work?
Master node can failover
Mesos
Master
Mesos
Master
Mesos
Master
VM1
or
BM1
VM2
or
BM2
VM3
or
BM3
VM4
or
BM4
VM5
or
BM5
Services keep running if Mesos Master fails!
15. How does this work?
Slave
process
can
failover
Tasks
keep
running
if
Mesos
Slave
Process
fails!
Mesos
Slave
Process
Spark
Task
Executor
Task
#1
Task
#2
./ruby
XYZ
Compute
Node
Task
#3
Task
#4
16. How does this work?
Can
deploy
in
many
environments
Get
orchestrated
by
Openstack,
Ansible
(scripts),
Cloudbreak
True
Hybrid
Cloud
deployment:
CIS,
AWS,
UCS,
vSphere,
other
CIS/
CIS/AWS/Metpod/vSphere/UCS…
Terraform
REST
API
REST
API
Scripted
provisioning
Direct
provisioning
Policy,
Auto-‐scaling
VM1
or
BM1
VM2
or
BM2
VM3
or
BM3
VM4
or
BM4
VM5
or
BM5
17. How does this work?
Microservices managed and scaled separately
Microservices managed by Mesos in a single platform
Microservices architecture for Mesos frameworks and other
components
CIS/AWS/Metacloud/vSphere/UCS…
Terraform
Spark
Executor
N
Spark
Executor 1
Spark
Scheduler
Kafka
Broker N
Kafka
Broker 1
Kafka
Scheduler
Docker Docker
TraefikMicroservices …
REST API
REST API
Scripted provisioning
Direct
provisioning
Policy, Auto-scaling
VM1
or
BM1
VM2
or
BM2
VM3
or
BM3
VM4
or
BM4
VM5
or
BM5
18. What have we done so far?
Working
with
partners
on
adop.ng
and
co-‐developing
services
Partners
Co-‐development
Partners
19. Anatomy of the service/framework
Riak
is
Basho
Technologies
distributed
highly
available
database
Op?mized
Mul?-‐Datacenter
opera?on
We
are
working
together
with
Basho
Labs
on
developing
and
tes?ng
their
Mesos
Service
version
of
the
product
20. Riak Use Cases
Online / Commerce
● Session Control
● Shopping Cart
● Product Ratings and Reviews
Internet of Things
● Connected Device Data
● Sensor Data
● Log Data
Content Management
● Storing Unstructured Data
● Content Personalization
● Advertising Data
Gaming
● Store Leaderboard Info
● Store Bet Transactions
● Online Chat
Digital Communications
● Online Community Chat
● Notification and Alerting
● Mobile Messaging
21. Development phases
Phase
0:
Package
applica?on
in
Docker
container
to
deploy
on
Mesos
Phase
1:
Convert
applica?on
to
Microservices
Architecture
to
deploy
as
Mesos
applica?on
with
mul?ple
components
Phase
2:
Create
an
intelligent
scalable
Mesos
service
based
on
the
applica?on
27. Anatomy of the service/framework
Zoomdata
is
distributed
highly
available
large
scale
visualiza?on
pla6orm
Op?mized
very
big
data
set
micro-‐query
analy?cs
We
are
working
together
with
Zoomdata
on
developing
and
tes?ng
their
Mesos
Service
version
of
the
product
29. Zoomdata Service: Mesos + Kubernetes
Mesos
Slave
Mesos
Master
Mesos
Slave
Mesos
Slave
Zoomdata
web
app
Mongodb
Spark
Worker
Spark
Executor
Spark
Executor
Proxy
(haproxy,
nginx)
Kubernetes
Mongo
Service/RC
Kubernetes
Spark-‐Proxy
Service/RC
Spark-‐Proxy
Zoomdata
web
app
Zoomdata
web
app
Kubernetes
Framework
Kubernetes
Zoomdata
Scheduler
Service/RC
Zoomdata
Scheduler
ProxyGen
Script
User
● Every
component
(Zoomdata
App,
MongoDB,
Spark-‐Proxy,
Scheduler)
must
be
started
in
independent
K8s
Pod
and
there
must
be
exactly
one
MongoDB,
Spark-‐Proxy
and
Scheduler
Pods
meanwhile
Zoomdata
App
can
be
scaled
with
help
Kubernetes
Replica?on
Controller.
● There
must
be
defined
Kubernetes
Service
for
MongoDB,
Spark-‐Proxy,
Scheduler
as
they
will
be
used
in
Zoomdata’s
App
Pod.
Every
docker
container
will
have
env
variables
for
every
present
Service
injected
automa?cally.
30. Anatomy of the service/framework
StreamSets
is
an
open
source
con?nuous
big
data
ingest
infrastructure
Accelerates
?me
to
analysis
with
unprecedented
transparency
and
processing
to
data
in
mo?on.
Cluster
deployments
JVM,
Docker,
Spark
Streaming
on
Mesos
Con?nuous
Opera?ons
to
Minimize
down?me
31. Advantages of Streamsets
Adaptable
Data
flow
-‐
Design
and
execute
intent-‐driven
data
flows
in
a
graphical
IDE
Instream
Sani?za?on
-‐
transform
and
process
the
data
on
the
fly
Intelligent
Monitoring
-‐
Get
early
warnings,
detect
anomalies
and
take
ac?on
Link
origins
to
des?na?ons
with in-stream data preparation
32. Streamsets Data Pipeline
MESOS
Streamsets Data Collector (SDC) Architecture
Cluster
Streaming
mode
Data
Collector
runs
as
an
applica?on
within
Spark
Streaming,
Spark
Streaming
runs
on
Mesos
cluster
manager
to
process
data
from
a
KaUa
cluster.
The
Data
Collector
uses
a
cluster
manager
and
a
cluster
applica?on
to
spawn
workers
as
needed.
Cluster
Batch
Mode
:
Data
Collector
processes
all
available
data
from
HDFS
and
then
stops
the
pipeline.
MapReduce
generate
addi?onal
worker
nodes
as
needed.
Standalone
mode
Single
Data
Collector
process
runs
the
pipeline.
A
pipeline
runs
in
standalone
mode
by
default.
33. MANTL Data Platform Overview
A
modern,
baneries
included
pla6orm
for
rapidly
deploying
globally
distributed
services.
Mantl’s
goal
is
to
provide
a
fully
func?onal,
instrumented,
and
portable
container
based
PaaS
for
your
business
at
the
push
of
a
bunon
1)
Easy
deployment
and
configura?on
on
different
pla6orms
2)
High
availability
and
self-‐healing
3)
Mul?-‐datacenter
support
4)
Linear
scalability
5)
Smart
resource
management
6)
Wide
range
of
supported
frameworks
34. MANTL nodes
Consul
for
service
discovery
Mesos
cluster
manager
Marathon
for
cluster
management
Docker
container
run?me
Zookeeper
for
configura?on
management
Docker
containers
Any
Mesos-‐based
workloads
Traefik
for
proxying
external
traffic
into
services
running
in
the
cluster
35. Security
&
Opera.on
Frameworks
PlaCorm
Support
Mantl Components
Core
Components
➢ Data
Storage
-‐
Riak,
Cassandra,
HDFS
➢ Data
processing
-‐
Spark
➢ Security
-‐
Vault
➢ Data
inges?on
–
KaYa
➢ Metrics
collec?on
-‐
Collectd
➢ Logs
forwarding
-‐
Logstash
➢ Provisioning
-‐
Terraform,
Ansible
➢ Cluster
management
–
Mesos,
Marathon
➢ Service
discovery
and
configura?on
management
-‐
Consul,
Zookeeper,
Traefik
➢ Container
run?me
-‐
Docker
➢ Cisco
Cloud
Services,
Cisco
MetaCloud
➢ Amazon
Web
Services
➢ Google
Compute
Engine
➢ Openstack
➢ DigitalOcean
➢ Bare
Metal
➢ Autoscaling
and
high
availability
➢ Applica?on
load
balancer
➢ Applica?on
dynamic
firewall
➢ Manage
Linux
user
accounts
➢ Authen?ca?on
and
authoriza?on
for
Consul,
Mesos,
Marathon
36. Long
Running
Services
Big
Data
Processing
Batch
Scheduling
Supported Mesos Frameworks
Data
Storage
Mesos
makes
it
easy
to
develop
distributed
systems
by
providing
high-‐level
building
blocks.
37. ANALYTICS
PLATFORM
MANAGEMENT
Data
Inges?on
• KaUa,
Streamsets
configurators
Data
Storage
Riak,
Cassandra,
HDFS
Model
DevOps
Machine
learning
MLLib,
Spark
Model
Deployment
• Model
loading,
versioning
Cluster
Management
&
Scheduling
Cluster
manager
Mesos
Cluster
Management
long
running
service
Marathon
Service
Discovery
Consul
Distributed
Virtual
network
Calico
ETCD
ADVANCED
ANALYTICS
APPS
Analy?cs
Accelerators
as
Apps
• Forecas?ng,
NLP,
op?miza?on,
enrichment
etc.
SPECIALIZED
ADVANCED
ANALYTICS
MODELS
Consul?ng
Services
Design,
Build,
Deploy
Maintain,
Manage
Performance
DASHBOARDS
ZoomData
Tableau,
Qlik,
Spo6ire,
Excel/BI
Cubes
…
BUSINESS
APPS
Custom
ZoomData
Visualiza?ons
(D3)
Custom
Applica?ons
Customer
System
Integra?on
CUSTOMIZATION
&
MANAGED
SERVICES
CISCO
INTERCLOUD
Customization MANTL Data Platform
38. Sample Architecture for Batch Data
Processing
Cassandra
Elastic
Search
Spark
Spark Mllib
Riak
Kibana
Dashboard
VisualisationStorage
Stream
Sets
I/P in multiple formats
Text, logs and json from
various storage source.
Spark application process data
and store to elastic search or
Cassandra or Riak storage for
visualization else it stores in
HDFS
Machine learning algorithm for
data science application
Zoomdata
Data
Discovery
D3
Web
Application
HDFS
StreamSets Data Collector
runs as an application in
Spark Streaming to pull
data from origin to spark
CSV, Tab
delimited etc.
LOG file
JSON
TEXT
39. Sample Architecture for Data Streaming
Kafka
Cassandra
Elastic
Search
Spark
Spark Mllib
Riak
Kibana
Dashboard
VisualisationStorage
Stream
Sets
Streaming network from
different sources
Kafka is used for collecting streaming data and
data is consumed through consumer API by
Streamset for further processing.
Spark application process data
and store to elastic search or
Cassandra or Riak storage.
Machine learning algorithm for
data science application
Zoomdata
Data
Discovery
D3
Web
Application
40. Use Case 1 - Shipped Analytics
Collect log metric from cluster to analyze and drive Alert/Recommend
engine
• Alert Engine - produces alert messages on a basis of some conditions.
• Trend Engine - produces trend messages related to data aggregation.
• Policy Engine – derives from Alert and Trend Engines produces policy
messages which contain recommendations.
41. Use Case 1 - Shipped Analytics Architecture
Central
Cluster
Probe
Probe
Probe
43. • Iden?fy
the
top
technology
trends
by
analyzing
public
data
and
open
source
projects
• Use
machine
learning
to
process
a
wide
range
of
public
data
available
on
the
world
wide
web
and
iden?fy
high
poten?al
emerging
technologies
• Publish
results
to
a
web-‐based
dashboards
and
refresh
results
regularly
Use case 2 - Emerging Top technology using Public data
45. Use Case 2 - Dataflow
APIs
RSS
feeds
Scraping
Numeric
Network
Data
Text
data
(ar?cles,
blogs)
Staging
tables
Interac.ve
D3
Dashboards
Websites
Data Sources
D a t a
Extractors
Data
Storage
Data Processing Machine Learning Visualization
Below we used the framework to execute the project in CIS Data Platform
46. Lambda Reference Architecture
Monitoring
/
Analy?cs
Cluster
(local,
Texas-‐3)
Global
Monitoring
/
Analy?cs
Cluster
(global,
Texas-‐1)
Monitoring
/
Analy?cs
Cluster
(local,
Ams.
-‐1
)
Monitoring
/
Analy?cs
Cluster
(local,
Lon.-‐1)
Local
components
and
deployment
is
the
same
as
global,
just
smaller
Real-‐.me
and
batch
processing
(Lambda),
anomaly
detec.on,
visualiza.on
SSL
KaUa
SSL
SSL
MQTT
47. MANTL Data Platform in Practice:
putting it all together
Working
on
advanced
enabling
technology
–
Mesos,
K8S,
Orchestra?on
Working
on
developing
individual
components
–
dev
&
co-‐dev:
Zoomdata,
Riak,
Streamsets,
etc.
PuMng
together
reference
architectures
and
real
solu?ons
to
test
and
further
develop
the
technology
Provide
innova?on
and
advanced
services
to
customers
Pla6orm
to
develop
and
deliver
Microservices
and
Data
applica?ons
49. Next steps
Con?nue
partnerships
and
co-‐devlopment
efforts
with
industry
leaders
to
deliver
innova?on
Con?nue
applying
new
developed
technology
to
real
use
cases
and
PoC
with
customers
and
partners
Con?nue
working
closely
with
A&E
and
Product
teams
on
produc?za?on
roadmap
Work
with
A&E
team
closely
on
priori?za?on
of
our
R&D
ac?vi?es
to
stay
closely
aligned
50. Anatomy of the service/framework
Elas?csearch
is
a
highly
scalable
open-‐source
full-‐text
search
and
analy?cs
engine
Allows
to
store,
search,
and
analyse
big
volumes
of
data
quickly
and
in
near
real
?me
Underlying
technology
in
applica?on
to
Op?mize
complex
search
in
Big
data
We
are
working
together
with
Elas?c
developing
and
tes?ng
their
UTILIZING
Mesos
cluster
to
run
Elas?csearch
51. Elas?csearch
on
Mesos
Cluster
Elas?c
framework
scheduler
Marathon
framework
scheduler
Chronos
framework
scheduler
Zookeper
Chronos
Executor
Marathon
Executor
HA
Proxy
node
Step
1:
Mesos
Cluster
with
Marathon
&
Chronos
running
Step
2:Elas?c
framework
installa?on
on
MESOS
Master
with
a
configured
#
of
mesos
slaves
to
be
launched
Step
3:
Deploys
the
ES
executore
in
MESOS
slaves
Step
4:
ES
nodes
discovery
and
Zookeper
pugin
in
ES
nodes
Step
5
Using
plugin
nodes
find
each
other
and
search
is
op?mized
at
cluster
level
Elas?csearch
executor
&
Zookeper
pugin
52. MANTL Architecture – Datacenters
Control
nodes
manage
the
cluster
and
resource
nodes.
Containers
automa?cally
register
themselves
into
DNS
so
that
other
services
can
locate
them.
Once
WAN
joining
is
configured,
each
cluster
can
locate
services
in
other
data
centres
via
DNS
or
Consul
API
Single
Datacentre
Mul?ple
Datacentre
53. Client Client Client
RPC
over DNSmask
RPC
over DNSmask
LAN gossip
over DNSmask
Server
Server
(Leader)
Server
replication replication
Lead
forwarding
Internet
Server
Server
(Leader)
Server
replication replication
Lead
forwarding
Datacenter 1
Datacenter 2
Remote DC
forwarding
WAN gossip
TCP&UDP
Consul
➢ Service discovery
➢ Client health-checking
➢ Key-value store for
configurations
➢ Multi-datacenter support
54. Mesos features
Mesos
makes
it
easy
to
develop
distributed
systems
by
providing
high-‐level
building
blocks.
➢ Scalability
➢ Fault-tolerance and self-healing
➢ Resource isolation
➢ Fine Grained resource elasticity
57. ZK
ZK
ZK
Zookeeper quorum
JN
JN
JN
Shared edits
DataNode
DataNode
Active
NameNode
Zookeeper
Failover Controller
Active
NameNode
Zookeeper
Failover Controller
DataNode
Heartbeat Heartbeat
Write Read
Active NN state
monitoring
Standby NN state
monitoring
Monitor and
maintain active
lock
Monitor and try
to take active
lock
➢ Used
to
store
and
distribute
data
accross
a
cluster
➢ Is
a
base
for
batch
analy?c
processing
➢ Is
highly
available
and
fault
tolerant
➢ Automa?cally
scaled
and
self-‐
healing
with
Mesos
HDFS framework
59. * Smart broker.id assignment
* Preservation of broker placement
* Rolling restarts
* Easy cluster scale-up
Mesos framework for Kafka
60. Ø Fault
tolerant
job
scheduler
Ø handles
dependencies
and
ISO8601
based
schedules
Ø Flexible
Job
Scheduling
Ø Supports
arbitrarily
long
dependency
chain
Ø supports
the
defini?on
of
jobs
triggered
by
the
comple?on
of
other
jobs
Mesos framework for Chronos
61. How MANTL Data Platform for business
application
• Cisco Data Platform can be used to build custom applications or
service for various analysis and Data analytics initiative.
• Companies can streamline Data ingestion, process,
manipulate , analyse and visualize data all in single
Infrastructure
62. Yali Load Testing Framework
Yali
Elas?csearch
KaUa
Cassandra
HDFS
Plugins
Kafka
Cassandra
HDFS
Storage
Elasticsearch
Generate data to load test storage
63. Elasticsearch Plugin Testing Results
Job
Host
config
Elas.csearch
config
Execu.on
threads
Batches
Records/
batch/thread
Average
response
from
ES,
s
Records/s
Record
size,
b
Records
generated
*
10^6
Execu.on
.me,
min
Win7,
4
cpu,
16
ram
Cluster:
CentOs
6.7,
Elas?csearch
2.1.1,
VPN
network,
2
master(4
cpu,
16
ram),
15
worker
nodes(8
cpu,
32
ram)
12
60
50000
78
6804
280
36
84
Local:
CentOs
6.4,
Elas?csearch
2.1.1,
VMware
virtual
network,
single
node
(2
Core
cpu,
8
Gb
ram)
4
60
10000
1,6
14768
280
2,4
2,5
Records