SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
EXTENDING DEVOPS TO
BIG DATA APPLICATIONS
WITH KUBERNETES
ABOUT ME
NICOLA FERRARO
Software Engineer at Red Hat
Working on Apache Camel, Fabric8, JBoss Fuse, Fuse Integration Services for
Openshift
Previously Solution Architect for Big Data systems at Engineering (IT)
@ni_ferraro
SUMMARY
Overview of Big Data Systems and Apache Spark
DevOps, Docker, Kubernetes and Openshift Origin
Demo: Oshinko, Kafka, Spring-Boot and Fabric8 tools
BIG DATA
Big Data ≈ Machine Learning at Scale
Big Data systems are capable of
handling data with high:
Volume
Terabytes, petabytes, ...
Log files
Transactions
​Velocity
Streaming ​data
Micro batching
Near real-time
​Variety
Structured data​
Images
Videos
Free text
Volume
Velocity Variety
Here
> 100
machines
EVOLUTION OF BIG DATA SOFTWARE
http://www.slideshare.net/nicolaferraro/a-brief-history-of-big-data-48525942
Hadoop Pig Hive
(a whole zoo)
Spark
EVOLUTION OF BIG DATA
INFRASTRUCTURE
Commodity hardware
(Really ?)
Specialized hardware
(Appliances)
The Cloud
EVOLUTION OF
ARCHITECTURES:
BATCH
The first Big Data architecture was heavily
based on Hadoop: MapReduce + HDFS.
Some commercial advertising motivations:
Extract meaningful information from raw
data: "companies analyze around <<put-a-
random-number-between-0-and-20-
here>> % of the data they produce"
More data = more precision for machine
learning algorithms = better insights
Assist the decision making process: predict
the future using all the available data
SELECT page, count(*) FROM logs
ARCHITECTURES: BATCH
Hadoop (HDFS + MapReduce) Nodes
App Server
Logs + Data
(Flume)
Report (DWH)
Client
Batch
Processing
(Hive)
MR
EVOLUTION OF
ARCHITECTURES:
HYBRID
Second generation architectures were focused
to improve user experience through machine
learning:
Not everything could be executed in
streaming, for performance reasons
Execute heavyweight steps offline in
batches (generate the "view", e.g. a
machine learning model)
Execute lightweight steps online with
streaming applications
The view must be refreshed periodically
+
ARCHITECTURES: HYBRID (LAMBDA)
Hadoop Nodes
App Server
Client
Batch
Processing
NoSQL + Messaging + Other
Streaming
Processing
EVOLUTION OF
ARCHITECTURES:
STREAMING
The new generation of architectures are
streaming only:
Provide immediate feedback to the user
No need to execute offline work
Provide a personalized experience to each
user
Here we focus (mainly) on
streaming applications ...
ARCHITECTURES: STREAMING (KAPPA)
Hadoop Nodes
App Server
Client
NoSQL + Messaging + Other
Streaming
Processing
Here we focus (mainly) on
streaming applications ...
APACHE SPARK
One of the main reason of Spark success is that
it allows to define, by design, with a unified
model:
Batch processing and streaming
(MapReduce / Storm)
Multi-language: core in Scala, can be also
used in Java, Python and R
Declarative semantics (SQL), procedural
and functional programming (Hive /
MapReduce / Pig)
General purpose data processing and
machine learning (MapReduce / Mahout)
Logistic regression in Spark vs.
Hadoop (it was so in 2013)
And the performance are really impressive!
SPARK ARCHITECTURE
Driver
Cluster
Manager
Driver
(2nd app)
Workers
Driver App
(main)
Executor Executor
Single Worker
Tasks:
"do something on
a data partition"
Assign
executors to
the driver app
Define the app
workflow
Do the
dirty job
Logical viewPhysical view
SPARK ARCHITECTURE: DATA
DISTRIBUTION
HDFS / S3
Spark
Executo
r
Spark
Executo
r
Spark
Executo
r
r/w
Spark
Driver
Kafka
Partition
Spark
Executo
r
Spark
Executo
r
Spark
Executo
r
Spark
Driver
Kafka
Partition
Kafka
Partition
HDFS / S3 HDFS / S3
Why distribution?
... to spread the
work across
thousands of
workers
SPARK DEMO
A simple quickstart showing a batch
application doing a word count.
The purpose is showing which part of the
software is executed in the driver and which
parts are sent to the remote executors.
SPARK: WHAT CAN GO WRONG ???
Programming with Spark is awesome, but there
are a lot of problems you need to solve:
Most of the problems do not happen in a
standalone cluster (serialization issues,
classloading issues, network issues, ...)
Something that works well with a small dataset
hangs forever in production: you must test it
earlier with a lot of data!
Spark applications must be tuned: decide
caching steps, decide partitioning, reduce the
number of shuffles, do not "collect()"
For streaming apps:
Exactly-once message semantics cannot be
guaranteed all the times: you have to write
idempotent operations to take into account
duplicate messages
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCl
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$Closur
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:1
at org.apache.spark.SparkContext.clean(SparkContext.scala:2055)
at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:324)
at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:323)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationSco
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationSco
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.map(RDD.scala:323)
AND SPARK IS THE MINOR
PROBLEMEvery operation is executed by different software
distributed in hundreds of machines.
Production different from the development
environment: difficult to maintain software
aligned (patches and upgrades of NoSQL
databases, message brokers, ...): you need to
test the integration with different versions of
external systems
The whole system randomly doesn't work: e.g.
you expect a message in a queue or you expect
a result in a query, and your expectations are
not satisfied
Debugging is hard: it's better to fix bugs
earlier, but bugs appear late in production!
There are consistency problems in some
NoSQL databases: the CAP theorem​
Too many steps from development to release,
with different teams involved: development, qa,
operations I can continue...
THE 2 MAIN PROBLEMS
1
The local development environment is usually too different
from production (replication and versions), so you find bugs late
2
Big Data applications are difficult to test effectively, because
configuring a good testing environment is a hard and long task:
it's difficult to debug bugs all together
How? we can apply DevOps practices with Kubernetes ...
Easy Solution:
Test early
Test more
Test the production software
SUMMARY
Overview of Big Data Systems and Apache Spark
DevOps, Docker, Kubernetes and Openshift Origin
Demo: Oshinko, Kafka, Spring-Boot and Fabric8 tools
DEVOPS
No common definition
Combination of software development, qa
and operations: working together to release
working software frequently
Applying common dev and agile techniques
to operations
It's a cultural change, rather than just a
change in development and administration
tools
Common practices:
Infrastructure as code
Continuous integration
Continuous delivery
DEV
OPS
WHY DEVOPS
Increased communication and
collaboration among: dev, ops, qa,
business
Fast time to market: accomodating
business and customer needs quickly to
have competitive advantage
Embrace change, do not fear it
Embrace innovation (and team
engagement), through usage of new
technologies
Improved quality of software through
automated testing
DEV
OPS
A DEVOPS WORKFLOW
The key point of DevOps is being able to
collaborate on the infrastructure the same
way you collaborate on software:
Common shared repository (SCM)
Pull requests
Peer review
Automated checks (correctness, quality)
Continuous integration
Continuous delivery
Testing the infrastructure like you test
software: scalability, fault tolerance, backup,
recovery options.
DevOps practices become difficult to apply
and uneffective when your infrastructure does
not follow the software release cycle.
DOCKER
Docker has revolutionized the way we build
and ship software.
Containers can be created using a
Dockerfile (example on the right)
There are at least 10 different ways to
create Docker containers
A container is not a VM (even if it seems
really close to it)
A container hosts a single application
Containers need to be orchestrated to
deliver their full power
FROM ubuntu:16.04
RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:
RUN echo "deb http://repo.mongodb.org/apt/ubuntu $(cat
RUN apt-get update && apt-get install -y mongodb-org
# Create the MongoDB data directory
RUN mkdir -p /data/db
# Expose port #27017 from the container to the host
EXPOSE 27017
# Set /usr/bin/mongod as the dockerized entry-point app
ENTRYPOINT ["/usr/bin/mongod"]
DOCKER ARCHITECTURE
Docker is the building block of infrastructure as code:
The image is completely described by the Dockerfile
Any container is stateless (although stateful data can be attached with volumes).
This means:
You don't ssh your server and change the configuration
You rebuild a new version of the application when you need to change something
"Deployment" means basically replacing a container with a different container
KUBERNETES
Cloud platform, with several deployment
options (including local and private cloud), to
Orchestrate (Docker) containers:
Born at Google
Production ready (Pokemon Go!)
Provides:
Application composition in namespaces
Virtual networks
Service discovery
Load balancing
Auto (and manual) scaling
Automatic recovery
Health checking
Rolling upgrades
Monitoring
Logging
....
kubectl run --image=nginx nginx-app --port=80
oc run --image=nginx nginx-app --port=80
# or better..
# oc new-app nginx --name nginx-app2 BOTH Kuebertes and Openshift Origin
are Open Source projects!
KUBERNETES ARCHITECTURE
Very high level architecture of Kubernetes.
Note: it resembles the Spark architecture seen in previous slides (with the Kubernetes master playing the
role of the cluster manager), but we want to deploy a whole Spark application inside Kubernetes
(including the driver).
KUBERNETES FOR
LOCAL DEV
Minikube
https://github.com/kubernetes/minikube
OPENSHIFT FOR
LOCAL DEV
Minishift
"oc cluster up"
https://github.com/minishift/minishift
$ minikube start
Starting local Kubernetes cluster...
Running pre-create checks...
Creating machine...
Starting local Kubernetes cluster...
$ kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --port=8080
$ minishift start
...
$ oc cluster up
-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
...
SUMMARY
Overview of Big Data Systems and Apache Spark
DevOps, Docker, Kubernetes and Openshift Origin
Demo: Oshinko, Kafka, Spring-Boot and Fabric8 tools
OPENSHIFT ORIGIN DEMO
A simple recommender system using collaborative filtering on Openshift origin.
Deploy a:
Spring-boot microservice
Kafka Broker
Spark Cluster
Into openshift origin.
Source code:
https://github.com/nicolaferraro/voxxed-bigdata-spark
Rating **** Rating ****
Recommendations
Recommendations
Spark Cluster
SPARK ON
KUBERNETES
The problem:
A Spark application needs a cluster of n
worker machines to run
You need 1 cluster manager node
The application runs on 1 driver node
We want to deploy everything
inside Kubernetes or Openshift Origin
Basically, every piece of the Spark architecture
should become a POD.
There are multiple solutions. I'm going to
introduce the Oshinko project.
http://radanalytics.io/
https://github.com/radanalyticsio
Driver
Cluster
Manager
Recall the Spark architecture
PODs
Worker Worker Worker
OSHINKO
Making Apache Spark Cloud Native on Openshift Origin.
W WW
CM
D
Spark Cluster
W WW
CM
D
Spark Cluster
1 cluster per application!
Deployed in the application namespace
Oshinko console
DEMO TIME
We will deploy the Oshinko rest server and then deploy the spark
streaming application. It will connect to Kafka to retrieve ratings
and return personalized (not really) recommendations.
Sources:
https://github.com/nicolaferraro/voxxed-bigdata-spark
I used a basic implementation of the slope one algorithm
( ) for collaborative
filtering.
It's just an example of "real life" custom Spark application that
sometimes doesn't work :)
https://en.wikipedia.org/wiki/Slope_One
TESTING
One of the major advantages of making a cloud native Spark application is the
possibility of executing system tests on software identical to production.
Production pods
Optional testing tools or load
generators
Test driver (inside or outside the namespace)
You can create virtual test environments on the fly and do assertions.
Demo: using fabric8-arquillian and the fabric8 kubernetes client
TESTING: SOME SCENARIOS
Some tests that you may want to run:
Deploy the production infrastructure (or part of it) and run a suite of functional
tests to ensure every piece works as expected to produce the result
Deploy the production infrastructure and do performance tests using external
or injected data
Deploy the production infrastructure (or part of it) and check that every piece is
healthy. Then randomly kill pods while sending a load to test the system
availability (system tests)
The Chaos Monkey!
FABRIC8
We have used some tools from Fabric8:
Fabric8 Maven Plugin: to build Kubernetes resources for Java projects
Docker Maven Plugin: used by the fabric8 maven plugin to create docker images
Fabric8 Arquillian: to create system tests from artifacts using the fabric8 plugin
Fabric8 Kubernetes Client: to interact with Kubernetes resources (pods, services, etc.)
Fabric8 is a complete development platform for
Kubernetes!
Uses Jenkins Pipelines and to provide:
Continuous integration
Continuous delivery
Out of the box!
https://fabric8.io/
QUESTIONS?
Q/A
What about storage and data locality? In cloud native applications, using a storage
system like s3 has many advantages over HDFS (costs, availability, usage patterns in
testing). Performance of HDFS are superior (6x), but you can use larger clusters for a
limited amount of time to gain the same performance. More info:
Can you scale the application directly from Openshift, or you always need to
change the code? Sure, you can use Openshift to scale the application. Beware that, in
a pure DevOps approach, changing the configuration of a system at runtime is not
something you should do, especially on production. You lose the benefits of having all
the software and configuration in a "executable" state, and you lose the effectiveness of
system tests because tests will not run on an environment identical to production. As an
alternative, you can use the auto-scaling feature (enabling it in the code).
Is the Spark demo using a machine learning algorithm? The demo is using a basic
collaborative filtering algorithm called "slope one" (the focus of this talk is not machine
learning). Spark provides many algorithms out of the box, also for collaborative filtering.
Many of them are supposed to be executed in a batch application, rather than in a
streaming app.
http://qr.ae/TAF4cN
THANKS
Follow me on Twitter!
@ni_ferraro

Weitere ähnliche Inhalte

Was ist angesagt?

Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopEvans Ye
 
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native WayMigrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native WayDatabricks
 
Real time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsReal time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsBen Laird
 
Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on DockerRakesh Saha
 
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark Summit
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudDatabricks
 
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...Spark Summit
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?DataWorks Summit
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersDataWorks Summit/Hadoop Summit
 
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache TezYahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache TezDataWorks Summit
 
Hw09 Building Data Intensive Apps A Closer Look At Trending Topics.Org
Hw09   Building Data Intensive Apps  A Closer Look At Trending Topics.OrgHw09   Building Data Intensive Apps  A Closer Look At Trending Topics.Org
Hw09 Building Data Intensive Apps A Closer Look At Trending Topics.OrgCloudera, Inc.
 
OSCON 2013 - Planning an OpenStack Cloud - Tom Fifield
OSCON 2013 - Planning an OpenStack Cloud - Tom FifieldOSCON 2013 - Planning an OpenStack Cloud - Tom Fifield
OSCON 2013 - Planning an OpenStack Cloud - Tom FifieldOSCON Byrum
 
Getting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache MesosGetting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache MesosPaco Nathan
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimDatabricks
 
Apache Spark 2.4 Bridges the Gap Between Big Data and Deep Learning
Apache Spark 2.4 Bridges the Gap Between Big Data and Deep LearningApache Spark 2.4 Bridges the Gap Between Big Data and Deep Learning
Apache Spark 2.4 Bridges the Gap Between Big Data and Deep LearningDataWorks Summit
 

Was ist angesagt? (20)

Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native WayMigrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
 
Real time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsReal time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.js
 
Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on Docker
 
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the Cloud
 
Migrating pipelines into Docker
Migrating pipelines into DockerMigrating pipelines into Docker
Migrating pipelines into Docker
 
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
 
Spark Working Environment in Windows OS
Spark Working Environment in Windows OSSpark Working Environment in Windows OS
Spark Working Environment in Windows OS
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache TezYahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
 
Hw09 Building Data Intensive Apps A Closer Look At Trending Topics.Org
Hw09   Building Data Intensive Apps  A Closer Look At Trending Topics.OrgHw09   Building Data Intensive Apps  A Closer Look At Trending Topics.Org
Hw09 Building Data Intensive Apps A Closer Look At Trending Topics.Org
 
OSCON 2013 - Planning an OpenStack Cloud - Tom Fifield
OSCON 2013 - Planning an OpenStack Cloud - Tom FifieldOSCON 2013 - Planning an OpenStack Cloud - Tom Fifield
OSCON 2013 - Planning an OpenStack Cloud - Tom Fifield
 
Getting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache MesosGetting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache Mesos
 
Google cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache FlinkGoogle cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache Flink
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
 
Hadoop on-mesos
Hadoop on-mesosHadoop on-mesos
Hadoop on-mesos
 
Apache Spark 2.4 Bridges the Gap Between Big Data and Deep Learning
Apache Spark 2.4 Bridges the Gap Between Big Data and Deep LearningApache Spark 2.4 Bridges the Gap Between Big Data and Deep Learning
Apache Spark 2.4 Bridges the Gap Between Big Data and Deep Learning
 

Andere mochten auch

Monitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on KubernetesMonitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on KubernetesMartin Etmajer
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding KubernetesTu Pham
 
Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)lestrrat
 
"On-premises" FaaS on Kubernetes
"On-premises" FaaS on Kubernetes"On-premises" FaaS on Kubernetes
"On-premises" FaaS on KubernetesAlex Casalboni
 
An Introduction to Kubernetes
An Introduction to KubernetesAn Introduction to Kubernetes
An Introduction to KubernetesImesh Gunaratne
 
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...Mike Splain
 
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...Oleg Shalygin
 
Lessons Learned: Using Concourse In Production
Lessons Learned: Using Concourse In ProductionLessons Learned: Using Concourse In Production
Lessons Learned: Using Concourse In ProductionShingo Omura
 
Extend and build on Kubernetes
Extend and build on KubernetesExtend and build on Kubernetes
Extend and build on KubernetesStefan Schimanski
 
Security best practices for kubernetes deployment
Security best practices for kubernetes deploymentSecurity best practices for kubernetes deployment
Security best practices for kubernetes deploymentMichael Cherny
 
Kubernetes and lastminute.com: our course towards better scalability and proc...
Kubernetes and lastminute.com: our course towards better scalability and proc...Kubernetes and lastminute.com: our course towards better scalability and proc...
Kubernetes and lastminute.com: our course towards better scalability and proc...Michele Orsi
 
Oxalide Workshop #5 - Docker avancé & Kubernetes
Oxalide Workshop #5 - Docker avancé & KubernetesOxalide Workshop #5 - Docker avancé & Kubernetes
Oxalide Workshop #5 - Docker avancé & KubernetesLudovic Piot
 
Docker and kubernetes
Docker and kubernetesDocker and kubernetes
Docker and kubernetesDongwon Kim
 
Continuous delivery of microservices with kubernetes - Quintor 27-2-2017
Continuous delivery of microservices with kubernetes - Quintor 27-2-2017Continuous delivery of microservices with kubernetes - Quintor 27-2-2017
Continuous delivery of microservices with kubernetes - Quintor 27-2-2017Arjen Wassink
 
Microservices at scale with docker and kubernetes - AMS JUG 2017
Microservices at scale with docker and kubernetes - AMS JUG 2017Microservices at scale with docker and kubernetes - AMS JUG 2017
Microservices at scale with docker and kubernetes - AMS JUG 2017Arjen Wassink
 
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1Etsuji Nakai
 
London Adapt or Die: Kubernetes, Containers and Cloud - The MoD Story
London Adapt or Die: Kubernetes, Containers and Cloud - The MoD StoryLondon Adapt or Die: Kubernetes, Containers and Cloud - The MoD Story
London Adapt or Die: Kubernetes, Containers and Cloud - The MoD StoryApigee | Google Cloud
 
Global C4IR-1 Masterclass Beart - DevicePilot 2017
Global C4IR-1 Masterclass Beart - DevicePilot 2017Global C4IR-1 Masterclass Beart - DevicePilot 2017
Global C4IR-1 Masterclass Beart - DevicePilot 2017Justin Hayward
 

Andere mochten auch (20)

Monitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on KubernetesMonitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on Kubernetes
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding Kubernetes
 
Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)
 
"On-premises" FaaS on Kubernetes
"On-premises" FaaS on Kubernetes"On-premises" FaaS on Kubernetes
"On-premises" FaaS on Kubernetes
 
An Introduction to Kubernetes
An Introduction to KubernetesAn Introduction to Kubernetes
An Introduction to Kubernetes
 
From Code to Kubernetes
From Code to KubernetesFrom Code to Kubernetes
From Code to Kubernetes
 
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
 
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
 
Lessons Learned: Using Concourse In Production
Lessons Learned: Using Concourse In ProductionLessons Learned: Using Concourse In Production
Lessons Learned: Using Concourse In Production
 
Extend and build on Kubernetes
Extend and build on KubernetesExtend and build on Kubernetes
Extend and build on Kubernetes
 
Security best practices for kubernetes deployment
Security best practices for kubernetes deploymentSecurity best practices for kubernetes deployment
Security best practices for kubernetes deployment
 
Kubernetes and lastminute.com: our course towards better scalability and proc...
Kubernetes and lastminute.com: our course towards better scalability and proc...Kubernetes and lastminute.com: our course towards better scalability and proc...
Kubernetes and lastminute.com: our course towards better scalability and proc...
 
Oxalide Workshop #5 - Docker avancé & Kubernetes
Oxalide Workshop #5 - Docker avancé & KubernetesOxalide Workshop #5 - Docker avancé & Kubernetes
Oxalide Workshop #5 - Docker avancé & Kubernetes
 
Docker and kubernetes
Docker and kubernetesDocker and kubernetes
Docker and kubernetes
 
Continuous delivery of microservices with kubernetes - Quintor 27-2-2017
Continuous delivery of microservices with kubernetes - Quintor 27-2-2017Continuous delivery of microservices with kubernetes - Quintor 27-2-2017
Continuous delivery of microservices with kubernetes - Quintor 27-2-2017
 
Kubernetes on aws
Kubernetes on awsKubernetes on aws
Kubernetes on aws
 
Microservices at scale with docker and kubernetes - AMS JUG 2017
Microservices at scale with docker and kubernetes - AMS JUG 2017Microservices at scale with docker and kubernetes - AMS JUG 2017
Microservices at scale with docker and kubernetes - AMS JUG 2017
 
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
 
London Adapt or Die: Kubernetes, Containers and Cloud - The MoD Story
London Adapt or Die: Kubernetes, Containers and Cloud - The MoD StoryLondon Adapt or Die: Kubernetes, Containers and Cloud - The MoD Story
London Adapt or Die: Kubernetes, Containers and Cloud - The MoD Story
 
Global C4IR-1 Masterclass Beart - DevicePilot 2017
Global C4IR-1 Masterclass Beart - DevicePilot 2017Global C4IR-1 Masterclass Beart - DevicePilot 2017
Global C4IR-1 Masterclass Beart - DevicePilot 2017
 

Ähnlich wie Extending DevOps to Big Data Applications with Kubernetes

A DevOps guide to Kubernetes
A DevOps guide to KubernetesA DevOps guide to Kubernetes
A DevOps guide to KubernetesPaul Czarkowski
 
Kubernetes for the PHP developer
Kubernetes for the PHP developerKubernetes for the PHP developer
Kubernetes for the PHP developerPaul Czarkowski
 
OpenSouthCode 2016 - Accenture DevOps Platform 2016-05-07
OpenSouthCode 2016  - Accenture DevOps Platform 2016-05-07OpenSouthCode 2016  - Accenture DevOps Platform 2016-05-07
OpenSouthCode 2016 - Accenture DevOps Platform 2016-05-07Jorge Hidalgo
 
Docker module 1
Docker module 1Docker module 1
Docker module 1Liang Bo
 
Why everyone is excited about Docker (and you should too...) - Carlo Bonamic...
Why everyone is excited about Docker (and you should too...) -  Carlo Bonamic...Why everyone is excited about Docker (and you should too...) -  Carlo Bonamic...
Why everyone is excited about Docker (and you should too...) - Carlo Bonamic...Codemotion
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesPetteriTeikariPhD
 
Capital onehadoopintro
Capital onehadoopintroCapital onehadoopintro
Capital onehadoopintroDoug Chang
 
codemotion-docker-2014
codemotion-docker-2014codemotion-docker-2014
codemotion-docker-2014Carlo Bonamico
 
The DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkitThe DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkitMarco Ferrigno
 
The DevOps Paradigm
The DevOps ParadigmThe DevOps Paradigm
The DevOps ParadigmNaLUG
 
OWASP WTE - Now in the Cloud!
OWASP WTE - Now in the Cloud!OWASP WTE - Now in the Cloud!
OWASP WTE - Now in the Cloud!Matt Tesauro
 
Containers, Docker, and Microservices: the Terrific Trio
Containers, Docker, and Microservices: the Terrific TrioContainers, Docker, and Microservices: the Terrific Trio
Containers, Docker, and Microservices: the Terrific TrioJérôme Petazzoni
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes vty
 
Kubernetes Java Operator
Kubernetes Java OperatorKubernetes Java Operator
Kubernetes Java OperatorAnthony Dahanne
 
A Fabric/Puppet Build/Deploy System
A Fabric/Puppet Build/Deploy SystemA Fabric/Puppet Build/Deploy System
A Fabric/Puppet Build/Deploy Systemadrian_nye
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonBenjamin Bengfort
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017Patrick Chanezon
 

Ähnlich wie Extending DevOps to Big Data Applications with Kubernetes (20)

A DevOps guide to Kubernetes
A DevOps guide to KubernetesA DevOps guide to Kubernetes
A DevOps guide to Kubernetes
 
Kubernetes for the PHP developer
Kubernetes for the PHP developerKubernetes for the PHP developer
Kubernetes for the PHP developer
 
OpenSouthCode 2016 - Accenture DevOps Platform 2016-05-07
OpenSouthCode 2016  - Accenture DevOps Platform 2016-05-07OpenSouthCode 2016  - Accenture DevOps Platform 2016-05-07
OpenSouthCode 2016 - Accenture DevOps Platform 2016-05-07
 
Docker module 1
Docker module 1Docker module 1
Docker module 1
 
Why everyone is excited about Docker (and you should too...) - Carlo Bonamic...
Why everyone is excited about Docker (and you should too...) -  Carlo Bonamic...Why everyone is excited about Docker (and you should too...) -  Carlo Bonamic...
Why everyone is excited about Docker (and you should too...) - Carlo Bonamic...
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and Kubernetes
 
Capital onehadoopintro
Capital onehadoopintroCapital onehadoopintro
Capital onehadoopintro
 
codemotion-docker-2014
codemotion-docker-2014codemotion-docker-2014
codemotion-docker-2014
 
PaaSing Your Code Around
PaaSing Your Code AroundPaaSing Your Code Around
PaaSing Your Code Around
 
The DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkitThe DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkit
 
The DevOps Paradigm
The DevOps ParadigmThe DevOps Paradigm
The DevOps Paradigm
 
OWASP WTE - Now in the Cloud!
OWASP WTE - Now in the Cloud!OWASP WTE - Now in the Cloud!
OWASP WTE - Now in the Cloud!
 
Docker 101
Docker 101 Docker 101
Docker 101
 
Containers, Docker, and Microservices: the Terrific Trio
Containers, Docker, and Microservices: the Terrific TrioContainers, Docker, and Microservices: the Terrific Trio
Containers, Docker, and Microservices: the Terrific Trio
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes
 
Kubernetes Java Operator
Kubernetes Java OperatorKubernetes Java Operator
Kubernetes Java Operator
 
Docker Ecosystem on Azure
Docker Ecosystem on AzureDocker Ecosystem on Azure
Docker Ecosystem on Azure
 
A Fabric/Puppet Build/Deploy System
A Fabric/Puppet Build/Deploy SystemA Fabric/Puppet Build/Deploy System
A Fabric/Puppet Build/Deploy System
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017
 

Mehr von Nicola Ferraro

Camel Day Italia 2021 - Camel K
Camel Day Italia 2021 - Camel KCamel Day Italia 2021 - Camel K
Camel Day Italia 2021 - Camel KNicola Ferraro
 
ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...
ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...
ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...Nicola Ferraro
 
ApacheCon NA - Apache Camel K: a cloud-native integration platform
ApacheCon NA - Apache Camel K: a cloud-native integration platformApacheCon NA - Apache Camel K: a cloud-native integration platform
ApacheCon NA - Apache Camel K: a cloud-native integration platformNicola Ferraro
 
Analyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache SparkAnalyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache SparkNicola Ferraro
 
Integrating Applications: the Reactive Way
Integrating Applications: the Reactive WayIntegrating Applications: the Reactive Way
Integrating Applications: the Reactive WayNicola Ferraro
 
Cloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps ApproachCloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps ApproachNicola Ferraro
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"Nicola Ferraro
 

Mehr von Nicola Ferraro (7)

Camel Day Italia 2021 - Camel K
Camel Day Italia 2021 - Camel KCamel Day Italia 2021 - Camel K
Camel Day Italia 2021 - Camel K
 
ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...
ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...
ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...
 
ApacheCon NA - Apache Camel K: a cloud-native integration platform
ApacheCon NA - Apache Camel K: a cloud-native integration platformApacheCon NA - Apache Camel K: a cloud-native integration platform
ApacheCon NA - Apache Camel K: a cloud-native integration platform
 
Analyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache SparkAnalyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache Spark
 
Integrating Applications: the Reactive Way
Integrating Applications: the Reactive WayIntegrating Applications: the Reactive Way
Integrating Applications: the Reactive Way
 
Cloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps ApproachCloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps Approach
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"
 

Kürzlich hochgeladen

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 

Kürzlich hochgeladen (20)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 

Extending DevOps to Big Data Applications with Kubernetes

  • 1. EXTENDING DEVOPS TO BIG DATA APPLICATIONS WITH KUBERNETES
  • 2. ABOUT ME NICOLA FERRARO Software Engineer at Red Hat Working on Apache Camel, Fabric8, JBoss Fuse, Fuse Integration Services for Openshift Previously Solution Architect for Big Data systems at Engineering (IT) @ni_ferraro
  • 3. SUMMARY Overview of Big Data Systems and Apache Spark DevOps, Docker, Kubernetes and Openshift Origin Demo: Oshinko, Kafka, Spring-Boot and Fabric8 tools
  • 4. BIG DATA Big Data ≈ Machine Learning at Scale Big Data systems are capable of handling data with high: Volume Terabytes, petabytes, ... Log files Transactions ​Velocity Streaming ​data Micro batching Near real-time ​Variety Structured data​ Images Videos Free text Volume Velocity Variety Here > 100 machines
  • 5. EVOLUTION OF BIG DATA SOFTWARE http://www.slideshare.net/nicolaferraro/a-brief-history-of-big-data-48525942 Hadoop Pig Hive (a whole zoo) Spark
  • 6. EVOLUTION OF BIG DATA INFRASTRUCTURE Commodity hardware (Really ?) Specialized hardware (Appliances) The Cloud
  • 7. EVOLUTION OF ARCHITECTURES: BATCH The first Big Data architecture was heavily based on Hadoop: MapReduce + HDFS. Some commercial advertising motivations: Extract meaningful information from raw data: "companies analyze around <<put-a- random-number-between-0-and-20- here>> % of the data they produce" More data = more precision for machine learning algorithms = better insights Assist the decision making process: predict the future using all the available data SELECT page, count(*) FROM logs
  • 8. ARCHITECTURES: BATCH Hadoop (HDFS + MapReduce) Nodes App Server Logs + Data (Flume) Report (DWH) Client Batch Processing (Hive) MR
  • 9. EVOLUTION OF ARCHITECTURES: HYBRID Second generation architectures were focused to improve user experience through machine learning: Not everything could be executed in streaming, for performance reasons Execute heavyweight steps offline in batches (generate the "view", e.g. a machine learning model) Execute lightweight steps online with streaming applications The view must be refreshed periodically +
  • 10. ARCHITECTURES: HYBRID (LAMBDA) Hadoop Nodes App Server Client Batch Processing NoSQL + Messaging + Other Streaming Processing
  • 11. EVOLUTION OF ARCHITECTURES: STREAMING The new generation of architectures are streaming only: Provide immediate feedback to the user No need to execute offline work Provide a personalized experience to each user Here we focus (mainly) on streaming applications ...
  • 12. ARCHITECTURES: STREAMING (KAPPA) Hadoop Nodes App Server Client NoSQL + Messaging + Other Streaming Processing Here we focus (mainly) on streaming applications ...
  • 13. APACHE SPARK One of the main reason of Spark success is that it allows to define, by design, with a unified model: Batch processing and streaming (MapReduce / Storm) Multi-language: core in Scala, can be also used in Java, Python and R Declarative semantics (SQL), procedural and functional programming (Hive / MapReduce / Pig) General purpose data processing and machine learning (MapReduce / Mahout) Logistic regression in Spark vs. Hadoop (it was so in 2013) And the performance are really impressive!
  • 14. SPARK ARCHITECTURE Driver Cluster Manager Driver (2nd app) Workers Driver App (main) Executor Executor Single Worker Tasks: "do something on a data partition" Assign executors to the driver app Define the app workflow Do the dirty job Logical viewPhysical view
  • 15. SPARK ARCHITECTURE: DATA DISTRIBUTION HDFS / S3 Spark Executo r Spark Executo r Spark Executo r r/w Spark Driver Kafka Partition Spark Executo r Spark Executo r Spark Executo r Spark Driver Kafka Partition Kafka Partition HDFS / S3 HDFS / S3 Why distribution? ... to spread the work across thousands of workers
  • 16. SPARK DEMO A simple quickstart showing a batch application doing a word count. The purpose is showing which part of the software is executed in the driver and which parts are sent to the remote executors.
  • 17. SPARK: WHAT CAN GO WRONG ??? Programming with Spark is awesome, but there are a lot of problems you need to solve: Most of the problems do not happen in a standalone cluster (serialization issues, classloading issues, network issues, ...) Something that works well with a small dataset hangs forever in production: you must test it earlier with a lot of data! Spark applications must be tuned: decide caching steps, decide partitioning, reduce the number of shuffles, do not "collect()" For streaming apps: Exactly-once message semantics cannot be guaranteed all the times: you have to write idempotent operations to take into account duplicate messages org.apache.spark.SparkException: Task not serializable at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCl at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$Closur at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:1 at org.apache.spark.SparkContext.clean(SparkContext.scala:2055) at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:324) at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:323) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationSco at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationSco at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.map(RDD.scala:323)
  • 18. AND SPARK IS THE MINOR PROBLEMEvery operation is executed by different software distributed in hundreds of machines. Production different from the development environment: difficult to maintain software aligned (patches and upgrades of NoSQL databases, message brokers, ...): you need to test the integration with different versions of external systems The whole system randomly doesn't work: e.g. you expect a message in a queue or you expect a result in a query, and your expectations are not satisfied Debugging is hard: it's better to fix bugs earlier, but bugs appear late in production! There are consistency problems in some NoSQL databases: the CAP theorem​ Too many steps from development to release, with different teams involved: development, qa, operations I can continue...
  • 19. THE 2 MAIN PROBLEMS 1 The local development environment is usually too different from production (replication and versions), so you find bugs late 2 Big Data applications are difficult to test effectively, because configuring a good testing environment is a hard and long task: it's difficult to debug bugs all together How? we can apply DevOps practices with Kubernetes ... Easy Solution: Test early Test more Test the production software
  • 20. SUMMARY Overview of Big Data Systems and Apache Spark DevOps, Docker, Kubernetes and Openshift Origin Demo: Oshinko, Kafka, Spring-Boot and Fabric8 tools
  • 21. DEVOPS No common definition Combination of software development, qa and operations: working together to release working software frequently Applying common dev and agile techniques to operations It's a cultural change, rather than just a change in development and administration tools Common practices: Infrastructure as code Continuous integration Continuous delivery DEV OPS
  • 22. WHY DEVOPS Increased communication and collaboration among: dev, ops, qa, business Fast time to market: accomodating business and customer needs quickly to have competitive advantage Embrace change, do not fear it Embrace innovation (and team engagement), through usage of new technologies Improved quality of software through automated testing DEV OPS
  • 23. A DEVOPS WORKFLOW The key point of DevOps is being able to collaborate on the infrastructure the same way you collaborate on software: Common shared repository (SCM) Pull requests Peer review Automated checks (correctness, quality) Continuous integration Continuous delivery Testing the infrastructure like you test software: scalability, fault tolerance, backup, recovery options. DevOps practices become difficult to apply and uneffective when your infrastructure does not follow the software release cycle.
  • 24. DOCKER Docker has revolutionized the way we build and ship software. Containers can be created using a Dockerfile (example on the right) There are at least 10 different ways to create Docker containers A container is not a VM (even if it seems really close to it) A container hosts a single application Containers need to be orchestrated to deliver their full power FROM ubuntu:16.04 RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com: RUN echo "deb http://repo.mongodb.org/apt/ubuntu $(cat RUN apt-get update && apt-get install -y mongodb-org # Create the MongoDB data directory RUN mkdir -p /data/db # Expose port #27017 from the container to the host EXPOSE 27017 # Set /usr/bin/mongod as the dockerized entry-point app ENTRYPOINT ["/usr/bin/mongod"]
  • 25. DOCKER ARCHITECTURE Docker is the building block of infrastructure as code: The image is completely described by the Dockerfile Any container is stateless (although stateful data can be attached with volumes). This means: You don't ssh your server and change the configuration You rebuild a new version of the application when you need to change something "Deployment" means basically replacing a container with a different container
  • 26. KUBERNETES Cloud platform, with several deployment options (including local and private cloud), to Orchestrate (Docker) containers: Born at Google Production ready (Pokemon Go!) Provides: Application composition in namespaces Virtual networks Service discovery Load balancing Auto (and manual) scaling Automatic recovery Health checking Rolling upgrades Monitoring Logging .... kubectl run --image=nginx nginx-app --port=80 oc run --image=nginx nginx-app --port=80 # or better.. # oc new-app nginx --name nginx-app2 BOTH Kuebertes and Openshift Origin are Open Source projects!
  • 27. KUBERNETES ARCHITECTURE Very high level architecture of Kubernetes. Note: it resembles the Spark architecture seen in previous slides (with the Kubernetes master playing the role of the cluster manager), but we want to deploy a whole Spark application inside Kubernetes (including the driver).
  • 28. KUBERNETES FOR LOCAL DEV Minikube https://github.com/kubernetes/minikube OPENSHIFT FOR LOCAL DEV Minishift "oc cluster up" https://github.com/minishift/minishift $ minikube start Starting local Kubernetes cluster... Running pre-create checks... Creating machine... Starting local Kubernetes cluster... $ kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --port=8080 $ minishift start ... $ oc cluster up -- Checking OpenShift client ... OK -- Checking Docker client ... OK ...
  • 29. SUMMARY Overview of Big Data Systems and Apache Spark DevOps, Docker, Kubernetes and Openshift Origin Demo: Oshinko, Kafka, Spring-Boot and Fabric8 tools
  • 30. OPENSHIFT ORIGIN DEMO A simple recommender system using collaborative filtering on Openshift origin. Deploy a: Spring-boot microservice Kafka Broker Spark Cluster Into openshift origin. Source code: https://github.com/nicolaferraro/voxxed-bigdata-spark Rating **** Rating **** Recommendations Recommendations Spark Cluster
  • 31. SPARK ON KUBERNETES The problem: A Spark application needs a cluster of n worker machines to run You need 1 cluster manager node The application runs on 1 driver node We want to deploy everything inside Kubernetes or Openshift Origin Basically, every piece of the Spark architecture should become a POD. There are multiple solutions. I'm going to introduce the Oshinko project. http://radanalytics.io/ https://github.com/radanalyticsio Driver Cluster Manager Recall the Spark architecture PODs Worker Worker Worker
  • 32. OSHINKO Making Apache Spark Cloud Native on Openshift Origin. W WW CM D Spark Cluster W WW CM D Spark Cluster 1 cluster per application! Deployed in the application namespace Oshinko console
  • 33. DEMO TIME We will deploy the Oshinko rest server and then deploy the spark streaming application. It will connect to Kafka to retrieve ratings and return personalized (not really) recommendations. Sources: https://github.com/nicolaferraro/voxxed-bigdata-spark I used a basic implementation of the slope one algorithm ( ) for collaborative filtering. It's just an example of "real life" custom Spark application that sometimes doesn't work :) https://en.wikipedia.org/wiki/Slope_One
  • 34. TESTING One of the major advantages of making a cloud native Spark application is the possibility of executing system tests on software identical to production. Production pods Optional testing tools or load generators Test driver (inside or outside the namespace) You can create virtual test environments on the fly and do assertions. Demo: using fabric8-arquillian and the fabric8 kubernetes client
  • 35. TESTING: SOME SCENARIOS Some tests that you may want to run: Deploy the production infrastructure (or part of it) and run a suite of functional tests to ensure every piece works as expected to produce the result Deploy the production infrastructure and do performance tests using external or injected data Deploy the production infrastructure (or part of it) and check that every piece is healthy. Then randomly kill pods while sending a load to test the system availability (system tests) The Chaos Monkey!
  • 36. FABRIC8 We have used some tools from Fabric8: Fabric8 Maven Plugin: to build Kubernetes resources for Java projects Docker Maven Plugin: used by the fabric8 maven plugin to create docker images Fabric8 Arquillian: to create system tests from artifacts using the fabric8 plugin Fabric8 Kubernetes Client: to interact with Kubernetes resources (pods, services, etc.) Fabric8 is a complete development platform for Kubernetes! Uses Jenkins Pipelines and to provide: Continuous integration Continuous delivery Out of the box! https://fabric8.io/
  • 38. Q/A What about storage and data locality? In cloud native applications, using a storage system like s3 has many advantages over HDFS (costs, availability, usage patterns in testing). Performance of HDFS are superior (6x), but you can use larger clusters for a limited amount of time to gain the same performance. More info: Can you scale the application directly from Openshift, or you always need to change the code? Sure, you can use Openshift to scale the application. Beware that, in a pure DevOps approach, changing the configuration of a system at runtime is not something you should do, especially on production. You lose the benefits of having all the software and configuration in a "executable" state, and you lose the effectiveness of system tests because tests will not run on an environment identical to production. As an alternative, you can use the auto-scaling feature (enabling it in the code). Is the Spark demo using a machine learning algorithm? The demo is using a basic collaborative filtering algorithm called "slope one" (the focus of this talk is not machine learning). Spark provides many algorithms out of the box, also for collaborative filtering. Many of them are supposed to be executed in a batch application, rather than in a streaming app. http://qr.ae/TAF4cN
  • 39. THANKS Follow me on Twitter! @ni_ferraro