Keptn is a new OpenSource Framework for Automated Operations & Continuous Delivery for cloud native applications running on k8s, OpenShift, CloudFoundry ...
This presentation was used at Meetups to explain WHY we build keptn and which problems it solves in which way!
4. Confidential 4
MTTI
Mean Time to Innovation
MTTR
Mean Time to Remediate
4.8 days
4 hours
~ 10min
12.5 days 2 days ~ 1 hour
The reality and evidence supports the need for ACM!
https://dynatrace.ai/acsurvey
Only < 5% is „Cloud Native“
5. 5Confidential
Increase Quality &
Level of Automation
Increase Speed &
Reduce Costs
Automated
Testing
Continuous
Performance
Auto Quality
Gates
AUTOMATE
OPERATIONS
AUTOMATE
DEPLOYMENT
AUTOMATE
MONITORING
Feature
Flagging
Adaptive
Scaling
Auto
Roll-Back
Canary
Releases
Blue /
Green
Deploymen
ts
Auto-
Remediation
AUTOMATE
QUALITY
Strategically Used as Pipeline Feature
Building Blocks for ACM/Cloud Natives!
6. Confidential 6
That is why we are building
Because cloud native delivery and operations is a BIG challenge for enterprises!
Cloud Native
7. Confidential 7
Quote: “Pipelines seem to be becoming our new future un-managable legacy code!“
First: solves the Continuous Delivery Problem!
9. Continuous Delivery – Launch control
Launch operations are supervised and
controlled from several control rooms (also
known as a firing room). The controllers are
in control of pre-launch checks, the booster
and spacecraft. Once the rocket has cleared
the launch tower (usually within the first
10–15 seconds), is when control is switched
over to the Mission Control Center
10. Continuous Operations – Mission Control
A mission control center (MCC, sometimes
called a flight control center or operations
center) is a facility that manages space flights,
usually from the point of launch until landing
or the end of the mission. It is part of
the ground segment of spacecraft operations.
A staff of flight controllers and other support
personnel monitor all aspects of the mission
using telemetry, and send commands to the
vehicle using ground stations
11. Confidential 11
Quote: “We spend more time in manual communicaton than remediating issues“
Second: has Continuous Operations at its Core!
ENGAGE TRIAGE FIND & ASSEMBLE RESOLVE RESTORE
MANUAL
COMMUNICATION
MANUAL
COMM
Before
After
RESTORERESOLVE
NUMBER
OF ISSUES BEFORE: mostly manual
AFTER: mostly automated
12. Mission Control
“Automated Operations” = Day 2 Ops
Launch Control
“Continuous Deployment” = Day 1 Ops
keptn accelerates building autonomous clouds
Event-driven runbook automation
Productionproblemscan beautomaticallyremediatedin
real-timebyexecutingrunbooksthatrequirenomanual
intervention.
Self-healing blue/green deployments
Deploymentsthatfollowthe“Operationsas Code”
paradigmautomaticallyremediateproblemsandget
yourdeploymentpipelineworkingagain inundera
minute.
Automated multistage unbreakable delivery
pipelines
GitOps-enableddeliverypipelineswithautomated
qualitygates supportautomatedtestingandmonitoring-
as-a-service.
13. Designed for modern applications
GitOps-based collaboration
AllkeptnworkflowsarebasedontheGitOps
paradigm.
Operator patterns for all logic
components
Logiccomponentscan bereusedforother
operationaltasks.
Monitoring and operations as code
Developer-friendlydefinitionofmonitoringand
operationaltasks.
Built on and for Kubernetes
Builtformoderncloud-nativeenvironments.
Event-driven and serverless
Powerfulwitha minimal
resourcefootprint.
Pluggable tooling
Alltoolsleveragedbykeptn
can bereplacedbased
onyourtoolpreferences.
14. 14
Keptn Use Cases
• Installation
• One-Line Installation: on most popular k8s platforms
• Zero-Touch Toolchain Integration: No custom tool integrations needed
• Re-Think Pipelines: Gone are the days of custom pipeline coding!
• Zero-Touch Cloud Native Services: Enables GitOps event-driven CD/CO for your services
• Continuous Delivery
• Automated Multi-Stage Delivery: Risk-Free auto deployment through multi-stage delivery pipelines
• Automated Quality Gates: stops bad changes before production using Pitometer
• Self-Healing Blue/Green Deployments: reverts bad changes before impacting end-users
• Zero-Touch Toolchain Updates: Add/Remove/Replace tools without custom coding
• Mastering Continuous Delivery: Risk-Free Automated Deployments
• Continuous Operations
• Self-Healing Production: Automated Problem Remediation (Scale-Up, Scale-Down, ...)
• Self-Healing/Continuous BizDevOps: Automated Business Operations Optimization, Turn on/off feature flags based on conversion rates ...
• Zero-Touch Toolchain Updates: Add/Remove/Replace tools without custom coding
• Auto-Protect Production & Business: Stop DDOS attacks, Redirect Bot Traffic ...
• Chaos-Driven Operation Readiness: Chaos Engineering to validate your production self-healing
15. 15
Config ChatOps IT Autom
Deploy Test Observe
One-Line Installation: $ keptn install
16. 16
Config ChatOps IT Autom
Deploy Test Observe
Zero-Touch Toolchain Integration: $ keptn wear uniform <GitHub, Slack ...>
17. 17
Config ChatOps IT Autom
Deploy Test Observe
Re-Think Pipelines: $ keptn create project keptn-sample {stage(perf),prod(bg)}
S
T
A
G
I
N
G
P
R
O
D
DirectUpdateC D
Blue/GreenUpdateC D
18. 18
Config ChatOps IT Autom
Deploy Test Observe
Zero-Touch Cloud Native Services: $ keptn onboard service myservice [xxx.yaml]
S
T
A
G
I
N
G
P
R
O
D
DirectUpdateC D
Blue/GreenUpdateC D
PLACEHOLDER
PLACEHOLDER
19. 19
Config ChatOps IT Autom
Deploy Test Observe
Automated Multi-Stage Delivery: $ keptn new artifact myservice:1.0.0
S
T
A
G
I
N
G
P
R
O
D
ScoreDirect PerformanceUpdate Promote?C D T O
ScoreBlue/GreenUpdate Keep?C D T O
PLACEHOLDER
PLACEHOLDER
1.0.0
1 1 90
/
100
1.0.0
1 1 1 75
/
100
P
R
O
M
O
T
E
K
E
E
P
20. 20
A Quick word on Pitometer: Automated Deployment Validation
Metric Source &
Query
Grading Details
& Metric Score
Pitometer Specfile
Total Scoring
Objectives
2GB
Allocated Bytes (from Prometheus)
> 2GB: 0 Points
< 2GB: 20 Points
5%
2% < 2%: 0 Points
< 5%: 10 Points
> 5%: 20 Points
Conversion Rate (Dynatrace)
GraderSource
If value: 3GB
Score: 0
If value: 3.9%
Score: 10
Total Score: 10
21. 21
Config ChatOps IT Autom
Deploy Test Observe
Automated Quality Gates: $ keptn new artifact myservice:2.0.0
S
T
A
G
I
N
G
P
R
O
D
ScoreDirect PerformanceUpdate Promote?C D T O
ScoreBlue/GreenUpdate Keep?C D T O
1.0.0
1 1 45
/
100
1.0.0
1 1 1
2.0.0
2 2
A
B
O
R
T
22. 22
Config ChatOps IT Autom
Deploy Test Observe
Self-Healing Blue/Green Deployments: $ keptn new artifact myservice:3.0.0
S
T
A
G
I
N
G
P
R
O
D
ScoreDirect PerformanceUpdate Promote?C D T O
ScoreBlue/GreenUpdate Keep?C D T O
1.0.0
1 1 85
/
100
1.0.0
1 1 1
2.0.0
2 2
3.0.0
3 3
3.0.0
3 3 3
80
/
100
P
R
O
M
O
T
E
R
E
V
E
R
T
23. 23
Config ChatOps IT Autom
Deploy Test Observe
Zero-Touch Toolchain Updates: $ keptn update uniform <+neo,+end2end,+spinnaker>
S
T
A
G
I
N
G
P
R
O
D
ScoreDirect PerformanceUpdate Promote?C D T O
ScoreBlue/GreenUpdate Keep?C D O
1.0.0
1 1
1.0.0
1 1 1
2.0.0
2 2
3.0.0
3 3
End2EndT
24. 24
Config ChatOps IT Autom
Deploy Test Observe
Mastering Continuous Delivery: $ keptn new artifact myservice:4.0.0
S
T
A
G
I
N
G
P
R
O
D
ScoreDirect PerformanceUpdate Promote?C D T O
ScoreBlue/GreenUpdate Keep?C D O
1.0.0
1 1
1.0.0
1 1 1
2.0.0
2 2
3.0.0
3 3
4 4 4
End2EndT
4.0.0
4.0.0
4 4 95
/
100
90
/
100
K
E
E
P
P
R
O
M
O
T
E
25. 25
4 4
Config ChatOps IT Autom
Deploy Test Observe
Self-Healing Production: $ keptn new problem <Services, Root Cause>
P
R
O
D
Evaluate
Decide
Act
Notify
4 Escalate
4 4
26. Confidential 26
MTTI
= Mean Time to Innovation
MTTR
Mean Time to Remediate
4.8 days
4 hours
~ 10min
12.5 days 2 days ~ 1 hour
We are building keptn to re-shape this reality
GROW this number!
Numbers based on our survey – https://dynatrace.ai/acsurvey
Many teams we spoken with build their own
Deployment Pipelines: Combination of OpenSource & Commercial Tools for Deployment Automation!
Testing Pipelines: Combination of OpenSource & Commerical Tools for Test Execution
Quality Gates: Most often done manual. Some are investing in automated validation!
Auto Remediation: Mostly done manual with a trend towards simply remediation actions
Numbers based on our survey – https://dynatrace.ai/acsurvey