4. Instrument Your Applications
- R.E.D. Method
( Rates / Errors / Duration )
- Many service-meshes and ingress-controllers auto-expose
RED via Prometheus
Flagger can integrate directly /w these metrics
@capileigh stealthybox
Pre-req:
6. @capileigh
Tech doesn’t solve problems without People
Engineering + Product reviewing metrics
Optional communication to Stakeholder for promoting a Canary
Team Review of key release risks and struggles
stealthybox
Decision Making
7. Traffic Shifting
Good for infrequently hit, read-workloads
Header Matching / User Indication
Client can flag if a User has “joined the beta program”
User Segmentation
( ex: device-type or browser User Agent )
A/B Testing
Traffic Sampling, Replay, and Mirroring
@capileigh stealthybox
Feature Gate Strategy
8. General migration strategy for breaking changes to load-bearing
interfaces. ( Classes, Database Schema, API’s, etc. )
Expand
Migrate
Contract
@capileigh stealthybox
Parallel Change
11. How do I even do SRE
Why?
@capileigh stealthybox
12. Progressive Delivery allows you to sample real traffic
and measure your mutations
( you need metrics retention )
@capileigh stealthybox
13. SLI
INDICATOR
This is a metric
-- typically quantifiable in an automated manner
SLO
OBJECTIVE
SLA
AGREEMENT
This is an enforceable contract
Between the Service Provider and the Consumer
Consumer could be internal or external
Often Formal + Culturally or Legally binding
@capileigh stealthybox
SRE Jargon -- Service Levels