SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Increasing visibility of
distributed systems in production
@PierreVincent
June 16th, 2017
RebelCon, Cork
Pierre Vincent
SRE Manager at Poppulo
techblog.poppulo.com
@PierreVincent
Hierarchy of Service Reliability
(Mikey Dickerson)
Let’s start here
@PierreVincent
Reaching production is only the beginning
@PierreVincent
No system is immune to failure:
Design for recovery
@PierreVincent
When distributing a system,
we’re also distributing the places
where things might go wrong
@PierreVincent
Healthchecks
Is it
running
Can it
perform its
task
Can it
accept
more work
?
@PierreVincent
Healthchecks strategies
Broadcast Register Expose
@PierreVincent
System
metrics
Application
metrics
Business
metrics
Time-series metrics
Network latency Error rates Customer conversions
@PierreVincent
Servers / VMs
Appliances / Infra
Services
Metrics
collector
Metrics
query
engine
Dashboards
Alerts
@PierreVincent
Servers / VMs
Appliances / Infra
Services
@PierreVincent
/metrics
/metrics
/metrics
Prometheus
Usability of metrics tooling is key to adoption
Instrument
code
Query
metrics
Create
dashboards
Define rules
& thresholds
@PierreVincent
Limit alerting to user-impacting symptoms
Expose dashboards to diagnose causes
@PierreVincent
Overlaying changes with production metrics
Source: Ian Malpass (Etsy), Measure Anything, Measure Everything
https://codeascraft.com/2011/02/15/measure-anything-measure-everything@PierreVincent
Making sense of logs
Centralise
logs
Common
searchable
format
Correlation
IDs
@PierreVincent
Tracing
A
F
H
D
J
B
E
C
G
a1b2c3
a1b2c3
a1b2c3
ERROR [svc=H][trace=a1b2c3] Failed to save order
Cause: Cassandra timeout exception
ERROR [svc=F][trace=a1b2c3] Failed to complete order
Cause: Shipping service responded with 500
ERROR [svc=A][trace=a1b2c3] Failed to process order
Cause: Order process manager responded with 500
a1b2c3
INFO [svc=G][trace=a1b2c3] Items verified in stock
@PierreVincent
@PierreVincent
Visibility enables operability
@PierreVincent
Visibility allows justifiable decisions
@PierreVincent
Visibility builds trust
@PierreVincent
If you can’t monitor a service, you don’t know
what’s happening, and if you’re blind to what’s
happening, you can’t be reliable.
“ ”
N. Murphy, J. Petoff, C. Jones, B. Beyer
Site Reliability Engineering
@PierreVincent
techblog.poppulo.com
Questions?
@PierreVincent

Weitere ähnliche Inhalte

Was ist angesagt?

Mobility Trends Impacting Healthcare
Mobility Trends Impacting HealthcareMobility Trends Impacting Healthcare
Mobility Trends Impacting HealthcareExtreme Networks
 
Leveraging Hospital Network Analytics
Leveraging Hospital Network AnalyticsLeveraging Hospital Network Analytics
Leveraging Hospital Network AnalyticsExtreme Networks
 
HOW CAN BIG DATA ANALYTICS IMPROVE YOUR OPERATIONS?
HOW CAN BIG DATA ANALYTICS IMPROVE YOUR OPERATIONS?HOW CAN BIG DATA ANALYTICS IMPROVE YOUR OPERATIONS?
HOW CAN BIG DATA ANALYTICS IMPROVE YOUR OPERATIONS?Ericsson
 
The Journey from Zero to SOC: How Citadel built its Security Operations from ...
The Journey from Zero to SOC: How Citadel built its Security Operations from ...The Journey from Zero to SOC: How Citadel built its Security Operations from ...
The Journey from Zero to SOC: How Citadel built its Security Operations from ...Elasticsearch
 
IoT System SalesBytes Overview Final
IoT System SalesBytes Overview FinalIoT System SalesBytes Overview Final
IoT System SalesBytes Overview FinalSarah Reinbolt, MBA
 
BDNA joins Flexera
BDNA joins FlexeraBDNA joins Flexera
BDNA joins FlexeraFlexera
 
The Top 10 IT Issues in Higher Ed
The Top 10 IT Issues in Higher EdThe Top 10 IT Issues in Higher Ed
The Top 10 IT Issues in Higher EdExtreme Networks
 

Was ist angesagt? (10)

Icon Secure by Maintel
Icon Secure by MaintelIcon Secure by Maintel
Icon Secure by Maintel
 
Mobility Trends Impacting Healthcare
Mobility Trends Impacting HealthcareMobility Trends Impacting Healthcare
Mobility Trends Impacting Healthcare
 
Why vyopta
Why vyoptaWhy vyopta
Why vyopta
 
Leveraging Hospital Network Analytics
Leveraging Hospital Network AnalyticsLeveraging Hospital Network Analytics
Leveraging Hospital Network Analytics
 
HOW CAN BIG DATA ANALYTICS IMPROVE YOUR OPERATIONS?
HOW CAN BIG DATA ANALYTICS IMPROVE YOUR OPERATIONS?HOW CAN BIG DATA ANALYTICS IMPROVE YOUR OPERATIONS?
HOW CAN BIG DATA ANALYTICS IMPROVE YOUR OPERATIONS?
 
The Importance of Business Data Backups
The Importance of Business Data BackupsThe Importance of Business Data Backups
The Importance of Business Data Backups
 
The Journey from Zero to SOC: How Citadel built its Security Operations from ...
The Journey from Zero to SOC: How Citadel built its Security Operations from ...The Journey from Zero to SOC: How Citadel built its Security Operations from ...
The Journey from Zero to SOC: How Citadel built its Security Operations from ...
 
IoT System SalesBytes Overview Final
IoT System SalesBytes Overview FinalIoT System SalesBytes Overview Final
IoT System SalesBytes Overview Final
 
BDNA joins Flexera
BDNA joins FlexeraBDNA joins Flexera
BDNA joins Flexera
 
The Top 10 IT Issues in Higher Ed
The Top 10 IT Issues in Higher EdThe Top 10 IT Issues in Higher Ed
The Top 10 IT Issues in Higher Ed
 

Ähnlich wie [RebelCon] Increasing visibility of distributed systems in production

[Test Bash Manchester] Observability and Testing
[Test Bash Manchester] Observability and Testing[Test Bash Manchester] Observability and Testing
[Test Bash Manchester] Observability and TestingPierre Vincent
 
NetIQ AppManager & NetIQ Operations Center. NCU Ltd
NetIQ AppManager & NetIQ Operations Center. NCU LtdNetIQ AppManager & NetIQ Operations Center. NCU Ltd
NetIQ AppManager & NetIQ Operations Center. NCU LtdNCU Ltd
 
Continuous Testing- A Key Ingredient for Success in Agile & DevOps
Continuous Testing- A Key Ingredient for Success in Agile & DevOpsContinuous Testing- A Key Ingredient for Success in Agile & DevOps
Continuous Testing- A Key Ingredient for Success in Agile & DevOpsSmartBear
 
Devops based progressive delivery finalized
Devops based progressive delivery finalizedDevops based progressive delivery finalized
Devops based progressive delivery finalizedBhagvanK1
 
Bhagvan Kommadi [Value Momentum] | TeleHealth Platform: DevOps-Based Progress...
Bhagvan Kommadi [Value Momentum] | TeleHealth Platform: DevOps-Based Progress...Bhagvan Kommadi [Value Momentum] | TeleHealth Platform: DevOps-Based Progress...
Bhagvan Kommadi [Value Momentum] | TeleHealth Platform: DevOps-Based Progress...InfluxData
 
Connecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deploymentConnecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deploymentNagarro
 
Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...VMware Tanzu
 
2010 06 gartner avoiding audit fatigue in nine steps 1d
2010 06 gartner   avoiding audit fatigue in nine steps 1d2010 06 gartner   avoiding audit fatigue in nine steps 1d
2010 06 gartner avoiding audit fatigue in nine steps 1dGene Kim
 
Fingerprint Based Voting
Fingerprint Based VotingFingerprint Based Voting
Fingerprint Based VotingIRJET Journal
 
Observability A Critical Practice to Enable Digital Transformation
Observability A Critical Practice to Enable Digital TransformationObservability A Critical Practice to Enable Digital Transformation
Observability A Critical Practice to Enable Digital TransformationCloudZenix LLC
 
How to Use Open Source Technologies in Safety-critical Digital Health Applica...
How to Use Open Source Technologies in Safety-critical Digital Health Applica...How to Use Open Source Technologies in Safety-critical Digital Health Applica...
How to Use Open Source Technologies in Safety-critical Digital Health Applica...Shahid Shah
 
IRJET- Vendor Management System using Machine Learning
IRJET-  	  Vendor Management System using Machine LearningIRJET-  	  Vendor Management System using Machine Learning
IRJET- Vendor Management System using Machine LearningIRJET Journal
 
How Dealertrack Optimizes the DevOps Toolchain, FutureStack17
How Dealertrack Optimizes the DevOps Toolchain, FutureStack17How Dealertrack Optimizes the DevOps Toolchain, FutureStack17
How Dealertrack Optimizes the DevOps Toolchain, FutureStack17New Relic
 
Give ‘Em What They Want! Self-Service Middleware Monitoring in a Shared Servi...
Give ‘Em What They Want! Self-Service Middleware Monitoring in a Shared Servi...Give ‘Em What They Want! Self-Service Middleware Monitoring in a Shared Servi...
Give ‘Em What They Want! Self-Service Middleware Monitoring in a Shared Servi...SL Corporation
 
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...Agile Testing Alliance
 
Metrics Monitoring Is So Critical - What's Your Best Approach?
Metrics Monitoring Is So Critical - What's Your Best Approach? Metrics Monitoring Is So Critical - What's Your Best Approach?
Metrics Monitoring Is So Critical - What's Your Best Approach? Wavefront
 
Lunch and Learn and Sneakers
Lunch and Learn and SneakersLunch and Learn and Sneakers
Lunch and Learn and SneakersBill Zajac
 

Ähnlich wie [RebelCon] Increasing visibility of distributed systems in production (20)

[Test Bash Manchester] Observability and Testing
[Test Bash Manchester] Observability and Testing[Test Bash Manchester] Observability and Testing
[Test Bash Manchester] Observability and Testing
 
NetIQ AppManager & NetIQ Operations Center. NCU Ltd
NetIQ AppManager & NetIQ Operations Center. NCU LtdNetIQ AppManager & NetIQ Operations Center. NCU Ltd
NetIQ AppManager & NetIQ Operations Center. NCU Ltd
 
Continuous Testing- A Key Ingredient for Success in Agile & DevOps
Continuous Testing- A Key Ingredient for Success in Agile & DevOpsContinuous Testing- A Key Ingredient for Success in Agile & DevOps
Continuous Testing- A Key Ingredient for Success in Agile & DevOps
 
Devops based progressive delivery finalized
Devops based progressive delivery finalizedDevops based progressive delivery finalized
Devops based progressive delivery finalized
 
VmTurbo
VmTurboVmTurbo
VmTurbo
 
Bhagvan Kommadi [Value Momentum] | TeleHealth Platform: DevOps-Based Progress...
Bhagvan Kommadi [Value Momentum] | TeleHealth Platform: DevOps-Based Progress...Bhagvan Kommadi [Value Momentum] | TeleHealth Platform: DevOps-Based Progress...
Bhagvan Kommadi [Value Momentum] | TeleHealth Platform: DevOps-Based Progress...
 
Corporate Presentation Vdc
Corporate Presentation VdcCorporate Presentation Vdc
Corporate Presentation Vdc
 
Corporate presentation vdc
Corporate presentation vdcCorporate presentation vdc
Corporate presentation vdc
 
Connecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deploymentConnecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deployment
 
Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...
 
2010 06 gartner avoiding audit fatigue in nine steps 1d
2010 06 gartner   avoiding audit fatigue in nine steps 1d2010 06 gartner   avoiding audit fatigue in nine steps 1d
2010 06 gartner avoiding audit fatigue in nine steps 1d
 
Fingerprint Based Voting
Fingerprint Based VotingFingerprint Based Voting
Fingerprint Based Voting
 
Observability A Critical Practice to Enable Digital Transformation
Observability A Critical Practice to Enable Digital TransformationObservability A Critical Practice to Enable Digital Transformation
Observability A Critical Practice to Enable Digital Transformation
 
How to Use Open Source Technologies in Safety-critical Digital Health Applica...
How to Use Open Source Technologies in Safety-critical Digital Health Applica...How to Use Open Source Technologies in Safety-critical Digital Health Applica...
How to Use Open Source Technologies in Safety-critical Digital Health Applica...
 
IRJET- Vendor Management System using Machine Learning
IRJET-  	  Vendor Management System using Machine LearningIRJET-  	  Vendor Management System using Machine Learning
IRJET- Vendor Management System using Machine Learning
 
How Dealertrack Optimizes the DevOps Toolchain, FutureStack17
How Dealertrack Optimizes the DevOps Toolchain, FutureStack17How Dealertrack Optimizes the DevOps Toolchain, FutureStack17
How Dealertrack Optimizes the DevOps Toolchain, FutureStack17
 
Give ‘Em What They Want! Self-Service Middleware Monitoring in a Shared Servi...
Give ‘Em What They Want! Self-Service Middleware Monitoring in a Shared Servi...Give ‘Em What They Want! Self-Service Middleware Monitoring in a Shared Servi...
Give ‘Em What They Want! Self-Service Middleware Monitoring in a Shared Servi...
 
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
 
Metrics Monitoring Is So Critical - What's Your Best Approach?
Metrics Monitoring Is So Critical - What's Your Best Approach? Metrics Monitoring Is So Critical - What's Your Best Approach?
Metrics Monitoring Is So Critical - What's Your Best Approach?
 
Lunch and Learn and Sneakers
Lunch and Learn and SneakersLunch and Learn and Sneakers
Lunch and Learn and Sneakers
 

Mehr von Pierre Vincent

[Test bash NL] Contract testing in practice with Pact
[Test bash NL] Contract testing in practice with Pact[Test bash NL] Contract testing in practice with Pact
[Test bash NL] Contract testing in practice with PactPierre Vincent
 
DevOpsDays Galway 2019 - Zero-downtime deployments
DevOpsDays Galway 2019 - Zero-downtime deploymentsDevOpsDays Galway 2019 - Zero-downtime deployments
DevOpsDays Galway 2019 - Zero-downtime deploymentsPierre Vincent
 
[Test bash manchester] contract testing in practice
[Test bash manchester] contract testing in practice[Test bash manchester] contract testing in practice
[Test bash manchester] contract testing in practicePierre Vincent
 
QCon London - How to build observable distributed systems
QCon London - How to build observable distributed systemsQCon London - How to build observable distributed systems
QCon London - How to build observable distributed systemsPierre Vincent
 
Improve collaboration and confidence with Consumer-driven contracts
Improve collaboration and confidence with Consumer-driven contractsImprove collaboration and confidence with Consumer-driven contracts
Improve collaboration and confidence with Consumer-driven contractsPierre Vincent
 
Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...
Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...
Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...Pierre Vincent
 
Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...
Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...
Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...Pierre Vincent
 
Agile at Newsweaver (Agile Cork March 2016)
Agile at Newsweaver (Agile Cork March 2016)Agile at Newsweaver (Agile Cork March 2016)
Agile at Newsweaver (Agile Cork March 2016)Pierre Vincent
 

Mehr von Pierre Vincent (8)

[Test bash NL] Contract testing in practice with Pact
[Test bash NL] Contract testing in practice with Pact[Test bash NL] Contract testing in practice with Pact
[Test bash NL] Contract testing in practice with Pact
 
DevOpsDays Galway 2019 - Zero-downtime deployments
DevOpsDays Galway 2019 - Zero-downtime deploymentsDevOpsDays Galway 2019 - Zero-downtime deployments
DevOpsDays Galway 2019 - Zero-downtime deployments
 
[Test bash manchester] contract testing in practice
[Test bash manchester] contract testing in practice[Test bash manchester] contract testing in practice
[Test bash manchester] contract testing in practice
 
QCon London - How to build observable distributed systems
QCon London - How to build observable distributed systemsQCon London - How to build observable distributed systems
QCon London - How to build observable distributed systems
 
Improve collaboration and confidence with Consumer-driven contracts
Improve collaboration and confidence with Consumer-driven contractsImprove collaboration and confidence with Consumer-driven contracts
Improve collaboration and confidence with Consumer-driven contracts
 
Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...
Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...
Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...
 
Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...
Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...
Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...
 
Agile at Newsweaver (Agile Cork March 2016)
Agile at Newsweaver (Agile Cork March 2016)Agile at Newsweaver (Agile Cork March 2016)
Agile at Newsweaver (Agile Cork March 2016)
 

Kürzlich hochgeladen

Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile EnvironmentVictorSzoltysek
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 

Kürzlich hochgeladen (20)

Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 

[RebelCon] Increasing visibility of distributed systems in production

Hinweis der Redaktion

  1. Maslows hierarchy of needs: food > safety > love > esteem > fulfilment Reliability: monitoring: see how things are working and get notified when they’re not Incident resp: once we’re notified, how to we mitigate (turn off feature / add capacity) postmortem/RCA: what went wrong, how do we fix it durably Testing/RP: test what tends to go wrong, to catch things before CP: understanding load, dynamically balacing load, circuit breaking etc Dev: design system for reliability, on where things tend to be brittle Product: fulfilment of a reliable product “Monitoring enables service owners to make rational decisions about the impact of changes to the service, apply the scientific method to incident response, and of course ensure their reason for existence: to measure the service’s alignment with business goals”
  2. We used to spend most of the time code and not testing, then came TDD - not unit testing is the widely agreed outlier But are we still spending most of our time developing? Apps that haven’t reached production = just playing around Production is the real deal, but we see it as the finish line = it’s the opposite
  3. Things will never run perfectly. If nothing goes wrong in your system either: Nobody is using it You just don’t know about it There is only so much we can think about. Diminishing returns in designing/coding perfection - much more value for money in admitting that things will go wrong, and that in these cases our focus is to: Know about it asap Have as much info as possible to find the source of the problem
  4. What I mean by DS: No longer 1 web app supported by 1 db Number of separate parts, responsible for different things, talk to each other over the network Independently scalable, independently deployable, independently failing Clusters of databases, messages queues Multi servers, DCs, clouds Nobody knew distributing would be so complicated! ;)
  5. Simple network broadcast Registration Heartbeat in a HA store (etcd, zk…) Expose health to others E.g. Http endpoint Requires some form of service discovery
  6. All levels of metrics are important Different teams might be responsible for different things Not exclusive levels Need ability to correlate different levels
  7. If adding metrics is simple, every developer will do it 1 line instrumentation with tools like Prometheus or StatsD Integration with graphing tools, alerting tools
  8. Not going to expand on alerting - entire (multiple) talks required! Alerting on symptoms reduces noise > 1st action is to mitigate effects, then track down the cause Use dashboards to troubleshoot > 1st place to go to validate theories
  9. Problems mostly caused by changes Overlay production changes with time-series Deployment/config change > correlate with change in system behaviour
  10. Aggregated logs are just more logs in one place Need to make sense of it Correlation ids, tracing
  11. Search for a trace / Timing of traces This is profiling on a live environment! Example of DNS issue tracked: - Error rate of peer dependency went up - Tracked down to breach of our SLO on API - Request to particular dependency was slow, but no evidence of that dependency to be slow to respond - Monitoring disproved dependency from being slow to respond - Pointed at something between the 2 services - Added internal zipkin tracing inside the call service - Tracked down to slow DNS look up because of bad resolv configuration
  12. Having fuller picture = less guess work Impossible to reason about a system when flying blind Monitoring allows to adopt a scientific approach to explain production systems > find evidence of problems > make hypotheses on issues > correlate issues with recent changes > prove/disprove hypotheses
  13. Shining a light on your system gives you the real picture Internal changes backed up by guess work is an anti pattern Will make things more complicated without backing it up No way to quantify if things get better or worse
  14. Encourage “information radiators” = No hiding (from other and from ourselves) > Needs culture of safety Distilled dashboards and status pages for other parts of the business = spread visibility for higher up (e.g. support) = build confidence and trust of stakeholders