SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Downloaden Sie, um offline zu lesen
Practical Monitoring Techniques
Today's Talk
●
Our Mission
●
Current Tools
●
Increasing Coverage
●
PD Schedules
●
Automatic Self Healing
●
Bots And Alerts channels
●
Events Dashboard
●
Dashboard Accessibility
●
Best Practices Summary
Our Mission
Back up culture with the proper tools to support it
Current Tools
●
Metrics collections: Collectd, statsd, Cloudwatch
●
Monitoring: Sensu, NewRelic
●
Alert channels: PagerDuty, emails, slack
●
Dashboards: Grafana, CloudWatch, NewRelic
●
Application testing: E2E Testing System
●
Internal tools: Sensu mobile, events system,
Sensu bar and more
Increasing Coverage
●
Automatic collection of basic
system and 3rd party metrics
for new instances
●
Add alerts automatically for
new instance of existed
subscriber
●
Each Developer / DevOps is
responsible for monitoring his
application / infrastructure
●
Easy method to add new
alerts and dashboards
●
Automatic events flow
Pager Schedules
●
Divided into logical groups of ownership
●
Schedule has escalation point
●
On call should be able to connect and respond to
issues in his area
●
Easy method to override schedule
●
Ability to contact relevant on call
●
Ability to page relevant on call
Automatic Self Healing
●
Better MTTR
●
Avoid waking On Call if
possible
●
Log activity to float
recurrent issues
●
Limit the healing to avoid
restart loops
●
Make sure to sync
Healer Alert↔
Bots, Integrations and Alerts Channels
●
Alerts channels: Emails, slack, PD mobile, sms, calls
●
Integrations: Sensu to PD/Slack, CloudWatch to PD,
3rd party (EX: CouchBase, NewRelic, etc) to PD,
●
Slack Bot:
Events Dashboard
●
Simple Rest API for sending events
●
Clean timeline view to spot production events
●
Connections between events (“depends on” and “dependents”)
●
Detailed view for each event
Accessibility
●
Available from everywhere by mobile
●
Easy to ack, resolve, mute alerts
●
Slack bots to reach help
●
Automatically get graph with the alert
●
Ability to search, edit, copy, etc alerts
●
Treat alerts management as code (SVC, DB,
backups, etc)
Best Practices Summary
●
Share the pain
●
Automate base metrics
●
Automate healing
●
Make help reachable
●
Make it easy to add alerts and dashboards
●
Use warning levels as soft events to avoid phone calls at night
●
Automate graphs in alerts
●
Positive alerting system check each day
●
Dependencies between alerts
●
Postmortems
Questions

Weitere ähnliche Inhalte

Andere mochten auch

DevOps Roadtrip Minneapolis
DevOps Roadtrip Minneapolis DevOps Roadtrip Minneapolis
DevOps Roadtrip Minneapolis VictorOps
 
DevOps/Flow workshop for agile india 2015
DevOps/Flow workshop for agile india 2015DevOps/Flow workshop for agile india 2015
DevOps/Flow workshop for agile india 2015Yuval Yeret
 
Run IT Support the DevOps Way
Run IT Support the DevOps WayRun IT Support the DevOps Way
Run IT Support the DevOps WayAtlassian
 
Jelastic - DevOps PaaS Business with Docker Support for Service Providers
Jelastic - DevOps PaaS Business with Docker Support for Service ProvidersJelastic - DevOps PaaS Business with Docker Support for Service Providers
Jelastic - DevOps PaaS Business with Docker Support for Service ProvidersJelastic Multi-Cloud PaaS
 
Paris Devops - Monitoring And Feature Toggle Pattern With JMX
Paris Devops - Monitoring And Feature Toggle Pattern With JMXParis Devops - Monitoring And Feature Toggle Pattern With JMX
Paris Devops - Monitoring And Feature Toggle Pattern With JMXCyrille Le Clerc
 
DevOps in the Cloud with Microsoft Azure
DevOps in the Cloud with Microsoft AzureDevOps in the Cloud with Microsoft Azure
DevOps in the Cloud with Microsoft Azuregjuljo
 
DevOps monitoring: Feedback loops in enterprise environments
DevOps monitoring: Feedback loops in enterprise environmentsDevOps monitoring: Feedback loops in enterprise environments
DevOps monitoring: Feedback loops in enterprise environmentsJonah Kowall
 
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...Keiichiro Ono
 

Andere mochten auch (10)

DevOps Roadtrip Minneapolis
DevOps Roadtrip Minneapolis DevOps Roadtrip Minneapolis
DevOps Roadtrip Minneapolis
 
DevOps/Flow workshop for agile india 2015
DevOps/Flow workshop for agile india 2015DevOps/Flow workshop for agile india 2015
DevOps/Flow workshop for agile india 2015
 
Devoxx 2014 monitoring
Devoxx 2014 monitoringDevoxx 2014 monitoring
Devoxx 2014 monitoring
 
Run IT Support the DevOps Way
Run IT Support the DevOps WayRun IT Support the DevOps Way
Run IT Support the DevOps Way
 
Jelastic - DevOps PaaS Business with Docker Support for Service Providers
Jelastic - DevOps PaaS Business with Docker Support for Service ProvidersJelastic - DevOps PaaS Business with Docker Support for Service Providers
Jelastic - DevOps PaaS Business with Docker Support for Service Providers
 
Paris Devops - Monitoring And Feature Toggle Pattern With JMX
Paris Devops - Monitoring And Feature Toggle Pattern With JMXParis Devops - Monitoring And Feature Toggle Pattern With JMX
Paris Devops - Monitoring And Feature Toggle Pattern With JMX
 
Devops the Microsoft Way
Devops the Microsoft WayDevops the Microsoft Way
Devops the Microsoft Way
 
DevOps in the Cloud with Microsoft Azure
DevOps in the Cloud with Microsoft AzureDevOps in the Cloud with Microsoft Azure
DevOps in the Cloud with Microsoft Azure
 
DevOps monitoring: Feedback loops in enterprise environments
DevOps monitoring: Feedback loops in enterprise environmentsDevOps monitoring: Feedback loops in enterprise environments
DevOps monitoring: Feedback loops in enterprise environments
 
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
 

Ähnlich wie Practical Monitoring Techniques Overview

Anitha_Resume_BigData
Anitha_Resume_BigDataAnitha_Resume_BigData
Anitha_Resume_BigDataAnitha Bade
 
Choosing the Right Real Estate Management Software - A Comprehensive Guide.do...
Choosing the Right Real Estate Management Software - A Comprehensive Guide.do...Choosing the Right Real Estate Management Software - A Comprehensive Guide.do...
Choosing the Right Real Estate Management Software - A Comprehensive Guide.do...SculptSoft Private Limited
 
Modern incident management
Modern incident management Modern incident management
Modern incident management OpsGenie
 
Copy of webinar modern incident management (1)
Copy of webinar  modern incident management (1)Copy of webinar  modern incident management (1)
Copy of webinar modern incident management (1)Pırıl Kavlak
 
Observability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxObservability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxOpsTree solutions
 
(ONLINE) ITIL Indonesia Community – Meetup “ITIL Introduction: Incident and P...
(ONLINE) ITIL Indonesia Community – Meetup “ITIL Introduction: Incident and P...(ONLINE) ITIL Indonesia Community – Meetup “ITIL Introduction: Incident and P...
(ONLINE) ITIL Indonesia Community – Meetup “ITIL Introduction: Incident and P...ITIL Indonesia
 
Event management slide share
Event management slide shareEvent management slide share
Event management slide shareKADAMBINI SHREE
 
Sigma Conso Consolidation & Reporting
Sigma Conso Consolidation & ReportingSigma Conso Consolidation & Reporting
Sigma Conso Consolidation & Reportingsigmaconsoasia
 
Flexible Custom Workflows for Banner ERP and the Campus
Flexible Custom Workflows for Banner ERP and the CampusFlexible Custom Workflows for Banner ERP and the Campus
Flexible Custom Workflows for Banner ERP and the CampusBonitasoft
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneSri Ambati
 
Cutting Costs with Automation
Cutting Costs with AutomationCutting Costs with Automation
Cutting Costs with AutomationVaporware
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform Michael Ghen
 
Tracking and Controlling Technical Documentation Projects
Tracking and Controlling Technical Documentation ProjectsTracking and Controlling Technical Documentation Projects
Tracking and Controlling Technical Documentation ProjectsSaiff Solutions, Inc.
 
Data driven @startups
Data driven @startups Data driven @startups
Data driven @startups IIMBNSRCEL
 
Observability in highly distributed systems
Observability in highly distributed systemsObservability in highly distributed systems
Observability in highly distributed systemsDevOps Indonesia
 
Architecting for analytics
Architecting for analyticsArchitecting for analytics
Architecting for analyticsRob Winters
 
ML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptxML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptxAltafSMT
 
Core Areas of a CA- Interlinked with computers
Core Areas of a CA- Interlinked with computersCore Areas of a CA- Interlinked with computers
Core Areas of a CA- Interlinked with computersShikha Gupta
 

Ähnlich wie Practical Monitoring Techniques Overview (20)

Anitha_Resume_BigData
Anitha_Resume_BigDataAnitha_Resume_BigData
Anitha_Resume_BigData
 
Choosing the Right Real Estate Management Software - A Comprehensive Guide.do...
Choosing the Right Real Estate Management Software - A Comprehensive Guide.do...Choosing the Right Real Estate Management Software - A Comprehensive Guide.do...
Choosing the Right Real Estate Management Software - A Comprehensive Guide.do...
 
Modern incident management
Modern incident management Modern incident management
Modern incident management
 
Copy of webinar modern incident management (1)
Copy of webinar  modern incident management (1)Copy of webinar  modern incident management (1)
Copy of webinar modern incident management (1)
 
3 types of monitoring for 2020
3 types of monitoring for 20203 types of monitoring for 2020
3 types of monitoring for 2020
 
Observability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxObservability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptx
 
Accurate systems - ERP
Accurate systems - ERPAccurate systems - ERP
Accurate systems - ERP
 
(ONLINE) ITIL Indonesia Community – Meetup “ITIL Introduction: Incident and P...
(ONLINE) ITIL Indonesia Community – Meetup “ITIL Introduction: Incident and P...(ONLINE) ITIL Indonesia Community – Meetup “ITIL Introduction: Incident and P...
(ONLINE) ITIL Indonesia Community – Meetup “ITIL Introduction: Incident and P...
 
Event management slide share
Event management slide shareEvent management slide share
Event management slide share
 
Sigma Conso Consolidation & Reporting
Sigma Conso Consolidation & ReportingSigma Conso Consolidation & Reporting
Sigma Conso Consolidation & Reporting
 
Flexible Custom Workflows for Banner ERP and the Campus
Flexible Custom Workflows for Banner ERP and the CampusFlexible Custom Workflows for Banner ERP and the Campus
Flexible Custom Workflows for Banner ERP and the Campus
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
 
Cutting Costs with Automation
Cutting Costs with AutomationCutting Costs with Automation
Cutting Costs with Automation
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform
 
Tracking and Controlling Technical Documentation Projects
Tracking and Controlling Technical Documentation ProjectsTracking and Controlling Technical Documentation Projects
Tracking and Controlling Technical Documentation Projects
 
Data driven @startups
Data driven @startups Data driven @startups
Data driven @startups
 
Observability in highly distributed systems
Observability in highly distributed systemsObservability in highly distributed systems
Observability in highly distributed systems
 
Architecting for analytics
Architecting for analyticsArchitecting for analytics
Architecting for analytics
 
ML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptxML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptx
 
Core Areas of a CA- Interlinked with computers
Core Areas of a CA- Interlinked with computersCore Areas of a CA- Interlinked with computers
Core Areas of a CA- Interlinked with computers
 

Mehr von Ariel Moskovich (13)

Consul scale
Consul scaleConsul scale
Consul scale
 
Kafka ops-new
Kafka ops-newKafka ops-new
Kafka ops-new
 
Docker appsflyer
Docker appsflyerDocker appsflyer
Docker appsflyer
 
Advanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the FieldAdvanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the Field
 
Consul
ConsulConsul
Consul
 
sensu
sensusensu
sensu
 
devopstools
devopstoolsdevopstools
devopstools
 
kafka
kafkakafka
kafka
 
Bouncer
BouncerBouncer
Bouncer
 
Devopstools
DevopstoolsDevopstools
Devopstools
 
Kafka aws
Kafka awsKafka aws
Kafka aws
 
Docker in prod
Docker in prodDocker in prod
Docker in prod
 
Docker tlv
Docker tlvDocker tlv
Docker tlv
 

Kürzlich hochgeladen

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 

Kürzlich hochgeladen (20)

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 

Practical Monitoring Techniques Overview

  • 2. Today's Talk ● Our Mission ● Current Tools ● Increasing Coverage ● PD Schedules ● Automatic Self Healing ● Bots And Alerts channels ● Events Dashboard ● Dashboard Accessibility ● Best Practices Summary
  • 3. Our Mission Back up culture with the proper tools to support it
  • 4. Current Tools ● Metrics collections: Collectd, statsd, Cloudwatch ● Monitoring: Sensu, NewRelic ● Alert channels: PagerDuty, emails, slack ● Dashboards: Grafana, CloudWatch, NewRelic ● Application testing: E2E Testing System ● Internal tools: Sensu mobile, events system, Sensu bar and more
  • 5. Increasing Coverage ● Automatic collection of basic system and 3rd party metrics for new instances ● Add alerts automatically for new instance of existed subscriber ● Each Developer / DevOps is responsible for monitoring his application / infrastructure ● Easy method to add new alerts and dashboards ● Automatic events flow
  • 6. Pager Schedules ● Divided into logical groups of ownership ● Schedule has escalation point ● On call should be able to connect and respond to issues in his area ● Easy method to override schedule ● Ability to contact relevant on call ● Ability to page relevant on call
  • 7. Automatic Self Healing ● Better MTTR ● Avoid waking On Call if possible ● Log activity to float recurrent issues ● Limit the healing to avoid restart loops ● Make sure to sync Healer Alert↔
  • 8. Bots, Integrations and Alerts Channels ● Alerts channels: Emails, slack, PD mobile, sms, calls ● Integrations: Sensu to PD/Slack, CloudWatch to PD, 3rd party (EX: CouchBase, NewRelic, etc) to PD, ● Slack Bot:
  • 9. Events Dashboard ● Simple Rest API for sending events ● Clean timeline view to spot production events ● Connections between events (“depends on” and “dependents”) ● Detailed view for each event
  • 10. Accessibility ● Available from everywhere by mobile ● Easy to ack, resolve, mute alerts ● Slack bots to reach help ● Automatically get graph with the alert ● Ability to search, edit, copy, etc alerts ● Treat alerts management as code (SVC, DB, backups, etc)
  • 11. Best Practices Summary ● Share the pain ● Automate base metrics ● Automate healing ● Make help reachable ● Make it easy to add alerts and dashboards ● Use warning levels as soft events to avoid phone calls at night ● Automate graphs in alerts ● Positive alerting system check each day ● Dependencies between alerts ● Postmortems