SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Monitoring Patterns for
Mitigating Technical Risk
@Forter
#1 risk
Slow or bad (500) API responses
Auto-healing
because humans are slow
SLA, Failover, Degradation, Throttling
Alerting
Detect, Filter, Alert, Diagnostics
SLA
Performance Data Loss Business Logic
TX Processing Low Latency Nope Best Effort
Stream
Processing High Throughput Best Effort Best Effort
Batch Processing High Volume Nope Reconciliation
Automatic Failover
http fencing (Incapsula)
http load balancing (ELB)
instance restart (Scaling Group)
process restart (upstart)
exceptions bubble up and crash
Graceful Degradation
nginx (lua)
expressjs (nodejs)
storm (java)
Stability
Code
Changes
Throttling (without back-pressure)
request priority reduced when TX/sec > thresh
Different priority →
Different queue →
Different worker
lower priority inside queue for test probes
Detect -> Filter -> Alert -> Manual Diagnostics
Alerting
Detection
filter &
route
alert
diagnostics
redundancy
CloudWatch/CollectD - fast, no root cause
App events (exceptions) - too noisy, root cause
Pingdom probes - low coverage, reliable
Internal probes - better coverage, false alarms
cloudwatch
pagerduty
alert
(no riemann)
system test
pagerduty
alert
(riemann needed)
filter tests using a state machine
filter tests using a state machine
(tagged "apisRegression"
(pagerduty-test-dispatch "1234567892ed295d91"))
(defn pagerduty-test-dispatch
[key]
(let [pd (pagerduty key)]
(changed-state {:init "passed"}
(where (state "passed") (:resolve pd))
(where (state "failed") (:trigger pd)))))
re-open manually resolved alert
re-open manually resolved alert
(tagged "apisRegression"
(pagerduty-test-dispatch "1234567892ed295d91"))
(defn pagerduty-test-dispatch
[key]
(let [pd (pagerduty key)]
(sdo
(changed-state {:init "passed"}
(where (state "passed") (:resolve pd)))
(where (state "failed")
(by [:host :service]
(throttle 1 60 (:trigger pd)))))))
Diagnostics - Storm topology timing
Diagnostics - Storm timelines
#2 risk
Slowing down merchant's website
Alerting
Monitor each and every browser
Aggregations (per browser type)
Notify on thresholds
Monitoring our javascript snippet
Timeouts
Exceptions by browser
Exception aggregation
Monitoring new versions
Riemann's Index (server monitoring)
key (host+service) event TTL
10.0.0.1-redisfree { "metric":"5"} 60
10.0.0.1-probe1 {"state":"failed"} 300
10.0.0.2-probe1 { "state":"passed"} 300
Riemann's Index
key (host+service) event TTL
199.25.1.1-1234 {"state":"loaded"} 300
199.25.2.1-4567 {"state":"downloaded"} 300
199.25.3.1-8901 {"state":"loaded"} 300
For our use case:
host=browser-ip, service=cookie
Riemann's state machine
(index)
stores last event and creates expired events (TTL)
(by [:host :service] stream)
creates a new stream for each host/service
(by-host-service stream) - forter's fork only
also closes stream when TTL expires
(defn calc-load-time [& children]
(by-host-service
(changed :state {:pairs? true}
(smap (fn [[previous current]]
(cond
(and (= (:state previous) "downloaded")
(= (:state current) "loaded"))
(assoc previous :metric (- (:time current) (:time previous)))
(and (= (:state previous) "downloaded")
(= (:state current) "expired"))
(assoc previous :metric (* JS_TIMEOUT 1000))))
children))))
(defn aggregate-by-browser [& children]
(by [:browser]
(fixed-time-window 60
(sdo
(smap folds/median
(tag "median-load-time"
children))
(smap folds/count
(tag "load-count"
children)))))))
#3 risk
Wrong decision (approve/decline)
Alerting
Anomaly detection
Motivation
Control false alarms mathematically
Threshold per customer
Threshold seasonality
Alert me if
the probability that we decline
more than k out of n transactions
given probability p
is 1 in a million (t=0.0001%)
n number of tx (30 minutes)
k number of declined txs (30 minutes)
p per customer declined/total (24 hours)
t alert threshold
Binomial Distribution Assumption
External events are uncorrelated
What happens when a customer retries the
same Tx because the first one was declined?
Questions?
email itai@forter.com
http://tech.forter.com
http://www.softwarearchitectureaddict.com

Weitere Àhnliche Inhalte

Was ist angesagt?

Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarScyllaDB
 
End to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max IndenEnd to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max IndenParis Container Day
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight OverviewJacques Nadeau
 
Monitoring via Datadog
Monitoring via DatadogMonitoring via Datadog
Monitoring via DatadogKnoldus Inc.
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introductionRico Chen
 
Building infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps KrakowBuilding infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps KrakowAnton Babenko
 
Compiler Case Study - Design Patterns in C#
Compiler Case Study - Design Patterns in C#Compiler Case Study - Design Patterns in C#
Compiler Case Study - Design Patterns in C#CodeOps Technologies LLP
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusGrafana Labs
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013Julien Le Dem
 
Prometheus and Grafana
Prometheus and GrafanaPrometheus and Grafana
Prometheus and GrafanaLhouceine OUHAMZA
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developersconfluent
 
Docker Swarm for Beginner
Docker Swarm for BeginnerDocker Swarm for Beginner
Docker Swarm for BeginnerShahzad Masud
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with KubernetesDeivid Hahn Fração
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaAIMDek Technologies
 
Terraform modules and best-practices - September 2018
Terraform modules and best-practices - September 2018Terraform modules and best-practices - September 2018
Terraform modules and best-practices - September 2018Anton Babenko
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentationIlias Okacha
 
Managing Egress with Istio
Managing Egress with IstioManaging Egress with Istio
Managing Egress with IstioSolo.io
 

Was ist angesagt? (20)

Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
 
End to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max IndenEnd to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max Inden
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight Overview
 
Monitoring via Datadog
Monitoring via DatadogMonitoring via Datadog
Monitoring via Datadog
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 
Building infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps KrakowBuilding infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps Krakow
 
Compiler Case Study - Design Patterns in C#
Compiler Case Study - Design Patterns in C#Compiler Case Study - Design Patterns in C#
Compiler Case Study - Design Patterns in C#
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013
 
Prometheus and Grafana
Prometheus and GrafanaPrometheus and Grafana
Prometheus and Grafana
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
Docker Swarm for Beginner
Docker Swarm for BeginnerDocker Swarm for Beginner
Docker Swarm for Beginner
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with Kubernetes
 
Terraform
TerraformTerraform
Terraform
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Terraform modules and best-practices - September 2018
Terraform modules and best-practices - September 2018Terraform modules and best-practices - September 2018
Terraform modules and best-practices - September 2018
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentation
 
Managing Egress with Istio
Managing Egress with IstioManaging Egress with Istio
Managing Egress with Istio
 

Andere mochten auch

Car Alarms & Smoke Alarms [Monitorama]
Car Alarms & Smoke Alarms [Monitorama]Car Alarms & Smoke Alarms [Monitorama]
Car Alarms & Smoke Alarms [Monitorama]Dan Slimmon
 
DevOps & Security from an Enterprise Toolsmith's Perspective
DevOps & Security from an Enterprise Toolsmith's PerspectiveDevOps & Security from an Enterprise Toolsmith's Perspective
DevOps & Security from an Enterprise Toolsmith's Perspectivedev2ops
 
Revolutionizing Enterprise Software Development through Continuous Delivery &...
Revolutionizing Enterprise Software Development through Continuous Delivery &...Revolutionizing Enterprise Software Development through Continuous Delivery &...
Revolutionizing Enterprise Software Development through Continuous Delivery &...People10 Technosoft Private Limited
 
AMIS 25: DevOps Best Practice for Oracle SOA and BPM
AMIS 25: DevOps Best Practice for Oracle SOA and BPMAMIS 25: DevOps Best Practice for Oracle SOA and BPM
AMIS 25: DevOps Best Practice for Oracle SOA and BPMMatt Wright
 
Achieving DevOps using Open Source Tools in the Enterprise
Achieving DevOps using Open Source Tools in the EnterpriseAchieving DevOps using Open Source Tools in the Enterprise
Achieving DevOps using Open Source Tools in the EnterpriseCollabNet
 
Continuous Deployment at Etsy: A Tale of Two Approaches
Continuous Deployment at Etsy: A Tale of Two ApproachesContinuous Deployment at Etsy: A Tale of Two Approaches
Continuous Deployment at Etsy: A Tale of Two ApproachesRoss Snyder
 
DevOps: A Culture Transformation, More than Technology
DevOps: A Culture Transformation, More than TechnologyDevOps: A Culture Transformation, More than Technology
DevOps: A Culture Transformation, More than TechnologyCA Technologies
 
Accenture DevOps: Delivering applications at the pace of business
Accenture DevOps: Delivering applications at the pace of businessAccenture DevOps: Delivering applications at the pace of business
Accenture DevOps: Delivering applications at the pace of businessAccenture Technology
 

Andere mochten auch (10)

Car Alarms & Smoke Alarms [Monitorama]
Car Alarms & Smoke Alarms [Monitorama]Car Alarms & Smoke Alarms [Monitorama]
Car Alarms & Smoke Alarms [Monitorama]
 
DevOps & Security from an Enterprise Toolsmith's Perspective
DevOps & Security from an Enterprise Toolsmith's PerspectiveDevOps & Security from an Enterprise Toolsmith's Perspective
DevOps & Security from an Enterprise Toolsmith's Perspective
 
Revolutionizing Enterprise Software Development through Continuous Delivery &...
Revolutionizing Enterprise Software Development through Continuous Delivery &...Revolutionizing Enterprise Software Development through Continuous Delivery &...
Revolutionizing Enterprise Software Development through Continuous Delivery &...
 
AMIS 25: DevOps Best Practice for Oracle SOA and BPM
AMIS 25: DevOps Best Practice for Oracle SOA and BPMAMIS 25: DevOps Best Practice for Oracle SOA and BPM
AMIS 25: DevOps Best Practice for Oracle SOA and BPM
 
Achieving DevOps using Open Source Tools in the Enterprise
Achieving DevOps using Open Source Tools in the EnterpriseAchieving DevOps using Open Source Tools in the Enterprise
Achieving DevOps using Open Source Tools in the Enterprise
 
Continuous Deployment at Etsy: A Tale of Two Approaches
Continuous Deployment at Etsy: A Tale of Two ApproachesContinuous Deployment at Etsy: A Tale of Two Approaches
Continuous Deployment at Etsy: A Tale of Two Approaches
 
DevOps
DevOpsDevOps
DevOps
 
DevOps: A Culture Transformation, More than Technology
DevOps: A Culture Transformation, More than TechnologyDevOps: A Culture Transformation, More than Technology
DevOps: A Culture Transformation, More than Technology
 
Introducing DevOps
Introducing DevOpsIntroducing DevOps
Introducing DevOps
 
Accenture DevOps: Delivering applications at the pace of business
Accenture DevOps: Delivering applications at the pace of businessAccenture DevOps: Delivering applications at the pace of business
Accenture DevOps: Delivering applications at the pace of business
 

Ähnlich wie Monitoring patterns for mitigating technical risk

"How about no grep and zabbix?". ELK based alerts and metrics.
"How about no grep and zabbix?". ELK based alerts and metrics."How about no grep and zabbix?". ELK based alerts and metrics.
"How about no grep and zabbix?". ELK based alerts and metrics.Vladimir Pavkin
 
Logging makes perfect - Riemann, Elasticsearch and friends
Logging makes perfect - Riemann, Elasticsearch and friendsLogging makes perfect - Riemann, Elasticsearch and friends
Logging makes perfect - Riemann, Elasticsearch and friendsItamar
 
Failure Self Defense: Defend your App against failures in a (micro) services ...
Failure Self Defense: Defend your App against failures in a (micro) services ...Failure Self Defense: Defend your App against failures in a (micro) services ...
Failure Self Defense: Defend your App against failures in a (micro) services ...Tony Fabeen
 
Zabbix Smart problem detection - FISL 2015 workshop
Zabbix Smart problem detection - FISL 2015 workshopZabbix Smart problem detection - FISL 2015 workshop
Zabbix Smart problem detection - FISL 2015 workshopZabbix
 
Chicago Flink Meetup: Flink's streaming architecture
Chicago Flink Meetup: Flink's streaming architectureChicago Flink Meetup: Flink's streaming architecture
Chicago Flink Meetup: Flink's streaming architectureRobert Metzger
 
Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseKostas Tzoumas
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMjavier ramirez
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Puppet
 
Practical non blocking microservices in java 8
Practical non blocking microservices in java 8Practical non blocking microservices in java 8
Practical non blocking microservices in java 8Michal Balinski
 
Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Stephan Ewen
 
Nagios Conference 2007 | Nagios in very large Environments by Werner Neunteufl
Nagios Conference 2007 | Nagios in very large Environments by Werner NeunteuflNagios Conference 2007 | Nagios in very large Environments by Werner Neunteufl
Nagios Conference 2007 | Nagios in very large Environments by Werner NeunteuflNETWAYS
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Amazon Web Services
 
Reactive Microservices with JRuby and Docker
Reactive Microservices with JRuby and DockerReactive Microservices with JRuby and Docker
Reactive Microservices with JRuby and DockerJohn Scattergood
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Robert Metzger
 
Network Automation with Salt and NAPALM: a self-resilient network
Network Automation with Salt and NAPALM: a self-resilient networkNetwork Automation with Salt and NAPALM: a self-resilient network
Network Automation with Salt and NAPALM: a self-resilient networkCloudflare
 
Monitoring with Syslog and EventMachine
Monitoring with Syslog and EventMachineMonitoring with Syslog and EventMachine
Monitoring with Syslog and EventMachineWooga
 
Cisco Router Security
Cisco Router SecurityCisco Router Security
Cisco Router Securitykktamang
 
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart NeedhamThe post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart NeedhamStewart Needham
 
FME Server Meets the Challenge of Real-time
FME Server Meets the Challenge of Real-timeFME Server Meets the Challenge of Real-time
FME Server Meets the Challenge of Real-timeSafe Software
 

Ähnlich wie Monitoring patterns for mitigating technical risk (20)

"How about no grep and zabbix?". ELK based alerts and metrics.
"How about no grep and zabbix?". ELK based alerts and metrics."How about no grep and zabbix?". ELK based alerts and metrics.
"How about no grep and zabbix?". ELK based alerts and metrics.
 
Logging makes perfect - Riemann, Elasticsearch and friends
Logging makes perfect - Riemann, Elasticsearch and friendsLogging makes perfect - Riemann, Elasticsearch and friends
Logging makes perfect - Riemann, Elasticsearch and friends
 
Failure Self Defense: Defend your App against failures in a (micro) services ...
Failure Self Defense: Defend your App against failures in a (micro) services ...Failure Self Defense: Defend your App against failures in a (micro) services ...
Failure Self Defense: Defend your App against failures in a (micro) services ...
 
Zabbix Smart problem detection - FISL 2015 workshop
Zabbix Smart problem detection - FISL 2015 workshopZabbix Smart problem detection - FISL 2015 workshop
Zabbix Smart problem detection - FISL 2015 workshop
 
Chicago Flink Meetup: Flink's streaming architecture
Chicago Flink Meetup: Flink's streaming architectureChicago Flink Meetup: Flink's streaming architecture
Chicago Flink Meetup: Flink's streaming architecture
 
Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San Jose
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
 
Practical non blocking microservices in java 8
Practical non blocking microservices in java 8Practical non blocking microservices in java 8
Practical non blocking microservices in java 8
 
Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)
 
Nagios Conference 2007 | Nagios in very large Environments by Werner Neunteufl
Nagios Conference 2007 | Nagios in very large Environments by Werner NeunteuflNagios Conference 2007 | Nagios in very large Environments by Werner Neunteufl
Nagios Conference 2007 | Nagios in very large Environments by Werner Neunteufl
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
 
Reactive Microservices with JRuby and Docker
Reactive Microservices with JRuby and DockerReactive Microservices with JRuby and Docker
Reactive Microservices with JRuby and Docker
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
 
Network Automation with Salt and NAPALM: a self-resilient network
Network Automation with Salt and NAPALM: a self-resilient networkNetwork Automation with Salt and NAPALM: a self-resilient network
Network Automation with Salt and NAPALM: a self-resilient network
 
Monitoring with Syslog and EventMachine
Monitoring with Syslog and EventMachineMonitoring with Syslog and EventMachine
Monitoring with Syslog and EventMachine
 
Cisco Router Security
Cisco Router SecurityCisco Router Security
Cisco Router Security
 
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart NeedhamThe post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
 
FME Server Meets the Challenge of Real-time
FME Server Meets the Challenge of Real-timeFME Server Meets the Challenge of Real-time
FME Server Meets the Challenge of Real-time
 

KĂŒrzlich hochgeladen

why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
CALL ON ➄8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂
CALL ON ➄8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂CALL ON ➄8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂
CALL ON ➄8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂anilsa9823
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto GonzĂĄlez Trastoy
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
(Genuine) Escort Service Lucknow | Starting â‚č,5K To @25k with A/C đŸ§‘đŸœâ€â€ïžâ€đŸ§‘đŸ» 89...
(Genuine) Escort Service Lucknow | Starting â‚č,5K To @25k with A/C đŸ§‘đŸœâ€â€ïžâ€đŸ§‘đŸ» 89...(Genuine) Escort Service Lucknow | Starting â‚č,5K To @25k with A/C đŸ§‘đŸœâ€â€ïžâ€đŸ§‘đŸ» 89...
(Genuine) Escort Service Lucknow | Starting â‚č,5K To @25k with A/C đŸ§‘đŸœâ€â€ïžâ€đŸ§‘đŸ» 89...gurkirankumar98700
 

KĂŒrzlich hochgeladen (20)

why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
CALL ON ➄8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂
CALL ON ➄8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂CALL ON ➄8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂
CALL ON ➄8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂
 
Call Girls In Mukherjee Nagar đŸ“± 9999965857 đŸ€© Delhi đŸ«Š HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar đŸ“±  9999965857  đŸ€© Delhi đŸ«Š HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar đŸ“±  9999965857  đŸ€© Delhi đŸ«Š HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar đŸ“± 9999965857 đŸ€© Delhi đŸ«Š HOT AND SEXY VVIP 🍎 SE...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
(Genuine) Escort Service Lucknow | Starting â‚č,5K To @25k with A/C đŸ§‘đŸœâ€â€ïžâ€đŸ§‘đŸ» 89...
(Genuine) Escort Service Lucknow | Starting â‚č,5K To @25k with A/C đŸ§‘đŸœâ€â€ïžâ€đŸ§‘đŸ» 89...(Genuine) Escort Service Lucknow | Starting â‚č,5K To @25k with A/C đŸ§‘đŸœâ€â€ïžâ€đŸ§‘đŸ» 89...
(Genuine) Escort Service Lucknow | Starting â‚č,5K To @25k with A/C đŸ§‘đŸœâ€â€ïžâ€đŸ§‘đŸ» 89...
 

Monitoring patterns for mitigating technical risk

Hinweis der Redaktion

  1. riemann is an event streaming processing server written in clojure tailored for monitoring. It receives input from the infrastructure (OS/queues/DBs) and instrumented applications (all prog languages) processes the events (enrichement, filtering, aggregation) and forwards them for visualization and alerting. explain the pros/cons of infra monitoring (plugins), application monitoring (custom events), and system probes. the need for processing before a machine wakes you up at night (the role of pagerduty in all of this) and the need for visualization when you wake up at night.
  2. riemann is an event streaming processing server written in clojure tailored for monitoring. It receives input from the infrastructure (OS/queues/DBs) and instrumented applications (all prog languages) processes the events (enrichement, filtering, aggregation) and forwards them for visualization and alerting. explain the pros/cons of infra monitoring (plugins), application monitoring (custom events), and system probes. the need for processing before a machine wakes you up at night (the role of pagerduty in all of this) and the need for visualization when you wake up at night.
  3. riemann is an event streaming processing server written in clojure tailored for monitoring. It receives input from the infrastructure (OS/queues/DBs) and instrumented applications (all prog languages) processes the events (enrichement, filtering, aggregation) and forwards them for visualization and alerting. explain the pros/cons of infra monitoring (plugins), application monitoring (custom events), and system probes. the need for processing before a machine wakes you up at night (the role of pagerduty in all of this) and the need for visualization when you wake up at night.
  4. riemann is an event streaming processing server written in clojure tailored for monitoring. It receives input from the infrastructure (OS/queues/DBs) and instrumented applications (all prog languages) processes the events (enrichement, filtering, aggregation) and forwards them for visualization and alerting. explain the pros/cons of infra monitoring (plugins), application monitoring (custom events), and system probes. the need for processing before a machine wakes you up at night (the role of pagerduty in all of this) and the need for visualization when you wake up at night.
  5. riemann is an event streaming processing server written in clojure tailored for monitoring. It receives input from the infrastructure (OS/queues/DBs) and instrumented applications (all prog languages) processes the events (enrichement, filtering, aggregation) and forwards them for visualization and alerting. explain the pros/cons of infra monitoring (plugins), application monitoring (custom events), and system probes. the need for processing before a machine wakes you up at night (the role of pagerduty in all of this) and the need for visualization when you wake up at night.
  6. On the left: example for cloudwatch alert for the load balancer detecting REST API error 500 (internal server error) On the right: example for escalation policies. alert for a system test (probe) that failed.
  7. On the left: example for cloudwatch alert for the load balancer detecting REST API error 500 (internal server error) On the right: example for escalation policies. alert for a system test (probe) that failed.
  8. PagerDuty has an alerts panel. An alert is either triggered or resolved. PD assigns the alert to whoever is on duty and then notifies him based on predefined rules (for example call if the alert has not been resolved for 15 minutes). However, PagerDuty API has a throttler. We cannot send every test result to PD every time the test runs. In essence we want to replicate the riemann state into PD. Only when the riemann state changes (a test that previously passed now fails or vica versa) we update PD. Riemann makes it easy to define a state machine by storing the last's event per [host+service] combination.
  9. PagerDuty has an alerts panel. An alert is either triggered or resolved. PD assigns the alert to whoever is on duty and then notifies him based on predefined rules (for example call if the alert has not been resolved for 15 minutes). However, PagerDuty API has a throttler. We cannot send every test result to PD every time the test runs. In essence we want to replicate the riemann state into PD. Only when the riemann state changes (a test that previously passed now fails or vica versa) we update PD. Riemann makes it easy to define a state machine by storing the last's event per [host+service] combination.
  10. But what happens if we manually resolved the alert on PD and the test keeps failing? You would want the alert to be repoened again. This means that we need to filter passed test events using state machine, but not filter failed tests at all. We just throttle them to no more than 1 per minute per test permutation [host service] combination.
  11. But what happens if we manually resolved the alert on PD and the test keeps failing? You would want the alert to be repoened again. This means that we need to filter passed test events using state machine, but not filter failed tests at all. We just throttle them to no more than 1 per minute per test permutation [host service] combination.
  12. filtering widgets that were not been possible without riemann enrichmenent. The quick filters allow fast zoom-in on the branch you are working on, and werether you would like to see test probes or not.