SlideShare a Scribd company logo
1 of 28
Download to read offline
From Ceilometer
to Telemetry
Not so alarming!

A Julien Danjou & Nick Barcet presentation
for
OpenStack in action! 4
on the 5th December 2013
Speakers
Nick Barcet
VP Products @ eNovance
Co-founded the Ceilometer project at the Folsom
summit and led the project through incubation
Julien Danjou Ceilometer Lead Dev @ eNovance
Has been a core Ceilometer contributor from the
outset, taking over the PTL reins for Havana
State of the project
● Officially named OpenStack Telemetry
● Havana is the first integrated release
● Community growth
○ Grizzly: 30 contributors, 267 commits
○ Havana: 57 contributors, 434 commits
What was done during
the Havana cycle?
UDP transport
● Faster, stateless
● Lighter (msgpack encoding)
but…

● No delivery guaranteed
● Not signed
▶ Use case: gathering metrics for alarms
Improved API
● Group samples by fields when requesting
statistics (?groupby[]=user_id)
● Limit the number of items returned (?limit=42)
● Provides links to other resources in the API
Send your own samples
Users or operators can
send samples
➔ Leverage the
statistics
➔ Usable for alarming

POST /v2/meters/mymeter
[{
"counter_type": "gauge",
"counter_unit": "megabyte",
"counter_volume": 142.0,
"user_id": "efd87807-12d2-4b38-9c705f5c2ac427ff",
"project_id": "35b17138-b364-4e6a-a1318f3099c5be68",
"resource_id": "bd9431c1-8d69-4ad3-803a8d4a6b89fd36",
"resource_metadata": {
"name1": "value1",
"name2": "value2"
},
"source": "mypaasplatform",
"timestamp": "2013-09-10T20:34:13.711330"
}]
New storage backends
Database TTL
Previously:
No way to purge data.
Ceilometer produces a lot of data
(gigabytes per day)
Now:
ceilometer-expirer will drop data older
than the configured time-to-live delay
Hyper-V

➔ Disk, network and CPU usage
New meters
● API endpoints
○ Meters the requests made to API server (Neutron,
Glance, Nova, Swift, etc)

● Neutron bandwidth
○ Meter the bandwidth consumed by each project
○ Traffic labeled as configured by operator
(based on source/destination)
Neutron Traffic Labels
Internet
label: Ext
label: Compute

VM

VM

label: Object

VM

Swift

Swift

Swift
Alarms

Regularly watch for meters statistics
values and triggers actions based on
threshold crossings.
Alarms architecture
Ceilometer API
R
P
C

H
T
T
P

Ceilometer alarm
evaluator

Webhook, SMS, e-mail…

B
u
s
Trigger

Trigger

Ceilometer
Ceilometer
alarm notifier
Ceilometer
alarm notifier
alarm notifier
Alarm types
● Threshold alarms
Triggered once a value crosses a threshold
“Call a Webhook as soon as CPU usage goes above 80%”

● Combination alarms
Triggered once all alarms in that alarm are triggered
“Call a Webhook as soon as alarm “foo” and alarm “bar” are
triggered”
Alarms API
POST /v2/alarms

GET /v2/alarms/foobar
PUT /v2/alarms/foobar

{
"alarm_actions": [ "http://site:8000/alarm"],
"insufficient_data_actions": ["http://site:8000/nodata"],
"ok_actions": ["http://site:8000/ok"],
"comparison_operator": "gt",
"description": "An alarm",
"evaluation_periods": 2,
"matching_metadata": {"key_name": "key_value"},
"meter_name": "storage.objects",
"name": "SwiftObjectAlarm",
"period": 240,
"statistic": "avg",
"threshold": 200.0
}

DELETE /v2/alarms/foobar
Heat & auto-scaling

API service

Heat Engine
injects user
metadata

triggers alarm
my_stack

Instance

Alarm
evaluator

monitors instances

Compute
Agent

Ceilometer

creates alarms
Heat & auto-scaling
API

Heat Engine

Alarms
injects user
metadata
my_stack

Instance
Instance
Instance

scales out
stack

Compute

Ceilometer

alarming
Heat & auto-scaling
API

Heat Engine

Alarms
injects user
metadata
my_stack

Instance
Instance
Instance
Instance
Instance

scales out
stack

Compute

Ceilometer

alarming
Events storage
(Almost) all OpenStack components send notifications on
events: let’s store them.
➔ Useful to be able to re-generate samples
➔ Useful to generate new sample we did not think about
➔ Allow to have a double-entry accounting
➔ Audit ability
Not yet complete, to be continued in Icehouse
Exciting ideas for Icehouse
we’re going to hack on.
General improvements
● Split the collector in two logical pieces
● Rely on notification for samples rather than
RPC
● Bring SQLAlchemy and MongoDB driver
almost on parity
● Support for hardware polling
● Support Ironic
API improvements
● Complex filtering and query DSL
x OR y AND z

● /v2/samples
(a.k.a. /v2/meter without the meter)
● Return rate rather than absolute value
● More statistics functions (rate of change,
moving-window averages…)
● Bulk requests
Alarming
Exclude low sample counts
● Allow time constrained alarms
●
Distributed polling
Leveraging Tooz and Taskflow to distribute
tasks among workers (agents).
★ Ability to distribute the polling
★ Replace alarm evaluator custom distributor
OpenStack
Telemetry

Ceilometer

#openstack-ceilometer @ Freenode

The end.
Backup slides
Heat & auto-scaling

my_stack

Instance

API service
Meter store

queries
stats

reports
samples

Compute
Agent

provides
alarm rules

Alarm
evaluator

Ceilometer

Heat Engine

More Related Content

What's hot

OpenStack Heat slides
OpenStack Heat slidesOpenStack Heat slides
OpenStack Heat slides
dbelova
 
Template Languages for OpenStack - Heat and TOSCA
Template Languages for OpenStack - Heat and TOSCATemplate Languages for OpenStack - Heat and TOSCA
Template Languages for OpenStack - Heat and TOSCA
Cloud Native Day Tel Aviv
 
Load Balancer Component Architecture - Apache Stratos 4.0.0
Load Balancer Component Architecture - Apache Stratos 4.0.0Load Balancer Component Architecture - Apache Stratos 4.0.0
Load Balancer Component Architecture - Apache Stratos 4.0.0
Imesh Gunaratne
 
Running Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data PlatformRunning Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data Platform
Eva Tse
 
Overview of apache stratos (incubation) 4.0 architecture
Overview of apache stratos (incubation) 4.0 architectureOverview of apache stratos (incubation) 4.0 architecture
Overview of apache stratos (incubation) 4.0 architecture
Lakmal Warusawithana
 

What's hot (20)

OpenStack Orchestration with Heat
OpenStack Orchestration with HeatOpenStack Orchestration with Heat
OpenStack Orchestration with Heat
 
Openstack heat & How Autoscaling works
Openstack heat & How Autoscaling worksOpenstack heat & How Autoscaling works
Openstack heat & How Autoscaling works
 
An Introduction to OpenStack Heat
An Introduction to OpenStack HeatAn Introduction to OpenStack Heat
An Introduction to OpenStack Heat
 
Orchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using HeatOrchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using Heat
 
OpenStack Heat slides
OpenStack Heat slidesOpenStack Heat slides
OpenStack Heat slides
 
Template Languages for OpenStack - Heat and TOSCA
Template Languages for OpenStack - Heat and TOSCATemplate Languages for OpenStack - Heat and TOSCA
Template Languages for OpenStack - Heat and TOSCA
 
Heat - keep the clouds up
Heat - keep the clouds upHeat - keep the clouds up
Heat - keep the clouds up
 
CEP Integration for Apache Stratos 4.0.0
CEP Integration for Apache Stratos 4.0.0CEP Integration for Apache Stratos 4.0.0
CEP Integration for Apache Stratos 4.0.0
 
Enforcing Application SLA with Congress and Monasca
Enforcing Application SLA with Congress and MonascaEnforcing Application SLA with Congress and Monasca
Enforcing Application SLA with Congress and Monasca
 
Blueprint: Kafka Publisher of Ceilometer
Blueprint: Kafka Publisher of CeilometerBlueprint: Kafka Publisher of Ceilometer
Blueprint: Kafka Publisher of Ceilometer
 
(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014
(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014
(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014
 
Load Balancer Component Architecture - Apache Stratos 4.0.0
Load Balancer Component Architecture - Apache Stratos 4.0.0Load Balancer Component Architecture - Apache Stratos 4.0.0
Load Balancer Component Architecture - Apache Stratos 4.0.0
 
Ceilometer Updates - Kilo Edition
Ceilometer Updates - Kilo EditionCeilometer Updates - Kilo Edition
Ceilometer Updates - Kilo Edition
 
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxDataHow to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
 
Container Management - Federico Simoncelli - ManageIQ Design Summit 2016
Container Management - Federico Simoncelli - ManageIQ Design Summit 2016Container Management - Federico Simoncelli - ManageIQ Design Summit 2016
Container Management - Federico Simoncelli - ManageIQ Design Summit 2016
 
Running Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data PlatformRunning Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data Platform
 
Apache Stratos (incubating) Hangout IV - Stratos Controller and CLI Internals
Apache Stratos (incubating) Hangout IV - Stratos Controller and CLI InternalsApache Stratos (incubating) Hangout IV - Stratos Controller and CLI Internals
Apache Stratos (incubating) Hangout IV - Stratos Controller and CLI Internals
 
Overview of apache stratos (incubation) 4.0 architecture
Overview of apache stratos (incubation) 4.0 architectureOverview of apache stratos (incubation) 4.0 architecture
Overview of apache stratos (incubation) 4.0 architecture
 
Device status anomaly detection
Device status anomaly detectionDevice status anomaly detection
Device status anomaly detection
 
Deployment Automation on OpenStack with TOSCA and Cloudify
Deployment Automation on OpenStack with  TOSCA and CloudifyDeployment Automation on OpenStack with  TOSCA and Cloudify
Deployment Automation on OpenStack with TOSCA and Cloudify
 

Viewers also liked

클라우드 컴퓨팅 환경에서의 미터링 공개용
클라우드 컴퓨팅 환경에서의 미터링 공개용클라우드 컴퓨팅 환경에서의 미터링 공개용
클라우드 컴퓨팅 환경에서의 미터링 공개용
OnTheWheel
 
Fleet Management in Russia, CIS and Eastern Europe
Fleet Management in Russia, CIS and Eastern EuropeFleet Management in Russia, CIS and Eastern Europe
Fleet Management in Russia, CIS and Eastern Europe
johanfagerberg
 
AUTOLOCATOR COMPANY OVERVIEW
AUTOLOCATOR COMPANY OVERVIEWAUTOLOCATOR COMPANY OVERVIEW
AUTOLOCATOR COMPANY OVERVIEW
Denis Volchugin
 

Viewers also liked (19)

OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech day
 
클라우드 컴퓨팅 환경에서의 미터링 공개용
클라우드 컴퓨팅 환경에서의 미터링 공개용클라우드 컴퓨팅 환경에서의 미터링 공개용
클라우드 컴퓨팅 환경에서의 미터링 공개용
 
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaSAutoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
 
Cloud applications
Cloud applicationsCloud applications
Cloud applications
 
Overview of EBRD and the Transport Sector
Overview of EBRD and the Transport SectorOverview of EBRD and the Transport Sector
Overview of EBRD and the Transport Sector
 
Calendario13
Calendario13Calendario13
Calendario13
 
Fleet Management in Russia, CIS and Eastern Europe
Fleet Management in Russia, CIS and Eastern EuropeFleet Management in Russia, CIS and Eastern Europe
Fleet Management in Russia, CIS and Eastern Europe
 
AUTOLOCATOR COMPANY OVERVIEW
AUTOLOCATOR COMPANY OVERVIEWAUTOLOCATOR COMPANY OVERVIEW
AUTOLOCATOR COMPANY OVERVIEW
 
Презентация Такси Всегда для корпоративных клиентов
Презентация Такси Всегда для корпоративных клиентовПрезентация Такси Всегда для корпоративных клиентов
Презентация Такси Всегда для корпоративных клиентов
 
ESAM 6900 rus
ESAM 6900 rusESAM 6900 rus
ESAM 6900 rus
 
Toyota Reach Trucks – The Simply Effective BT Reflex
Toyota Reach Trucks – The Simply Effective BT ReflexToyota Reach Trucks – The Simply Effective BT Reflex
Toyota Reach Trucks – The Simply Effective BT Reflex
 
The potential of automotive industry
The potential of automotive industryThe potential of automotive industry
The potential of automotive industry
 
Brief survey of telematics industry Russia june 11
Brief survey of telematics industry Russia june 11Brief survey of telematics industry Russia june 11
Brief survey of telematics industry Russia june 11
 
Проект Каршеринг
Проект КаршерингПроект Каршеринг
Проект Каршеринг
 
Toyota I_Site Provides Constant Information & Support For Your Forklift Fleet
Toyota I_Site Provides Constant Information & Support For Your Forklift FleetToyota I_Site Provides Constant Information & Support For Your Forklift Fleet
Toyota I_Site Provides Constant Information & Support For Your Forklift Fleet
 
Autolocator company overview_oct_11
Autolocator company overview_oct_11Autolocator company overview_oct_11
Autolocator company overview_oct_11
 
Strategic development plan
Strategic development planStrategic development plan
Strategic development plan
 
Fleet management industry analysis Russia july 2011
Fleet management industry analysis Russia july 2011Fleet management industry analysis Russia july 2011
Fleet management industry analysis Russia july 2011
 
Функции MES - обзор
Функции MES - обзорФункции MES - обзор
Функции MES - обзор
 

Similar to From Ceilometer to Telemetry: not so alarming!

LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on OpenstackLinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
OpenShift Origin
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Accumulo Summit
 

Similar to From Ceilometer to Telemetry: not so alarming! (20)

Uni w pachube 111108
Uni w pachube 111108Uni w pachube 111108
Uni w pachube 111108
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 
Openstack: security beyond firewalls
Openstack: security beyond firewallsOpenstack: security beyond firewalls
Openstack: security beyond firewalls
 
OpenStack: Security Beyond Firewalls
OpenStack: Security Beyond FirewallsOpenStack: Security Beyond Firewalls
OpenStack: Security Beyond Firewalls
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Practical Chaos Engineering
Practical Chaos EngineeringPractical Chaos Engineering
Practical Chaos Engineering
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
Serverless London 2019 FaaS composition using Kafka and CloudEvents
Serverless London 2019   FaaS composition using Kafka and CloudEventsServerless London 2019   FaaS composition using Kafka and CloudEvents
Serverless London 2019 FaaS composition using Kafka and CloudEvents
 
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on OpenstackLinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
 
Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in Action
 
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
 
Tracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptxTracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptx
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
 
Neutron on boarding room
Neutron on boarding roomNeutron on boarding room
Neutron on boarding room
 
Test Execution Infrastructure for IoT Quality analysis
Test Execution Infrastructure for IoT Quality analysisTest Execution Infrastructure for IoT Quality analysis
Test Execution Infrastructure for IoT Quality analysis
 
Monitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with PrometheusMonitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with Prometheus
 
Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...
Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...
Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...
 
HP Helion Webinar #5 - Security Beyond Firewalls
HP Helion Webinar #5 - Security Beyond FirewallsHP Helion Webinar #5 - Security Beyond Firewalls
HP Helion Webinar #5 - Security Beyond Firewalls
 

More from Nicolas (Nick) Barcet

More from Nicolas (Nick) Barcet (8)

Bringing Cloud Native Innovation to the Enterprise
Bringing Cloud Native Innovation to the EnterpriseBringing Cloud Native Innovation to the Enterprise
Bringing Cloud Native Innovation to the Enterprise
 
OpenStack Israel 2015 keynote
OpenStack Israel 2015 keynoteOpenStack Israel 2015 keynote
OpenStack Israel 2015 keynote
 
Don't change my mindset, I'm not that open
Don't change my mindset, I'm not that openDon't change my mindset, I'm not that open
Don't change my mindset, I'm not that open
 
Transforming to OpenStack: a sample roadmap to DevOps
Transforming to OpenStack: a sample roadmap to DevOpsTransforming to OpenStack: a sample roadmap to DevOps
Transforming to OpenStack: a sample roadmap to DevOps
 
OpenStack Paris Meetup on Nfv 2014/10/07
OpenStack Paris Meetup on Nfv 2014/10/07OpenStack Paris Meetup on Nfv 2014/10/07
OpenStack Paris Meetup on Nfv 2014/10/07
 
Are enterprises ready for the OpenStack transformation
Are enterprises ready for the OpenStack transformationAre enterprises ready for the OpenStack transformation
Are enterprises ready for the OpenStack transformation
 
Building clouds as a cloud factory
Building clouds as a cloud factoryBuilding clouds as a cloud factory
Building clouds as a cloud factory
 
A View from the Board
A View from the BoardA View from the Board
A View from the Board
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

From Ceilometer to Telemetry: not so alarming!

  • 1. From Ceilometer to Telemetry Not so alarming! A Julien Danjou & Nick Barcet presentation for OpenStack in action! 4 on the 5th December 2013
  • 2. Speakers Nick Barcet VP Products @ eNovance Co-founded the Ceilometer project at the Folsom summit and led the project through incubation Julien Danjou Ceilometer Lead Dev @ eNovance Has been a core Ceilometer contributor from the outset, taking over the PTL reins for Havana
  • 3. State of the project ● Officially named OpenStack Telemetry ● Havana is the first integrated release ● Community growth ○ Grizzly: 30 contributors, 267 commits ○ Havana: 57 contributors, 434 commits
  • 4. What was done during the Havana cycle?
  • 5. UDP transport ● Faster, stateless ● Lighter (msgpack encoding) but… ● No delivery guaranteed ● Not signed ▶ Use case: gathering metrics for alarms
  • 6. Improved API ● Group samples by fields when requesting statistics (?groupby[]=user_id) ● Limit the number of items returned (?limit=42) ● Provides links to other resources in the API
  • 7. Send your own samples Users or operators can send samples ➔ Leverage the statistics ➔ Usable for alarming POST /v2/meters/mymeter [{ "counter_type": "gauge", "counter_unit": "megabyte", "counter_volume": 142.0, "user_id": "efd87807-12d2-4b38-9c705f5c2ac427ff", "project_id": "35b17138-b364-4e6a-a1318f3099c5be68", "resource_id": "bd9431c1-8d69-4ad3-803a8d4a6b89fd36", "resource_metadata": { "name1": "value1", "name2": "value2" }, "source": "mypaasplatform", "timestamp": "2013-09-10T20:34:13.711330" }]
  • 9. Database TTL Previously: No way to purge data. Ceilometer produces a lot of data (gigabytes per day) Now: ceilometer-expirer will drop data older than the configured time-to-live delay
  • 10. Hyper-V ➔ Disk, network and CPU usage
  • 11. New meters ● API endpoints ○ Meters the requests made to API server (Neutron, Glance, Nova, Swift, etc) ● Neutron bandwidth ○ Meter the bandwidth consumed by each project ○ Traffic labeled as configured by operator (based on source/destination)
  • 12. Neutron Traffic Labels Internet label: Ext label: Compute VM VM label: Object VM Swift Swift Swift
  • 13. Alarms Regularly watch for meters statistics values and triggers actions based on threshold crossings.
  • 14. Alarms architecture Ceilometer API R P C H T T P Ceilometer alarm evaluator Webhook, SMS, e-mail… B u s Trigger Trigger Ceilometer Ceilometer alarm notifier Ceilometer alarm notifier alarm notifier
  • 15. Alarm types ● Threshold alarms Triggered once a value crosses a threshold “Call a Webhook as soon as CPU usage goes above 80%” ● Combination alarms Triggered once all alarms in that alarm are triggered “Call a Webhook as soon as alarm “foo” and alarm “bar” are triggered”
  • 16. Alarms API POST /v2/alarms GET /v2/alarms/foobar PUT /v2/alarms/foobar { "alarm_actions": [ "http://site:8000/alarm"], "insufficient_data_actions": ["http://site:8000/nodata"], "ok_actions": ["http://site:8000/ok"], "comparison_operator": "gt", "description": "An alarm", "evaluation_periods": 2, "matching_metadata": {"key_name": "key_value"}, "meter_name": "storage.objects", "name": "SwiftObjectAlarm", "period": 240, "statistic": "avg", "threshold": 200.0 } DELETE /v2/alarms/foobar
  • 17. Heat & auto-scaling API service Heat Engine injects user metadata triggers alarm my_stack Instance Alarm evaluator monitors instances Compute Agent Ceilometer creates alarms
  • 18. Heat & auto-scaling API Heat Engine Alarms injects user metadata my_stack Instance Instance Instance scales out stack Compute Ceilometer alarming
  • 19. Heat & auto-scaling API Heat Engine Alarms injects user metadata my_stack Instance Instance Instance Instance Instance scales out stack Compute Ceilometer alarming
  • 20. Events storage (Almost) all OpenStack components send notifications on events: let’s store them. ➔ Useful to be able to re-generate samples ➔ Useful to generate new sample we did not think about ➔ Allow to have a double-entry accounting ➔ Audit ability Not yet complete, to be continued in Icehouse
  • 21. Exciting ideas for Icehouse we’re going to hack on.
  • 22. General improvements ● Split the collector in two logical pieces ● Rely on notification for samples rather than RPC ● Bring SQLAlchemy and MongoDB driver almost on parity ● Support for hardware polling ● Support Ironic
  • 23. API improvements ● Complex filtering and query DSL x OR y AND z ● /v2/samples (a.k.a. /v2/meter without the meter) ● Return rate rather than absolute value ● More statistics functions (rate of change, moving-window averages…) ● Bulk requests
  • 24. Alarming Exclude low sample counts ● Allow time constrained alarms ●
  • 25. Distributed polling Leveraging Tooz and Taskflow to distribute tasks among workers (agents). ★ Ability to distribute the polling ★ Replace alarm evaluator custom distributor
  • 28. Heat & auto-scaling my_stack Instance API service Meter store queries stats reports samples Compute Agent provides alarm rules Alarm evaluator Ceilometer Heat Engine