SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Dynamic AWS Server Usage
Using Nagios Core
or
How to pay only for what you need
Eric Loyd
eric@bitnetix.com
877.33.VOICE
@Bitnetix @SmartVox
About Bitnetix
3
About Eric Loyd and Bitnetix
Founder and CEO of Bitnetix Incorporated
VoIP services and IT/network consulting
Over 25 Years in IT and management at places like
Eastman Kodak
Frontier Communications / Global Crossing
Rochester Institute of Technology
Bitnetix started its eighth year in July, 2013
Digital Rochester GREAT Award Finalist in:
2012 for Communications Technology
2013 for Rising Star
Using Nagios since 2004
© 2013 Bitnetix Incorporated
History of SmartVox:
Bitnetix’s VoIP Platform
5
History of SmartVox, our VoIP Platform
Pre-2012 – not yet called SmartVox
Bitnetix primarily focused on IT consulting
VoIP service was ~10% of business with servers
located primarily at client sites
Custom Asterisk-based servers running FreePBX
We ran customer’s network so we had control over VoIP
2012 – Focus switched to VoIP
Focused now on hosted VoIP solutions
Made use of Amazon Web Services EC2 VPS
One per customer with no proxies* or media servers
Network/bandwidth was only customer responsiblity
© 2013 Bitnetix Incorporated
6
History of SmartVox, our VoIP Platform
2013 – SmartVox name born
Copyright, trademark, domain name, biz cards, etc.
Third generation born with multiple
proxies, registrars, configuration servers, and media
servers
June – Started Mission Matrix program & sales
AWS architecture leveraged for geography
Each customer gets own EC2 server
Proxies to closest zone, secondary “to the west”
Media servers located in zones base on number of
simultaneous calls, conferences, etc.
VMs and CDRs stored in database
© 2013 Bitnetix Incorporated
Brief Overview of AWS
8
AWS EC2 Concepts
AWS – Amazon Web Services
Collection of cloud-based services:
Storage (S3), DNS (Route 53), CDN, Server (EC2)
EC2 - Elastic Compute Cloud
Virtual servers in AWS datacenters (zones)
US (3 = VA, CA, OR), EU (1), Asia (3), SA (1)
Persistent storage & flexible IP address assignment
Pay by the hour that it’s up, storage and bandwidth
Spot instances – “temporary” EC2 servers
Bring online as needed, terminated when shut down
© 2013 Bitnetix Incorporated
9
AWS EC2 Costs
LOTS of variables, but reasonable potential costs:
Reserved servers cost about $2.00 per day
Reserved instance pricing is contractual and static, based on size
Spot servers cost between $0.50-$2.50 per day
Spot instance pricing is dynamic, we assume ~$0.10 per hour
We quantize concurrent calls into 50-call blocks
One media server = 50 calls = 1 spot instance
Two media servers = 100 calls = 2 spot instances
Bandwidth and storage will add ~10%
Reducing AWS usage reduces cost
We keep these savings for ourselves. Shhhh!!!
© 2013 Bitnetix Incorporated
Why Nagios?
11
Why Nagios?
Extensive experience using it for clients
Bitnetix is a Nagios reseller
Needed centralized monitoring software
Integrate with Twitter for notifications
Integrate with Eventum via email for trouble tickets
Zero cost
Framework
Leverage SSH, HTTP, check_mk and livestatus!!
Custom checks and notifications (very important)
Ability to “cookie cutter” installs for AWS
© 2013 Bitnetix Incorporated
12
Initial Hurdles
Customer Premise Equipment
No real control over CPE choices
Routers block some traffic, “help” other traffic incorrectly
Need to be able to remotely [re-]configure phones
Figure out how to “cookie-cutter” EC2 servers
Customer boxes and SIP endpoints
Proxies and media servers
Wanted to monitor upstream providers as well
How to separate apparent from actual failure
Something’s broken, but overall service functional
© 2013 Bitnetix Incorporated
SmartVox Provisioning
Process and Automation
14
SmartVox Network
DNS SRV records are key to redundant servers
© 2013 Bitnetix Incorporated
Sends the call
on to the correct
phone/media
server (VM, etc)
Figures out what
customer should
receive the calls
Sends incoming
calls to
one/more border
proxies
Provider
Border
Proxy
Customer
Proxy
Customer
Proxy
Border
Proxy
Customer
Proxy
15
Provisioning Process
SmartVox AWS EC2 Provisioning Database
Customer information
Account (location/division/etc) information
Number of phones*, VM boxes, etc.
Computes how many proxies customer needs
DNS SRV records created for batch updates
Media server/VM entries created automatically
Phone provisioning info created automatically
Automatically places order for phones* (+some)
Phones drop-shipped to customer in about 3 days
© 2013 Bitnetix Incorporated
16
AWS EC2 Automation: Spot Instance API
Create spot instance -> gives request ID
Instance created with SmartVox created base image
Wait a bit -> query request ID -> get instance ID
Query instance -> get IP address
Update DNS with server information and IP
Update Nagios with server information and IP
When spot instances shut down, they terminate
No more expense for “burstable resources”
This sounds like a Nagios event handler…
© 2013 Bitnetix Incorporated
17
AWS EC2 Automation: Our Custom Image
SmartVox media server image includes Asterisk
Asterisk told to exit after waiting for calls to terminate
Startup script shuts down system after Asterisk exits
Instant “spot instance”
Bring it online when needed, and terminate as required
Same basic idea for starting/stopping proxies
These tend to be more static than media servers
Platform can be adjusted automatically
COGS adjusts appropriately
Hey, let’s hook this up to Nagios!!
© 2013 Bitnetix Incorporated
18
AWS EC2 Automation: More ideas
Quick aside about spot instances. Useful for:
Database dumps
Spot instance turned up to do MySQL copies
Run reports, dump, compress, purge, etc & term
Distributing web server load
Pop up another server and add to DNS
Instant on-demand capacity
Anything that you only want to do repeatedly
but not for a long time, and only when you
want to (or maybe if you have to)
© 2013 Bitnetix Incorporated
Use Nagios for:
Provisioning
Monitoring
Capacity Planning
20
Provisioning
Rather than create EC2s, we just update Nagios
Automatically regenerate SIP proxy and media server
dynamic_hosts.cfg file as part of provisioning process
Nagios looks for host up, doesn’t find it, fires off handler
Event handler queries EC2 to see if it’s being turned up (~10
min) or just not running. If it’s not running, it starts it.
DNS is batch updated every hour. 59 min TTLs
Phone provisioning handled via automatic extract from
database to create HTTP served configuration files
Master/slave “config servers” (also in AWS) to send all
this stuff to customers, with a URL to activate phones
Entire process from signature to functional < 1 week
© 2013 Bitnetix Incorporated
21
Monitoring
Nagios looks for hosts (see previous slide)
Automatically creates them if needed
Note that SIP proxies are not spot instances
Dedicated to lifespan of customer/account so they are
only terminated as part of de-provisioning process
Nagios looks at health of services
Determine if we have faults, outages, etc.
Can potentially reroute automatically (DNS SRV!)
Store performance info for capacity calculations
Notifications via Twitter and email
Come back tomorrow at 10:30 for how this works
© 2013 Bitnetix Incorporated
22
Capacity Planning
Quantize by 50 simultaneous calls per server
Perf data used to calculate historical usage
Can use cron to automatically add/remove servers
Nagios figures out “deltac” in current usage
If deltac = 0, we are just right (OK)
If deltac < 0, we have too much capacity (WARN)
If deltac > 0, we need more capacity (CRITICAL)
Event handler looks at state and either does
nothing, tells least used box to stop Asterisk, or adds
another box to the mix (see provisioning)
Capacity (and costs) dynamically adjust with usage
© 2013 Bitnetix Incorporated
23
Capacity Planning: DeltaC
deltac – Custom Nagios module
Looks at the last three times it ran on particular host
Quantized by 50 calls = change in 50-call volumes
If deltac = 0 then we return an OK state
If deltac < 0 then we are dropping call volumes
and can SSH to a box and tell Asterisk to stop
This will then stop the spot instance and reduce cost
If deltac > 0 then we are gaining call volumes
and trigger provisioning process
This will start a spot instance and increase cost
© 2013 Bitnetix Incorporated
Event Handler:
DeltaC
25
How DeltaC Works
Let’s assume we’re creating a new host
ec2-request-spot-instances ami-58296831 -p 0.04 --key
"BTC EC2" --group Asterisk --instance-type m1.medium -n 1
--type one-time
Get back a “spotInstanceRequestId” (sir-722f4e34)
ec2-describe-spot-instance-requests sir-722f4e34
Get back an “instanceId” (i-6488e31f)
ec2-describe-instances i-6488e31f
Get back public IP address (ipAddress) of this machine
Now we have IP address and (internal) name
Populate DNS batch update queue
Regenerate /usr/local/nagios/etc/objects/dynamic_hosts.cfg
© 2013 Bitnetix Incorporated
26
DeltaC Saves Lives Money
Small percentage changes in usage
result in large changes
in Cost Of Goods
For example:
© 2013 Bitnetix Incorporated
100 calls
• 2 boxes
• $0.20/hour
• ~$75/year
500 calls
• 10 boxes
• $1.00/hour
• ~$375/year
2000 calls
• 20 boxes
• $2.00/hour
• ~$750/year
5000 calls
• 50 boxes
• $5.00/hour
• ~$2000/year
Questions?
Eric Loyd
eric@bitnetix.com
877.33.VOICE
@Bitnetix @SmartVox

Weitere ähnliche Inhalte

Was ist angesagt?

AWS Network Topology/Architecture
AWS Network Topology/ArchitectureAWS Network Topology/Architecture
AWS Network Topology/Architecture
wlscaudill
 

Was ist angesagt? (20)

Webinar AWS 201 - Using Amazon Virtual Private Cloud (VPC)
Webinar AWS 201 - Using Amazon Virtual Private Cloud (VPC)Webinar AWS 201 - Using Amazon Virtual Private Cloud (VPC)
Webinar AWS 201 - Using Amazon Virtual Private Cloud (VPC)
 
Creating Your Virtual Data Center: VPC Fundamentals and Connectivity Options
Creating Your Virtual Data Center: VPC Fundamentals and Connectivity OptionsCreating Your Virtual Data Center: VPC Fundamentals and Connectivity Options
Creating Your Virtual Data Center: VPC Fundamentals and Connectivity Options
 
A day in the life of a billion packets - AWS Summit Cape Town 2017
A day in the life of a billion packets - AWS Summit Cape Town 2017A day in the life of a billion packets - AWS Summit Cape Town 2017
A day in the life of a billion packets - AWS Summit Cape Town 2017
 
Amazon web services
Amazon web servicesAmazon web services
Amazon web services
 
AWS re:Invent 2016: How Aptean uses AWS Marketplace storage solutions to back...
AWS re:Invent 2016: How Aptean uses AWS Marketplace storage solutions to back...AWS re:Invent 2016: How Aptean uses AWS Marketplace storage solutions to back...
AWS re:Invent 2016: How Aptean uses AWS Marketplace storage solutions to back...
 
Aws Architecture Fundamentals
Aws Architecture FundamentalsAws Architecture Fundamentals
Aws Architecture Fundamentals
 
AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)
AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)
AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)
 
AWS re:Invent 2016: Deep Dive: AWS Direct Connect and VPNs (NET402)
AWS re:Invent 2016: Deep Dive: AWS Direct Connect and VPNs (NET402)AWS re:Invent 2016: Deep Dive: AWS Direct Connect and VPNs (NET402)
AWS re:Invent 2016: Deep Dive: AWS Direct Connect and VPNs (NET402)
 
AWS Summit DC 2021: Improve the developer experience with AWS CDK
AWS Summit DC 2021: Improve the developer experience with AWS CDKAWS Summit DC 2021: Improve the developer experience with AWS CDK
AWS Summit DC 2021: Improve the developer experience with AWS CDK
 
AWS June Webinar Series - Deep dive: Hybrid Architectures
AWS June Webinar Series - Deep dive: Hybrid ArchitecturesAWS June Webinar Series - Deep dive: Hybrid Architectures
AWS June Webinar Series - Deep dive: Hybrid Architectures
 
[AWS LA Media & Entertainment Event 2015]: Security of Digital Media Content ...
[AWS LA Media & Entertainment Event 2015]: Security of Digital Media Content ...[AWS LA Media & Entertainment Event 2015]: Security of Digital Media Content ...
[AWS LA Media & Entertainment Event 2015]: Security of Digital Media Content ...
 
(SPOT209) Raising the Bar on Video Streaming Quality Using AWS
(SPOT209) Raising the Bar on Video Streaming Quality Using AWS(SPOT209) Raising the Bar on Video Streaming Quality Using AWS
(SPOT209) Raising the Bar on Video Streaming Quality Using AWS
 
AWS re:Invent 2016: Media Delivery from the Cloud: Integrated AWS Solutions f...
AWS re:Invent 2016: Media Delivery from the Cloud: Integrated AWS Solutions f...AWS re:Invent 2016: Media Delivery from the Cloud: Integrated AWS Solutions f...
AWS re:Invent 2016: Media Delivery from the Cloud: Integrated AWS Solutions f...
 
(ARC403) From One to Many: Evolving VPC Design | AWS re:Invent 2014
(ARC403) From One to Many: Evolving VPC Design | AWS re:Invent 2014(ARC403) From One to Many: Evolving VPC Design | AWS re:Invent 2014
(ARC403) From One to Many: Evolving VPC Design | AWS re:Invent 2014
 
AWS Network Topology/Architecture
AWS Network Topology/ArchitectureAWS Network Topology/Architecture
AWS Network Topology/Architecture
 
(SDD302) A Tale of One Thousand Instances - Migrating from Amazon EC2-Classic...
(SDD302) A Tale of One Thousand Instances - Migrating from Amazon EC2-Classic...(SDD302) A Tale of One Thousand Instances - Migrating from Amazon EC2-Classic...
(SDD302) A Tale of One Thousand Instances - Migrating from Amazon EC2-Classic...
 
AWS Architecture Fundamentals - Houston
AWS Architecture Fundamentals - HoustonAWS Architecture Fundamentals - Houston
AWS Architecture Fundamentals - Houston
 
AWS Transit Gateway-Benefits and Best Practices
AWS Transit Gateway-Benefits and Best PracticesAWS Transit Gateway-Benefits and Best Practices
AWS Transit Gateway-Benefits and Best Practices
 
Enterprise Service Delivery from the AWS Cloud (ARC208) | AWS re:Invent 2013
Enterprise Service Delivery from the AWS Cloud (ARC208) | AWS re:Invent 2013Enterprise Service Delivery from the AWS Cloud (ARC208) | AWS re:Invent 2013
Enterprise Service Delivery from the AWS Cloud (ARC208) | AWS re:Invent 2013
 
AWS VPC best practices 2016 by Bogdan Naydenov
AWS VPC best practices 2016 by Bogdan NaydenovAWS VPC best practices 2016 by Bogdan Naydenov
AWS VPC best practices 2016 by Bogdan Naydenov
 

Ähnlich wie Nagios Conference 2013 - Eric Loyd - Dynamic AWS Server Usage Using Nagios Core

Ähnlich wie Nagios Conference 2013 - Eric Loyd - Dynamic AWS Server Usage Using Nagios Core (20)

Cloud economics design, capacity and operational concerns
Cloud economics  design, capacity and operational concernsCloud economics  design, capacity and operational concerns
Cloud economics design, capacity and operational concerns
 
Presentation cisco iasbu private cloud introduction
Presentation   cisco iasbu private cloud introductionPresentation   cisco iasbu private cloud introduction
Presentation cisco iasbu private cloud introduction
 
Aws cloud migration_realestatedesign
Aws cloud migration_realestatedesignAws cloud migration_realestatedesign
Aws cloud migration_realestatedesign
 
Building the future of Digital Television and Enterprise Database Management ...
Building the future of Digital Television and Enterprise Database Management ...Building the future of Digital Television and Enterprise Database Management ...
Building the future of Digital Television and Enterprise Database Management ...
 
FreeSBC How To - Deploy on AWS
FreeSBC How To - Deploy on AWSFreeSBC How To - Deploy on AWS
FreeSBC How To - Deploy on AWS
 
FreeSBC How To - Deploy on AWS
FreeSBC How To - Deploy on AWSFreeSBC How To - Deploy on AWS
FreeSBC How To - Deploy on AWS
 
IP Expo - What is AWS?
IP Expo - What is AWS?IP Expo - What is AWS?
IP Expo - What is AWS?
 
Shashi Raina [AWS] & Al Sargent [InfluxData] | Build Modern Monitoring with I...
Shashi Raina [AWS] & Al Sargent [InfluxData] | Build Modern Monitoring with I...Shashi Raina [AWS] & Al Sargent [InfluxData] | Build Modern Monitoring with I...
Shashi Raina [AWS] & Al Sargent [InfluxData] | Build Modern Monitoring with I...
 
Cloud Hosting: Lessons from the trenches
Cloud Hosting: Lessons from the trenchesCloud Hosting: Lessons from the trenches
Cloud Hosting: Lessons from the trenches
 
Private cloud with z enterprise
Private cloud with z enterprisePrivate cloud with z enterprise
Private cloud with z enterprise
 
AWS Webcast - AWS 101 - Journey to the AWS Cloud: Introduction to AWS
AWS Webcast - AWS 101 - Journey to the AWS Cloud: Introduction to AWSAWS Webcast - AWS 101 - Journey to the AWS Cloud: Introduction to AWS
AWS Webcast - AWS 101 - Journey to the AWS Cloud: Introduction to AWS
 
AWS Core Services Overview, Immersion Day Huntsville 2019
AWS Core Services Overview, Immersion Day Huntsville 2019AWS Core Services Overview, Immersion Day Huntsville 2019
AWS Core Services Overview, Immersion Day Huntsville 2019
 
CyberCloud
CyberCloud CyberCloud
CyberCloud
 
AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...
AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...
AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...
 
Building real-time serverless data applications with Confluent and AWS.pptx
Building real-time serverless data applications with Confluent and AWS.pptxBuilding real-time serverless data applications with Confluent and AWS.pptx
Building real-time serverless data applications with Confluent and AWS.pptx
 
AWS IoT 및 Mobile Hub 서비스 소개 (김일호) :: re:Invent re:Cap Webinar 2015
AWS IoT 및 Mobile Hub 서비스 소개 (김일호) :: re:Invent re:Cap Webinar 2015AWS IoT 및 Mobile Hub 서비스 소개 (김일호) :: re:Invent re:Cap Webinar 2015
AWS IoT 및 Mobile Hub 서비스 소개 (김일호) :: re:Invent re:Cap Webinar 2015
 
AWS re:Invent 2016: Hybrid Architecture Design: Connecting Your On-Premises W...
AWS re:Invent 2016: Hybrid Architecture Design: Connecting Your On-Premises W...AWS re:Invent 2016: Hybrid Architecture Design: Connecting Your On-Premises W...
AWS re:Invent 2016: Hybrid Architecture Design: Connecting Your On-Premises W...
 
Adding Recurring Revenue with Cloud Computing ProfitBricks
Adding Recurring Revenue with Cloud Computing ProfitBricksAdding Recurring Revenue with Cloud Computing ProfitBricks
Adding Recurring Revenue with Cloud Computing ProfitBricks
 
Cloud Orchestrator - IBM Software Defined Environment Event
Cloud Orchestrator - IBM Software Defined Environment EventCloud Orchestrator - IBM Software Defined Environment Event
Cloud Orchestrator - IBM Software Defined Environment Event
 
Building real-time serverless data applications with Confluent and AWS - Lond...
Building real-time serverless data applications with Confluent and AWS - Lond...Building real-time serverless data applications with Confluent and AWS - Lond...
Building real-time serverless data applications with Confluent and AWS - Lond...
 

Mehr von Nagios

Mehr von Nagios (20)

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Nagios Conference 2013 - Eric Loyd - Dynamic AWS Server Usage Using Nagios Core

  • 1. Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix @SmartVox
  • 3. 3 About Eric Loyd and Bitnetix Founder and CEO of Bitnetix Incorporated VoIP services and IT/network consulting Over 25 Years in IT and management at places like Eastman Kodak Frontier Communications / Global Crossing Rochester Institute of Technology Bitnetix started its eighth year in July, 2013 Digital Rochester GREAT Award Finalist in: 2012 for Communications Technology 2013 for Rising Star Using Nagios since 2004 © 2013 Bitnetix Incorporated
  • 5. 5 History of SmartVox, our VoIP Platform Pre-2012 – not yet called SmartVox Bitnetix primarily focused on IT consulting VoIP service was ~10% of business with servers located primarily at client sites Custom Asterisk-based servers running FreePBX We ran customer’s network so we had control over VoIP 2012 – Focus switched to VoIP Focused now on hosted VoIP solutions Made use of Amazon Web Services EC2 VPS One per customer with no proxies* or media servers Network/bandwidth was only customer responsiblity © 2013 Bitnetix Incorporated
  • 6. 6 History of SmartVox, our VoIP Platform 2013 – SmartVox name born Copyright, trademark, domain name, biz cards, etc. Third generation born with multiple proxies, registrars, configuration servers, and media servers June – Started Mission Matrix program & sales AWS architecture leveraged for geography Each customer gets own EC2 server Proxies to closest zone, secondary “to the west” Media servers located in zones base on number of simultaneous calls, conferences, etc. VMs and CDRs stored in database © 2013 Bitnetix Incorporated
  • 8. 8 AWS EC2 Concepts AWS – Amazon Web Services Collection of cloud-based services: Storage (S3), DNS (Route 53), CDN, Server (EC2) EC2 - Elastic Compute Cloud Virtual servers in AWS datacenters (zones) US (3 = VA, CA, OR), EU (1), Asia (3), SA (1) Persistent storage & flexible IP address assignment Pay by the hour that it’s up, storage and bandwidth Spot instances – “temporary” EC2 servers Bring online as needed, terminated when shut down © 2013 Bitnetix Incorporated
  • 9. 9 AWS EC2 Costs LOTS of variables, but reasonable potential costs: Reserved servers cost about $2.00 per day Reserved instance pricing is contractual and static, based on size Spot servers cost between $0.50-$2.50 per day Spot instance pricing is dynamic, we assume ~$0.10 per hour We quantize concurrent calls into 50-call blocks One media server = 50 calls = 1 spot instance Two media servers = 100 calls = 2 spot instances Bandwidth and storage will add ~10% Reducing AWS usage reduces cost We keep these savings for ourselves. Shhhh!!! © 2013 Bitnetix Incorporated
  • 11. 11 Why Nagios? Extensive experience using it for clients Bitnetix is a Nagios reseller Needed centralized monitoring software Integrate with Twitter for notifications Integrate with Eventum via email for trouble tickets Zero cost Framework Leverage SSH, HTTP, check_mk and livestatus!! Custom checks and notifications (very important) Ability to “cookie cutter” installs for AWS © 2013 Bitnetix Incorporated
  • 12. 12 Initial Hurdles Customer Premise Equipment No real control over CPE choices Routers block some traffic, “help” other traffic incorrectly Need to be able to remotely [re-]configure phones Figure out how to “cookie-cutter” EC2 servers Customer boxes and SIP endpoints Proxies and media servers Wanted to monitor upstream providers as well How to separate apparent from actual failure Something’s broken, but overall service functional © 2013 Bitnetix Incorporated
  • 14. 14 SmartVox Network DNS SRV records are key to redundant servers © 2013 Bitnetix Incorporated Sends the call on to the correct phone/media server (VM, etc) Figures out what customer should receive the calls Sends incoming calls to one/more border proxies Provider Border Proxy Customer Proxy Customer Proxy Border Proxy Customer Proxy
  • 15. 15 Provisioning Process SmartVox AWS EC2 Provisioning Database Customer information Account (location/division/etc) information Number of phones*, VM boxes, etc. Computes how many proxies customer needs DNS SRV records created for batch updates Media server/VM entries created automatically Phone provisioning info created automatically Automatically places order for phones* (+some) Phones drop-shipped to customer in about 3 days © 2013 Bitnetix Incorporated
  • 16. 16 AWS EC2 Automation: Spot Instance API Create spot instance -> gives request ID Instance created with SmartVox created base image Wait a bit -> query request ID -> get instance ID Query instance -> get IP address Update DNS with server information and IP Update Nagios with server information and IP When spot instances shut down, they terminate No more expense for “burstable resources” This sounds like a Nagios event handler… © 2013 Bitnetix Incorporated
  • 17. 17 AWS EC2 Automation: Our Custom Image SmartVox media server image includes Asterisk Asterisk told to exit after waiting for calls to terminate Startup script shuts down system after Asterisk exits Instant “spot instance” Bring it online when needed, and terminate as required Same basic idea for starting/stopping proxies These tend to be more static than media servers Platform can be adjusted automatically COGS adjusts appropriately Hey, let’s hook this up to Nagios!! © 2013 Bitnetix Incorporated
  • 18. 18 AWS EC2 Automation: More ideas Quick aside about spot instances. Useful for: Database dumps Spot instance turned up to do MySQL copies Run reports, dump, compress, purge, etc & term Distributing web server load Pop up another server and add to DNS Instant on-demand capacity Anything that you only want to do repeatedly but not for a long time, and only when you want to (or maybe if you have to) © 2013 Bitnetix Incorporated
  • 20. 20 Provisioning Rather than create EC2s, we just update Nagios Automatically regenerate SIP proxy and media server dynamic_hosts.cfg file as part of provisioning process Nagios looks for host up, doesn’t find it, fires off handler Event handler queries EC2 to see if it’s being turned up (~10 min) or just not running. If it’s not running, it starts it. DNS is batch updated every hour. 59 min TTLs Phone provisioning handled via automatic extract from database to create HTTP served configuration files Master/slave “config servers” (also in AWS) to send all this stuff to customers, with a URL to activate phones Entire process from signature to functional < 1 week © 2013 Bitnetix Incorporated
  • 21. 21 Monitoring Nagios looks for hosts (see previous slide) Automatically creates them if needed Note that SIP proxies are not spot instances Dedicated to lifespan of customer/account so they are only terminated as part of de-provisioning process Nagios looks at health of services Determine if we have faults, outages, etc. Can potentially reroute automatically (DNS SRV!) Store performance info for capacity calculations Notifications via Twitter and email Come back tomorrow at 10:30 for how this works © 2013 Bitnetix Incorporated
  • 22. 22 Capacity Planning Quantize by 50 simultaneous calls per server Perf data used to calculate historical usage Can use cron to automatically add/remove servers Nagios figures out “deltac” in current usage If deltac = 0, we are just right (OK) If deltac < 0, we have too much capacity (WARN) If deltac > 0, we need more capacity (CRITICAL) Event handler looks at state and either does nothing, tells least used box to stop Asterisk, or adds another box to the mix (see provisioning) Capacity (and costs) dynamically adjust with usage © 2013 Bitnetix Incorporated
  • 23. 23 Capacity Planning: DeltaC deltac – Custom Nagios module Looks at the last three times it ran on particular host Quantized by 50 calls = change in 50-call volumes If deltac = 0 then we return an OK state If deltac < 0 then we are dropping call volumes and can SSH to a box and tell Asterisk to stop This will then stop the spot instance and reduce cost If deltac > 0 then we are gaining call volumes and trigger provisioning process This will start a spot instance and increase cost © 2013 Bitnetix Incorporated
  • 25. 25 How DeltaC Works Let’s assume we’re creating a new host ec2-request-spot-instances ami-58296831 -p 0.04 --key "BTC EC2" --group Asterisk --instance-type m1.medium -n 1 --type one-time Get back a “spotInstanceRequestId” (sir-722f4e34) ec2-describe-spot-instance-requests sir-722f4e34 Get back an “instanceId” (i-6488e31f) ec2-describe-instances i-6488e31f Get back public IP address (ipAddress) of this machine Now we have IP address and (internal) name Populate DNS batch update queue Regenerate /usr/local/nagios/etc/objects/dynamic_hosts.cfg © 2013 Bitnetix Incorporated
  • 26. 26 DeltaC Saves Lives Money Small percentage changes in usage result in large changes in Cost Of Goods For example: © 2013 Bitnetix Incorporated 100 calls • 2 boxes • $0.20/hour • ~$75/year 500 calls • 10 boxes • $1.00/hour • ~$375/year 2000 calls • 20 boxes • $2.00/hour • ~$750/year 5000 calls • 50 boxes • $5.00/hour • ~$2000/year