SlideShare ist ein Scribd-Unternehmen logo
1 von 29
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Bob Wilkinson, GM – Amazon CloudWatch
Jon Madison, Manager, Product Engineering – Rackspace
May 21, 2018
AWS Online Tech Talks
Gaining Better Observability of Your VMs
with Amazon CloudWatch
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Speakers
Bob Wilkinson – AWS
GM, Amazon CloudWatch
Jon Madison – Rackspace
Manager, Product Engineering
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
• Introduction to Amazon CloudWatch
• Monitoring & Observability
• CloudWatch Agent in Action
• Rackspace – Scaling Operations with CloudWatch
• Closing
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introduction to Amazon
CloudWatch
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon CloudWatch at a Glance
MONITOR
• Get metrics on key
resources
• Observe application
and operational
health
• Monitor custom
metrics and log files
ACT
• SNS notifications
• Automated alarm
actions
• Event-driven
corrective actions
ANALYZE
• Visualize through
Dashboards
• 1-sec granularity
• Unified operational
view
• 15-months of data
retention
Gain System-Wide Visibility into Resource Utilization, Application
Performance, and Operational Health
>
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
“Amazon CloudWatch monitors more than
800 trillion metric observations, triggers
than 2 trillion events, and ingests more
than 50 petabytes of logs per month (*as
of March 2018)”
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring & Observability
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
From Monitoring to Observability
MONITORING
• Reports overall system health
OBSERVABILITY
• Granular insights into system
behavior
• Detailed metrics, enhanced
monitoring with alerting,
visualization, and log
aggregation & analytics
• Used for debugging, complex
troubleshooting, system
performance etc.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Observability is Challenging
• Complex applications & microservices
• Agile infrastructure
• Distributed systems
• Disparate tooling
• High customer expectations
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
+
Contextual Resource
Information
Custom Dimensions
Autodetect Region
Aggregate Metrics
1-Second Resolution
Metrics Available with the CloudWatch Agent
Metric
Enhancements
Observability Needs Granular Metrics
+
Default Metrics
CPU Disk Memory
cpu_time_guest disk_free mem_active
cpu_time_guest_nice disk_inodes_free mem_available
cpu_time_idle disk_inodes_total mem_available_percent
cpu_time_iowait disk_inodes_used mem_buffered
cpu_time_irq disk_total mem_cached
cpu_time_nice disk_used mem_free
cpu_time_softirq disk_used_percent mem_inactive
cpu_time_steal diskio_io_time mem_total
cpu_time_system diskio_iops_in_progress mem_used
cpu_time_user diskio_read_bytes mem_used_percent
cpu_usage_guest diskio_read_time
cpu_usage_guest_nice diskio_reads Network Statistics
cpu_usage_idle diskio_write_bytes netstat_tcp_close
cpu_usage_iowait diskio_write_time netstat_tcp_close_wait
cpu_usage_irq diskio_writes netstat_tcp_closing
cpu_usage_nice netstat_tcp_established
cpu_usage_softirq Processes netstat_tcp_fin_wait1
cpu_usage_steal processes_blocked netstat_tcp_fin_wait2
cpu_usage_system processes_dead netstat_tcp_last_ack
cpu_usage_user processes_idle netstat_tcp_listen
processes_paging netstat_tcp_none
Network processes_running netstat_tcp_syn_recv
net_bytes_recv processes_sleeping netstat_tcp_syn_sent
net_bytes_sent processes_stopped netstat_tcp_time_wait
net_drop_in processes_total netstat_udp_socket
net_drop_out processes_total_threads
net_err_in processes_wait Swap
net_err_out processes_zombies swap_free
net_packets_recv swap_used
net_packets_sent swap_used_percent
EC2 Instance Metrics
CPUUtilization
DiskReadBytes
DiskReadOps
DiskWriteBytes
DiskWriteOps
NetworkIn
NetworkOut
NetworkPacketsIn
NetworkPacketsOut
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CloudWatch Agent in Action
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The CloudWatch Agent Simplifies Observability
Unified Agent
Metrics & logs
For EC2 and on-premise
servers
Linux & Windows
Enhanced Metrics &
Logs
Collect in-guest system
metrics
Appends EC2
dimensions
Custom dimensions
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Getting Started Experience
Install and Configure
with AWS Systems
Manager Integration
Provides defaults
specific to OS-type
(Windows vs Linux)
Basic, Standard, or
Advanced options
(complement EC2 or
granular per resource
metrics)
Log collection
(can specify multiple
file_paths)
Migrate from
previous CW Logs
agent
Curated metric set
specific to
environment
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ideal Path to Gain Observability of your VMs
Install the
CloudWatch
Agent
Collect
Metrics and
Logs
Build and
View
Dashboards
Create
Alarms &
Actions
Generate
New Time
Series
Using
Metric Math
✅
Collect
Metrics and
Logs
Collect
Metrics and
Logs
Build and
View
Dashboards
Create
Alarms &
Actions
Generate
New Time
Series Using
Metric Math
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Rackspace – Scaling Operations
with Amazon CloudWatch
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
About Rackspace
• AWS Premier Consulting Partner and audited Managed
Service Provider (MSP)
• A leader in the 2017 Gartner Magic Quadrant for Public
Cloud Infrastructure Managed Service Providers,
Worldwide
• Managed Service Provider to over half the Fortune 100
• Provides a full-stack portfolio, from managed operations
and applications to security, professional services, and
Enterprise migration and transformation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key Use Cases Driving Customer Value
1. Solving Enterprise challenges of cost optimization and
cost governance
2. Supporting day-to-day Customer Operations
3. Enabling automation to lower time to diagnose and
resolve infrastructure and Operating System issues
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
1. Cost Control With Amazon CloudWatch
• Rackspace provides services for Cost and Performance Optimization
• CloudWatch replaces expensive, on premise infrastructure monitoring
• The CloudWatch Agent provides on-instance insight for disk size,
memory/swap utilization, etc.
• Rackspace recently saved a customer ~$500k on their yearly AWS
Spend by:
• Right-sizing instances based on CloudWatch metric performance insights
• Consolidating unused instances
• Migration to new instance families
• Proposed Reserved Instance (RI) savings would add ~$100k in savings
• Rackspace also has tooling to manage spend alerting, bill consolidation and
cost allocation / chargeback
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
2. Scale Operations With Amazon CloudWatch
• Rackspace provides services for 24x7x365 Fanatical
Support
• CloudWatch is used to provide infrastructure monitoring and
alerting for hundreds of AWS customer accounts.
• Rackspace acts as first line of response for Infrastructure and
Operating System alarms.
• The CloudWatch Agent enables monitoring of Operating System
performance and logs, and dashboards increase context for our
operations team.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
3. Automation With Amazon CloudWatch
• CloudWatch and the CloudWatch Agent integrate with
other Amazon services (SNS, Lambda, etc.) to provide
automation opportunities
• Examples:
• Low disk space reports
• Restart runaway processes
• Run diagnostic tools in response to instance metric alarms (top,
free, etc.)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recommended Best Practices
• Use the in-instance visibility of the CloudWatch Agent to
maximize your cost optimization strategy
• Leverage CloudWatch Agent and Dashboards to provide
an increase in context for Operations and reduce MTTR
• CloudWatch can tie into other systems to increase
automated handling and diagnosis of issues
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Closing
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Closing Thoughts
1. Have a strong monitoring & observability strategy
2. Focus on collecting health and behavior metrics
3. Improve performance, minimize production impacts and
control costs for a better end-user experience
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Next Steps
Get started with CloudWatch for free today
aws.amazon.com/cloudwatch
Many applications can operate within our monthly free tier limits
• Basic monitoring
• 10 custom or detailed metrics
• 10 alarms
• 3 dashboards of 50 metrics each
• 1 million API requests
Install the CloudWatch Agent
docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent
Use Metric Math to create new time series
docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math

Weitere ähnliche Inhalte

Was ist angesagt?

Being Well Architected in the Cloud
Being Well Architected in the CloudBeing Well Architected in the Cloud
Being Well Architected in the CloudAdrian Hornsby
 
Real Time Analytics On AWS: Optimized Architectures
Real Time Analytics On AWS: Optimized ArchitecturesReal Time Analytics On AWS: Optimized Architectures
Real Time Analytics On AWS: Optimized ArchitecturesAmazon Web Services
 
Replicate and Manage Data Using Managed Databases and Serverless Technologies
Replicate and Manage Data Using Managed Databases and Serverless Technologies Replicate and Manage Data Using Managed Databases and Serverless Technologies
Replicate and Manage Data Using Managed Databases and Serverless Technologies Amazon Web Services
 
ENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS ResourcesENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS ResourcesAmazon Web Services
 
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...Amazon Web Services
 
Machine Learning Inference at the Edge (IOT322-R1) - AWS re:Invent 2018
Machine Learning Inference at the Edge (IOT322-R1) - AWS re:Invent 2018Machine Learning Inference at the Edge (IOT322-R1) - AWS re:Invent 2018
Machine Learning Inference at the Edge (IOT322-R1) - AWS re:Invent 2018Amazon Web Services
 
AWS re:Invent 2016: Setting the Stage for Instant Success: Getting the Most O...
AWS re:Invent 2016: Setting the Stage for Instant Success: Getting the Most O...AWS re:Invent 2016: Setting the Stage for Instant Success: Getting the Most O...
AWS re:Invent 2016: Setting the Stage for Instant Success: Getting the Most O...Amazon Web Services
 
AWSome Day London January 2016 Intro
AWSome Day London January 2016 IntroAWSome Day London January 2016 Intro
AWSome Day London January 2016 IntroIan Massingham
 
Build Enterprise-Grade Serverless Apps
Build Enterprise-Grade Serverless Apps Build Enterprise-Grade Serverless Apps
Build Enterprise-Grade Serverless Apps Amazon Web Services
 
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017Amazon Web Services
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSAmazon Web Services
 
Analyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SFAnalyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SFAmazon Web Services
 
Automatisierte Kontrolle und Transparenz in der AWS Cloud – Autopilot für Com...
Automatisierte Kontrolle und Transparenz in der AWS Cloud – Autopilot für Com...Automatisierte Kontrolle und Transparenz in der AWS Cloud – Autopilot für Com...
Automatisierte Kontrolle und Transparenz in der AWS Cloud – Autopilot für Com...AWS Germany
 
Getting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpacesGetting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpacesAmazon Web Services
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightAmazon Web Services
 
ENT318 Innovate Faster on Salesforce Heroku and AWS
ENT318 Innovate Faster on Salesforce Heroku and AWSENT318 Innovate Faster on Salesforce Heroku and AWS
ENT318 Innovate Faster on Salesforce Heroku and AWSAmazon Web Services
 
Deep Dive on AWS Cloud Data Migration Services
Deep Dive on AWS Cloud Data Migration ServicesDeep Dive on AWS Cloud Data Migration Services
Deep Dive on AWS Cloud Data Migration ServicesAmazon Web Services
 
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...Amazon Web Services
 

Was ist angesagt? (20)

Being Well Architected in the Cloud
Being Well Architected in the CloudBeing Well Architected in the Cloud
Being Well Architected in the Cloud
 
Real Time Analytics On AWS: Optimized Architectures
Real Time Analytics On AWS: Optimized ArchitecturesReal Time Analytics On AWS: Optimized Architectures
Real Time Analytics On AWS: Optimized Architectures
 
Replicate and Manage Data Using Managed Databases and Serverless Technologies
Replicate and Manage Data Using Managed Databases and Serverless Technologies Replicate and Manage Data Using Managed Databases and Serverless Technologies
Replicate and Manage Data Using Managed Databases and Serverless Technologies
 
ENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS ResourcesENT314 Automate Best Practices and Operational Health for Your AWS Resources
ENT314 Automate Best Practices and Operational Health for Your AWS Resources
 
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
 
Machine Learning Inference at the Edge (IOT322-R1) - AWS re:Invent 2018
Machine Learning Inference at the Edge (IOT322-R1) - AWS re:Invent 2018Machine Learning Inference at the Edge (IOT322-R1) - AWS re:Invent 2018
Machine Learning Inference at the Edge (IOT322-R1) - AWS re:Invent 2018
 
AWS re:Invent 2016: Setting the Stage for Instant Success: Getting the Most O...
AWS re:Invent 2016: Setting the Stage for Instant Success: Getting the Most O...AWS re:Invent 2016: Setting the Stage for Instant Success: Getting the Most O...
AWS re:Invent 2016: Setting the Stage for Instant Success: Getting the Most O...
 
AWSome Day London January 2016 Intro
AWSome Day London January 2016 IntroAWSome Day London January 2016 Intro
AWSome Day London January 2016 Intro
 
Build Enterprise-Grade Serverless Apps
Build Enterprise-Grade Serverless Apps Build Enterprise-Grade Serverless Apps
Build Enterprise-Grade Serverless Apps
 
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWS
 
Analyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SFAnalyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SF
 
Automatisierte Kontrolle und Transparenz in der AWS Cloud – Autopilot für Com...
Automatisierte Kontrolle und Transparenz in der AWS Cloud – Autopilot für Com...Automatisierte Kontrolle und Transparenz in der AWS Cloud – Autopilot für Com...
Automatisierte Kontrolle und Transparenz in der AWS Cloud – Autopilot für Com...
 
Value, TCO & Cost Optimisation
Value, TCO & Cost OptimisationValue, TCO & Cost Optimisation
Value, TCO & Cost Optimisation
 
Getting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpacesGetting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpaces
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
ENT318 Innovate Faster on Salesforce Heroku and AWS
ENT318 Innovate Faster on Salesforce Heroku and AWSENT318 Innovate Faster on Salesforce Heroku and AWS
ENT318 Innovate Faster on Salesforce Heroku and AWS
 
Deep Dive on AWS Cloud Data Migration Services
Deep Dive on AWS Cloud Data Migration ServicesDeep Dive on AWS Cloud Data Migration Services
Deep Dive on AWS Cloud Data Migration Services
 
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
 
Cost Optimization at Scale
 Cost Optimization at Scale Cost Optimization at Scale
Cost Optimization at Scale
 

Ähnlich wie Gaining Better Observability of Your VMs with Amazon CloudWatch - AWS Online Tech Talks

Governance@scale - Governance of Multi-Account, Large-Scale AWS Environments ...
Governance@scale - Governance of Multi-Account, Large-Scale AWS Environments ...Governance@scale - Governance of Multi-Account, Large-Scale AWS Environments ...
Governance@scale - Governance of Multi-Account, Large-Scale AWS Environments ...Amazon Web Services
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Amazon Web Services
 
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...Amazon Web Services
 
Serverless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesServerless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesVladimir Simek
 
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...Amazon Web Services
 
Simplify Operations, Compliance and Governance using AWS Systems Manager
Simplify Operations, Compliance and Governance using AWS Systems ManagerSimplify Operations, Compliance and Governance using AWS Systems Manager
Simplify Operations, Compliance and Governance using AWS Systems ManagerAmazon Web Services
 
Fully Realizing the Microservices Vision with Service Mesh (DEV312-S) - AWS r...
Fully Realizing the Microservices Vision with Service Mesh (DEV312-S) - AWS r...Fully Realizing the Microservices Vision with Service Mesh (DEV312-S) - AWS r...
Fully Realizing the Microservices Vision with Service Mesh (DEV312-S) - AWS r...Amazon Web Services
 
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...Amazon Web Services
 
Migration Planning with AWS Application Discovery Service - ENT308 - Chicago ...
Migration Planning with AWS Application Discovery Service - ENT308 - Chicago ...Migration Planning with AWS Application Discovery Service - ENT308 - Chicago ...
Migration Planning with AWS Application Discovery Service - ENT308 - Chicago ...Amazon Web Services
 
DevOps, CI/CD, cost management, and security on AWS
DevOps, CI/CD, cost management, and security on AWSDevOps, CI/CD, cost management, and security on AWS
DevOps, CI/CD, cost management, and security on AWSTom Laszewski
 
The Quest for Continuous ATO: A Case Study Featuring the US Intelligence Comm...
The Quest for Continuous ATO: A Case Study Featuring the US Intelligence Comm...The Quest for Continuous ATO: A Case Study Featuring the US Intelligence Comm...
The Quest for Continuous ATO: A Case Study Featuring the US Intelligence Comm...Amazon Web Services
 
How Trek10 Uses Datadog's Distributed Tracing to Improve AWS Lambda Projects ...
How Trek10 Uses Datadog's Distributed Tracing to Improve AWS Lambda Projects ...How Trek10 Uses Datadog's Distributed Tracing to Improve AWS Lambda Projects ...
How Trek10 Uses Datadog's Distributed Tracing to Improve AWS Lambda Projects ...Amazon Web Services
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleAWS Germany
 
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...Amazon Web Services
 
Too Many Tools? How AWS Systems Manager Bridges Operational Models - AWS Summ...
Too Many Tools? How AWS Systems Manager Bridges Operational Models - AWS Summ...Too Many Tools? How AWS Systems Manager Bridges Operational Models - AWS Summ...
Too Many Tools? How AWS Systems Manager Bridges Operational Models - AWS Summ...Amazon Web Services
 
Nirav Kothari: Well-Architected - Operational Excellence Instructor Led Lab.pdf
Nirav Kothari: Well-Architected - Operational Excellence Instructor Led Lab.pdfNirav Kothari: Well-Architected - Operational Excellence Instructor Led Lab.pdf
Nirav Kothari: Well-Architected - Operational Excellence Instructor Led Lab.pdfAmazon Web Services
 
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...Amazon Web Services
 
How can your business benefit from going Serverless
How can your business benefit from going ServerlessHow can your business benefit from going Serverless
How can your business benefit from going ServerlessAmazon Web Services
 
Estate and Patch Management Infrastructure and Operations as Code
Estate and Patch Management Infrastructure and Operations as CodeEstate and Patch Management Infrastructure and Operations as Code
Estate and Patch Management Infrastructure and Operations as CodeAmazon Web Services
 
2019 03-13-implementing microservices by ddd
2019 03-13-implementing microservices by ddd2019 03-13-implementing microservices by ddd
2019 03-13-implementing microservices by dddKim Kao
 

Ähnlich wie Gaining Better Observability of Your VMs with Amazon CloudWatch - AWS Online Tech Talks (20)

Governance@scale - Governance of Multi-Account, Large-Scale AWS Environments ...
Governance@scale - Governance of Multi-Account, Large-Scale AWS Environments ...Governance@scale - Governance of Multi-Account, Large-Scale AWS Environments ...
Governance@scale - Governance of Multi-Account, Large-Scale AWS Environments ...
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
 
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
 
Serverless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesServerless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best Practices
 
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
 
Simplify Operations, Compliance and Governance using AWS Systems Manager
Simplify Operations, Compliance and Governance using AWS Systems ManagerSimplify Operations, Compliance and Governance using AWS Systems Manager
Simplify Operations, Compliance and Governance using AWS Systems Manager
 
Fully Realizing the Microservices Vision with Service Mesh (DEV312-S) - AWS r...
Fully Realizing the Microservices Vision with Service Mesh (DEV312-S) - AWS r...Fully Realizing the Microservices Vision with Service Mesh (DEV312-S) - AWS r...
Fully Realizing the Microservices Vision with Service Mesh (DEV312-S) - AWS r...
 
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
 
Migration Planning with AWS Application Discovery Service - ENT308 - Chicago ...
Migration Planning with AWS Application Discovery Service - ENT308 - Chicago ...Migration Planning with AWS Application Discovery Service - ENT308 - Chicago ...
Migration Planning with AWS Application Discovery Service - ENT308 - Chicago ...
 
DevOps, CI/CD, cost management, and security on AWS
DevOps, CI/CD, cost management, and security on AWSDevOps, CI/CD, cost management, and security on AWS
DevOps, CI/CD, cost management, and security on AWS
 
The Quest for Continuous ATO: A Case Study Featuring the US Intelligence Comm...
The Quest for Continuous ATO: A Case Study Featuring the US Intelligence Comm...The Quest for Continuous ATO: A Case Study Featuring the US Intelligence Comm...
The Quest for Continuous ATO: A Case Study Featuring the US Intelligence Comm...
 
How Trek10 Uses Datadog's Distributed Tracing to Improve AWS Lambda Projects ...
How Trek10 Uses Datadog's Distributed Tracing to Improve AWS Lambda Projects ...How Trek10 Uses Datadog's Distributed Tracing to Improve AWS Lambda Projects ...
How Trek10 Uses Datadog's Distributed Tracing to Improve AWS Lambda Projects ...
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scale
 
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
 
Too Many Tools? How AWS Systems Manager Bridges Operational Models - AWS Summ...
Too Many Tools? How AWS Systems Manager Bridges Operational Models - AWS Summ...Too Many Tools? How AWS Systems Manager Bridges Operational Models - AWS Summ...
Too Many Tools? How AWS Systems Manager Bridges Operational Models - AWS Summ...
 
Nirav Kothari: Well-Architected - Operational Excellence Instructor Led Lab.pdf
Nirav Kothari: Well-Architected - Operational Excellence Instructor Led Lab.pdfNirav Kothari: Well-Architected - Operational Excellence Instructor Led Lab.pdf
Nirav Kothari: Well-Architected - Operational Excellence Instructor Led Lab.pdf
 
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
 
How can your business benefit from going Serverless
How can your business benefit from going ServerlessHow can your business benefit from going Serverless
How can your business benefit from going Serverless
 
Estate and Patch Management Infrastructure and Operations as Code
Estate and Patch Management Infrastructure and Operations as CodeEstate and Patch Management Infrastructure and Operations as Code
Estate and Patch Management Infrastructure and Operations as Code
 
2019 03-13-implementing microservices by ddd
2019 03-13-implementing microservices by ddd2019 03-13-implementing microservices by ddd
2019 03-13-implementing microservices by ddd
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Gaining Better Observability of Your VMs with Amazon CloudWatch - AWS Online Tech Talks

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Bob Wilkinson, GM – Amazon CloudWatch Jon Madison, Manager, Product Engineering – Rackspace May 21, 2018 AWS Online Tech Talks Gaining Better Observability of Your VMs with Amazon CloudWatch
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speakers Bob Wilkinson – AWS GM, Amazon CloudWatch Jon Madison – Rackspace Manager, Product Engineering
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda • Introduction to Amazon CloudWatch • Monitoring & Observability • CloudWatch Agent in Action • Rackspace – Scaling Operations with CloudWatch • Closing
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Introduction to Amazon CloudWatch
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon CloudWatch at a Glance MONITOR • Get metrics on key resources • Observe application and operational health • Monitor custom metrics and log files ACT • SNS notifications • Automated alarm actions • Event-driven corrective actions ANALYZE • Visualize through Dashboards • 1-sec granularity • Unified operational view • 15-months of data retention Gain System-Wide Visibility into Resource Utilization, Application Performance, and Operational Health >
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. “Amazon CloudWatch monitors more than 800 trillion metric observations, triggers than 2 trillion events, and ingests more than 50 petabytes of logs per month (*as of March 2018)”
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring & Observability
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. From Monitoring to Observability MONITORING • Reports overall system health OBSERVABILITY • Granular insights into system behavior • Detailed metrics, enhanced monitoring with alerting, visualization, and log aggregation & analytics • Used for debugging, complex troubleshooting, system performance etc.
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Observability is Challenging • Complex applications & microservices • Agile infrastructure • Distributed systems • Disparate tooling • High customer expectations
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. + Contextual Resource Information Custom Dimensions Autodetect Region Aggregate Metrics 1-Second Resolution Metrics Available with the CloudWatch Agent Metric Enhancements Observability Needs Granular Metrics + Default Metrics CPU Disk Memory cpu_time_guest disk_free mem_active cpu_time_guest_nice disk_inodes_free mem_available cpu_time_idle disk_inodes_total mem_available_percent cpu_time_iowait disk_inodes_used mem_buffered cpu_time_irq disk_total mem_cached cpu_time_nice disk_used mem_free cpu_time_softirq disk_used_percent mem_inactive cpu_time_steal diskio_io_time mem_total cpu_time_system diskio_iops_in_progress mem_used cpu_time_user diskio_read_bytes mem_used_percent cpu_usage_guest diskio_read_time cpu_usage_guest_nice diskio_reads Network Statistics cpu_usage_idle diskio_write_bytes netstat_tcp_close cpu_usage_iowait diskio_write_time netstat_tcp_close_wait cpu_usage_irq diskio_writes netstat_tcp_closing cpu_usage_nice netstat_tcp_established cpu_usage_softirq Processes netstat_tcp_fin_wait1 cpu_usage_steal processes_blocked netstat_tcp_fin_wait2 cpu_usage_system processes_dead netstat_tcp_last_ack cpu_usage_user processes_idle netstat_tcp_listen processes_paging netstat_tcp_none Network processes_running netstat_tcp_syn_recv net_bytes_recv processes_sleeping netstat_tcp_syn_sent net_bytes_sent processes_stopped netstat_tcp_time_wait net_drop_in processes_total netstat_udp_socket net_drop_out processes_total_threads net_err_in processes_wait Swap net_err_out processes_zombies swap_free net_packets_recv swap_used net_packets_sent swap_used_percent EC2 Instance Metrics CPUUtilization DiskReadBytes DiskReadOps DiskWriteBytes DiskWriteOps NetworkIn NetworkOut NetworkPacketsIn NetworkPacketsOut
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. CloudWatch Agent in Action
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The CloudWatch Agent Simplifies Observability Unified Agent Metrics & logs For EC2 and on-premise servers Linux & Windows Enhanced Metrics & Logs Collect in-guest system metrics Appends EC2 dimensions Custom dimensions
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Getting Started Experience Install and Configure with AWS Systems Manager Integration Provides defaults specific to OS-type (Windows vs Linux) Basic, Standard, or Advanced options (complement EC2 or granular per resource metrics) Log collection (can specify multiple file_paths) Migrate from previous CW Logs agent Curated metric set specific to environment
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ideal Path to Gain Observability of your VMs Install the CloudWatch Agent Collect Metrics and Logs Build and View Dashboards Create Alarms & Actions Generate New Time Series Using Metric Math ✅
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rackspace – Scaling Operations with Amazon CloudWatch
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. About Rackspace • AWS Premier Consulting Partner and audited Managed Service Provider (MSP) • A leader in the 2017 Gartner Magic Quadrant for Public Cloud Infrastructure Managed Service Providers, Worldwide • Managed Service Provider to over half the Fortune 100 • Provides a full-stack portfolio, from managed operations and applications to security, professional services, and Enterprise migration and transformation
  • 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Key Use Cases Driving Customer Value 1. Solving Enterprise challenges of cost optimization and cost governance 2. Supporting day-to-day Customer Operations 3. Enabling automation to lower time to diagnose and resolve infrastructure and Operating System issues
  • 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 1. Cost Control With Amazon CloudWatch • Rackspace provides services for Cost and Performance Optimization • CloudWatch replaces expensive, on premise infrastructure monitoring • The CloudWatch Agent provides on-instance insight for disk size, memory/swap utilization, etc. • Rackspace recently saved a customer ~$500k on their yearly AWS Spend by: • Right-sizing instances based on CloudWatch metric performance insights • Consolidating unused instances • Migration to new instance families • Proposed Reserved Instance (RI) savings would add ~$100k in savings • Rackspace also has tooling to manage spend alerting, bill consolidation and cost allocation / chargeback
  • 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2. Scale Operations With Amazon CloudWatch • Rackspace provides services for 24x7x365 Fanatical Support • CloudWatch is used to provide infrastructure monitoring and alerting for hundreds of AWS customer accounts. • Rackspace acts as first line of response for Infrastructure and Operating System alarms. • The CloudWatch Agent enables monitoring of Operating System performance and logs, and dashboards increase context for our operations team.
  • 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 3. Automation With Amazon CloudWatch • CloudWatch and the CloudWatch Agent integrate with other Amazon services (SNS, Lambda, etc.) to provide automation opportunities • Examples: • Low disk space reports • Restart runaway processes • Run diagnostic tools in response to instance metric alarms (top, free, etc.)
  • 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recommended Best Practices • Use the in-instance visibility of the CloudWatch Agent to maximize your cost optimization strategy • Leverage CloudWatch Agent and Dashboards to provide an increase in context for Operations and reduce MTTR • CloudWatch can tie into other systems to increase automated handling and diagnosis of issues
  • 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Closing
  • 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Closing Thoughts 1. Have a strong monitoring & observability strategy 2. Focus on collecting health and behavior metrics 3. Improve performance, minimize production impacts and control costs for a better end-user experience
  • 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Next Steps Get started with CloudWatch for free today aws.amazon.com/cloudwatch Many applications can operate within our monthly free tier limits • Basic monitoring • 10 custom or detailed metrics • 10 alarms • 3 dashboards of 50 metrics each • 1 million API requests Install the CloudWatch Agent docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent Use Metric Math to create new time series docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math