NGINX High Availability and Monitoring

NGINX High Availability
and Monitoring
Introduced by Andrew Alexeev
Presented by Owen Garrett
Nginx, Inc.

About this webinar
No one likes a broken website. Learn about some of the techniques that NGINX
users employ to ensure that server failures are detected and worked around, so that
you too can build large-scale, highly-available web services.

The causes of downtime
“ Through 2015, 80% of outages impacting mission-critical
services will be caused by people and process
issues, and more than 50% of those outages will be
caused by change/configuration/release integration
and hand-off issues. ”
Configuration Management for Virtual and Cloud
Infrastructures
Ronni J. Colville and George Spafford, Gartner
Hardware failures, disasters
People and Process

What is NGINX?
Internet
Proxy
Caching, Load Balancing… HTTP traffic
N
Web Server
Serve content from disk
Application Server
FastCGI, uWSGI, Passenger…
Application Acceleration
SSL and SPDY termination
Performance Monitoring
High Availability
Advanced Features: Bandwidth Management
Content-based Routing
Request Manipulation
Response Rewriting
Authentication
Video Delivery
Mail Proxy
GeoLocation

NGINX Accelerates
143,000,000
Websites

22%
Top 1 million websites 37%
Top 1,000 websites

NGINX and NGINX Plus
NGINX F/OSS
nginx.org
3rd party
modules
Large community of >100 modules

NGINX and NGINX Plus
NGINX F/OSS
nginx.org
3rd party
modules
Large community of >100 modules
NGINX Plus
Advanced load balancing features
Ease-of-management
Commercial support

IMPROVING AVAILABILITY WITH NGINX

Quick review of load balancing
server {
listen 80;
location / {
proxy_pass http://backend;
}
}
upstream backend {
server webserver1:80;
}
Internet
N

Three NGINX Techniques for High Availability
NGINX: Basic Error Checks
NGINX Plus: Advanced Health Checks
Live software upgrades
1
2
3

1. Basic Error Checks
• Monitor transactions as they happen
– Retry transactions that ‘fail’ where possible
– Mark failed servers as dead

Basic Error Checks
server {
listen 80;
location / {
proxy_next_upstream error timeout; # http_503..., off
}
}
upstream backend {
server webserver1:80 max_fails=1 fail_timeout=10s;
}

More sophisticated retries
server {
listen 80;
location / {
# On error/timeout, try the upstream group one more time
error_page 502 504 = @fallback;
proxy_next_upstream off;
}
location @fallback {
proxy_next_upstream off;
}
}

2. Advanced Health Checks
• “Synthetic Transactions”
– Probes server health
– Complex, custom tests are possible
– Available in NGINX Plus

Advanced Health Checks
server {
listen 80;
location / {
health_check;
}
}
upstream backend {
zone backend 64k;
}
health_check:
interval = period between checks
fails = failure count before dead
passes = pass count before alive
uri = custom URI
Default:
5 seconds, 1 fail, 1 pass, uri = /

Advanced usage
server {
listen 80;
location / {
health_check uri=/test.php match=statusok;
proxy_set_header Host www.foo.com;
}
}
match statusok {
# Used for /test.php health check
status 200;
header Content-Type = text/html;
body ~ "Server[0-9]+ is alive";
}
Health checks inherit all
parameters from location
block.
match blocks define the
success criteria for a
health check

Edge cases – variables in configuration
server {
location / {
health_check;
proxy_set_header Host $host;
}
}
This may not work as expected.
Remember – the health_check
tests run in the context of the
enclosing location.

Edge cases – variables in configuration
server {
location / {
health_check;
proxy_set_header Host $host;
}
}
server {
location /internal-check {
internal;
health_check;
proxy_set_header Host www.foo.com;
}
}
This may not work as expected.
Remember – the health_check
tests run in the context of the
enclosing location.
This is the common alternative.
Use a custom URI for the location.
Tag the location as internal.
Set headers manually.
Useful for authentication.

Examples of using health checks
• Verify that pages
don’t contain errors
• Run internal tests (e.g. test.php => DB connect)
• Managed removal of servers
$ touch $DOCROOT/isactive.txt

Advantages of ‘Health Checks’
• Run tests asynchronously (find errors faster)
• Custom tests (not related to ‘real’ traffic)
• More flexibility to specify success/error

Slow start
• When basic error checks and advanced health
checks recover:
upstream backends {
zone backends 64k;
server webserver1 slow_start=30s;
}

NGINX Plus status monitoring
http://demo.nginx.com/ and http://demo.nginx.com/status
Total data and connections
Current data and conns.
Split per ‘server zone’
Cache statistics
Upstream statistics:
Traffic
Health and Error status
(web) (JSON)

3. Live software upgrades
• Upgrade your NGINX binary on-the-fly
– No downtime
– No dropped connections

No downtime – ever!
• Reload configuration with SIGHUP
# nginx –s reload
• Re-exec binary with copy-and-signal
http://nginx.org/en/docs/control.html#upgrade
NGINX parent process
NGINX workers
NGINX workers
NGINX workers
NGINX workers

In summary...
NGINX F/OSS:
Basic Error checks and retry logic On-the-fly upgrades
NGINX Plus:
Advanced health checks + slow start Extended status monitoring
Compared to other load balancers and ADCs, NGINX Plus is uniquely well-suited
to a devops-driven environment.

Closing thoughts
• 37% of the busiest websites use NGINX
– In most situations, it’s a drop-in extension
• Check out the blogs on nginx.com
• Future webinars: nginx.com/webinars
Try NGINX F/OSS (nginx.org) or NGINX Plus (nginx.com)

NGINX High Availability and Monitoring

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (19)

Mehr von NGINX, Inc.

Mehr von NGINX, Inc. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

NGINX High Availability and Monitoring

Hinweis der Redaktion