Integrate CMS Content Into Lightning Communities with CMS Connect
Building Web APIs that Scale
1. Building Web APIs that Scale
Designing for Graceful Degradation
Evan Cooke, Twilio, CTO
@emcooke
2. Safe Harbor
Safe harbor statement under the Private Securities Litigation Reform Act of 1995:
This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties
materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results
expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be
deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other
financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any
statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new
functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our
operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of
intellectual property and other litigation, risks associated with possible mergers and acquisitions, the immature market in which we
operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new
releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization
and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of
salesforce.com, inc. is included in our annual report on Form 10-Q for the most recent fiscal quarter ended July 31, 2012. This
documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of
our Web site.
Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently
available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based
upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-
looking statements.
3.
4.
5. Cloud services and the APIs they power are
becoming the backbone of modern society. APIs
support the apps that structure how we work, play,
and communicate.
6. Twilio
Observations today based on
experience building @twilio
• Founded in 2008
• Infrastructure APIs to automate
phone and SMS
communications
• 120 Employees
• >1000 servers running 24x7
10. Goal Today
Support graceful degradation of API
performance under extreme load
No Failure
11. Incoming
Why Failures?
Requests
Load
Balancer Worker
Pool
AAA AAA AAA
WW
...
Throttling Throttling Throttling
App App App W App
W
Server Server Server W W
Server
WW
Throttling Throttling Throttling
13. Problem Summary
• Cloud services often use worker pools
to handle incoming requests
• When load goes beyond size of the
worker pool, requests fail
14. Queues to the rescue?
Incoming Process &
Requests Respond
1. If we synchronously respond, each item in the queue
still ties up a worker. Doh
2. If we close the incoming connection and free the
worker then we need an asynchronous callback to
respond to the request Doh
15. Observation 1
A synchronous web API is often much
easier for developers to integrate due
additional complexity of callbacks
Implication Responding to requests
synchronously is often preferable to queuing
the request and responding with an
asynchronous callback
16. Synchronous vs. Asynchronous Interfaces
Take POST data from a web form, send it to a geo lookup API, store the
result DB and return status page to user
Sync Async
d = read_form(); d = read_form();
geo = api->lookup(d); api->lookup(d);
db->store(d, geo);
return “success”; # in /geo-result
db->store(d, geo);
ws->send(“success”);
Async interface need a separate URL handler,
and websocket connection to return the result
17. Observation 2
For many APIs, taking additional time to
service a request is better than failing
that specific request
Implication In many cases, it is better to service
a request with some delay rather than failing it
18. Observation 3
It is better to fail some requests than all
incoming requests
Implication Under load, it may better to
selectively drop expensive requests that can’t
be serviced and allow others
24. Non-blocking IO
Time
req = ‘GET /’; 1
req.append(‘/r/n/r/n’); 1
socket.write(req, fn() { 10
socket.read(fn(resp) { 10
print(resp); 10
});
});
No delay blocking
reactor.run_forever(); the worker waiting
for IO
25. Request Response Decoupling
Using this
req = ‘GET /’; approach we can
req.append(‘/r/n/r/n’); decouple the
socket.write(req, fn() { socket of an
socket.read(fn(resp) {
print(resp); incoming
}); connection from
});
reactor.run_forever(); the processing of
that connection
27. Callback Spaghetti
req = ‘GET /’ Example of
req += ‘/r/n/r/n’
callback nesting
def r(resp): complexity with
print resp Python Twisted
def w(): (Also node.js)
socket.read().addCallback(r)
socket.write().addCallback(w)
28. inlineCallbacks to the Rescue
req = ‘GET /’ We can clean up
req += ‘/r/n/r/n’
the callbacks
yield socket.write() using deferred
resp = yield socket.read() generators and
print resp
inline callbacks
(similar
frameworks also
exist for js)
31. Event Python gevent
“gevent is a coroutine-based Python
networking library that uses greenlet to
provide a high-level synchronous API on
top of the libevent event loop.”
Natively asynchronous
socket.write()
resp = socket.read()
print resp
32. gevent Example
Easy sequential
Simple Echo Server model yet fully
from gevent.server
import StreamServer asynchronous
def echo(socket, address):
print ('New connection from %s:%s' % address)
socket.sendall('Welcome to the echo server!rn')
line = fileobj.readline()
fileobj.write(line)
fileobj.flush()
print ("echoed %r" % line)
if __name__ == '__main__':
server = StreamServer(('0.0.0.0', 6000), echo)
server.serve_forever()
33. gevent Example
Simple Echo Server
from gevent.server
import StreamServer However, gevent requires
daemonization, logging and
def echo(socket, address):
print ('New connection from %s:%s' % address)
other servicification functionality
socket.sendall('Welcome to the echo server!rn')
line = fileobj.readline()
for production use such
fileobj.write(line)
fileobj.flush()
print ("echoed %r" % line)Twisted’s twistd
if __name__ == '__main__':
server = StreamServer(('0.0.0.0', 6000), echo)
server.serve_forever()
34. Async Services with Ginkgo
Ginkgo is a simple framework for
composing asynchronous gevent services
with common configuration, logging,
demonizing etc.
https://github.com/progrium/ginkgo
Let’s look a simple example that implements a
TCP and HTTP server...
35. Ginkgo Example
import gevent
from gevent.pywsgi import WSGIServer Import
from gevent.server import StreamServer
from ginkgo.core import Service WSGI/TCP
Servers
36. Ginkgo Example
import gevent
from gevent.pywsgi import WSGIServer
from gevent.server import StreamServer
from ginkgo.core import Service
def handle_http(env, start_response):
start_response('200 OK', [('Content-Type', 'text/html')])
print 'new http request!’
return ["hello world”]
HTTP Handler
37. Ginkgo Example
import gevent
from gevent.pywsgi import WSGIServer
from gevent.server import StreamServer
from ginkgo.core import Service
def handle_http(env, start_response):
start_response('200 OK', [('Content-Type', 'text/html')])
print 'new http request!’
return ["hello world"]
def handle_tcp(socket, address):
print 'new tcp connection!’
while True:
socket.send('hellon’) TCP Handler
gevent.sleep(1)
38. Ginkgo Example
import gevent
from gevent.pywsgi import WSGIServer
from gevent.server import StreamServer
from ginkgo.core import Service
def handle_http(env, start_response):
start_response('200 OK', [('Content-Type', 'text/html')])
print 'new http request!’
return ["hello world"]
def handle_tcp(socket, address):
print 'new tcp connection!’
while True:
socket.send('hellon’)
gevent.sleep(1)
Service
app = Service() Composition
app.add_service(StreamServer(('127.0.0.1', 1234), handle_tcp))
app.add_service(WSGIServer(('127.0.0.1', 8080), handle_http))
app.serve_forever()
39. Toward Fully a Asynchronous API
Using Ginkgo or another async
framework let’s look at our web-worker
architecture and see how we can modify
it to become fully asynchronous
WW
WW
W W
WW
40. Incoming
The Old Way
Requests
Load
Balancer Worker
Pool
AAA AAA AAA
WW
...
Throttling Throttling Throttling
App App App W App
W
Server Server Server W W
Server
WW
Throttling Throttling Throttling
41. Incoming
Requests
Load
Balancer
Async
Server
Async
Server
... Async
Server
Step 1 - Let’s start by replacing our threaded
workers with asynchronous app servers
42. Incoming
Requests
Huzzah, now
Load
idle open
Balancer
connections
will use very
few server
Async
Server
Async
Server
... Async
Server
resources
Step 1 - Let’s start by replacing our threaded
workers with asynchronous app servers
43. Incoming
Requests
Load
Balancer
AAA AAA AAA
Async
Server
Async
Server
... Async
Server
Step 2 – Define authentication and authorization
layer to identify the user and resource requested
44. AAA Manager
Goal Perform authentication,
authorization and accounting for each
incoming API request
Extract key parameters
• Account
• Resource Type
45. Incoming
Requests
Load
Balancer
AAA AAA AAA
...
Throttling Throttling Throttling
Async Async Concurrency
Async
Manager
Server Server Server
Step 3 – Add a concurrency manager that
determines whether to throttle each request
46. Concurrency Manager
Goal determine whether to delay or drop
an individual request to limit access to
API resources
Possible inputs
• By Account
• By Resource Type
• By Availability of Dependent Resources
47. Concurrency Manager
What we’ve found useful
•Tuple (Account, Resource Type)
Supports multi-tenancy
• Protection between Accounts
• Protect within an account between resource
types e.g., Calls & SMS
48. Concurrency Manager
Concurrency manager returns one of
1. Allow the request immediately
2. Delay the request before being
processed
3. Drop the request and return an error
HTTP 429 - Concurrency Limit
Reached
49. Step 4 – provide for Incoming
concurrency control Requests
between the servers
Load
and backend Balancer
resources
AAA AAA AAA
...
Throttling Throttling Throttling
Async Async Concurrency
Async
Manager
Server Server Server
Throttling Throttling Throttling
Dependent
Services
50. Conclusion 1
A synchronous web API is often much
easier for developers to integrate due
additional complexity of callbacks
The proposed asynchronous API framework
allows provides for synchronous API calls
without worrying about worker pools filling up.
It is also easy to add callback where needed.
51. Conclusion 2
For many APIs, taking additional time to
service a request is better than failing
that specific request
The proposed asynchronous API framework
provides the ability to inject into delay the
processing of incoming requests rather than
dropping them.
52. Example of Delay Injection
Load
Latency
Spread load across a
longer time period
53. Conclusion 3
It is better to fail some incoming
requests than to fail all requests
The proposed asynchronous API framework
provides the ability to selectively drop requests
to limit contention on limited resources
54. Example of Dropping Requests
Load
Latency /x Dropped
Latency /*
Drop only the requests that we must
due to scare backend resources
55. Summary
Async frameworks like gevent allow you to easily
decouple a request from access to constrained
resources
API outage
Request
Latency
Time