Although we don't use it for the core web application, most other places in Launchpad that have to deal with concurrency issues do it using Twisted. This talk will survey these areas and talk about issues we've found and design patterns we've found helpful.
1. How we use Twisted in
Launchpad
KiwiPyCon 2009
Michael Hudson, Canonical Ltd
michael.hudson@canonical.com
2. Introduction
• This talk attempts to present some “real
world” use of Twisted as part of Launchpad,
a large – one could even say “enterprise
scale” – open source application.
• First, a survey of where Twisted is used
• Then a more detailed example
3. Twisted?
• “Event-driven networking engine written in
Python and licensed under the MIT license”
• Has abstractions for handling concurrency
without going insane
• (in other words, it doesn’t use threads)
• http://twistedmatrix.com/trac/
• Not a web framework!
4. Launchpad?
• Collaboration and hosting platform for
software projects
• Particularly: Ubuntu
• Code hosting, bugs, translations, …
• The service: https://launchpad.net
• The code: https://launchpad.net/launchpad
(licensed under Affero GPLv3)
5. Why Twisted?
• Honestly: not completely sure, the decision
was made before I started
• Generally a high quality product, even back
in 2004 when Launchpad was new
• Wide range of supported protocols (SSH,
SFTP, HTTP), easy to add more
• Solid process management
6. Where Twisted?
• Everywhere there’s concurrency
• (Apart from the web application, that’s a
“thread per request” Zope web application)
• Codehosting, librarian, branch puller, code
imports, build farm, mirror prober…
• In more detail…
7. Codehosting
SSH
• A Twisted Conch server listens on
bazaar.launchpad.net:22
• Custom authentication: keys checked by
querying an XML-RPC server
• Custom file system for SFTP, maps external
paths to internal ones based on branch db id
• Launches and tracks “bzr serve” processes
for bzr+ssh branch access
8. Librarian
• Simple application for storing files
• Written using twisted.web and a simple
custom upload protocol
• Very simple: upload files, get a HTTP URL to
download them from
• Simple, but very effective; manages many
terabytes of data
9. Branch Puller
• Copies branch data from where it is
uploaded to a read only area
• Uses Twisted for process management, not
network access
• Twisted code quite generic: dispatches jobs
to subprocesses, monitors them for activity
• Will talk more about this “ProcessMonitor”
later in the talk
10. Code Imports
• As far as Twisted usage goes, similar to puller
• Runs code import in subprocess, monitors
for activity, informs database of progress
• Can take from seconds to weeks to
complete
11. The Build Farm
(a.k.a. Soyuz)
• XML-RPC client and server for dispatching
builds to builders
• Sort of environment where things go wrong
a lot, by now very robust against timeouts
etc
12. The Build Farm
(a.k.a. Soyuz)
• Build machines (buildds) not allowed to
make network connections by the firewall
• They run an XML-RPC server that has
methods like:
• “ping”: are you alive
• “status”: what are you doing
• “build”: start doing this
• Runs build as subprocess, monitors output
13. The Build Farm
(a.k.a. Soyuz)
• Buildd-manager is a daemon process that
periodically:
• calls the “status” method on every builder
(in parallel), then
• dispatches pending builds to idle builders
• fetches completed builds from builders
over HTTP
14. Mirror prober
• Checks that Ubuntu mirrors are up to date
• Highly parallel HTTP client
• Robust timeout handling
15. Quick Twisted
Jargon Primer 1
• Deferred: a result you don’t have yet
• E.g. the result of making an XML-RPC call
across the network
• Events: successfully got result, failed
somehow
• Interesting fact: if some operation returns a
Deferred, you need to worry about it failing
– Deferreds highlight “integration points”
16. Quick Twisted
Jargon Primer 2
• A Protocol represents a network connection:
• Handles data in a asynchronous mannter
• Events: “connection made”, “data received”,
“connection lost”
• A ProcessProtocol represents a subprocess:
• Similar, but processes have multiple streams
• “connection lost” becomes “process exited”
17. Example:
ProcessMonitorProtocol
• Use case:
• Run a subprocess
• Report its activity and output so that it
can be summarized on a web page
• Kill if no progress shown for too long
• Kill harder (SIGKILL) if it doesn’t die
after SIGINT
• Builds on ProcessProtocol
18. Example:
ProcessMonitorProtocol
• Race conditions galore:
• Process exits just as you’re reporting
progress
• An attempt to report progress fails just as
you receive output
• Production experience helped us beat these
out of the code :-)
19. Example:
ProcessMonitorProtocol
• serializes notifications
ProcessMonitorProtocol
and event handling with a DeferredLock – a
convenience that essentially prevents
callbacks from one deferred running until
another’s have completed
20. Example:
ProcessMonitorProtocol
class Example(ProcessMonitorProtocol):
"""Reports activity on all output.
self.endpoint is an XML-RPC proxy.
"""
def outReceived(self, data):
self.resetTimeout()
self.runNotification(
self.endpoint.callRemote, “progress”)
21. Other Canonical
uses of Twisted
• Landscape (system management/monitoring):
• client side: various Twisted processes
talking over DBUS
• server side: long running/unruly processes
managed by a Twisted daemon
• Ubuntu One (“your personal cloud”):
• File sharing client and server both
implemented using Twisted