Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
PSS Guest Lecture


Asynchronous IO 
  Programming

         th
       20  April 2006

  by Henrik Thostrup Jensen   1
Outline
    Asynchronous IO
●

        What, why, when
    –
        What about threads
    –
    Programming Asynchronous...
What is Asynchronous IO
    Also called event­driven or non­blocking
●


    Performing IO without blocking
●

        Cha...
Why Asynchronous IO
    Concurrency is really difficult
●

        Coordinating threads and locks is non­trivial
    –
   ...
Drawbacks
    The program must not block
●

        Only a problem when learning
    –
        And with crappy APIs
    –
...
Program Structure
    One process must handle several sockets
●

        State is not shared between connections
    –
   ...
Avoiding Asynchronous IO
    Not suitable for everything
●


    When true concurrency is needed
●

        Very seldom
  ...
Are Threads Evil
    No ­ just misunderstood
●

        Used when inappropriate
    –
    Coordination is the problem
●

 ...
Setting Things Straight
    Async. IO and threads can do the same
●

        Probably includes performance
    –
    It's ...
Async. Programming
    First introduced with Berkeley sockets
●


    Then came select, later poll
●


    Recent trends
●...
Berkeley Sockets

    int s = socket(family, type, protocol);
    fcntl(s, F_SETFL, O_NONBLOCK);
    // bind+listen or con...
Select

    int select(int n, fd_set *readfds, fd_set *writefds,
               fd_set *exceptfds, struct timeval *timeout...
Poll

    int poll(struct pollfd *ufds, unsigned int nfds, int timeout);
    struct pollfd {
            int fd;          ...
State vs. Stateless
    select() and poll() is stateless
●

        This forces them to be O(n) operations
    –
         ...
Epoll 1/2

int epfd = epoll_create(EPOLL_QUEUE_LEN);
int client_sock = socket();
static struct epoll_event ev;
ev.events =...
Epoll 2/2
    Has state in the kernel
●

        File descriptors must be added and removed
    –
        Slight more comp...
KQueue
    We've had enough APIs for now
●


    Works much like epoll
●


    Also does disk IO, signals, file/process ev...
libevent
    Event based library
●

        Portable (Linux, BSD, Solaris, Windows)
    –
        Slightly better abstract...
libevent benchmark 1/2




                         19
libevent benchmark 2/2




                         20
Asynchronous Disk IO
    What about disk IO
●

        Often blocks so little time that it can be ignored
    –
        Bu...
Posix AIO
    Relatively new standard
●

         Introduced in Linux in 2005
     –
              Was in several vendor t...
Twisted
    An event­driven framework
●


    Lives at: www.twistedmatrix.com
●


    Started as game network library
●


...
Twisted vs. the world
    Probably most advanced framework in its class
●


    Only real competitor is ACE 
●

        Tw...
Twisted Architecture
    Tries hard to keep things orthogonal
●


    Reactor–Transport–Factory–Protocol­Application
●

  ...
Twisted Echo Server
    from twisted.internet import reactor, protocol

    class Echo(protocol.Protocol):
        def dat...
The Twisted Reactor
    The main loop of Twisted
●


    Don't call us, we'll call you (framework)
●

        You insert c...
Factories and Protocols
    Factories produces protocols
●

        On incoming connections
    –
    A protocol represent...
Twisted Deferreds
    The single most confusing aspect in Twisted
●

        Removes the concept of stack execution
    –
...
Deferred Example
    Fetching a web page:
●


    def gotPage(page):
        print quot;I got the page:quot;, page

    de...
Deferreds and Errors

    def gotPage(page):
        print quot;I got the page:quot;, page

    def getPage_errorback(erro...
Chaining Callbacks
    deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;)
    deferred.addCallback(gotPage)
 ...
Deferreds and Coroutines
    @defer.deferredGenerator
    def myFunction(self):
        d = getPageStream(quot;http://www....
Why Twisted
    Grasping the power of Twisted is difficult
●

        Also hard to explain
    –
    Makes cross protocol ...
Summary
    Asynchronous IO
●

        An alternative to threads
    –
        Never block
    –
        Requires program ...
Nächste SlideShare
Wird geladen in …5
×

Asynchronous Io Programming

16.905 Aufrufe

Veröffentlicht am

Veröffentlicht in: Technologie, Bildung

Asynchronous Io Programming

  1. 1. PSS Guest Lecture Asynchronous IO  Programming th 20  April 2006 by Henrik Thostrup Jensen 1
  2. 2. Outline Asynchronous IO ● What, why, when – What about threads – Programming Asynchronously ● Berkeley sockets, select, poll – epoll, KQueue – Posix AIO, libevent, Twisted – 2
  3. 3. What is Asynchronous IO Also called event­driven or non­blocking ● Performing IO without blocking ● Changing blocking operations to be non­blocking – Requires program reorganization – The brain must follow ● The program must never block ● Instead ask what can be done without blocking – 3
  4. 4. Why Asynchronous IO Concurrency is really difficult ● Coordinating threads and locks is non­trivial – Very non­trivial – Don't make things hard – Threads and locks are not free ● They take up resources and require kernel switches – Often true concurrency is not needed ● Asynchronous programs are race free ● By definition – 4
  5. 5. Drawbacks The program must not block ● Only a problem when learning – And with crappy APIs – Removes “normal” program flow ● Due to mixing of IO channels – Requires state machinery or continuations – Hard to use multiple CPU cores ● But doable – 5
  6. 6. Program Structure One process must handle several sockets ● State is not shared between connections – Often solved using state machines – Makes program flow more restricted – Debugging is different – but not harder ● Threads and locks are darn hard to debug – 6
  7. 7. Avoiding Asynchronous IO Not suitable for everything ● When true concurrency is needed ● Very seldom – Utilizing several CPUs with shared data – Or when badly suited ● Long running computations – Can be split, but bad abstraction ● 7
  8. 8. Are Threads Evil No ­ just misunderstood ● Used when inappropriate – Coordination is the problem ● Often goes wrong – Not free ● Spawning a thread for each incoming connection – 500 incoming connections per second => problem – Thread pools used for compensating – 8
  9. 9. Setting Things Straight Async. IO and threads can do the same ● Probably includes performance – It's about having the best abstraction ● Or: Making things easy – Concurrency is not known to simplify things – 9
  10. 10. Async. Programming First introduced with Berkeley sockets ● Then came select, later poll ● Recent trends ● epoll, KQueue, Posix AIO, Twisted – 10
  11. 11. Berkeley Sockets int s = socket(family, type, protocol); fcntl(s, F_SETFL, O_NONBLOCK); // bind+listen or connect (also async.) void *buffer = malloc(1024); retval = read(s, buffer, 1024); if (retval == -EAGAIN) { // Reschedule command } Unsuitable for many sockets ● Many kernel switches – The try/except paradigm is ill suited for IO – 11
  12. 12. Select int select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); Monitors three sets of file descriptors for changes ● Tell which actions can be done without blocking – pselect() variant to avoid signal race ● Has a limit of of maximum file descriptors ● Availability: *nix, BSD, Windows ● Reasonably portable – 12
  13. 13. Poll int poll(struct pollfd *ufds, unsigned int nfds, int timeout); struct pollfd { int fd; /* file descriptor */ short events; /* requested events */ short revents; /* returned events */ }; Basically a select() with a different API ● No limit on file descriptors ● Availability: *nix, BSD ● 13
  14. 14. State vs. Stateless select() and poll() is stateless ● This forces them to be O(n) operations – The kernel must scan all file descriptors ● This tend to suck for large number of file descriptors – Having state means more code in the kernel ● Continuously monitor a set of file descriptions – Makes an O(n) operation to O(1) – 14
  15. 15. Epoll 1/2 int epfd = epoll_create(EPOLL_QUEUE_LEN); int client_sock = socket(); static struct epoll_event ev; ev.events = EPOLLIN | EPOLLOUT | EPOLLERR; ev.data.fd = client_sock; int res = epoll_ctl(epfd, EPOLL_CTL_ADD, client_sock, &ev); // Main loop struct epoll_event *events; while (1) { int nfds = epoll_wait(epfd, events, MAX_EVENTS, TIMEOUT); for(int i = 0; i < nfds; i++) { int fd = events[i].data.fd; handle_io_on_socket(fd); } } 15
  16. 16. Epoll 2/2 Has state in the kernel ● File descriptors must be added and removed – Slight more complex API – Outperforms select and poll ● When the number of open files is high – Availability: Linux only (introduced in 2.5) ● 16
  17. 17. KQueue We've had enough APIs for now ● Works much like epoll ● Also does disk IO, signals, file/process events ● Arguably best interface ● Availability: BSD ● 17
  18. 18. libevent Event based library ● Portable (Linux, BSD, Solaris, Windows) – Slightly better abstraction – Can use select, poll, epoll, KQueue, /dev/poll ● Possible to change between them – Can also be used for benchmarking :­) ● 18
  19. 19. libevent benchmark 1/2 19
  20. 20. libevent benchmark 2/2 20
  21. 21. Asynchronous Disk IO What about disk IO ● Often blocks so little time that it can be ignored – But sometimes it cannot – Databases are the prime target ● Sockets has been asynchronous for many years ● With disk IO limping behind – Posix AIO to the rescue ● 21
  22. 22. Posix AIO Relatively new standard ● Introduced in Linux in 2005 – Was in several vendor trees before that ● Often emulated in libc using threads – Also does vector operations ● API Sample: ● int aio_read(struct aiocb *aiocbp); int aio_suspend(const struct aiocb* const iocbs[], int niocb, const struct timespec *timeout); 22
  23. 23. Twisted An event­driven framework ● Lives at: www.twistedmatrix.com ● Started as game network library ● Rather large ● Around 200K lines of Python – Basic support for 30+ protocols – Has web, mail, ssh clients and servers ● Also has its own database API ● And security infrastructure (pluggable) ● 23
  24. 24. Twisted vs. the world Probably most advanced framework in its class ● Only real competitor is ACE  ● Twisted borrows a lot of inspiration from here – ACE is also historically important – Some claim that java.nio and C# Async are also  ● asynchronous frameworks 24
  25. 25. Twisted Architecture Tries hard to keep things orthogonal ● Reactor–Transport–Factory–Protocol­Application ● As opposed to mingled together libraries – Makes changes very easy (once mastered) ● Often is about combining things the right way – Also means more boilerplate code – 25
  26. 26. Twisted Echo Server from twisted.internet import reactor, protocol class Echo(protocol.Protocol): def dataReceived(self, data): self.transport.write(data) factory = protocol.ServerFactory() factory.protocol = Echo reactor.listenTCP(8000, factory) reactor.run() Lots of magic behind the curtain ● Protocols, Factories and the Reactor – 26
  27. 27. The Twisted Reactor The main loop of Twisted ● Don't call us, we'll call you (framework) ● You insert code into the reactor – Also does scheduling ● Future calls, looping calls – Interchangeable ● select, poll, WMFO, IOCP, CoreFoundation, KQueue – Integrates with GTK, Qt, wxWidgets – 27
  28. 28. Factories and Protocols Factories produces protocols ● On incoming connections – A protocol represents a connection ● Provides method such as dataReceived – Which are called from the reactor – The application is build on top of protocols ● May use several transports, factories, and protocols – 28
  29. 29. Twisted Deferreds The single most confusing aspect in Twisted ● Removes the concept of stack execution – Vital to understand as they are the program flow ● An alternative to state machines – Think of them as a one­shot continuation ● Or: An object representing what should happen – Or: A promise that data will be delivered – Or: A callback – 29
  30. 30. Deferred Example Fetching a web page: ● def gotPage(page): print quot;I got the page:quot;, page deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;) deferred.addCallback(gotPage) return deferred The getPage function takes time to complete ● But blocking is prohibited – When the page is retrieve the deferred will fire ● And callbacks will be invoked – 30
  31. 31. Deferreds and Errors def gotPage(page): print quot;I got the page:quot;, page def getPage_errorback(error): print quot;Didn't get the page, me so sad - reason:quot;, error deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;) deferred.addCallback(gotPage) deferred.addErrback(getPage_errorback) Separates normal program flow from error  ● handling 31
  32. 32. Chaining Callbacks deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;) deferred.addCallback(gotPage) deferred.addCallback(changePage) deferred.addCallback(uploadPage) Often several sequential actions are needed ● The result from gotPage is provided as argument  ● to changePage The execution stack disappears ● Causes headaches during learning – Coroutines can make program more stack­like – 32
  33. 33. Deferreds and Coroutines @defer.deferredGenerator def myFunction(self): d = getPageStream(quot;http://www.cs.aau.dkquot;) yield d ; stream = d.getResult() while True: d = stream.read() yield d ; content = d.getResult() if content is None: break print content Notice: yield instead of return ● The function is reentrant – 33
  34. 34. Why Twisted Grasping the power of Twisted is difficult ● Also hard to explain – Makes cross protocol integration very easy ● Provides a lot of functionality and power ● But one needs to know how to use it – Drawbacks ● Steep learning curve – Portability: Applications cannot be ported to Twisted – 34
  35. 35. Summary Asynchronous IO ● An alternative to threads – Never block – Requires program reorganization – Asynchronous Programming ● Stateless: Berkeley sockets, select, poll – State: epoll, Kqueue – Disk: Posix AIO – libevent and Twisted – Exercises? ● 35

×