SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
How my visualization tools use little memory:
  A tale of incrementalization and laziness

     Eugene Kirpichov <ekirpichov@gmail.com>

      St.-Petersburg Functional Programming User Group


                      Nov 7, 2012
Outline

   Introduction

   splot

   tplot

   Types of stream operations

   The transition

   Optimization

   Lessons learnt
Introduction


   http://jkff.info/software/timeplotters
   Tools for visualizing program behavior from logs, optimized for
   one-liners.
       timeplot — quantitative graphs about multiple event
       streams.
            Histograms of event durations, of types of events, regular
            line/dot plots etc.
       splot — Gantt-like chart about a large number of concurrent
       processes.
            Birds-eye view of thread/process/machine interaction patterns
Live demo




            Live demo
Why optimize




  Because they were slow and a memory hog (boxing, thunks).
      Took minutes for 100,000’s of events
      Took hours or crashed for 1,000,000’s of events
          Crashing was the last straw. I couldn’t do my job.
How to optimize




   Rewrite in C or Java? We can do better!
       It would be a defeat :)
       Would rewrite and re-debug everything
       Would learn nothing
   Turned out worth it.
splot


   The easy one.


 Before                              After
   1. Read input into a list           1. Read input into a list
   2. Traverse to calibrate axes       2. Traverse to calibrate axes
   3. Traverse to render               3. Read input into a list
                                       4. Traverse to render
                                     Lazy IO does the rest.
   Cost: no output to window, no input from stdin.
   Code tour.
tplot: why it can NOT be done




   The hard one.
       Complex data flow: events:tracks = M:M.
       Uses Chart, which keeps data in memory and for good reasons.
   Code tour: old version.
tplot: why it CAN be done




   Reading too much = drawing too much ⇒ Chart is not a problem.

   Building “data to render” for Chart in 1 pass seemed possible.

   Code tour: PlotData, Render.hs
tplot: main idea



   Push-based:
       List representation unchanged and hidden
       Push item
       Get result (at any moment)

   Code tour: StreamSummary.
   Live example: average + profiling.
   Applicative average.
Types of stream operations




        Concept     Logical type          Actual type
       Summarize     [a] → b             Summary a b
       Transform    [a] → [b]      Summary b r → Summary a r
        Generate     a → [b]         Summary b r → (a → r)


   Composition pipelines become quite funny.

   Code tour: RLE etc.
tplot: new architecture




   New architecture:
       Push-based builders for all plot types
       Driver loop to feed input events to output tracks
       When done, summarize and render
   Code tour: driver loop, plots (vs old code).
tplot: the transition




   I wanted to keep things working all the time.
     1. Change interface — make the change possible.
     2. Change implementation — make the change real.
tplot: the transition


   Before: Completely non-incremental.
       (interface) Separate building and rendering
       Split into modules
       (interface) Explicit 2 passes, both potentially incremental
       Toying with incremental combinators to get a feeling for them
       (interface) Make all plots have an incremental interface
       (but actually use toList bridge)
       (implementation) Incrementalize plots one by one
   After: Completely incremental.
Not so easy



   Memory leaks due to insufficient strictness:
   “push isn’t pushing hard enough.”

   Question
   What and why remains unevaluated until too late?

   Profilers didn’t help at all. Debug.Trace is insufficient: it outputs
   lines and I need hierarchy, not sequence.
Enter Debug.HTrace




  cabal install htrace
  “Code” “tour”. Live demo: average.
Why not X?




  Iteratee, conduit, pipes, . . .
       Scary, so I tried to get as far as I can without them
       Ended up very small, simple and nice, so I didn’t look back
       Also educational
Lessons learnt




       Lazy IO is ok for this task
       Debugging laziness is hard: I wouldn’t start a high-risk
       real-time project in Haskell
       Profilers are almost useless for debugging laziness
       Debug.Trace is better, Debug.HTrace much better
       Push-based incremental processing is fun and easy

Weitere ähnliche Inhalte

Was ist angesagt?

Function + Action = Interaction (2015)
Function + Action = Interaction (2015)Function + Action = Interaction (2015)
Function + Action = Interaction (2015)Ichi Kanaya
 
Snakes on a plane - Ship your Python on enterprise machines
Snakes on a plane - Ship your Python on enterprise machinesSnakes on a plane - Ship your Python on enterprise machines
Snakes on a plane - Ship your Python on enterprise machinesMax Pumperla
 
Reactive Programming and RxJS
Reactive Programming and RxJSReactive Programming and RxJS
Reactive Programming and RxJSDenis Gorbunov
 
Python 4 Arc
Python 4 ArcPython 4 Arc
Python 4 Arcabsvis
 
Real time-collaborative-editor-presentation
Real time-collaborative-editor-presentationReal time-collaborative-editor-presentation
Real time-collaborative-editor-presentationbflueras
 
Clojure from ground up
Clojure from ground upClojure from ground up
Clojure from ground upDi Xu
 
Asynchronous Programming in .NET
Asynchronous Programming in .NETAsynchronous Programming in .NET
Asynchronous Programming in .NETPierre-Luc Maheu
 
Firefox Developers Conference 2010
Firefox Developers Conference 2010Firefox Developers Conference 2010
Firefox Developers Conference 2010Yu Kobayashi
 
Life after Matplotlib: Harder, Better, Faster, Stronger by Kayla Lacovino
Life after Matplotlib: Harder, Better, Faster, Stronger by Kayla LacovinoLife after Matplotlib: Harder, Better, Faster, Stronger by Kayla Lacovino
Life after Matplotlib: Harder, Better, Faster, Stronger by Kayla LacovinoPyData
 
Python: Thanks for the memories
Python: Thanks for the memoriesPython: Thanks for the memories
Python: Thanks for the memoriesDanil Ineev
 
Dias do futuro presente da programação
Dias do futuro presente da programaçãoDias do futuro presente da programação
Dias do futuro presente da programaçãoLuiz Borba
 
Derandomized evolution strategies (quasi-random)
Derandomized evolution strategies (quasi-random)Derandomized evolution strategies (quasi-random)
Derandomized evolution strategies (quasi-random)Olivier Teytaud
 

Was ist angesagt? (20)

Function + Action = Interaction (2015)
Function + Action = Interaction (2015)Function + Action = Interaction (2015)
Function + Action = Interaction (2015)
 
Snakes on a plane - Ship your Python on enterprise machines
Snakes on a plane - Ship your Python on enterprise machinesSnakes on a plane - Ship your Python on enterprise machines
Snakes on a plane - Ship your Python on enterprise machines
 
Reactive Programming and RxJS
Reactive Programming and RxJSReactive Programming and RxJS
Reactive Programming and RxJS
 
Automata
AutomataAutomata
Automata
 
Python 4 Arc
Python 4 ArcPython 4 Arc
Python 4 Arc
 
Route planning
Route planningRoute planning
Route planning
 
Real time-collaborative-editor-presentation
Real time-collaborative-editor-presentationReal time-collaborative-editor-presentation
Real time-collaborative-editor-presentation
 
Programming Basics
Programming BasicsProgramming Basics
Programming Basics
 
Clojure from ground up
Clojure from ground upClojure from ground up
Clojure from ground up
 
Pixel shaders
Pixel shadersPixel shaders
Pixel shaders
 
Deep Learning meetup
Deep Learning meetupDeep Learning meetup
Deep Learning meetup
 
Hubba Deep Learning
Hubba Deep LearningHubba Deep Learning
Hubba Deep Learning
 
Asynchronous Programming in .NET
Asynchronous Programming in .NETAsynchronous Programming in .NET
Asynchronous Programming in .NET
 
Firefox Developers Conference 2010
Firefox Developers Conference 2010Firefox Developers Conference 2010
Firefox Developers Conference 2010
 
Life after Matplotlib: Harder, Better, Faster, Stronger by Kayla Lacovino
Life after Matplotlib: Harder, Better, Faster, Stronger by Kayla LacovinoLife after Matplotlib: Harder, Better, Faster, Stronger by Kayla Lacovino
Life after Matplotlib: Harder, Better, Faster, Stronger by Kayla Lacovino
 
Python: Thanks for the memories
Python: Thanks for the memoriesPython: Thanks for the memories
Python: Thanks for the memories
 
Go1
Go1Go1
Go1
 
Ready to go
Ready to goReady to go
Ready to go
 
Dias do futuro presente da programação
Dias do futuro presente da programaçãoDias do futuro presente da programação
Dias do futuro presente da programação
 
Derandomized evolution strategies (quasi-random)
Derandomized evolution strategies (quasi-random)Derandomized evolution strategies (quasi-random)
Derandomized evolution strategies (quasi-random)
 

Ähnlich wie How my visualization tools use little memory: A tale of incrementalization and laziness

ooc - A hybrid language experiment
ooc - A hybrid language experimentooc - A hybrid language experiment
ooc - A hybrid language experimentAmos Wenger
 
ooc - A hybrid language experiment
ooc - A hybrid language experimentooc - A hybrid language experiment
ooc - A hybrid language experimentAmos Wenger
 
Functional Patterns for C++ Multithreading (C++ Dev Meetup Iasi)
Functional Patterns for C++ Multithreading (C++ Dev Meetup Iasi)Functional Patterns for C++ Multithreading (C++ Dev Meetup Iasi)
Functional Patterns for C++ Multithreading (C++ Dev Meetup Iasi)Ovidiu Farauanu
 
Boo Manifesto
Boo ManifestoBoo Manifesto
Boo Manifestohu hans
 
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSimon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSkills Matter
 
Peyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futurePeyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futureTakayuki Muranushi
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopHéloïse Nonne
 
On the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonOn the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonTakeshi Akutsu
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of pythonYung-Yu Chen
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usagehyunyoung Lee
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019Rafał Leszko
 
Go Beyond Higher Order Functions: A Journey into Functional Programming
Go Beyond Higher Order Functions: A Journey into Functional ProgrammingGo Beyond Higher Order Functions: A Journey into Functional Programming
Go Beyond Higher Order Functions: A Journey into Functional ProgrammingLex Sheehan
 
Luigi presentation NYC Data Science
Luigi presentation NYC Data ScienceLuigi presentation NYC Data Science
Luigi presentation NYC Data ScienceErik Bernhardsson
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeDmitri Nesteruk
 
designpatterns_blair_upe.ppt
designpatterns_blair_upe.pptdesignpatterns_blair_upe.ppt
designpatterns_blair_upe.pptbanti43
 
A Brief Conceptual Introduction to Functional Java 8 and its API
A Brief Conceptual Introduction to Functional Java 8 and its APIA Brief Conceptual Introduction to Functional Java 8 and its API
A Brief Conceptual Introduction to Functional Java 8 and its APIJörn Guy Süß JGS
 
Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6While42
 

Ähnlich wie How my visualization tools use little memory: A tale of incrementalization and laziness (20)

ooc - A hybrid language experiment
ooc - A hybrid language experimentooc - A hybrid language experiment
ooc - A hybrid language experiment
 
ooc - A hybrid language experiment
ooc - A hybrid language experimentooc - A hybrid language experiment
ooc - A hybrid language experiment
 
Functional Patterns for C++ Multithreading (C++ Dev Meetup Iasi)
Functional Patterns for C++ Multithreading (C++ Dev Meetup Iasi)Functional Patterns for C++ Multithreading (C++ Dev Meetup Iasi)
Functional Patterns for C++ Multithreading (C++ Dev Meetup Iasi)
 
Boo Manifesto
Boo ManifestoBoo Manifesto
Boo Manifesto
 
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSimon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelism
 
Peyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futurePeyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_future
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and Hadoop
 
On the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonOn the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of Python
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of python
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
 
IN4308 1
IN4308 1IN4308 1
IN4308 1
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
 
Go Beyond Higher Order Functions: A Journey into Functional Programming
Go Beyond Higher Order Functions: A Journey into Functional ProgrammingGo Beyond Higher Order Functions: A Journey into Functional Programming
Go Beyond Higher Order Functions: A Journey into Functional Programming
 
Luigi presentation NYC Data Science
Luigi presentation NYC Data ScienceLuigi presentation NYC Data Science
Luigi presentation NYC Data Science
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
 
designpatterns_blair_upe.ppt
designpatterns_blair_upe.pptdesignpatterns_blair_upe.ppt
designpatterns_blair_upe.ppt
 
Introduction to F#
Introduction to F#Introduction to F#
Introduction to F#
 
A Brief Conceptual Introduction to Functional Java 8 and its API
A Brief Conceptual Introduction to Functional Java 8 and its APIA Brief Conceptual Introduction to Functional Java 8 and its API
A Brief Conceptual Introduction to Functional Java 8 and its API
 
Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6
 

How my visualization tools use little memory: A tale of incrementalization and laziness

  • 1. How my visualization tools use little memory: A tale of incrementalization and laziness Eugene Kirpichov <ekirpichov@gmail.com> St.-Petersburg Functional Programming User Group Nov 7, 2012
  • 2. Outline Introduction splot tplot Types of stream operations The transition Optimization Lessons learnt
  • 3. Introduction http://jkff.info/software/timeplotters Tools for visualizing program behavior from logs, optimized for one-liners. timeplot — quantitative graphs about multiple event streams. Histograms of event durations, of types of events, regular line/dot plots etc. splot — Gantt-like chart about a large number of concurrent processes. Birds-eye view of thread/process/machine interaction patterns
  • 4. Live demo Live demo
  • 5. Why optimize Because they were slow and a memory hog (boxing, thunks). Took minutes for 100,000’s of events Took hours or crashed for 1,000,000’s of events Crashing was the last straw. I couldn’t do my job.
  • 6. How to optimize Rewrite in C or Java? We can do better! It would be a defeat :) Would rewrite and re-debug everything Would learn nothing Turned out worth it.
  • 7. splot The easy one. Before After 1. Read input into a list 1. Read input into a list 2. Traverse to calibrate axes 2. Traverse to calibrate axes 3. Traverse to render 3. Read input into a list 4. Traverse to render Lazy IO does the rest. Cost: no output to window, no input from stdin. Code tour.
  • 8. tplot: why it can NOT be done The hard one. Complex data flow: events:tracks = M:M. Uses Chart, which keeps data in memory and for good reasons. Code tour: old version.
  • 9. tplot: why it CAN be done Reading too much = drawing too much ⇒ Chart is not a problem. Building “data to render” for Chart in 1 pass seemed possible. Code tour: PlotData, Render.hs
  • 10. tplot: main idea Push-based: List representation unchanged and hidden Push item Get result (at any moment) Code tour: StreamSummary. Live example: average + profiling. Applicative average.
  • 11. Types of stream operations Concept Logical type Actual type Summarize [a] → b Summary a b Transform [a] → [b] Summary b r → Summary a r Generate a → [b] Summary b r → (a → r) Composition pipelines become quite funny. Code tour: RLE etc.
  • 12. tplot: new architecture New architecture: Push-based builders for all plot types Driver loop to feed input events to output tracks When done, summarize and render Code tour: driver loop, plots (vs old code).
  • 13. tplot: the transition I wanted to keep things working all the time. 1. Change interface — make the change possible. 2. Change implementation — make the change real.
  • 14. tplot: the transition Before: Completely non-incremental. (interface) Separate building and rendering Split into modules (interface) Explicit 2 passes, both potentially incremental Toying with incremental combinators to get a feeling for them (interface) Make all plots have an incremental interface (but actually use toList bridge) (implementation) Incrementalize plots one by one After: Completely incremental.
  • 15. Not so easy Memory leaks due to insufficient strictness: “push isn’t pushing hard enough.” Question What and why remains unevaluated until too late? Profilers didn’t help at all. Debug.Trace is insufficient: it outputs lines and I need hierarchy, not sequence.
  • 16. Enter Debug.HTrace cabal install htrace “Code” “tour”. Live demo: average.
  • 17. Why not X? Iteratee, conduit, pipes, . . . Scary, so I tried to get as far as I can without them Ended up very small, simple and nice, so I didn’t look back Also educational
  • 18. Lessons learnt Lazy IO is ok for this task Debugging laziness is hard: I wouldn’t start a high-risk real-time project in Haskell Profilers are almost useless for debugging laziness Debug.Trace is better, Debug.HTrace much better Push-based incremental processing is fun and easy