SlideShare ist ein Scribd-Unternehmen logo
1 von 108
Downloaden Sie, um offline zu lesen
Reducers
                         A library and model for collection processing in Clojure




                                                                             Leonardo Borges
                                                                             @leonardo_borges
                                                                             http://www.leonardoborges.com
                                                                             http://www.thoughtworks.com
Thursday, 30 August 12
Reducers
                         A library and model for collection processing in Clojure


                                                                                 less
                                                                              or
                                                                     m i   ns
                                                             in 20
                                                       ...                         Leonardo Borges
                                                                                   @leonardo_borges
                                                                                   http://www.leonardoborges.com
                                                                                   http://www.thoughtworks.com
Thursday, 30 August 12
Reducers huh? Here’s the gist




Thursday, 30 August 12
Reducers huh? Here’s the gist




                         You get parallel versions of reduce, map and filter




Thursday, 30 August 12
Reducers huh? Here’s the gist




                         You get parallel versions of reduce, map and filter



                                            Ta-da! I’m done!



Thursday, 30 August 12
Reducers huh? Here’s the gist




                         You get parallel versions of reduce, map and filter



                                             Ta-da! I’m done!

                                     and well under my 20 min limit :)

Thursday, 30 August 12
Alright, alright I’m kidding




Thursday, 30 August 12
How do reducers make parallelism possible?




Thursday, 30 August 12
How do reducers make parallelism possible?



                                   • JVM’s Fork/Join framework
                                   • Reduction Transformers




Thursday, 30 August 12
Before we start - this is bleeding edge stuff
                         Java requirements

                         • Fork/Join framework
                          • Java 7 [1] or
                          • Java 6 + the JSR166 jar [2]
                         Clojure requirements

                         • 1.5.0-* (this is still MASTER on github [3] as of 30/08/2012)


                                                                       [1] - http://jdk7.java.net/
                                                                       [2] - http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166.jar
                                                                       [3] - https://github.com/clojure/clojure
Thursday, 30 August 12
The Fork/Join Framework




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.
                         •Progressively divides the workload into tasks, up to a threshold




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.
                         •Progressively divides the workload into tasks, up to a threshold
                         •Once it finished one task, it pops another one form its deque




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.
                         •Progressively divides the workload into tasks, up to a threshold
                         •Once it finished one task, it pops another one form its deque
                         •After at least two tasks have finished, results can be combined/joined




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.
                         •Progressively divides the workload into tasks, up to a threshold
                         •Once it finished one task, it pops another one form its deque
                         •After at least two tasks have finished, results can be combined/joined
                         •Idle workers can pop tasks from the deques of workers which fall behind




Thursday, 30 August 12
Text is boring


Thursday, 30 August 12
Fork/Join algorithm - simplified view




Thursday, 30 August 12
Fork/Join algorithm - simplified view




   Workload is put in “deques”




Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                                         ...and progressively halved




Thursday, 30 August 12
Fork/Join algorithm - simplified view




Thursday, 30 August 12
Fork/Join algorithm - simplified view




                         ...up to a configured threshold




Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                         Combine




                                    Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                         Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                         Combine                            Combine




                             Worker 1                   Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                                           Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                 Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                           Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                             Worker 1                    Worker 2

                         Idle workers can “steal” items from other workers
Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                             Combine Combine




                          Worker 1                     Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                                        Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                                    Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                       Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                      Final result




                          Worker 1                    Worker 2


Thursday, 30 August 12
Let’s talk about Reducers




Thursday, 30 August 12
Let’s talk about Reducers

                         Motivations

                         • Performance
                          • via less allocation
                          • via parallelism (leverage Fork/Join)




Thursday, 30 August 12
Let’s talk about Reducers

                         Motivations                               Issues

                         • Performance                             • Lists and Seqs are sequential
                          • via less allocation                    • map / filter implies order
                          • via parallelism (leverage Fork/Join)




Thursday, 30 August 12
A closer look at what map does
                         ;; a naive map implementation
                         (defn map [f coll]
                           (if (seq coll)
                             (cons (f (first coll)) (map f (rest coll)))
                             '()))




Thursday, 30 August 12
A closer look at what map does
                             ;; a naive map implementation
                             (defn map [f coll]
                               (if (seq coll)
                                 (cons (f (first coll)) (map f (rest coll)))
                                 '()))


                         • Recursion




Thursday, 30 August 12
A closer look at what map does
                             ;; a naive map implementation
                             (defn map [f coll]
                               (if (seq coll)
                                 (cons (f (first coll)) (map f (rest coll)))
                                 '()))


                         • Recursion
                         • Order




Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))


                         • Recursion
                         • Order
                         • Laziness (not shown)



Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))


                         • Recursion
                         • Order
                         • Laziness (not shown)
                         • Consumes List


Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))


                         • Recursion
                         • Order
                         • Laziness (not shown)
                         • Consumes List
                         • Builds List

Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))


                         • Recursion
                         • Order                        Oh, and it also applies the function
                         • Laziness (not shown)         to each item before putting the result
                         • Consumes List                into the new list
                         • Builds List

Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))
                                                           This is what mapping means!

                         • Recursion
                         • Order                          Oh, and it also applies the function
                         • Laziness (not shown)           to each item before putting the result
                         • Consumes List                  into the new list
                         • Builds List

Thursday, 30 August 12
Reduction Transformers




Thursday, 30 August 12
Reduction Transformers


                         • Idea is to build map / filter on top of reduce to break from sequentiality




Thursday, 30 August 12
Reduction Transformers


                         • Idea is to build map / filter on top of reduce to break from sequentiality
                         • map / filter then builds nothing and consumes nothing




Thursday, 30 August 12
Reduction Transformers


                         • Idea is to build map / filter on top of reduce to break from sequentiality
                         • map / filter then builds nothing and consumes nothing
                         • It changes what reduce means to the collection by transforming the reducing
                         functions




Thursday, 30 August 12
What map is really all about
                         (defn mapping [f]
                           (fn [f1]
                             (fn [result input]
                               (f1 result (f input)))))




Thursday, 30 August 12
But wait!
                         If map doesn’t consume the list any longer, who does?

                             • reduce does!
                             • Since Clojure 1.4 reduce lets the collection reduce itself
                              (through the CollReduce / CollFold protocols)
                              • Think of what this means for tree-like structures such as
                               vectors
                             • This is key to leveraging the Fork/Join framework




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                               (reduce ((mapping inc) +) 0 [1 2 3 4])
                               ;; 14




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                               (reduce ((mapping inc) +) 0 [1 2 3 4])
                               ;; 14




                                    (fn [result input]
                                      (+ result (inc input)))




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                             (reduce ((mapping inc) conj) [] [1 2 3 4])
                             ;; [2 3 4 5]




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                             (reduce ((mapping inc) conj) [] [1 2 3 4])
                             ;; [2 3 4 5]




                                    (fn [result input]
                                      (conj result (inc input)))




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                             (reduce ((mapping inc) conj) [] [1 2 3 4])
                             ;; [2 3 4 5]




                                    (fn [result input]
                                      (conj result (inc input)))


                                  But it feels awkward to use it in this form

Thursday, 30 August 12
What do we have so far?


                         • Performance has been improved due to less allocations
                          • No intermediary lists need to be built (see Haskell’s StreamFusion [4])
                         • However reduce is still sequential




                                                                                        [4] - http://bit.ly/streamFusion
Thursday, 30 August 12
Enters fold




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)
                         • Segments the collection




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)
                         • Segments the collection
                         • Runs multiple reduces in parallel




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)
                         • Segments the collection
                         • Runs multiple reduces in parallel
                         • Uses a combining function to join/reduce results




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)
                         • Segments the collection
                         • Runs multiple reduces in parallel
                         • Uses a combining function to join/reduce results


                                    (defn fold [combinef reducef coll]
                                      ...)


Thursday, 30 August 12
The combining function is a monoid
                         • A binary function with an identity element
                         • All the following functions are equivalent monoids




Thursday, 30 August 12
The combining function is a monoid
                         • A binary function with an identity element
                         • All the following functions are equivalent monoids

                                                      +
                                                      (+ 2 3) ; 5
                                                      (+) ; 0




Thursday, 30 August 12
The combining function is a monoid
                         • A binary function with an identity element
                         • All the following functions are equivalent monoids

                                                (defn my-+
                                                  ([] 0)
                                                  ([a b] (+ a b)))

                                                (my-+ 2 3) ; 5
                                                (my-+) ; 0




Thursday, 30 August 12
The combining function is a monoid
                         • A binary function with an identity element
                         • All the following functions are equivalent monoids

                                (require ‘[clojure.core.reducers :as r])

                                (def my-+
                                  (r/monoid + (fn [] 0)))

                                (my-+ 2 3) ; 5
                                (my-+) ; 0



Thursday, 30 August 12
fold by examples


                         ;; all examples assume the reducers library
                         is available as r
                         (ns reducers-playground.core
                           (:require [clojure.core.reducers :as r]))




Thursday, 30 August 12
fold by examples:
                         increment all even positive integers up to 10 million
                                         and add them all up




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs

                     (time (reduce + (r/map inc (r/filter even? my-vector))))




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs

                     (time (reduce + (r/map inc (r/filter even? my-vector))))
                     ;; 260msecs




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs

                     (time (reduce + (r/map inc (r/filter even? my-vector))))
                     ;; 260msecs

                     (time (r/fold + (r/map inc (r/filter even? my-vector))))


Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs

                     (time (reduce + (r/map inc (r/filter even? my-vector))))
                     ;; 260msecs

                     (time (r/fold + (r/map inc (r/filter even? my-vector))))
                     ;; 130msecs

Thursday, 30 August 12
fold by examples:
                                    standard word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn count-words [text]
                  (reduce
                   (fn [memo word]
                      (assoc memo word (inc (get memo word 0))))
                   {}
                   (map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                    standard word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn count-words [text]
                  (reduce
                   (fn [memo word]
                      (assoc memo word (inc (get memo word 0))))
                   {}
                   (map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))

                (time (count-words wiki-dump)) ;; 45 secs


Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn p-count-words [text]
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn p-count-words [text]
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)        Combining fn
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB
                                                Will be called at the leaves to merge the
                (defn p-count-words [text]                partial computations
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB
                                                       Will be called with no arguments to
                (defn p-count-words [text]                     provide a seed value
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn p-count-words [text]
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn p-count-words [text]
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))

                (time (p-count-words wiki-dump)) ;; 30 secs


Thursday, 30 August 12
fold by examples:
                               Load 100k records into PostgreSQL



                  (def records
                    (into [] (line-seq
                               (BufferedReader. (FileReader. "dump.txt")))))




Thursday, 30 August 12
fold by examples:
                                    Load 100k records into PostgreSQL


                         (time (doseq [record records]
                           (let [tokens (clojure.string/split record #"t" )]
                                  (insert users/users
                                          (values {
                                                    :account-id (nth tokens 0)
                                                    ...
                                                    })))))




Thursday, 30 August 12
fold by examples:
                                      Load 100k records into PostgreSQL


                         (time (doseq [record records]
                           (let [tokens (clojure.string/split record #"t" )]
                                  (insert users/users
                                          (values {
                                                    :account-id (nth tokens 0)
                                                    ...
                                                    })))))



                         ;; 90 secs
Thursday, 30 August 12
fold by examples:
                         Load 100k records into PostgreSQL in parallel
(time (r/fold
       +
       (r/map (fn [record]
                (let [tokens (clojure.string/split record #"t" )]
                  (do (insert users/users
                              (values {
                                        :account-id (nth tokens 0)
                                        ...
                                        }))
                      1))) records)))



Thursday, 30 August 12
fold by examples:
                         Load 100k records into PostgreSQL in parallel
(time (r/fold
       +
       (r/map (fn [record]
                (let [tokens (clojure.string/split record #"t" )]
                  (do (insert users/users
                              (values {
                                        :account-id (nth tokens 0)
                                        ...
                                        }))
                      1))) records)))


;; 50 secs
Thursday, 30 August 12
When to use it




Thursday, 30 August 12
When to use it

                         • Exploring decision trees




Thursday, 30 August 12
When to use it

                         • Exploring decision trees
                         • Image processing




Thursday, 30 August 12
When to use it

                         • Exploring decision trees
                         • Image processing
                         • As a building block for bigger, distributed systems such as Datomic and
                          Cascalog (maybe around parallel agregators)




Thursday, 30 August 12
When to use it

                         • Exploring decision trees
                         • Image processing
                         • As a building block for bigger, distributed systems such as Datomic and
                          Cascalog (maybe around parallel agregators)
                         • Basically any list intensive program




Thursday, 30 August 12
When to use it

                         • Exploring decision trees
                         • Image processing
                         • As a building block for bigger, distributed systems such as Datomic and
                          Cascalog (maybe around parallel agregators)
                         • Basically any list intensive program


                                    But the tools are available to anyone so be creative!



Thursday, 30 August 12
Resources

                         • The Anatomy of a Reducer - http://bit.ly/anatomyReducers
                         • Rich’s announcement post on Reducers - http://bit.ly/reducersANN
                         • Rich Hickey - Reducers - EuroClojure 2012 - http://bit.ly/reducersVideo
                          (this presentation was heavily inspired by this video)
                         • The Source on github - http://bit.ly/reducersCore



                                                                                      Leonardo Borges
                                                                                      @leonardo_borges
                                                                                      http://www.leonardoborges.com
                                                                                      http://www.thoughtworks.com
Thursday, 30 August 12
Thanks!




                             Questions?



                                 Leonardo Borges
                                @leonardo_borges
                         http://www.leonardoborges.com
                          http://www.thoughtworks.com

Thursday, 30 August 12

Weitere ähnliche Inhalte

Mehr von Leonardo Borges

Functional Reactive Programming / Compositional Event Systems
Functional Reactive Programming / Compositional Event SystemsFunctional Reactive Programming / Compositional Event Systems
Functional Reactive Programming / Compositional Event SystemsLeonardo Borges
 
High Performance web apps in Om, React and ClojureScript
High Performance web apps in Om, React and ClojureScriptHigh Performance web apps in Om, React and ClojureScript
High Performance web apps in Om, React and ClojureScriptLeonardo Borges
 
Programação functional reativa: lidando com código assíncrono
Programação functional reativa: lidando com código assíncronoProgramação functional reativa: lidando com código assíncrono
Programação functional reativa: lidando com código assíncronoLeonardo Borges
 
Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013
Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013
Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013Leonardo Borges
 
Intro to Clojure's core.async
Intro to Clojure's core.asyncIntro to Clojure's core.async
Intro to Clojure's core.asyncLeonardo Borges
 
Functional Reactive Programming in Clojurescript
Functional Reactive Programming in ClojurescriptFunctional Reactive Programming in Clojurescript
Functional Reactive Programming in ClojurescriptLeonardo Borges
 
Clojure/West 2013 in 30 mins
Clojure/West 2013 in 30 minsClojure/West 2013 in 30 mins
Clojure/West 2013 in 30 minsLeonardo Borges
 
The many facets of code reuse in JavaScript
The many facets of code reuse in JavaScriptThe many facets of code reuse in JavaScript
The many facets of code reuse in JavaScriptLeonardo Borges
 
Heroku addons development - Nov 2011
Heroku addons development - Nov 2011Heroku addons development - Nov 2011
Heroku addons development - Nov 2011Leonardo Borges
 
Clouds Against the Floods
Clouds Against the FloodsClouds Against the Floods
Clouds Against the FloodsLeonardo Borges
 

Mehr von Leonardo Borges (14)

Functional Reactive Programming / Compositional Event Systems
Functional Reactive Programming / Compositional Event SystemsFunctional Reactive Programming / Compositional Event Systems
Functional Reactive Programming / Compositional Event Systems
 
High Performance web apps in Om, React and ClojureScript
High Performance web apps in Om, React and ClojureScriptHigh Performance web apps in Om, React and ClojureScript
High Performance web apps in Om, React and ClojureScript
 
Programação functional reativa: lidando com código assíncrono
Programação functional reativa: lidando com código assíncronoProgramação functional reativa: lidando com código assíncrono
Programação functional reativa: lidando com código assíncrono
 
Monads in Clojure
Monads in ClojureMonads in Clojure
Monads in Clojure
 
Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013
Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013
Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013
 
Intro to Clojure's core.async
Intro to Clojure's core.asyncIntro to Clojure's core.async
Intro to Clojure's core.async
 
Functional Reactive Programming in Clojurescript
Functional Reactive Programming in ClojurescriptFunctional Reactive Programming in Clojurescript
Functional Reactive Programming in Clojurescript
 
Clojure/West 2013 in 30 mins
Clojure/West 2013 in 30 minsClojure/West 2013 in 30 mins
Clojure/West 2013 in 30 mins
 
The many facets of code reuse in JavaScript
The many facets of code reuse in JavaScriptThe many facets of code reuse in JavaScript
The many facets of code reuse in JavaScript
 
Heroku addons development - Nov 2011
Heroku addons development - Nov 2011Heroku addons development - Nov 2011
Heroku addons development - Nov 2011
 
Clouds Against the Floods
Clouds Against the FloodsClouds Against the Floods
Clouds Against the Floods
 
Arel in Rails 3
Arel in Rails 3Arel in Rails 3
Arel in Rails 3
 
Testing with Spring
Testing with SpringTesting with Spring
Testing with Spring
 
JRuby in The Enterprise
JRuby in The EnterpriseJRuby in The Enterprise
JRuby in The Enterprise
 

Kürzlich hochgeladen

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Clojure Reducers / clj-syd Aug 2012

  • 1. Reducers A library and model for collection processing in Clojure Leonardo Borges @leonardo_borges http://www.leonardoborges.com http://www.thoughtworks.com Thursday, 30 August 12
  • 2. Reducers A library and model for collection processing in Clojure less or m i ns in 20 ... Leonardo Borges @leonardo_borges http://www.leonardoborges.com http://www.thoughtworks.com Thursday, 30 August 12
  • 3. Reducers huh? Here’s the gist Thursday, 30 August 12
  • 4. Reducers huh? Here’s the gist You get parallel versions of reduce, map and filter Thursday, 30 August 12
  • 5. Reducers huh? Here’s the gist You get parallel versions of reduce, map and filter Ta-da! I’m done! Thursday, 30 August 12
  • 6. Reducers huh? Here’s the gist You get parallel versions of reduce, map and filter Ta-da! I’m done! and well under my 20 min limit :) Thursday, 30 August 12
  • 7. Alright, alright I’m kidding Thursday, 30 August 12
  • 8. How do reducers make parallelism possible? Thursday, 30 August 12
  • 9. How do reducers make parallelism possible? • JVM’s Fork/Join framework • Reduction Transformers Thursday, 30 August 12
  • 10. Before we start - this is bleeding edge stuff Java requirements • Fork/Join framework • Java 7 [1] or • Java 6 + the JSR166 jar [2] Clojure requirements • 1.5.0-* (this is still MASTER on github [3] as of 30/08/2012) [1] - http://jdk7.java.net/ [2] - http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166.jar [3] - https://github.com/clojure/clojure Thursday, 30 August 12
  • 12. The Fork/Join Framework •Based on divide and conquer Thursday, 30 August 12
  • 13. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm Thursday, 30 August 12
  • 14. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. Thursday, 30 August 12
  • 15. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. •Progressively divides the workload into tasks, up to a threshold Thursday, 30 August 12
  • 16. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. •Progressively divides the workload into tasks, up to a threshold •Once it finished one task, it pops another one form its deque Thursday, 30 August 12
  • 17. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. •Progressively divides the workload into tasks, up to a threshold •Once it finished one task, it pops another one form its deque •After at least two tasks have finished, results can be combined/joined Thursday, 30 August 12
  • 18. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. •Progressively divides the workload into tasks, up to a threshold •Once it finished one task, it pops another one form its deque •After at least two tasks have finished, results can be combined/joined •Idle workers can pop tasks from the deques of workers which fall behind Thursday, 30 August 12
  • 19. Text is boring Thursday, 30 August 12
  • 20. Fork/Join algorithm - simplified view Thursday, 30 August 12
  • 21. Fork/Join algorithm - simplified view Workload is put in “deques” Thursday, 30 August 12
  • 22. Fork/Join algorithm - simplified view ...and progressively halved Thursday, 30 August 12
  • 23. Fork/Join algorithm - simplified view Thursday, 30 August 12
  • 24. Fork/Join algorithm - simplified view ...up to a configured threshold Thursday, 30 August 12
  • 25. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 26. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 27. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 28. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 29. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 30. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 31. Fork/Join algorithm - simplified view Combine Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 32. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 33. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 34. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 35. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 36. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 37. Fork/Join algorithm - simplified view Worker 1 Worker 2 Idle workers can “steal” items from other workers Thursday, 30 August 12
  • 38. Fork/Join algorithm - simplified view Combine Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 39. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 40. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 41. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 42. Fork/Join algorithm - simplified view Final result Worker 1 Worker 2 Thursday, 30 August 12
  • 43. Let’s talk about Reducers Thursday, 30 August 12
  • 44. Let’s talk about Reducers Motivations • Performance • via less allocation • via parallelism (leverage Fork/Join) Thursday, 30 August 12
  • 45. Let’s talk about Reducers Motivations Issues • Performance • Lists and Seqs are sequential • via less allocation • map / filter implies order • via parallelism (leverage Fork/Join) Thursday, 30 August 12
  • 46. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) Thursday, 30 August 12
  • 47. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion Thursday, 30 August 12
  • 48. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order Thursday, 30 August 12
  • 49. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order • Laziness (not shown) Thursday, 30 August 12
  • 50. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order • Laziness (not shown) • Consumes List Thursday, 30 August 12
  • 51. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order • Laziness (not shown) • Consumes List • Builds List Thursday, 30 August 12
  • 52. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order Oh, and it also applies the function • Laziness (not shown) to each item before putting the result • Consumes List into the new list • Builds List Thursday, 30 August 12
  • 53. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) This is what mapping means! • Recursion • Order Oh, and it also applies the function • Laziness (not shown) to each item before putting the result • Consumes List into the new list • Builds List Thursday, 30 August 12
  • 55. Reduction Transformers • Idea is to build map / filter on top of reduce to break from sequentiality Thursday, 30 August 12
  • 56. Reduction Transformers • Idea is to build map / filter on top of reduce to break from sequentiality • map / filter then builds nothing and consumes nothing Thursday, 30 August 12
  • 57. Reduction Transformers • Idea is to build map / filter on top of reduce to break from sequentiality • map / filter then builds nothing and consumes nothing • It changes what reduce means to the collection by transforming the reducing functions Thursday, 30 August 12
  • 58. What map is really all about (defn mapping [f] (fn [f1] (fn [result input] (f1 result (f input))))) Thursday, 30 August 12
  • 59. But wait! If map doesn’t consume the list any longer, who does? • reduce does! • Since Clojure 1.4 reduce lets the collection reduce itself (through the CollReduce / CollFold protocols) • Think of what this means for tree-like structures such as vectors • This is key to leveraging the Fork/Join framework Thursday, 30 August 12
  • 60. Now we can use mapping to create reducing functions (reduce ((mapping inc) +) 0 [1 2 3 4]) ;; 14 Thursday, 30 August 12
  • 61. Now we can use mapping to create reducing functions (reduce ((mapping inc) +) 0 [1 2 3 4]) ;; 14 (fn [result input] (+ result (inc input))) Thursday, 30 August 12
  • 62. Now we can use mapping to create reducing functions (reduce ((mapping inc) conj) [] [1 2 3 4]) ;; [2 3 4 5] Thursday, 30 August 12
  • 63. Now we can use mapping to create reducing functions (reduce ((mapping inc) conj) [] [1 2 3 4]) ;; [2 3 4 5] (fn [result input] (conj result (inc input))) Thursday, 30 August 12
  • 64. Now we can use mapping to create reducing functions (reduce ((mapping inc) conj) [] [1 2 3 4]) ;; [2 3 4 5] (fn [result input] (conj result (inc input))) But it feels awkward to use it in this form Thursday, 30 August 12
  • 65. What do we have so far? • Performance has been improved due to less allocations • No intermediary lists need to be built (see Haskell’s StreamFusion [4]) • However reduce is still sequential [4] - http://bit.ly/streamFusion Thursday, 30 August 12
  • 67. Enters fold • Takes the sequentiality out or foldl, foldr and reduce Thursday, 30 August 12
  • 68. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) Thursday, 30 August 12
  • 69. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) Thursday, 30 August 12
  • 70. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) • Segments the collection Thursday, 30 August 12
  • 71. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) • Segments the collection • Runs multiple reduces in parallel Thursday, 30 August 12
  • 72. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) • Segments the collection • Runs multiple reduces in parallel • Uses a combining function to join/reduce results Thursday, 30 August 12
  • 73. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) • Segments the collection • Runs multiple reduces in parallel • Uses a combining function to join/reduce results (defn fold [combinef reducef coll] ...) Thursday, 30 August 12
  • 74. The combining function is a monoid • A binary function with an identity element • All the following functions are equivalent monoids Thursday, 30 August 12
  • 75. The combining function is a monoid • A binary function with an identity element • All the following functions are equivalent monoids + (+ 2 3) ; 5 (+) ; 0 Thursday, 30 August 12
  • 76. The combining function is a monoid • A binary function with an identity element • All the following functions are equivalent monoids (defn my-+ ([] 0) ([a b] (+ a b))) (my-+ 2 3) ; 5 (my-+) ; 0 Thursday, 30 August 12
  • 77. The combining function is a monoid • A binary function with an identity element • All the following functions are equivalent monoids (require ‘[clojure.core.reducers :as r]) (def my-+ (r/monoid + (fn [] 0))) (my-+ 2 3) ; 5 (my-+) ; 0 Thursday, 30 August 12
  • 78. fold by examples ;; all examples assume the reducers library is available as r (ns reducers-playground.core (:require [clojure.core.reducers :as r])) Thursday, 30 August 12
  • 79. fold by examples: increment all even positive integers up to 10 million and add them all up Thursday, 30 August 12
  • 80. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk Thursday, 30 August 12
  • 81. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) Thursday, 30 August 12
  • 82. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) Thursday, 30 August 12
  • 83. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs Thursday, 30 August 12
  • 84. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs (time (reduce + (r/map inc (r/filter even? my-vector)))) Thursday, 30 August 12
  • 85. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs (time (reduce + (r/map inc (r/filter even? my-vector)))) ;; 260msecs Thursday, 30 August 12
  • 86. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs (time (reduce + (r/map inc (r/filter even? my-vector)))) ;; 260msecs (time (r/fold + (r/map inc (r/filter even? my-vector)))) Thursday, 30 August 12
  • 87. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs (time (reduce + (r/map inc (r/filter even? my-vector)))) ;; 260msecs (time (r/fold + (r/map inc (r/filter even? my-vector)))) ;; 130msecs Thursday, 30 August 12
  • 88. fold by examples: standard word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn count-words [text] (reduce (fn [memo word] (assoc memo word (inc (get memo word 0)))) {} (map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 89. fold by examples: standard word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn count-words [text] (reduce (fn [memo word] (assoc memo word (inc (get memo word 0)))) {} (map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) (time (count-words wiki-dump)) ;; 45 secs Thursday, 30 August 12
  • 90. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 91. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) Combining fn (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 92. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB Will be called at the leaves to merge the (defn p-count-words [text] partial computations (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 93. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB Will be called with no arguments to (defn p-count-words [text] provide a seed value (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 94. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 95. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) (time (p-count-words wiki-dump)) ;; 30 secs Thursday, 30 August 12
  • 96. fold by examples: Load 100k records into PostgreSQL (def records (into [] (line-seq (BufferedReader. (FileReader. "dump.txt"))))) Thursday, 30 August 12
  • 97. fold by examples: Load 100k records into PostgreSQL (time (doseq [record records] (let [tokens (clojure.string/split record #"t" )] (insert users/users (values { :account-id (nth tokens 0) ... }))))) Thursday, 30 August 12
  • 98. fold by examples: Load 100k records into PostgreSQL (time (doseq [record records] (let [tokens (clojure.string/split record #"t" )] (insert users/users (values { :account-id (nth tokens 0) ... }))))) ;; 90 secs Thursday, 30 August 12
  • 99. fold by examples: Load 100k records into PostgreSQL in parallel (time (r/fold + (r/map (fn [record] (let [tokens (clojure.string/split record #"t" )] (do (insert users/users (values { :account-id (nth tokens 0) ... })) 1))) records))) Thursday, 30 August 12
  • 100. fold by examples: Load 100k records into PostgreSQL in parallel (time (r/fold + (r/map (fn [record] (let [tokens (clojure.string/split record #"t" )] (do (insert users/users (values { :account-id (nth tokens 0) ... })) 1))) records))) ;; 50 secs Thursday, 30 August 12
  • 101. When to use it Thursday, 30 August 12
  • 102. When to use it • Exploring decision trees Thursday, 30 August 12
  • 103. When to use it • Exploring decision trees • Image processing Thursday, 30 August 12
  • 104. When to use it • Exploring decision trees • Image processing • As a building block for bigger, distributed systems such as Datomic and Cascalog (maybe around parallel agregators) Thursday, 30 August 12
  • 105. When to use it • Exploring decision trees • Image processing • As a building block for bigger, distributed systems such as Datomic and Cascalog (maybe around parallel agregators) • Basically any list intensive program Thursday, 30 August 12
  • 106. When to use it • Exploring decision trees • Image processing • As a building block for bigger, distributed systems such as Datomic and Cascalog (maybe around parallel agregators) • Basically any list intensive program But the tools are available to anyone so be creative! Thursday, 30 August 12
  • 107. Resources • The Anatomy of a Reducer - http://bit.ly/anatomyReducers • Rich’s announcement post on Reducers - http://bit.ly/reducersANN • Rich Hickey - Reducers - EuroClojure 2012 - http://bit.ly/reducersVideo (this presentation was heavily inspired by this video) • The Source on github - http://bit.ly/reducersCore Leonardo Borges @leonardo_borges http://www.leonardoborges.com http://www.thoughtworks.com Thursday, 30 August 12
  • 108. Thanks! Questions? Leonardo Borges @leonardo_borges http://www.leonardoborges.com http://www.thoughtworks.com Thursday, 30 August 12