SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Sparse Data Structures for Weighted Bipartite
                 Matching
                        E. Jason Riedy
                     Dr. James Demmel
          (. . . and thanks to the BeBOP group)


SIAM Workshop on Combinatorial Scientific Computing 2004
Use Sparse Matrix Optimizations...




Take a fixed, simple algorithm: Auction alg. for matchings
    Repeated iterations over a sparse graph.
What’s expensive, and is there anything we can do about it?
    Take an idea from optimizing sparse matrix-vector products.
A little speed-up in some cases, but there are more ideas
available...
Where’s the Time Going?



Auction algorithm: Iterative, greedy algorithm bipartite matching:
          1. An unmatched row i finds a “most
             profitable” column j
                  π(i) = maxj b(i, j) − p(i)
          2. Row i places a bid for column j.
                  Bid price raised until j is no longer the best
                  choice. (Min. increment µ)
          3. Highest bid gets the matching (i, j).
Time linear in entries examined...
Number of entries examined is problem-dependent.

           4.5
            4
           3.5
            3
time (s)




           2.5
            2
           1.5
            1
           0.5
            0
                 0      5e+06   1e+07    1.5e+07    2e+07    2.5e+07   3e+07
                                number of entries examined
Expensive Inner Loop!
1.3 GHz Itanium 2
10


 8


 6


 4


 2


 0
  100    150        200    250       300     350   400   450
                          cycles per entry
Verifying...
Using kcachegrind (N.Nethercote and J.Weidendorfer) and valgrind
(J. Seward).
And Locating...


No obvious culprits in the instructions...
And Locating...


But considering cache effects!
Auction’s Inner Loop



                                 Price
Index
Entry



        value = entry - price
        save largest...
Auction’s Inner Loop


Same accesses as sparse matrix-vector multiplication!
                                                        Price
      Index
      Entry



                     y += a(i,j) * x(j)
Performance Through Blocking?




(Images swiped from Berkeley’s BeBOP group.)
Performance Through Blocking?




(Images swiped from Berkeley’s BeBOP group.)
Performance Through Blocking?




More entries, but 1.5× performance on Pentium 3!
(Images swiped from Berkeley’s BeBOP group.)
Blocking Speeds Some Matches
Finite element matrix from Vavasis (in UF collection):
Blocking Speeds Some Matches
Blocking Speeds Some Matches
Observations



A blocked graph data structure may provide additional
performance if:
    you iterate over whole rows,
    the graph / matrix has runs of columns, and
    you’re willing to use an automated tuning system.
Maximizing the runs: linear arrangement. Hard, but there may be
cheap heuristics. Only worth-while if you’re performing many
iterations. (For mat-vec, often > 50 computations of Ax.)

Weitere ähnliche Inhalte

Was ist angesagt?

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent Programming[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent ProgrammingTobias Lindaaker
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesRakuten Group, Inc.
 
Project Fortress
Project FortressProject Fortress
Project FortressAlex Miller
 
Input and output in C++
Input and output in C++Input and output in C++
Input and output in C++Nilesh Dalvi
 
constructor with default arguments and dynamic initialization of objects
constructor with default arguments and dynamic initialization of objectsconstructor with default arguments and dynamic initialization of objects
constructor with default arguments and dynamic initialization of objectsKanhaiya Saxena
 
Review constdestr
Review constdestrReview constdestr
Review constdestrrajudasraju
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationHidekazu Oiwa
 
CBSE Question Paper Computer Science with C++ 2011
CBSE Question Paper Computer Science with C++ 2011CBSE Question Paper Computer Science with C++ 2011
CBSE Question Paper Computer Science with C++ 2011Deepak Singh
 
Magical float repr
Magical float reprMagical float repr
Magical float reprdickinsm
 
O caml2014 leroy-slides
O caml2014 leroy-slidesO caml2014 leroy-slides
O caml2014 leroy-slidesOCaml
 
Tpr star tree
Tpr star treeTpr star tree
Tpr star treeWin Yu
 
Using Akka Futures
Using Akka FuturesUsing Akka Futures
Using Akka FuturesKnoldus Inc.
 
05211201 A D V A N C E D D A T A S T R U C T U R E S A N D A L G O R I...
05211201  A D V A N C E D  D A T A  S T R U C T U R E S   A N D   A L G O R I...05211201  A D V A N C E D  D A T A  S T R U C T U R E S   A N D   A L G O R I...
05211201 A D V A N C E D D A T A S T R U C T U R E S A N D A L G O R I...guestd436758
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Brendan Eich
 
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingFast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingRakuten Group, Inc.
 

Was ist angesagt? (20)

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent Programming[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent Programming
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select Dictionaries
 
Project Fortress
Project FortressProject Fortress
Project Fortress
 
Input and output in C++
Input and output in C++Input and output in C++
Input and output in C++
 
constructor with default arguments and dynamic initialization of objects
constructor with default arguments and dynamic initialization of objectsconstructor with default arguments and dynamic initialization of objects
constructor with default arguments and dynamic initialization of objects
 
Review constdestr
Review constdestrReview constdestr
Review constdestr
 
Fluent14
Fluent14Fluent14
Fluent14
 
Tiap
TiapTiap
Tiap
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
 
CBSE Question Paper Computer Science with C++ 2011
CBSE Question Paper Computer Science with C++ 2011CBSE Question Paper Computer Science with C++ 2011
CBSE Question Paper Computer Science with C++ 2011
 
Magical float repr
Magical float reprMagical float repr
Magical float repr
 
Java Day-5
Java Day-5Java Day-5
Java Day-5
 
O caml2014 leroy-slides
O caml2014 leroy-slidesO caml2014 leroy-slides
O caml2014 leroy-slides
 
Tpr star tree
Tpr star treeTpr star tree
Tpr star tree
 
Using Akka Futures
Using Akka FuturesUsing Akka Futures
Using Akka Futures
 
05211201 A D V A N C E D D A T A S T R U C T U R E S A N D A L G O R I...
05211201  A D V A N C E D  D A T A  S T R U C T U R E S   A N D   A L G O R I...05211201  A D V A N C E D  D A T A  S T R U C T U R E S   A N D   A L G O R I...
05211201 A D V A N C E D D A T A S T R U C T U R E S A N D A L G O R I...
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
 
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingFast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
 
Strings
StringsStrings
Strings
 

Andere mochten auch

Micro-benchmarking the Tera MTA
Micro-benchmarking the Tera MTAMicro-benchmarking the Tera MTA
Micro-benchmarking the Tera MTAJason Riedy
 
Cross domain sentiment classification via spectral feature alignment
Cross domain sentiment classification via spectral feature alignmentCross domain sentiment classification via spectral feature alignment
Cross domain sentiment classification via spectral feature alignmentlau
 
Graph Matching
Graph MatchingGraph Matching
Graph Matchinggraphitech
 
Simple algorithm & hopcroft karp for bipartite graph
Simple algorithm & hopcroft karp for bipartite graphSimple algorithm & hopcroft karp for bipartite graph
Simple algorithm & hopcroft karp for bipartite graphMiguel Pereira
 
Intro to sentiment analysis
Intro to sentiment analysisIntro to sentiment analysis
Intro to sentiment analysisTimea Turdean
 
Maximum Matching in General Graphs
Maximum Matching in General GraphsMaximum Matching in General Graphs
Maximum Matching in General GraphsAhmad Khayyat
 
[ACM-ICPC] Dinic's Algorithm
[ACM-ICPC] Dinic's Algorithm[ACM-ICPC] Dinic's Algorithm
[ACM-ICPC] Dinic's AlgorithmChih-Hsuan Kuo
 

Andere mochten auch (10)

Graphs
GraphsGraphs
Graphs
 
Micro-benchmarking the Tera MTA
Micro-benchmarking the Tera MTAMicro-benchmarking the Tera MTA
Micro-benchmarking the Tera MTA
 
JDeodorant: Clone Refactoring
JDeodorant: Clone RefactoringJDeodorant: Clone Refactoring
JDeodorant: Clone Refactoring
 
Cross domain sentiment classification via spectral feature alignment
Cross domain sentiment classification via spectral feature alignmentCross domain sentiment classification via spectral feature alignment
Cross domain sentiment classification via spectral feature alignment
 
Lect12 graph mining
Lect12 graph miningLect12 graph mining
Lect12 graph mining
 
Graph Matching
Graph MatchingGraph Matching
Graph Matching
 
Simple algorithm & hopcroft karp for bipartite graph
Simple algorithm & hopcroft karp for bipartite graphSimple algorithm & hopcroft karp for bipartite graph
Simple algorithm & hopcroft karp for bipartite graph
 
Intro to sentiment analysis
Intro to sentiment analysisIntro to sentiment analysis
Intro to sentiment analysis
 
Maximum Matching in General Graphs
Maximum Matching in General GraphsMaximum Matching in General Graphs
Maximum Matching in General Graphs
 
[ACM-ICPC] Dinic's Algorithm
[ACM-ICPC] Dinic's Algorithm[ACM-ICPC] Dinic's Algorithm
[ACM-ICPC] Dinic's Algorithm
 

Ähnlich wie Sparse Data Structures for Weighted Bipartite Matching

Understanding Javascript Engines
Understanding Javascript Engines Understanding Javascript Engines
Understanding Javascript Engines Parashuram N
 
20190927 generative models_aia
20190927 generative models_aia20190927 generative models_aia
20190927 generative models_aiaYi-Fan Liou
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Optimizing array-based data structures to the limit
Optimizing array-based data structures to the limitOptimizing array-based data structures to the limit
Optimizing array-based data structures to the limitRoman Leventov
 
JavaOne 2012 - JVM JIT for Dummies
JavaOne 2012 - JVM JIT for DummiesJavaOne 2012 - JVM JIT for Dummies
JavaOne 2012 - JVM JIT for DummiesCharles Nutter
 
Hidden Truths in Dead Software Paths
Hidden Truths in Dead Software PathsHidden Truths in Dead Software Paths
Hidden Truths in Dead Software PathsBen Hermann
 
Beating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit ArithmeticBeating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit Arithmeticinside-BigData.com
 
Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesSreedhar Chowdam
 
Orthogonal Functional Architecture
Orthogonal Functional ArchitectureOrthogonal Functional Architecture
Orthogonal Functional ArchitectureJohn De Goes
 
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...Charles Nutter
 
Time and Space Complexity Analysis.pptx
Time and Space Complexity Analysis.pptxTime and Space Complexity Analysis.pptx
Time and Space Complexity Analysis.pptxdudelover
 
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Spark Summit
 
Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)Daniel Lemire
 
Haskell for data science
Haskell for data scienceHaskell for data science
Haskell for data scienceJohn Cant
 
Scala collections wizardry - Scalapeño
Scala collections wizardry - ScalapeñoScala collections wizardry - Scalapeño
Scala collections wizardry - ScalapeñoSagie Davidovich
 
Chapter3 bp
Chapter3   bpChapter3   bp
Chapter3 bpkumar tm
 
ScalaCheck
ScalaCheckScalaCheck
ScalaCheckBeScala
 

Ähnlich wie Sparse Data Structures for Weighted Bipartite Matching (20)

Understanding Javascript Engines
Understanding Javascript Engines Understanding Javascript Engines
Understanding Javascript Engines
 
20190927 generative models_aia
20190927 generative models_aia20190927 generative models_aia
20190927 generative models_aia
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Optimizing array-based data structures to the limit
Optimizing array-based data structures to the limitOptimizing array-based data structures to the limit
Optimizing array-based data structures to the limit
 
JavaOne 2012 - JVM JIT for Dummies
JavaOne 2012 - JVM JIT for DummiesJavaOne 2012 - JVM JIT for Dummies
JavaOne 2012 - JVM JIT for Dummies
 
Hidden Truths in Dead Software Paths
Hidden Truths in Dead Software PathsHidden Truths in Dead Software Paths
Hidden Truths in Dead Software Paths
 
Quality Python Homework Help
Quality Python Homework HelpQuality Python Homework Help
Quality Python Homework Help
 
Beating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit ArithmeticBeating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit Arithmetic
 
Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture Notes
 
Neural networks
Neural networksNeural networks
Neural networks
 
Orthogonal Functional Architecture
Orthogonal Functional ArchitectureOrthogonal Functional Architecture
Orthogonal Functional Architecture
 
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
 
Time and Space Complexity Analysis.pptx
Time and Space Complexity Analysis.pptxTime and Space Complexity Analysis.pptx
Time and Space Complexity Analysis.pptx
 
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
 
Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)
 
Haskell for data science
Haskell for data scienceHaskell for data science
Haskell for data science
 
Scala collections wizardry - Scalapeño
Scala collections wizardry - ScalapeñoScala collections wizardry - Scalapeño
Scala collections wizardry - Scalapeño
 
Analysis
AnalysisAnalysis
Analysis
 
Chapter3 bp
Chapter3   bpChapter3   bp
Chapter3 bp
 
ScalaCheck
ScalaCheckScalaCheck
ScalaCheck
 

Mehr von Jason Riedy

Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFJason Riedy
 
LAGraph 2021-10-13
LAGraph 2021-10-13LAGraph 2021-10-13
LAGraph 2021-10-13Jason Riedy
 
Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFJason Riedy
 
Graph analysis and novel architectures
Graph analysis and novel architecturesGraph analysis and novel architectures
Graph analysis and novel architecturesJason Riedy
 
GraphBLAS and Emus
GraphBLAS and EmusGraphBLAS and Emus
GraphBLAS and EmusJason Riedy
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureJason Riedy
 
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...Jason Riedy
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureJason Riedy
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
Novel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondNovel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondJason Riedy
 
Characterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksCharacterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksJason Riedy
 
CRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateCRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateJason Riedy
 
Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Jason Riedy
 
Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesJason Riedy
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsJason Riedy
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsJason Riedy
 
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisA New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs Jason Riedy
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsJason Riedy
 
Updating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsUpdating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsJason Riedy
 

Mehr von Jason Riedy (20)

Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
LAGraph 2021-10-13
LAGraph 2021-10-13LAGraph 2021-10-13
LAGraph 2021-10-13
 
Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
Graph analysis and novel architectures
Graph analysis and novel architecturesGraph analysis and novel architectures
Graph analysis and novel architectures
 
GraphBLAS and Emus
GraphBLAS and EmusGraphBLAS and Emus
GraphBLAS and Emus
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to Architecture
 
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
Novel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondNovel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and Beyond
 
Characterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksCharacterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with Microbenchmarks
 
CRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateCRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery Update
 
Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018
 
Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New Architectures
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisA New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
Updating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsUpdating PageRank for Streaming Graphs
Updating PageRank for Streaming Graphs
 

Kürzlich hochgeladen

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 

Kürzlich hochgeladen (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 

Sparse Data Structures for Weighted Bipartite Matching

  • 1. Sparse Data Structures for Weighted Bipartite Matching E. Jason Riedy Dr. James Demmel (. . . and thanks to the BeBOP group) SIAM Workshop on Combinatorial Scientific Computing 2004
  • 2. Use Sparse Matrix Optimizations... Take a fixed, simple algorithm: Auction alg. for matchings Repeated iterations over a sparse graph. What’s expensive, and is there anything we can do about it? Take an idea from optimizing sparse matrix-vector products. A little speed-up in some cases, but there are more ideas available...
  • 3. Where’s the Time Going? Auction algorithm: Iterative, greedy algorithm bipartite matching: 1. An unmatched row i finds a “most profitable” column j π(i) = maxj b(i, j) − p(i) 2. Row i places a bid for column j. Bid price raised until j is no longer the best choice. (Min. increment µ) 3. Highest bid gets the matching (i, j).
  • 4. Time linear in entries examined... Number of entries examined is problem-dependent. 4.5 4 3.5 3 time (s) 2.5 2 1.5 1 0.5 0 0 5e+06 1e+07 1.5e+07 2e+07 2.5e+07 3e+07 number of entries examined
  • 5. Expensive Inner Loop! 1.3 GHz Itanium 2 10 8 6 4 2 0 100 150 200 250 300 350 400 450 cycles per entry
  • 6. Verifying... Using kcachegrind (N.Nethercote and J.Weidendorfer) and valgrind (J. Seward).
  • 7. And Locating... No obvious culprits in the instructions...
  • 9. Auction’s Inner Loop Price Index Entry value = entry - price save largest...
  • 10. Auction’s Inner Loop Same accesses as sparse matrix-vector multiplication! Price Index Entry y += a(i,j) * x(j)
  • 11. Performance Through Blocking? (Images swiped from Berkeley’s BeBOP group.)
  • 12. Performance Through Blocking? (Images swiped from Berkeley’s BeBOP group.)
  • 13. Performance Through Blocking? More entries, but 1.5× performance on Pentium 3! (Images swiped from Berkeley’s BeBOP group.)
  • 14. Blocking Speeds Some Matches Finite element matrix from Vavasis (in UF collection):
  • 17. Observations A blocked graph data structure may provide additional performance if: you iterate over whole rows, the graph / matrix has runs of columns, and you’re willing to use an automated tuning system. Maximizing the runs: linear arrangement. Hard, but there may be cheap heuristics. Only worth-while if you’re performing many iterations. (For mat-vec, often > 50 computations of Ax.)