La programación genérica ofrece un mecanismo para abstraer componentes software sin traducir esta abstracción en un sobrecoste en el rendimiento. Esto ha permitido poder expresar patrones y algoritmos de uso común, permitiendo que los programadores de aplicación centrarse en los detalles del software de aplicación. Esta idea ha sido, durante décadas, uno de los principios de la Standard Template Library (STL) del lenguaje C++. El mismo principio puede aplicarse los esqueletos paralelos y los patrones de patrones de paralelismo, que aplican la idea de patrón de diseño al caso de la programación paralela. En este caso, un parámetro adicional de configuración es el modelo de ejecución a utilizar. En esta charla, consideraré diversos aspectos de la integración de las ideas de programación paralela y programación genérica. Daré una visión muy general del estado actual en C++17 y me centraré en un enfoque alternativo basado en la biblioteca GrPPI (Generic and reusable Parallel Pattern Interface).
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Simplificando el paralelismo con Patrones Genéricos Paralelos
1. An introduction to GrPPI
An introduction to GrPPI
Generic Reusable Parallel Patterns Interface
J. Daniel Garcia
ARCOS Group
University Carlos III of Madrid
Spain
November 2019
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 1/80
2. An introduction to GrPPI
Warning
c This work is under Attribution-NonCommercial-
NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
license.
You are free to Share — copy and redistribute the ma-
terial in any medium or format.
b You must give appropriate credit, provide a link to the
license, and indicate if changes were made. You may
do so in any reasonable manner, but not in any way that
suggests the licensor endorses you or your use.
e You may not use the material for commercial purposes.
d If you remix, transform, or build upon the material, you
may not distribute the modified material.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 2/80
3. An introduction to GrPPI
ARCOS@uc3m
UC3M: A young international research oriented university.
ARCOS: An applied research group.
Lines: High Performance Computing, Big data,
Cyberphysical Systems, and Programming models for
application improvement.
Programming Models for Application Improvement:
Provide programming tools for improving:
Performance.
Energy efficiency.
Maintainability.
Correctness.
Standardization:
ISO/IEC JTC/SC22/WG21. ISO C++ standards committe.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 3/80
4. An introduction to GrPPI
Acknowledgements
Project 801091 “ASPIDE: Exascale programming models for
extreme programming funded by the European Commission
through H2020 FET-HPC (2018-2020).
Project ICT 644235 “REPHRASE: REfactoring Parallel
Heterogeneous Resource-aware Applications” funded by the
European Commission through H2020 program (2015-2018).
Project TIN2016-79673-P “Towards Unification of HPC and
Big Data Paradigms” funded by the Spanish Ministry of
Economy and Competitiveness (2016-2019).
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 4/80
5. An introduction to GrPPI
GrPPI team
Main team
J. Daniel Garcia (UC3M, lead).
Javier Garcia Blas (UC3M).
David del Río (UC3M).
Javier Fernández (UC3M).
Cooperation
Marco Danelutto (Univ. Pisa)
Massimo Torquati (Univ. Pisa)
Marco Aldinucci (Univ. Torino)
Fabio Tordini (Univ. Torino)
Plácido Fernández (UC3M-CERN).
Manuel F. Dolz (Univ. Jaume I).
. . .
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 5/80
6. An introduction to GrPPI
Introduction
1 Introduction
2 Simple use
3 Patterns in GrPPI
4 Evaluation
5 Conclusions
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 6/80
7. An introduction to GrPPI
Introduction
Sequential Programming versus Parallel Programming
Sequential programming
Well-known set of control-structures embedded in
programming languages.
Control structures inherently sequential.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 7/80
8. An introduction to GrPPI
Introduction
Sequential Programming versus Parallel Programming
Sequential programming
Well-known set of control-structures embedded in
programming languages.
Control structures inherently sequential.
Traditional Parallel programming
Constructs adapting sequential control structures to the
parallel world (e.g. parallel-for).
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 7/80
9. An introduction to GrPPI
Introduction
Sequential Programming versus Parallel Programming
Sequential programming
Well-known set of control-structures embedded in
programming languages.
Control structures inherently sequential.
Traditional Parallel programming
Constructs adapting sequential control structures to the
parallel world (e.g. parallel-for).
But wait!
What if we had constructs that could be both sequential
and parallel?
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 7/80
10. An introduction to GrPPI
Introduction
Software design
There are two ways of constructing a software design:
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
11. An introduction to GrPPI
Introduction
Software design
There are two ways of constructing a software design:
One way is
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
12. An introduction to GrPPI
Introduction
Software design
There are two ways of constructing a software design:
One way is
to make it so simple that there are obviously no defi-
ciencies,
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
13. An introduction to GrPPI
Introduction
Software design
There are two ways of constructing a software design:
One way is
to make it so simple that there are obviously no defi-
ciencies,
and the other way is
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
14. An introduction to GrPPI
Introduction
Software design
There are two ways of constructing a software design:
One way is
to make it so simple that there are obviously no defi-
ciencies,
and the other way is
to make it so complicated that there are no obvious
deficiencies.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
15. An introduction to GrPPI
Introduction
Software design
There are two ways of constructing a software design:
One way is
to make it so simple that there are obviously no defi-
ciencies,
and the other way is
to make it so complicated that there are no obvious
deficiencies.
The first method is far more difficult.
C.A.R Hoare
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
16. An introduction to GrPPI
Introduction
Adding two vectors
Traditional way
using numvec = std::vector<double>;
numvec add(const numvec & v1, const numvec & v2) {
numvec res;
res.reserve(v1.size() ) ; // Asume equal sizes
for (int i =0; i <v1.size() ; ++i) {
res.push_back(v1[i]+v2[i]) ;
}
return res;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 9/80
17. An introduction to GrPPI
Introduction
Adding two vectors
Traditional way
using numvec = std::vector<double>;
numvec add(const numvec & v1, const numvec & v2) {
numvec res;
res.reserve(v1.size() ) ; // Asume equal sizes
for (int i =0; i <v1.size() ; ++i) {
res.push_back(v1[i]+v2[i]) ;
}
return res;
}
Adds additional constraints.
Traversing in-order.
Potential mistakes.
i<v1.size() versus i<=v1.size().
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 9/80
18. An introduction to GrPPI
Introduction
Adding two vectors
The STL way (C++98/03)
using numvec = std::vector<double>;
numvec add(const numvec & v1, const numvec & v2) {
numvec res;
res.reserve(v1.size() ) ; // Asume equal sizes
std :: transform(v1.begin(), v1.end(), v2.begin() ,
std :: back_inserter(res),
std :: plus<>{}) ;
return res;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 10/80
19. An introduction to GrPPI
Introduction
Adding two vectors
The STL way (C++98/03)
using numvec = std::vector<double>;
numvec add(const numvec & v1, const numvec & v2) {
numvec res;
res.reserve(v1.size() ) ; // Asume equal sizes
std :: transform(v1.begin(), v1.end(), v2.begin() ,
std :: back_inserter(res),
std :: plus<>{}) ;
return res;
}
Minimize off-by-one mistakes.
Type specific optimizations.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 10/80
20. An introduction to GrPPI
Introduction
Adding two vectors
The STL way (C++14)
using numvec = std::vector<double>;
numvec add(const numvec & v1, const numvec & v2) {
numvec res;
res.reserve(v1.size() ) ; // Asume equal sizes
std :: transform(v1.begin(), v1.end(), v2.begin() ,
std :: back_inserter(res),
[]( auto x, auto y) { return x+y; }) ;
return res;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 11/80
21. An introduction to GrPPI
Introduction
Adding two vectors
The STL way (C++14)
using numvec = std::vector<double>;
numvec add(const numvec & v1, const numvec & v2) {
numvec res;
res.reserve(v1.size() ) ; // Asume equal sizes
std :: transform(v1.begin(), v1.end(), v2.begin() ,
std :: back_inserter(res),
[]( auto x, auto y) { return x+y; }) ;
return res;
}
Use (possibly generic) lambdas .
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 11/80
22. An introduction to GrPPI
Introduction
A brief history of patterns
From building and architecture (Cristopher Alexander):
1977: A Pattern Language: Towns, Buildings, Construction.
1979: The timeless way of buildings.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 12/80
23. An introduction to GrPPI
Introduction
A brief history of patterns
From building and architecture (Cristopher Alexander):
1977: A Pattern Language: Towns, Buildings, Construction.
1979: The timeless way of buildings.
To software design (Gamma et al.):
1993: Design Patterns: abstraction and reuse of object
oriented design. ECOOP.
1995: Design Patterns. Elements of Reusable
Object-Oriented Software.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 12/80
24. An introduction to GrPPI
Introduction
A brief history of patterns
From building and architecture (Cristopher Alexander):
1977: A Pattern Language: Towns, Buildings, Construction.
1979: The timeless way of buildings.
To software design (Gamma et al.):
1993: Design Patterns: abstraction and reuse of object
oriented design. ECOOP.
1995: Design Patterns. Elements of Reusable
Object-Oriented Software.
To parallel programming (McCool, Reinders, Robinson):
2012: Structured Parallel Programming: Patterns for
Efficient Computation.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 12/80
25. An introduction to GrPPI
Introduction
Parallel Patterns
Levels of patterns:
Software Architecture Patterns.
Design Patterns.
Algorithm Strategy Patterns.
Implementation patterns.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 13/80
26. An introduction to GrPPI
Introduction
Parallel Patterns
Levels of patterns:
Software Architecture Patterns.
Design Patterns.
Algorithm Strategy Patterns.
Implementation patterns.
Algorithm Strategy Pattern or Algorithmic Skeleton
A way to codify best practices in parallel programming in
a reusable way.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 13/80
27. An introduction to GrPPI
Introduction
Parallel Patterns
Levels of patterns:
Software Architecture Patterns.
Design Patterns.
Algorithm Strategy Patterns.
Implementation patterns.
Algorithm Strategy Pattern or Algorithmic Skeleton
A way to codify best practices in parallel programming in
a reusable way.
A single Parallel Pattern may have different realizations
in different programming models.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 13/80
28. An introduction to GrPPI
Introduction
Example: map pattern
Apply a single elemental operation to different data
items.
SAXPY: Sequential
for (size_t i=0; i<n; ++i) {
y[ i ] = a ∗ x[ i ] + y[ i ];
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 14/80
29. An introduction to GrPPI
Introduction
Example: map pattern
Apply a single elemental operation to different data
items.
SAXPY: Sequential
for (size_t i=0; i<n; ++i) {
y[ i ] = a ∗ x[ i ] + y[ i ];
}
SAXPY: OpenMP
#pragma omp parallel for
for (size_t i=0; i<n; ++i) {
y[ i ] = a ∗ x[ i ] + y[ i ];
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 14/80
30. An introduction to GrPPI
Introduction
Example: map pattern
Apply a single elemental operation to different data
items.
SAXPY: Sequential
for (size_t i=0; i<n; ++i) {
y[ i ] = a ∗ x[ i ] + y[ i ];
}
SAXPY: OpenMP
#pragma omp parallel for
for (size_t i=0; i<n; ++i) {
y[ i ] = a ∗ x[ i ] + y[ i ];
}
SAXPY: TBB
tbb :: parallel_for (tbb :: blocked_range<int>{0,n},
[&](tbb :: blocked_range<int> r) {
for (auto i : r) {
y[ i ] = a ∗ x[ i ] + y[ i ];
}
}) ;
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 14/80
31. An introduction to GrPPI
Introduction
Ideals for Parallel Patterns
Applications should be expressed independently of the
execution model.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
32. An introduction to GrPPI
Introduction
Ideals for Parallel Patterns
Applications should be expressed independently of the
execution model.
Computations should be expressed in terms of
structured composable patterns.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
33. An introduction to GrPPI
Introduction
Ideals for Parallel Patterns
Applications should be expressed independently of the
execution model.
Computations should be expressed in terms of
structured composable patterns.
Multiple back-ends should be offered with simple
switching mechanisms.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
34. An introduction to GrPPI
Introduction
Ideals for Parallel Patterns
Applications should be expressed independently of the
execution model.
Computations should be expressed in terms of
structured composable patterns.
Multiple back-ends should be offered with simple
switching mechanisms.
Interface should integrate seamlessly with modern C++
and its standard library.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
35. An introduction to GrPPI
Introduction
Ideals for Parallel Patterns
Applications should be expressed independently of the
execution model.
Computations should be expressed in terms of
structured composable patterns.
Multiple back-ends should be offered with simple
switching mechanisms.
Interface should integrate seamlessly with modern C++
and its standard library.
Applications should be able to take advantage of modern
C++ language features.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
36. An introduction to GrPPI
Introduction
C++17: Execution policies
An execution policy allows to disambiguate parallel
algorithms overloads.
sequenced_policy: Not parallelized.
std::execution::seq.
parallel_policy: Execution can be parallelized in multiple
threads.
std::execution::par.
parallel_unsequenced_policy: Execution can be
parallelized, vectorized, or migrated across threads.
std::execution::par_unseq.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 16/80
37. An introduction to GrPPI
Introduction
C++17: Parallel algorithms
Most algorithms in <algorithm> and <numeric> have
execution policy based versions.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 17/80
38. An introduction to GrPPI
Introduction
C++17: Parallel algorithms
Most algorithms in <algorithm> and <numeric> have
execution policy based versions.
SAXPY: C++17
std :: transform(std::execution::par,
y.begin(), y.end(), x.begin(), y.begin(),
[=]( auto vx, auto vy) { return a ∗ vx + vy; }) ;
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 17/80
39. An introduction to GrPPI
Introduction
C++17: Parallel algorithms
Most algorithms in <algorithm> and <numeric> have
execution policy based versions.
SAXPY: C++17
std :: transform(std::execution::par,
y.begin(), y.end(), x.begin(), y.begin(),
[=]( auto vx, auto vy) { return a ∗ vx + vy; }) ;
However . . .
No fine control knobs.
Lack of task parallelism and stream parallelism.
OK as an entry level to parallelism.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 17/80
40. An introduction to GrPPI
Introduction
GrPPI
https://github.com/arcosuc3m/grppi
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 18/80
41. An introduction to GrPPI
Introduction
GrPPI
https://github.com/arcosuc3m/grppi
Generic reusable Parallel Pattern Interface.
A header only library (might change).
A set of execution policies.
A set of type safe generic algorithms.
Requires C++14.
Apache 2.0 License.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 18/80
42. An introduction to GrPPI
Simple use
1 Introduction
2 Simple use
3 Patterns in GrPPI
4 Evaluation
5 Conclusions
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 19/80
43. An introduction to GrPPI
Simple use
Example: Transforming a sequence
Given a sequence of frames generate a new sequence of
frames in grayscale.
struct frame { /∗ ... ∗/ };
frame togray(const frame & f);
frame[0] frame[1] frame[2] frame[3] frame[4]
frame[0] frame[1] frame[2] frame[3] frame[4]
togray() togray() togray() togray() togray()
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 20/80
44. An introduction to GrPPI
Simple use
Transforming a sequence
Traditional explicit loop
using frameseq = std::vector<frame>;
frameseq seq_togray(const frameseq & s) {
frameseq r;
r.reserve(s.size() ) ;
// Requires processing in−order
for (const auto & f : s) {
r.push_back(togray(f));
}
return r;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 21/80
45. An introduction to GrPPI
Simple use
Transforming a sequence
STL way
using frameseq = std::vector<frame>;
frameseq seq_togray(const frameseq & s) {
frameseq r;
r.reserve(s.size() ) ;
std :: transform(s.begin(), s.end(), std :: back_inserter(r), togray);
return r;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 22/80
46. An introduction to GrPPI
Simple use
Transforming a sequence
Parallel STL way (C++17)
using frameseq = std::vector<frame>;
frameseq seq_togray(const frameseq & s) {
frameseq r;
r.reserve(s.size () ) ;
// No execution order assumed
std :: transform(std :: execution::par,
s.begin(), s.end(), std :: back_inserter(r), togray);
return r;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 23/80
47. An introduction to GrPPI
Simple use
Transforming a sequence
GrPPI
using frameseq = std::vector<frame>;
frameseq seq_togray(const frameseq & s) {
frameseq r(s.size() );
grppi :: sequential_execution seq;
grppi :: map(seq, s.begin(), s.end(), r.begin(), togray);
return r;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 24/80
48. An introduction to GrPPI
Simple use
Transforming a sequence
GrPPI + lambda
using frameseq = std::vector<frame>;
frameseq seq_togray(const frameseq & s) {
frameseq r(s.size() );
grppi :: sequential_execution seq;
grppi :: map(seq, s.begin(), s.end(), r.begin(),
[]( const frame & f) { return filter (f,64); }) ;
return r;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 25/80
49. An introduction to GrPPI
Simple use
Transforming a sequence
GrPPI + lambda - iterators
using frameseq = std::vector<frame>;
frameseq seq_togray(const frameseq & s) {
frameseq r(s.size() );
grppi :: sequential_execution seq;
grppi :: map(seq, s, r,
[]( const frame & f) { return filter (f,64); }) ;
return r;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 26/80
50. An introduction to GrPPI
Patterns in GrPPI
1 Introduction
2 Simple use
3 Patterns in GrPPI
4 Evaluation
5 Conclusions
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 27/80
51. An introduction to GrPPI
Patterns in GrPPI
Controlling execution
3 Patterns in GrPPI
Controlling execution
Patterns overview
Data patterns
Streaming patterns
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 28/80
52. An introduction to GrPPI
Patterns in GrPPI
Controlling execution
Execution types
Execution model is encapsulated in execution types.
Always provided as first argument to patterns.
Current concrete execution types:
Sequential: sequential_execution.
ISO C++ Threads: parallel_execution_native.
OpenMP: parallel_execution_omp.
Intel TBB: parallel_execution_tbb.
FastFlow: parallel_execution_ff.
Run-time polymorphic wrapper through type erasure:
dynamic_execution.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 29/80
53. An introduction to GrPPI
Patterns in GrPPI
Controlling execution
Execution model properties
Some execution types allow finer configurtion.
Example: Concurrency degree.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 30/80
54. An introduction to GrPPI
Patterns in GrPPI
Controlling execution
Execution model properties
Some execution types allow finer configurtion.
Example: Concurrency degree.
Interface:
ex.set_concurrency_degree(4);
int n = ex.concurrency_degree();
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 30/80
55. An introduction to GrPPI
Patterns in GrPPI
Controlling execution
Execution model properties
Some execution types allow finer configurtion.
Example: Concurrency degree.
Interface:
ex.set_concurrency_degree(4);
int n = ex.concurrency_degree();
Default values:
Sequential ⇒ 1.
Native ⇒ std::thread::hardware_concurrency().
OpenMP ⇒ omp_get_num_threads().
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 30/80
56. An introduction to GrPPI
Patterns in GrPPI
Patterns overview
3 Patterns in GrPPI
Controlling execution
Patterns overview
Data patterns
Streaming patterns
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 31/80
57. An introduction to GrPPI
Patterns in GrPPI
Patterns overview
A classification
Data patterns: Express computations over a data set.
map, reduce, map/reduce, stencil.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 32/80
58. An introduction to GrPPI
Patterns in GrPPI
Patterns overview
A classification
Data patterns: Express computations over a data set.
map, reduce, map/reduce, stencil.
Task patterns: Express task composition.
divide/conquer.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 32/80
59. An introduction to GrPPI
Patterns in GrPPI
Patterns overview
A classification
Data patterns: Express computations over a data set.
map, reduce, map/reduce, stencil.
Task patterns: Express task composition.
divide/conquer.
Streaming patterns: Express computations ver a
(possibly unbounded) data set.
pipeline.
Specialized stages: farm, filter, reduction, iteration.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 32/80
60. An introduction to GrPPI
Patterns in GrPPI
Data patterns
3 Patterns in GrPPI
Controlling execution
Patterns overview
Data patterns
Streaming patterns
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 33/80
61. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Patterns on data sets
A data pattern performs an operation on one or more data
sets that are already in memory.
Input:
One or more data sets.
Operations.
Output:
A data set (map, stencil).
A single value (reduce, map/reduce).
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 34/80
62. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Maps on data sequences
A map pattern applies an operation to every element in a a
data set, generating a new data set.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 35/80
63. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Maps on data sequences
A map pattern applies an operation to every element in a a
data set, generating a new data set.
Unidimensional:
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 35/80
64. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Maps on data sequences
A map pattern applies an operation to every element in a a
data set, generating a new data set.
Unidimensional:
Multidimensional:
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 35/80
65. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Single sequences mapping
Double all elements in vector sequentially
std :: vector<double> double_elements(const std::vector<double> & v)
{
std :: vector<double> res(v.size());
grppi :: sequential_execution seq;
grppi :: map(seq, v, res,
[]( double x) { return 2∗x; }) ;
return res;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 36/80
66. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Single sequences mapping
Double all elements in vector with OpenMP
std :: vector<double> double_elements(const std::vector<double> & v)
{
std :: vector<double> res(v.size());
grppi :: parallel_execution_omp omp;
grppi :: map(omp, v, res,
[]( double x) { return 2∗x; }) ;
return res;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 37/80
67. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Multiple sequences mapping
Add two vectors
template <typename Execution>
std :: vector<double> add_vectors(const Execution & ex,
const std::vector<double> & v1,
const std::vector<double> & v2)
{
auto size = std :: min(v1.size() , v2.size() );
std :: vector<double> res(size);
grppi :: map(ex, zip(v1, v2), res,
[]( double x, double y) { return x+y; }) ;
return res;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 38/80
68. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Multiple sequences mapping
Add three vectors
template <typename Execution>
std :: vector<double> add_vectors(const Execution & ex,
const std::vector<double> & v1,
const std::vector<double> & v2,
const std::vector<double> & v3)
{
auto size = std :: min(v1.size() , v2.size() );
std :: vector<double> res(size);
grppi :: map(ex, zip(v1,v2,v3), res,
[]( double x, double y, double z) { return x+y+z; });
return res;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 39/80
69. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Heterogeneous mapping
The result can be from a different type.
Complex vector from real and imaginary vectors
template <typename Execution>
std :: vector<complex<double>> create_cplx(const Execution & ex,
const std::vector<double> & re,
const std::vector<double> & im)
{
auto size = std :: min(re.size() , im.size() );
std :: vector<complex<double>> res(size);
grppi :: map(ex, zip(re,im), res,
[]( double r, double i) −> complex<double> { return {r,i}; });
return res;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 40/80
70. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Reductions on data sequences
A reduce pattern combines all values in a data set using a
binary combination operation.
id
result
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 41/80
71. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Homogeneous reduction
Add a sequence of values
template <typename Execution>
double add_sequence(const Execution & ex, const vector<double> & v)
{
return grppi :: reduce(ex, v, 0.0,
[]( double x, double y) { return x+y; }) ;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 42/80
72. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Heterogeneous reduction
Add areas of shapes
template <typename Execution>
double add_areas(const Execution & ex, const std::vector<shape> & shapes)
{
return grppi :: reduce(ex, shapes, 0.0,
[]( double a, const auto & s) {
return a + s.area()() ;
}) ;
}
Note: Better expressed as a map reduce.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 43/80
73. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Map/reduce pattern
A map/reduce pattern combines a map pattern and a
reduce pattern into a single pattern.
1 One or more data sets are mapped applying a
transformation operation.
2 The results are combined by a reduction operation.
A map/reduce could be also expressed by the
composition of a map and a reduce.
However, map/reduce may potentially fuse both stages,
allowing for extra optimizations.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 44/80
74. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Map/reduce
id
result
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 45/80
75. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Single sequence map/reduce
Sum of squares
template <typename Execution>
double sum_squares(const Execution & ex, const std::vector<double> & v)
{
return grppi :: map_reduce(ex, v, 0.0,
[]( double x) { return x∗x; }
[]( double x, double y) { return x+y; }
);
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 46/80
76. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Map/reduce on two data sets
Scalar product
template <typename Execution>
double scalar_product(const Execution & ex,
const std::vector<double> & v1,
const std::vector<double> & v2)
{
return grppi :: map_reduce(ex, zip(v1,v2), 0.0,
[]( double x, double y) { return x∗y; },
[]( double x, double y) { return x+y; }) ;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 47/80
77. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Heterogeneous reduction versus map/reduce
Add areas of shapes
template <typename Execution>
int add_areas(const Execution & ex, const std::vector<shape> & shapes)
{
return grppi :: map_reduce(ex, shapes, 0.0,
[]( const auto & s) { return s.area(); },
[]( double a, double b) { return a+b; } );
}
Note: Better than previous heterogeneous reduce.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 48/80
78. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Cannonical map/reduce
Given a sequence of words, produce a container where:
The key is the word.
The value is the number of occurrences of that word.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
79. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Cannonical map/reduce
Given a sequence of words, produce a container where:
The key is the word.
The value is the number of occurrences of that word.
Word frequencies
template <typename Execution>
auto word_freq(const Execution & ex, const std::vector<std::string> & words)
{
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
80. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Cannonical map/reduce
Given a sequence of words, produce a container where:
The key is the word.
The value is the number of occurrences of that word.
Word frequencies
template <typename Execution>
auto word_freq(const Execution & ex, const std::vector<std::string> & words)
{
using namespace std;
using dictionary = std :: map<string,int>;
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
81. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Cannonical map/reduce
Given a sequence of words, produce a container where:
The key is the word.
The value is the number of occurrences of that word.
Word frequencies
template <typename Execution>
auto word_freq(const Execution & ex, const std::vector<std::string> & words)
{
using namespace std;
using dictionary = std :: map<string,int>;
return grppi :: map_reduce(ex, words, dictionary{},
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
82. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Cannonical map/reduce
Given a sequence of words, produce a container where:
The key is the word.
The value is the number of occurrences of that word.
Word frequencies
template <typename Execution>
auto word_freq(const Execution & ex, const std::vector<std::string> & words)
{
using namespace std;
using dictionary = std :: map<string,int>;
return grppi :: map_reduce(ex, words, dictionary{},
[]( string w) −> dictionary { return {w,1}; }
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
83. An introduction to GrPPI
Patterns in GrPPI
Data patterns
Cannonical map/reduce
Given a sequence of words, produce a container where:
The key is the word.
The value is the number of occurrences of that word.
Word frequencies
template <typename Execution>
auto word_freq(const Execution & ex, const std::vector<std::string> & words)
{
using namespace std;
using dictionary = std :: map<string,int>;
return grppi :: map_reduce(ex, words, dictionary{},
[]( string w) −> dictionary { return {w,1}; }
[]( dictionary & lhs, const dictionary & rhs) −> dictionary {
for (auto & entry : rhs) { lhs[entry. first ] += entry.second; }
return lhs;
}) ;
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
84. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
3 Patterns in GrPPI
Controlling execution
Patterns overview
Data patterns
Streaming patterns
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 50/80
85. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Pipeline pattern
A pipeline pattern allows processing a data stream where
the computation may be divided in multiple stages.
Each stage processes the data item generated in the
previous stage and passes the produced result to the next
stage.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 51/80
86. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Standalone pipeline
A standalone pipeline is a top-level pipeline.
Invoking the pipeline translates into its execution.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 52/80
87. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Standalone pipeline
A standalone pipeline is a top-level pipeline.
Invoking the pipeline translates into its execution.
Given:
A generator g : ∅ → T1 ∪ ∅
A sequence of transformers ti : Ti → Ti+1
For every non-empty value generated by g, it evaluates:
tn(tn−1(. . . t1(g())))
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 52/80
88. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Generators
A generator g is any callable C++ entity that:
Takes no argument.
Returns a value of type T that may hold (or not) a value.
Null value signals end of stream.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
89. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Generators
A generator g is any callable C++ entity that:
Takes no argument.
Returns a value of type T that may hold (or not) a value.
Null value signals end of stream.
The return value must be any type that:
Is copy-constructible or move-constructible.
T x = g();
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
90. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Generators
A generator g is any callable C++ entity that:
Takes no argument.
Returns a value of type T that may hold (or not) a value.
Null value signals end of stream.
The return value must be any type that:
Is copy-constructible or move-constructible.
T x = g();
Is contextually convertible to bool
if (x) { /∗ ... ∗/ }
if (!x) { /∗ ... ∗/ }
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
91. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Generators
A generator g is any callable C++ entity that:
Takes no argument.
Returns a value of type T that may hold (or not) a value.
Null value signals end of stream.
The return value must be any type that:
Is copy-constructible or move-constructible.
T x = g();
Is contextually convertible to bool
if (x) { /∗ ... ∗/ }
if (!x) { /∗ ... ∗/ }
Can be derreferenced
auto val = ∗x;
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
92. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Generators
A generator g is any callable C++ entity that:
Takes no argument.
Returns a value of type T that may hold (or not) a value.
Null value signals end of stream.
The return value must be any type that:
Is copy-constructible or move-constructible.
T x = g();
Is contextually convertible to bool
if (x) { /∗ ... ∗/ }
if (!x) { /∗ ... ∗/ }
Can be derreferenced
auto val = ∗x;
The standard library offers an excellent candidate
std::experimental::optional<T>.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
93. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Simple pipeline
x -> x*x -> 1/x -> print
template <typename Execution>
void run_pipe(const Execution & ex, int n)
{
grppi :: pipeline(ex,
[ i=0,max=n] () mutable −> optional<int> {
if ( i<max) return i++;
else return {};
},
[]( int x) −> double { return x∗x; },
[]( double x) { return 1/x; },
[]( double x) { cout << x << "n"; }
);
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 54/80
94. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Nested pipelines
Pipelines may be nested.
An inner pipeline:
Does not take an execution policy.
All stages are transformers (no generator).
The last stage must also produce values.
The inner pipeline uses the same execution policy than
the outer pipeline.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 55/80
95. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Nested pipelines
Image processing
void process(std::istream & in_file , std :: ostream & out_file) {
grppi:parallel_execution_native ex;
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 56/80
96. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Nested pipelines
Image processing
void process(std::istream & in_file , std :: ostream & out_file) {
grppi:parallel_execution_native ex;
grppi :: pipeline(ex,
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 56/80
97. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Nested pipelines
Image processing
void process(std::istream & in_file , std :: ostream & out_file) {
grppi:parallel_execution_native ex;
grppi :: pipeline(ex,
[& in_file ]() −> optional<frame> {
frame f = read_frame(file);
if (! file ) return {};
return f ;
},
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 56/80
100. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Piecewise pipelines
Image processing
void process(std::istream & in_file , std :: ostream & out_file) {
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
101. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Piecewise pipelines
Image processing
void process(std::istream & in_file , std :: ostream & out_file) {
auto reader = [& in_file ]() −> optional<frame> {
frame f = read_frame(file);
if (! file ) return {};
return f ;
};
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
102. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Piecewise pipelines
Image processing
void process(std::istream & in_file , std :: ostream & out_file) {
auto reader = [& in_file ]() −> optional<frame> {
frame f = read_frame(file);
if (! file ) return {};
return f ;
};
auto transformer = pipeline(
[]( const frame & f) { return filter (f); },
[]( const frame & f) { return gray_scale(f); },
};
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
103. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Piecewise pipelines
Image processing
void process(std::istream & in_file , std :: ostream & out_file) {
auto reader = [& in_file ]() −> optional<frame> {
frame f = read_frame(file);
if (! file ) return {};
return f ;
};
auto transformer = pipeline(
[]( const frame & f) { return filter (f); },
[]( const frame & f) { return gray_scale(f); },
};
auto writer = [& out_file ](const frame & f) { write_frame(out_file , f); }
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
104. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Piecewise pipelines
Image processing
void process(std::istream & in_file , std :: ostream & out_file) {
auto reader = [& in_file ]() −> optional<frame> {
frame f = read_frame(file);
if (! file ) return {};
return f ;
};
auto transformer = pipeline(
[]( const frame & f) { return filter (f); },
[]( const frame & f) { return gray_scale(f); },
};
auto writer = [& out_file ](const frame & f) { write_frame(out_file , f); }
grppi:parallel_execution_native ex;
grppi :: pipeline(ex, reader, transformer, writer );
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
105. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Farm pattern
A farm is a streaming pattern applicable to a stage in a
pipeline, providing multiple tasks to process data items
from a data stream.
A farm has an associated cardinality which is the number
of parallel tasks used to serve the stage.
Each task in a farm runs a transformer for each data item
it receives.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 58/80
106. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Farms in pipelines
Improving a video
template <typename Execution>
void run_pipe(const Execution & ex, std::ifstream & filein , std :: ofstream & fileout )
{
grppi :: pipeline(ex,
[& filein ] () −> optional<frame> {
frame f = read_frame(filein);
if (! filein ) retrun {};
return f ;
},
farm(4, []( const frame & f) { return improve(f); },
[& fileout ] (const frame & f) { write_frame(f); }
);
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 59/80
107. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Piecewise farms
Improving a video
template <typename Execution>
void run_pipe(const Execution & ex, std::ifstream & filein , std :: ofstream & fileout )
{
auto improver = farm(4, []( const frame & f) { return improve(f); }) ;
grppi :: pipeline(ex,
[& filein ] () −> optional<frame> {
frame f = read_frame(filein);
if (! filein ) retrun {};
return f ;
},
improver,
[& fileout ] (const frame & f) { write_frame(f); }
);
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 60/80
108. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Ordering
Signals if pipeline items must be consumed in the same
order they were produced.
Do they need to be time-stamped?
Default is ordered.
API
ex.enable_ordering()
ex.disable_ordering()
bool o = ex.is_ordered()
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 61/80
109. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Queueing properties
Some policies (native and omp) use queues to
communicate pipeline stages.
Properties:
Queue size: Buffer size of the queue.
Mode: blocking versus lock-free.
API
ex.set_queue_attributes(100, mode::blocking)
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 62/80
110. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Filter pattern
A filter pattern discards (or keeps) the data items from a
data stream based on the outcome of a predicate.
This pattern can be used only as a stage of a pipeline.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 63/80
111. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Filter pattern
A filter pattern discards (or keeps) the data items from a
data stream based on the outcome of a predicate.
This pattern can be used only as a stage of a pipeline.
Alternatives:
Keep: Only data items satisfying the predicate are sent to
the next stage.
Discard: Only data items not satisfying the predicate are
sent to the next stage.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 63/80
112. An introduction to GrPPI
Patterns in GrPPI
Streaming patterns
Filtering in
Print primes
bool is_prime(int n);
template <typename Execution>
void print_primes(const Execution & ex, int n)
{
grppi :: pipeline(exec,
[ i=0,max=n]() mutable −> optional<int> {
if ( i<=n) return i++;
else return {};
},
grppi :: keep(is_prime),
[]( int x) { cout << x << "n"; }
);
}
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 64/80
114. An introduction to GrPPI
Evaluation
1 Introduction
2 Simple use
3 Patterns in GrPPI
4 Evaluation
5 Conclusions
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 66/80
115. An introduction to GrPPI
Evaluation
Example
Video frame processing for border detection with two
filters.
Gaussian blur.
Sobel.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 67/80
116. An introduction to GrPPI
Evaluation
Example
Video frame processing for border detection with two
filters.
Gaussian blur.
Sobel.
Using pipeline pattern:
S1: Frame reading.
S2: Gaussian blur (may apply farm).
S3: Sobel filter (may apply farm).
S4: Frame writing.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 67/80
117. An introduction to GrPPI
Evaluation
Example
Video frame processing for border detection with two
filters.
Gaussian blur.
Sobel.
Using pipeline pattern:
S1: Frame reading.
S2: Gaussian blur (may apply farm).
S3: Sobel filter (may apply farm).
S4: Frame writing.
Approaches:
Manual.
GrPPI.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 67/80
118. An introduction to GrPPI
Evaluation
Parallelization effort
Pipeline % LOC increase
Composition C++ Threads OpenMP Intel TBB GrPPI
( p | p | p | p ) +8.8 % +13.0 % +25.9 % +1.8 %
( p | f | p | p ) +59.4 % +62.6 % +25.9 % +3.1 %
( p | p | f | p ) +60.0 % +63.9 % +25.9 % +3.1 %
( p | f | f | p ) +106.9 % +109.4 % +25.9 % +4.4 %
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 68/80
119. An introduction to GrPPI
Evaluation
Performance: frames per second
1
4
16
64
480p 720p 1080p 1440p 2160p
Fotogramasporsegundo
Pipeline (p|p|p|p)
1
4
16
64
480p 720p 1080p 1440p 2160p
Pipeline (p|f|p|p)
1
4
16
64
480p 720p 1080p 1440p 2160p
Fotogramasporsegundo
Resolución de video
Pipeline (p|p|f|p)
1
4
16
64
256
480p 720p 1080p 1440p 2160p
Pipeline (p|f|f|p)
C++11
GrPPI C++11
OpenMP
GrPPI OpenMP
TBB
GrPPI TBB
Resolución de video
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 69/80
120. An introduction to GrPPI
Evaluation
Brain MRI (Magnetic Resonance Imaging)
Non intrusive method for getting
internal anatomy.
Huge amount of data generated.
Applied to neuro-sciences.
Bipolar disorder.
Paranoia.
Schizophrenia.
Identification of fibers and connectivity between areas in
brain.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 70/80
121. An introduction to GrPPI
Evaluation
Fibbers in brain
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 71/80
122. An introduction to GrPPI
Evaluation
MRI Evaluation
0
100
200
300
400
500
600
700
800
900
1000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Time(seconds)
Number of threads
GRPPI+SEQ GRPPI+TBB GRPPI+OMP GRPPI+THR GRPPI+FastFlow
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 72/80
123. An introduction to GrPPI
Conclusions
1 Introduction
2 Simple use
3 Patterns in GrPPI
4 Evaluation
5 Conclusions
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 73/80
124. An introduction to GrPPI
Conclusions
Summary
A unified programming model for sequential and
parallel modes.
Multiple back-ends available: sequential, OpenMP, TBB,
native, FastFlow.
Current pattern set:
Data: map, reduce, map/reduce, stencil.
Task: divide/conquer.
Streaming: pipeline with nesting of farm, filter,
reduction, iteration.
Low programming effort.
Very low performance overhead.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 74/80
125. An introduction to GrPPI
Conclusions
Further reading
A Generic Parallel Pattern Interface for Stream and Data Processing. D. del
Rio, M. F. Dolz, J. Fernández, J. D. García. Concurrency and Computation:
Practice and Experience. 2017.
Exploring stream parallel patterns in distributed MPI environments. Javier
López-Gómez, Javier Fernández Muñoz, David del Rio Astorga, Manuel F. Dolz,
J. Daniel Garcia. Parallel Computing. 84:24–36, 2019.
Challenging the abstraction penalty in parallel patterns libraries. J. D.
Garcia, D. del Rio, M. Aldinucci, F. Tordini, M. Danelutto, G. Mencagli, M.
Torquati. Journal of Supercomputing, 21pp. 2019.
Paving the way towards high-level parallel pattern interfaces for data
stream processing. D. del Rio, M. F. Dolz, J. Fernandez, and J. D. Garcia.
Future Generation Computer Systems 87:228-241. 2018
Supporting Advanced Patterns in GrPPI: a Generic Parallel Pattern
Interface. D. del rio, M. F. Dolz, J. Fernandez, and J. D. Garcia, Auto-DaSP 2017
(Euro-Par 2017).
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 75/80
126. An introduction to GrPPI
Conclusions
Further reading
Towards Automatic Parallelization of Stream Processing Applications. M. F.
Dolz, D. del Rio, J. Fernandez, J. D. Garcia, and J. Carretero, IEEE Access,
6:39944-39961. 2018.
Finding parallel patterns through static analysis in C++ applications. D. R.
del Astorga, M. F. Dolz, L. M. Sanchez, J. D. Garcia, M. Danelutto, and M.
Torquati, International Journal of High Performance Computing Applications,
2017.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 76/80
127. An introduction to GrPPI
Conclusions
GrPPI
https://github.com/arcosuc3m/grppi
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 77/80
128. An introduction to GrPPI
Conclusions
Future work
Additional backends:
Accelerator integration, distributed computing, . . .
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 78/80
129. An introduction to GrPPI
Conclusions
Future work
Additional backends:
Accelerator integration, distributed computing, . . .
Next generation C++:
Concepts (reduce metaprogramming), modules.
Executors: parallel versus reactive models.
Ranges.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 78/80
130. An introduction to GrPPI
Conclusions
Future work
Additional backends:
Accelerator integration, distributed computing, . . .
Next generation C++:
Concepts (reduce metaprogramming), modules.
Executors: parallel versus reactive models.
Ranges.
More patterns and less patterns:
Simplified interface.
Pattern catalog extension.
Minimal building blocks identification.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 78/80
131. An introduction to GrPPI
Conclusions
Future work
Additional backends:
Accelerator integration, distributed computing, . . .
Next generation C++:
Concepts (reduce metaprogramming), modules.
Executors: parallel versus reactive models.
Ranges.
More patterns and less patterns:
Simplified interface.
Pattern catalog extension.
Minimal building blocks identification.
Pattern composition:
Pattern fusion.
Pattern transformation rules.
Multi-context approach for heterogenous computing.
Better support of NUMA and affinity.
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 78/80
132. An introduction to GrPPI
Conclusions
The future
grppi :: omp_execution omp;
grppi :: cuda_execution cuda;
std :: vector<int> v = get() ;
auto comp = v |
grppi :: map([](auto x) { return x∗x; }) |
grppi :: reduce([](auto x, auto y) { return x+y; }) ;
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 79/80
133. An introduction to GrPPI
Conclusions
The future
grppi :: omp_execution omp;
grppi :: cuda_execution cuda;
std :: vector<int> v = get() ;
auto comp = v |
grppi :: map([](auto x) { return x∗x; }) |
grppi :: reduce([](auto x, auto y) { return x+y; }) ;
comp >> grppi::on(omp);
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 79/80
135. An introduction to GrPPI
Conclusions
An introduction to GrPPI
Generic Reusable Parallel Patterns Interface
J. Daniel Garcia
ARCOS Group
University Carlos III of Madrid
Spain
November 2019
cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 80/80