ulticore computer are now part of our everyday life. Most desktop and laptop machines out there bundle a dual-core processor, quad-core processor or even 8-core processor by default. This multiplication of the number of core on the same chip is destined to become the way for manufacturers to remain competitive. However, developers were a bit left out in this process, having written sequential programs for ages whereas they were now required to parallelize their program to make them efficient which isn't an easy step to take.
That's why we now see the apparition of framework designed to help programmers to take advantage of this new architecture of processor by hiding away the parallel difficulty under primitives that they are used to. ParallelFx is one of such framework for the Mono and .NET world. By providing several new parallel constructs and concurrent data structures, it allows Mono applications to enter painlessly in this new multicore era.
ParallelFx, bringing Mono applications in the multicore era
1. ParallelFx
Bringing Mono applications in the multicore era
Jérémie Laval
@ jeremie.laval@gmail.com
http://blog.neteril.org
http://twitter.com/jeremie_laval
IRC Garuma on #mono @ GIMPNet.org
2. Outline
Why bother?
The free lunch
An awesome idea
Remaining performant
ParallelFx
The big picture
Tasks and co
PLinq
State of things
A note for the future
3. Why bother? ParallelFx A note for the future Questions
Why are we bothering with parallelization
3 / 23
4. Why bother? ParallelFx A note for the future Questions
Why are we bothering with parallelization
Everyone loves his single thread
3 / 23
5. Why bother? ParallelFx A note for the future Questions
Why are we bothering with parallelization
Everyone loves his single thread
“The ideal number of thread you should use is 1”
– Alan Mc Govern
Mono hacker
3 / 23
6. Why bother? ParallelFx A note for the future Questions
Why are we bothering with parallelization
Everyone loves his single thread
– Alan Mc Govern
Mono hacker
3 / 23
7. Why bother? ParallelFx A note for the future Questions
Why are we bothering with parallelization
4 / 23
8. Why bother? ParallelFx A note for the future Questions
Why are we bothering with parallelization
Because the free lunch is over!
4 / 23
9. Why bother? ParallelFx A note for the future Questions
Free lunch?
70’s - 2005: Moore’s law in action
Intel Processor Clock Speed (MHz)
10000
Pentium 4 (Prescott)
Core 2
Extreme
1000
Pentium III
Celeron Multicore crisis
Pentium is here !
100
80486
80386
10
80286
8080
1
1968 1973 1979 1984 1990 1995 2001 2006
0.1
Source: Smoothspan blog
5 / 23
11. Why bother? ParallelFx A note for the future Questions
The awesome idea
Can’t scale vertically? Scale horizontally!
6 / 23
12. Why bother? ParallelFx A note for the future Questions
Trend of things
Number of cores
1 2 4 12ish 80
Core 2 Intel
Pentium Core Duo Core i7-980X
Quad prototype
7 / 23
17. Why bother? ParallelFx A note for the future Questions
Parallelization is difficult
11 / 23
18. Why bother? ParallelFx A note for the future Questions
Parallelization is difficult
Hard: is that stuff really thread safe?
11 / 23
19. Why bother? ParallelFx A note for the future Questions
Parallelization is difficult
Hard: is that stuff really thread safe?
Tedious: whose turn is it to debug deadlocks?
11 / 23
20. Why bother? ParallelFx A note for the future Questions
Parallelization is difficult
Hard: is that stuff really thread safe?
Tedious: whose turn is it to debug deadlocks?
Inefficient: how many thread to use?
11 / 23
21. Why bother? ParallelFx A note for the future Questions
Parallelization is difficult
Hard: is that stuff really thread safe?
Tedious: whose turn is it to debug deadlocks?
Inefficient: how many thread to use?
Too few: it’s not scaling
11 / 23
22. Why bother? ParallelFx A note for the future Questions
Parallelization is difficult
Hard: is that stuff really thread safe?
Tedious: whose turn is it to debug deadlocks?
Inefficient: how many thread to use?
Too few: it’s not scaling
Too much: context-switching hurts performance
11 / 23
23. Why bother? ParallelFx A note for the future Questions
Parallelization is difficult
Hard: is that stuff really thread safe?
Tedious: whose turn is it to debug deadlocks?
Inefficient: how many thread to use?
Too few: it’s not scaling
Too much: context-switching hurts performance
What if the number of core changes?
11 / 23
24. Why bother? ParallelFx A note for the future Questions
KISS time (Keep it simple, stupid)
We need something different :
12 / 23
25. Why bother? ParallelFx A note for the future Questions
KISS time (Keep it simple, stupid)
We need something different :
Automagically regulate thread usage at runtime
12 / 23
26. Why bother? ParallelFx A note for the future Questions
KISS time (Keep it simple, stupid)
We need something different :
Automagically regulate thread usage at runtime
As straightforward as possible
12 / 23
27. Why bother? ParallelFx A note for the future Questions
KISS time (Keep it simple, stupid)
We need something different :
Automagically regulate thread usage at runtime
As straightforward as possible
Simulate familiar constructs
12 / 23
28. Why bother? ParallelFx A note for the future Questions
KISS time (Keep it simple, stupid)
We need something different :
Automagically regulate thread usage at runtime
As straightforward as possible
Simulate familiar constructs
Reuse existing code with only slight modification
12 / 23
29. Why bother? ParallelFx A note for the future Questions
Enter ParallelFx
ParallelFx at a glance
13 / 23
30. Why bother? ParallelFx A note for the future Questions
At the heart
Work-stealing scheduler
Mono Application
Shared
ParallelFX library
work
pool (Scheduler)
Retrieve
Local work Steal Local work
pool pool
Thread Manage Thread
Worker Worker
OS OS
thread thread
14 / 23
32. Why bother? ParallelFx A note for the future Questions
Future (Task<T>)
static int SumParallel (Tree<int> tree, int curDepth)
{
const int SequentialThreshold = 3;
if (tree == null) return 0;
if (curDepth > SequentialThreshold)
return SumSequentialInternal (tree);
int right = SumParallel (tree.Right, curDepth + 1);
Task<int> left =
Task.Factory.StartNew (() => SumParallel (tree.Left, curDepth + 1));
return tree.Data + left.Value + right;
}
16 / 23
33. Why bother? ParallelFx A note for the future Questions
Parallel.For
Fractal fractal = new Fractal (width, height);
ColorChooser colorChooser = new ColorChooser ();
Parallel.For (0, width, (i) => {
for (int j = 0; j < height; j++) {
ProcessPixel (i, j, fractal, colorChooser);
}
});
17 / 23
34. Why bother? ParallelFx A note for the future Questions
Demo: Parallel For
Demo: image processing with parallel
loops
18 / 23
35. Why bother? ParallelFx A note for the future Questions
PLinq
ParallelEnumerable.Range (1, 1000)./* ... */
ParallelEnumerable.Repeat (‘‘Rupert’’, 1000)./* ... */
enumerable.AsParallel ()./* ... */
var query = from x in Directory.GetFiles ("/etc/")
.AsParallel ()
where x.EndsWith (".conf")
select x;
query.ForAll ((e) => {
Console.WriteLine ("{0} from {1}", e,
Thread.CurrentThread.ManagedThreadId);
});
19 / 23
36. Why bother? ParallelFx A note for the future Questions
Demo: PLinq
Demo: raytracing the PLinq way
20 / 23
37. Why bother? ParallelFx A note for the future Questions
State of things
Mono 2.6 (December 2009): .NET 4 beta 1
Task, Future, Parallel loops
Concurrent collections
Coordination data structures
Mono 2.8 (or git master): .NET 4 compliant
Enabled by default
Tons of improvement
With PLinq!
21 / 23
39. Why bother? ParallelFx A note for the future Questions
A note for the future
GLinq: GPGPU-powered Linq
22 / 23
40. Why bother? ParallelFx A note for the future Questions
A note for the future
GLinq: GPGPU-powered Linq
Re-use PLinq pipeline
22 / 23
41. Why bother? ParallelFx A note for the future Questions
A note for the future
GLinq: GPGPU-powered Linq
Re-use PLinq pipeline
Same design guidelines than PLinq: simplicity, transparency
22 / 23
42. Why bother? ParallelFx A note for the future Questions
A note for the future
GLinq: GPGPU-powered Linq
Re-use PLinq pipeline
Same design guidelines than PLinq: simplicity, transparency
Rewrite C# code in OpenCL (Expression trees FTW!)
22 / 23
43. Why bother? ParallelFx A note for the future Questions
A note for the future
GLinq: GPGPU-powered Linq
Re-use PLinq pipeline
Same design guidelines than PLinq: simplicity, transparency
Rewrite C# code in OpenCL (Expression trees FTW!)
Transparent mapping of OpenCL idioms to .NET
22 / 23