In this video from the Stanford HPC Conference, Elliott Slaughter, Wonchan Lee, Todd Warszawski, Karthik Murthy from Stanford present: Advances in the Legion Programming Model.
"Legion is an exascale-ready parallel programming model that simplifies the mapping of a complex, large-scale simulation code on a modern heterogeneous supercomputer. Legion relieves scientists and engineers of several burdens: they no longer need to determine which tasks depend on other tasks, specify where calculations will occur, or manage the transmission of data to and from the processors. In this talk, we will focus on three aspects of the Legion programming system, namely, dynamic tracing, projection functions, and vectorization."
Watch the video: https://wp.me/p3RLHQ-icf
Learn more: http://legion.stanford.edu/
and
http://hpcadvisorycouncil.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Advances in the Legion Programming Model
1. Stanford Conference, February 20-21, 2018
1
http://legion.stanford.edu
Elliott Slaughter2, Wonchan Lee1,
Todd Warszawski1, Karthik Murthy1
1Stanford University
2SLAC
Best Practices: Advances in the
Legion Programming Model
2. Stanford Conference, February 20-21, 2018
2
http://legion.stanford.edu
Why You Should Care: Performance
1.9x–2.8x better performance
(Titan supercomputer, S3D combustion simulation)
1.9x
2.8x
3. Stanford Conference, February 20-21, 2018
3
http://legion.stanford.edu
Why You Should Care: Ease of Use
Simulated “Primary Reference Fuel” mechanism
Too computationally intensive until now
Switched machines halfway through
Performance-tuned for new machine in hours
Titan Piz Daint
7.2x
3.9x 4.0x
4.0x
4. Stanford Conference, February 20-21, 2018
4
http://legion.stanford.edu
Legion Runtime
Task-Based Programming
task foo(x,y,z: region(…))
where reads writes(x,y,z) do
bar(y,x)
bar(x,y)
bar(x,z)
bar(z,y)
end
task bar(r,s: region(…)) where reads(r), writes(s)
bar(y,x)
bar(x,y) bar(x,z)
bar(z,y)
5. Stanford Conference, February 20-21, 2018
5
http://legion.stanford.edu
Can We Get Performance Today?
Yes … at great cost:
Task graph for one time step on one node…
… of a mini-app
Who will schedule the graph?
(High Performance)
Who will re-schedule the graph
for every new machine?
(Performance Portability)
Who is responsible
for generating the graph?
(Programmability)
Today: programmer’s responsibility
Tomorrow: programming system’s
responsibility
6. Stanford Conference, February 20-21, 2018
6
http://legion.stanford.edu
Legion: Tasks & Regions
A task is the unit of parallel execution
I.e. a function
Task arguments are regions
Collections
Rows are an index space
Columns are fields
Tasks declare how they use their regions
task saxpy(is : ispace(int1d), x,y: region(is, float), a: float )
where reads(x, y), writes(y)
0
1
2
3
4
2.72
3.14
42.0
12.7
0.0
7. Stanford Conference, February 20-21, 2018
7
http://legion.stanford.edu
Example Task
task saxpy(is: ispace(int1d), x: region(is, float),
y: region(is, float), a: float)
where
reads(x, y), writes(y)
do
for i in is do
y[i] += a*x[i]
end
end
8. Stanford Conference, February 20-21, 2018
8
http://legion.stanford.edu
Tasks
Tasks can call subtasks
Sequential semantics, implicit parallelism
If tasks do not interfere, can be executed in parallel
task foo(x,y,z: region(…))
where reads writes(x,y,z) do
bar(y,x)
bar(x,y)
bar(x,z)
bar(z,y)
end
task bar(r,s: region(…)) where reads(r), writes(s)
9. Stanford Conference, February 20-21, 2018
9
http://legion.stanford.edu
More on Permissions
Tasks declare permissions on regions
task bar(r: region(…)) where reads(r)
task bar(r: region(…)) where writes(r)
task bar(r: region(…)) where reduces +(r)
10. Stanford Conference, February 20-21, 2018
10
http://legion.stanford.edu
Regions
Regions can be partitioned into subregions
Partitioning is a primitive operation
Supports describing arbitrary subsets of a region
11. Stanford Conference, February 20-21, 2018
11
http://legion.stanford.edu
P S
Partitioning
N
s1 s2 s3 g1 g2 g3p1 p2 p3
W
w1 w2 w3
12. Stanford Conference, February 20-21, 2018
12
http://legion.stanford.edu
Regent: A Language for Legion
Easy to use and significantly less code
Type checker for Legion semantics
Compiler matches performance of hand-written
Legion (including kernels: vectorization, GPU, etc.)
task saxpy(is : ispace(int1d), x: region(is, float),
y: region(is, float), a: float)
where reads(x, y), writes(y) do
for i in is do
y[i] += a*x[i]
end
end
13. Stanford Conference, February 20-21, 2018
13
http://legion.stanford.edu
Legion Summary
The programmer
Describes the structure of the program’s data
Regions
The tasks that operate on that data
The Legion runtime
Guarantees tasks appear to execute in sequential order
Ensures tasks have the correct versions of their regions
The Regent language
Type system checks correctness of programs
Significantly easier to use, less code
Compiler matches performance of hand-written Legion
15. Stanford Conference, February 20-21, 2018
15
http://legion.stanford.edu
Legion Architecture
Realm
Isometry
(DMA)
Legion
(runtime)
Regent
(compiler)
DSL
compilers
Bishop
(compiler)
applications mappers
POSIX CUDA GASNet libnumapthreads
func/perf
verif tools
data model/
partitioning
type
system
Editor's Notes
hiding latency
Callback to first graph
This is a graph of operations and dependencies in an application. This may not be expressed explicitly in most of today’s systems (i.e. MPI), but it is there, implicitly in the programmer’s head. For example, two operations without a common dependence can be scheduled in parallel onto different processors.
“Collections, which you can think of as being like arrays of structs”
Borrow task names from S3D
Unique to Legion: multiple partitions, hierarchy
Solves the three prlblems: much easier syntax to learn, enforces all semantic requirements through compile time checking, support for generating efficient kernel code (e.g., vectortization) since it is a compiler, now the right way to learn Legion even if you plan to use the C++ API