Advances in the Legion Programming Model

•Download as PPTX, PDF•

1 like•616 views

In this video from the Stanford HPC Conference, Elliott Slaughter, Wonchan Lee, Todd Warszawski, Karthik Murthy from Stanford present: Advances in the Legion Programming Model. "Legion is an exascale-ready parallel programming model that simplifies the mapping of a complex, large-scale simulation code on a modern heterogeneous supercomputer. Legion relieves scientists and engineers of several burdens: they no longer need to determine which tasks depend on other tasks, specify where calculations will occur, or manage the transmission of data to and from the processors. In this talk, we will focus on three aspects of the Legion programming system, namely, dynamic tracing, projection functions, and vectorization." Watch the video: https://wp.me/p3RLHQ-icf Learn more: http://legion.stanford.edu/ and http://hpcadvisorycouncil.com Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter

Technology

Stanford Conference, February 20-21, 2018
1
http://legion.stanford.edu
Elliott Slaughter2, Wonchan Lee1,
Todd Warszawski1, Karthik Murthy1
1Stanford University
2SLAC
Best Practices: Advances in the
Legion Programming Model

Stanford Conference, February 20-21, 2018
2
http://legion.stanford.edu
Why You Should Care: Performance
1.9x–2.8x better performance
(Titan supercomputer, S3D combustion simulation)
1.9x
2.8x

Stanford Conference, February 20-21, 2018
3
http://legion.stanford.edu
Why You Should Care: Ease of Use
Simulated “Primary Reference Fuel” mechanism
Too computationally intensive until now
Switched machines halfway through
Performance-tuned for new machine in hours
Titan Piz Daint
7.2x
3.9x 4.0x
4.0x

Stanford Conference, February 20-21, 2018
4
http://legion.stanford.edu
Legion Runtime
Task-Based Programming
task foo(x,y,z: region(…))
where reads writes(x,y,z) do
bar(y,x)
bar(x,y)
bar(x,z)
bar(z,y)
end
task bar(r,s: region(…)) where reads(r), writes(s)
bar(y,x)
bar(x,y) bar(x,z)
bar(z,y)

Stanford Conference, February 20-21, 2018
5
http://legion.stanford.edu
Can We Get Performance Today?
Yes … at great cost:
Task graph for one time step on one node…
… of a mini-app
Who will schedule the graph?
(High Performance)
Who will re-schedule the graph
for every new machine?
(Performance Portability)
Who is responsible
for generating the graph?
(Programmability)
Today: programmer’s responsibility
Tomorrow: programming system’s
responsibility

Stanford Conference, February 20-21, 2018
6
http://legion.stanford.edu
Legion: Tasks & Regions
A task is the unit of parallel execution
I.e. a function
Task arguments are regions
Collections
Rows are an index space
Columns are fields
Tasks declare how they use their regions
task saxpy(is : ispace(int1d), x,y: region(is, float), a: float )
where reads(x, y), writes(y)
0
1
2
3
4
2.72
3.14
42.0
12.7
0.0

Stanford Conference, February 20-21, 2018
7
http://legion.stanford.edu
Example Task
task saxpy(is: ispace(int1d), x: region(is, float),
y: region(is, float), a: float)
where
reads(x, y), writes(y)
do
for i in is do
y[i] += a*x[i]
end
end

Stanford Conference, February 20-21, 2018
8
http://legion.stanford.edu
Tasks
Tasks can call subtasks
Sequential semantics, implicit parallelism
If tasks do not interfere, can be executed in parallel
task foo(x,y,z: region(…))
where reads writes(x,y,z) do
bar(y,x)
bar(x,y)
bar(x,z)
bar(z,y)
end
task bar(r,s: region(…)) where reads(r), writes(s)

Stanford Conference, February 20-21, 2018
9
http://legion.stanford.edu
More on Permissions
Tasks declare permissions on regions
task bar(r: region(…)) where reads(r)
task bar(r: region(…)) where writes(r)
task bar(r: region(…)) where reduces +(r)

Stanford Conference, February 20-21, 2018
10
http://legion.stanford.edu
Regions
Regions can be partitioned into subregions
Partitioning is a primitive operation
Supports describing arbitrary subsets of a region

Stanford Conference, February 20-21, 2018
11
http://legion.stanford.edu
P S
Partitioning
N
s1 s2 s3 g1 g2 g3p1 p2 p3
W
w1 w2 w3

Stanford Conference, February 20-21, 2018
12
http://legion.stanford.edu
Regent: A Language for Legion
Easy to use and significantly less code
Type checker for Legion semantics
Compiler matches performance of hand-written
Legion (including kernels: vectorization, GPU, etc.)
task saxpy(is : ispace(int1d), x: region(is, float),
y: region(is, float), a: float)
where reads(x, y), writes(y) do
for i in is do
y[i] += a*x[i]
end
end

Stanford Conference, February 20-21, 2018
13
http://legion.stanford.edu
Legion Summary
The programmer
Describes the structure of the program’s data
Regions
The tasks that operate on that data
The Legion runtime
Guarantees tasks appear to execute in sequential order
Ensures tasks have the correct versions of their regions
The Regent language
Type system checks correctness of programs
Significantly easier to use, less code
Compiler matches performance of hand-written Legion

Stanford Conference, February 20-21, 2018
14
http://legion.stanford.edu
Questions?

Stanford Conference, February 20-21, 2018
15
http://legion.stanford.edu
Legion Architecture
Realm
Isometry
(DMA)
Legion
(runtime)
Regent
(compiler)
DSL
compilers
Bishop
(compiler)
applications mappers
POSIX CUDA GASNet libnumapthreads
func/perf
verif tools
data model/
partitioning
type
system

Similar to Advances in the Legion Programming Model

Dataframes in Spark - Data Analysts' perspectiveMarcin Szymaniuk

Spark3poovarasu maniandan

Building Stateful Microservices With AkkaYaroslav Tkachenko

Dynamic data race detection in concurrent Java programsDevexperts

Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...BigDataEverywhere

Apache Flink Deep-Dive @ Hadoop Summit 2015 in San Jose, CARobert Metzger

Georastutorialdanyshareslide

Scalding Big (Ad)tab0ris_1

Lecture 12 os鍾誠陳鍾誠

DartAndrea Chiodoni

Write effectlively in late xC-CORE

Meet the squirrel @ #CSHUGMárton Balassi

A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016 Databricks

So you think you can stream.pptxPrakash Chockalingam

Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...Flink Forward

20130912 YTC_Reynold Xin_Spark and SharkYahooTechConference

Big Data Analytics with Scala at SCALA.IO 2013Samir Bessalah

presentationLuca Terrazzan

Spark what's new what's comingDatabricks

24-TensorFlow-Clipper.pptxnjjjjnjjjjjjmmSasidharaKashyapChat

Similar to Advances in the Legion Programming Model (20)

Dataframes in Spark - Data Analysts' perspective

Spark3

Building Stateful Microservices With Akka

Dynamic data race detection in concurrent Java programs

Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...

Apache Flink Deep-Dive @ Hadoop Summit 2015 in San Jose, CA

Georastutorial

Scalding Big (Ad)ta

Lecture 12 os

Dart

Write effectlively in late x

Meet the squirrel @ #CSHUG

A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016

So you think you can stream.pptx

Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...

20130912 YTC_Reynold Xin_Spark and Shark

Big Data Analytics with Scala at SCALA.IO 2013

presentation

Spark what's new what's coming

24-TensorFlow-Clipper.pptxnjjjjnjjjjjjmm

Recently uploaded

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Slack Application Development 101 Slidespraypatel2

How to convert PDF to text with Nanonetsnaman860154

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

A Domino Admins Adventures (Engage 2024)

Maximizing Board Effectiveness 2024 Webinar.pptx

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

How to Troubleshoot Apps for the Modern Connected Worker

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Unblocking The Main Thread Solving ANRs and Frozen Frames

08448380779 Call Girls In Friends Colony Women Seeking Men

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

Slack Application Development 101 Slides

How to convert PDF to text with Nanonets

08448380779 Call Girls In Civil Lines Women Seeking Men

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Advances in the Legion Programming Model

1. Stanford Conference, February 20-21, 2018 1 http://legion.stanford.edu Elliott Slaughter2, Wonchan Lee1, Todd Warszawski1, Karthik Murthy1 1Stanford University 2SLAC Best Practices: Advances in the Legion Programming Model

2. Stanford Conference, February 20-21, 2018 2 http://legion.stanford.edu Why You Should Care: Performance 1.9x–2.8x better performance (Titan supercomputer, S3D combustion simulation) 1.9x 2.8x

3. Stanford Conference, February 20-21, 2018 3 http://legion.stanford.edu Why You Should Care: Ease of Use Simulated “Primary Reference Fuel” mechanism Too computationally intensive until now Switched machines halfway through Performance-tuned for new machine in hours Titan Piz Daint 7.2x 3.9x 4.0x 4.0x

4. Stanford Conference, February 20-21, 2018 4 http://legion.stanford.edu Legion Runtime Task-Based Programming task foo(x,y,z: region(…)) where reads writes(x,y,z) do bar(y,x) bar(x,y) bar(x,z) bar(z,y) end task bar(r,s: region(…)) where reads(r), writes(s) bar(y,x) bar(x,y) bar(x,z) bar(z,y)

5. Stanford Conference, February 20-21, 2018 5 http://legion.stanford.edu Can We Get Performance Today? Yes … at great cost: Task graph for one time step on one node… … of a mini-app Who will schedule the graph? (High Performance) Who will re-schedule the graph for every new machine? (Performance Portability) Who is responsible for generating the graph? (Programmability) Today: programmer’s responsibility Tomorrow: programming system’s responsibility

6. Stanford Conference, February 20-21, 2018 6 http://legion.stanford.edu Legion: Tasks & Regions A task is the unit of parallel execution I.e. a function Task arguments are regions Collections Rows are an index space Columns are fields Tasks declare how they use their regions task saxpy(is : ispace(int1d), x,y: region(is, float), a: float ) where reads(x, y), writes(y) 0 1 2 3 4 2.72 3.14 42.0 12.7 0.0

7. Stanford Conference, February 20-21, 2018 7 http://legion.stanford.edu Example Task task saxpy(is: ispace(int1d), x: region(is, float), y: region(is, float), a: float) where reads(x, y), writes(y) do for i in is do y[i] += a*x[i] end end

8. Stanford Conference, February 20-21, 2018 8 http://legion.stanford.edu Tasks Tasks can call subtasks Sequential semantics, implicit parallelism If tasks do not interfere, can be executed in parallel task foo(x,y,z: region(…)) where reads writes(x,y,z) do bar(y,x) bar(x,y) bar(x,z) bar(z,y) end task bar(r,s: region(…)) where reads(r), writes(s)

9. Stanford Conference, February 20-21, 2018 9 http://legion.stanford.edu More on Permissions Tasks declare permissions on regions task bar(r: region(…)) where reads(r) task bar(r: region(…)) where writes(r) task bar(r: region(…)) where reduces +(r)

10. Stanford Conference, February 20-21, 2018 10 http://legion.stanford.edu Regions Regions can be partitioned into subregions Partitioning is a primitive operation Supports describing arbitrary subsets of a region

11. Stanford Conference, February 20-21, 2018 11 http://legion.stanford.edu P S Partitioning N s1 s2 s3 g1 g2 g3p1 p2 p3 W w1 w2 w3

12. Stanford Conference, February 20-21, 2018 12 http://legion.stanford.edu Regent: A Language for Legion Easy to use and significantly less code Type checker for Legion semantics Compiler matches performance of hand-written Legion (including kernels: vectorization, GPU, etc.) task saxpy(is : ispace(int1d), x: region(is, float), y: region(is, float), a: float) where reads(x, y), writes(y) do for i in is do y[i] += a*x[i] end end

13. Stanford Conference, February 20-21, 2018 13 http://legion.stanford.edu Legion Summary The programmer Describes the structure of the program’s data Regions The tasks that operate on that data The Legion runtime Guarantees tasks appear to execute in sequential order Ensures tasks have the correct versions of their regions The Regent language Type system checks correctness of programs Significantly easier to use, less code Compiler matches performance of hand-written Legion

14. Stanford Conference, February 20-21, 2018 14 http://legion.stanford.edu Questions?

15. Stanford Conference, February 20-21, 2018 15 http://legion.stanford.edu Legion Architecture Realm Isometry (DMA) Legion (runtime) Regent (compiler) DSL compilers Bishop (compiler) applications mappers POSIX CUDA GASNet libnumapthreads func/perf verif tools data model/ partitioning type system

Editor's Notes

hiding latency
Callback to first graph
This is a graph of operations and dependencies in an application. This may not be expressed explicitly in most of today’s systems (i.e. MPI), but it is there, implicitly in the programmer’s head. For example, two operations without a common dependence can be scheduled in parallel onto different processors.
“Collections, which you can think of as being like arrays of structs”
Borrow task names from S3D
Unique to Legion: multiple partitions, hierarchy
Solves the three prlblems: much easier syntax to learn, enforces all semantic requirements through compile time checking, support for generating efficient kernel code (e.g., vectortization) since it is a compiler, now the right way to learn Legion even if you plan to use the C++ API

Advances in the Legion Programming Model

Recommended

Recommended

More Related Content

Similar to Advances in the Legion Programming Model

Similar to Advances in the Legion Programming Model (20)

More from inside-BigData.com

More from inside-BigData.com (20)

Recently uploaded

Recently uploaded (20)

Advances in the Legion Programming Model

Editor's Notes