SlideShare ist ein Scribd-Unternehmen logo
1 von 62
Downloaden Sie, um offline zu lesen
Brad Richardson
Framework for Extensible,
Asynchronous Task
Scheduling (FEATS) in Fortran
Co-contributors: Damian Rouson,
Harris Snyder and Robert Singleterry
Agenda
• Motivations
• Example/Demo Application
• Implementation Details
• Performance Studies
• Conclusions
Motivations
https://portal.nersc.gov/project/m888/nersc10/workload/N10_Workload_Analysis.latest.pdf
Motivations
● Target an under-served user base: Fortran
○ Enable scientists and engineers to develop efficient applications for HPC beyond the
“embarrassingly parallel” problems
● Explore the native parallel features of Fortran
○ Don’t force “reformulating” the problem to be able to interoperate with C/C++ or other
external libraries
Example Applications
if (this_image() == 1) then
write(output_unit, "(A)") &
"Enter values for a, b and c in `a*x**2 + b*x + c`:"
flush(output_unit)
read(input_unit, *) a, b, c
end if
call co_broadcast(a, 1)
call co_broadcast(b, 1)
call co_broadcast(c, 1)
solver = dag_t( &
[ vertex_t([integer::], a_t(a)) &
, vertex_t([integer::], b_t(b)) &
, vertex_t([integer::], c_t(c)) &
, vertex_t([2], b_squared_t()) &
, vertex_t([1, 3], four_ac_t()) &
, vertex_t([4, 5], square_root_t()) &
, vertex_t([2, 6], minus_b_pm_square_root_t()) &
, vertex_t([1], two_a_t()) &
, vertex_t([8, 7], division_t()) &
, vertex_t([9], printer_t()) &
])
Quadratic Solver
-b±√(b2
-4*a*c)
2*a
Enter values for a, b and c in `a*x**2 + b*x + c`:
1 1 -12
c = -12.0000000
b = 1.0000000
a = 1.0000000
2*a = 2.0000000
b**2 = 1.0000000
4*a*c = -48.0000000
sqrt(b**2 - 4*a*c) = 7.0000000 -7.0000000
-b +- sqrt(b**2 - 4*a*c) = 6.0000000 -8.0000000
(-b +- sqrt(b**2 - 4*a*c)) / (2*a) = 3.0000000 -4.0000000
The roots are 3.0000000 -4.0000000
Enter values for a, b and c in `a*x**2 + b*x + c`:
1 -1 -30
b = -1.0000000
c = -30.0000000
a = 1.0000000
b**2 = 1.0000000
4*a*c = -1.2000000E+02
2*a = 2.0000000
sqrt(b**2 - 4*a*c) = 11.0000000 -11.0000000
-b +- sqrt(b**2 - 4*a*c) = 12.0000000 -10.0000000
(-b +- sqrt(b**2 - 4*a*c)) / (2*a) = 6.0000000 -5.0000000
The roots are 6.0000000 -5.0000000
Quadratic Solver
LU Decomposition
In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix
as the product of a lower triangular matrix and an upper triangular matrix (see matrix decomposition). The
product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form
of Gaussian elimination.
- Wikipedia
I.e. A = LU =
call read_matrix_in(matrix, matrix_size)
num_tasks = 4 + (matrix_size-1)*2 + sum([((matrix_size-step)*3, step=1,matrix_size-1)])
allocate(vertices(num_tasks))
vertices(1) = vertex_t([integer::], initial_t(matrix))
vertices(2) = vertex_t([1], print_matrix_t(0))
do step = 1, matrix_size-1
do row = step+1, matrix_size
latest_matrix = 1 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) ! reconstructed matrix from last step
task_base = sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) + 3*(row-(step+1))
vertices(3+task_base) = vertex_t([latest_matrix], calc_factor_t(row=row, step=step))
vertices(4+task_base) = vertex_t([latest_matrix, 3+task_base], row_multiply_t(step=step))
vertices(5+task_base) = vertex_t([latest_matrix, 4+task_base], row_subtract_t(row=row))
end do
reconstruction_step = 3 + sum([(3*(matrix_size-i), i = 1, step)]) + 2*(step-1)
vertices(reconstruction_step) = vertex_t( &
[ 1 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) &
, [(5 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) + 3*(row-(step+1)) &
, row=step+1, matrix_size)] &
], &
reconstruct_t(step=step)) ! depends on previous reconstructed matrix and just subtracted rows
! print the just reconstructed matrix
vertices(reconstruction_step+1) = vertex_t([reconstruction_step], print_matrix_t(step))
end do
vertices(num_tasks-1) = vertex_t( &
[([(3 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) + 3*(row-(step+1)) &
, row=step+1, matrix_size)] &
, step=1, matrix_size-1)] &
, back_substitute_t(n_rows=matrix_size)) ! depends on all "factors"
vertices(num_tasks) = vertex_t([num_tasks-1], print_matrix_t(matrix_size))
dag = dag_t(vertices)
LU Decomposition
LU Decomposition - 1x1
LU Decomposition - 2x2
LU Decomposition - 3x3
LU Decomposition - 4x4
LU Decomposition - 5x5
LU Decomposition - 3x3
LU Decomposition - 3x3
Step: 0
11.0516795999999999 4.1176995799999997E+02 5.7346350099999995E+02
9.7038173699999994 9.7847460899999999E+02 7.5112048300000004E+02
5.6446673599999997E+02 8.9235162400000002E+02 8.0637854000000004E+02
Step: 1
11.0516795999999999 4.1176995799999997E+02 5.7346350099999995E+02
0.0000000000000000 6.1692409220031186E+02 2.4759655872112285E+02
0.0000000000000000 -2.0138880965761025E+04 -2.8483383952264314E+04
Step: 3
1.0000000000000000 0.0000000000000000 0.0000000000000000
0.8780400555586139 1.0000000000000000 0.0000000000000000
51.0751991036728938 -32.6440176682580301 1.0000000000000000
Step: 2
11.0516795999999999 4.1176995799999997E+02 5.7346350099999995E+02
0.0000000000000000 6.1692409220031186E+02 2.4759655872112285E+02
0.0000000000000000 0.0000000000000000 -2.0400837514772094E+04
Implementation
Derived Types
Run Implementation (#1)
• One scheduler image and multiple executor images
• Mailbox, and task assignment coarrays
• Events to signal ready for work and task completed
• Directed acyclic graph (DAG) to define task dependencies
Coarrays Needed
• type(payload_t), allocatable :: mailbox(:)[:]
• type(event_type), allocatable :: ready_for_next_task(:)[:]
• type(event_type) :: task_assigned[*]
• integer :: task_identifier[*]
• integer, allocatable :: task_assignment_history(:)[:]
Startup Procedure
● User:
○ Define DAG
■ Define tasks and their dependencies
○ call run(dag)
NOTE: All images must have the same dag to start
● Framework:
○ Allocate coarrays needed for communication
■ If we’re executing on more than one image
Data Layout
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Scheduler Steps
• Find an executor that has posted it is ready
• While we do this, we keep track of what tasks have been completed
• Find an uncompleted task with all dependencies completed
• “Wait” for the ready executor (balances posts/waits)
• Assign the task to the executor
• Post that the executor has been assigned a task
• Repeat
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Find Executor
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Find Next Task
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Wait for ready image
event wait(...)
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Assign task
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Post task assigned
event post(...)
Executor Steps
• Post ready for a task
• Wait until it has been assigned a task
• Collect payloads from executors that ran dependent tasks
• We access the history kept by the scheduler to determine this
• Execute task and store result in mailbox
• Repeat
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Post ready for task
event post(...)
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Wait for task
assignment
event wait(...)
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Collect inputs
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Execute task and
store result
Performance
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - single node (OpenCoarrays)
Compilation flags= -O3
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
• Every image attempts to claim a task as soon as possible
• Mailbox, task claimed and num tasks completed coarrays
• Events to signal task completed
• Directed acyclic graph (DAG) to define task dependencies
Run Implementation (#2)
Coarrays Needed
• integer(atomic_int_kind), allocatable :: taken_on(:)[:]
• integer(atomic_int_kind), allocatable :: num_tasks_completed[:]
• type(event_type), allocatable :: task_completed(:)[:]
• type(payload_t), allocatable :: mailbox(:)[:]
Startup Procedure
● User:
○ Define DAG
■ Define tasks and their dependencies
○ call run(dag)
NOTE: All images must have the same dag to start
● Framework:
○ Allocate coarrays needed for communication
Data Layout
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Execution Steps
• Find task that hasn’t been claimed, and all of its dependencies have been completed
• Attempt to claim that task
• another image may have claimed it by now
• Collect payloads from images that ran dependent tasks
• “Wait” on event that it was completed
• Execute task and store result in mailbox
• Post to all images that the task has been completed and increment task completed counter
• Repeat
Find Ready Task
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Claim Task
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Collect Payloads
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Execute Task
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Notify Task Completion
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Performance
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - single node (OpenCoarrays)
Compilation flags= -O3
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - OpenCoarrays on Perlmutter
Compilation flags= -O3
On: Perlmutter
Conclusions
Fortran’s Advantages
• Coarrays and Events a great match
• Coarray to communicate task inputs and outputs
• Events to signal task start and completion
• Teams should allow for scalable implementation
• Partition DAG and have multiple schedulers work on independent regions with separate
teams of executors
• Polymorphism
• Different kinds of task can exist that capture different kinds of “input” data at startup
• Fortan’s History
• Likely lots of applications that could be easily adapted
Fortran’s Disadvantages
• Can’t use polymorphic objects in coarrays
• A strategic change to the standard could enable this
• Coindexed objects with polymorphic components not allowed in variable definition
context, but could be allowed in expressions
• See https://github.com/j3-fortran/fortran_proposals/issues/251
• No introspection
• Automatic task detection, fusion or splitting not possible
• Fortran’s History
• Many existing applications have shared global state
• Presents data races in task based execution
Conclusions
• It “works”
• Scaling/performance will depend not only on algorithm, but also compiler and platform
• There are limitations
• Future Work
• Propose changes to Fortran standard to improve utility/flexibility
• Investigate performance issues
• Explore alternative scheduling algorithms
• Find “beta” testers, i.e. target applications
• Questions?
• https://github.com/sourceryinstitute/feats
Thank You

Weitere ähnliche Inhalte

Ähnlich wie Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran

06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptxMouDhara1
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonCarlos V.
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Languagemspline
 
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015Codemotion
 
The Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep DiveThe Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep DiveHostedbyConfluent
 
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxINFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxcarliotwaycave
 
C++ process new
C++ process newC++ process new
C++ process new敬倫 林
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Brendan Eich
 
“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1SethTisue
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryDaniel Cuneo
 
Functions in python
Functions in pythonFunctions in python
Functions in pythonIlian Iliev
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiDatabricks
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptAnishaJ7
 
Day 4a iteration and functions.pptx
Day 4a   iteration and functions.pptxDay 4a   iteration and functions.pptx
Day 4a iteration and functions.pptxAdrien Melquiond
 
dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)Mumtaz Ali
 

Ähnlich wie Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran (20)

06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptx
 
Dafunctor
DafunctorDafunctor
Dafunctor
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with Python
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Language
 
Metaprogramming
MetaprogrammingMetaprogramming
Metaprogramming
 
Learn Matlab
Learn MatlabLearn Matlab
Learn Matlab
 
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
 
Matlab-1.pptx
Matlab-1.pptxMatlab-1.pptx
Matlab-1.pptx
 
The Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep DiveThe Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
 
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxINFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
 
6-Python-Recursion PPT.pptx
6-Python-Recursion PPT.pptx6-Python-Recursion PPT.pptx
6-Python-Recursion PPT.pptx
 
C++ process new
C++ process newC++ process new
C++ process new
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
 
“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal Recovery
 
Functions in python
Functions in pythonFunctions in python
Functions in python
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.ppt
 
Day 4a iteration and functions.pptx
Day 4a   iteration and functions.pptxDay 4a   iteration and functions.pptx
Day 4a iteration and functions.pptx
 
dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)
 

Mehr von Patrick Diehl

Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerPatrick Diehl
 
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsPatrick Diehl
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Patrick Diehl
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondPatrick Diehl
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...Patrick Diehl
 
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer FugakuSimulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer FugakuPatrick Diehl
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsPatrick Diehl
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerPatrick Diehl
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Patrick Diehl
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchPatrick Diehl
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Patrick Diehl
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Patrick Diehl
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...Patrick Diehl
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerPatrick Diehl
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Patrick Diehl
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsPatrick Diehl
 
Deploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersDeploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersPatrick Diehl
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsPatrick Diehl
 
EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...Patrick Diehl
 
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...Patrick Diehl
 

Mehr von Patrick Diehl (20)

Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
 
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff Hammond
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...
 
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer FugakuSimulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local models
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-Tiger
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task Bench
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-Tiger
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics models
 
Deploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersDeploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi Clusters
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic models
 
EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...
 
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
 

Kürzlich hochgeladen

GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicAditi Jain
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
DIENES _ Organic Chemistry _ B. Pharm..pptx
DIENES _ Organic Chemistry _  B. Pharm..pptxDIENES _ Organic Chemistry _  B. Pharm..pptx
DIENES _ Organic Chemistry _ B. Pharm..pptxAZCPh
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxMedical College
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptAmirRaziq1
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPirithiRaju
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxpriyankatabhane
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2AuEnriquezLontok
 

Kürzlich hochgeladen (20)

GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by Petrovic
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
DIENES _ Organic Chemistry _ B. Pharm..pptx
DIENES _ Organic Chemistry _  B. Pharm..pptxDIENES _ Organic Chemistry _  B. Pharm..pptx
DIENES _ Organic Chemistry _ B. Pharm..pptx
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and Functions
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptx
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.ppt
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and Vertical
 
PLASMODIUM. PPTX
PLASMODIUM. PPTXPLASMODIUM. PPTX
PLASMODIUM. PPTX
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPR
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptx
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
 

Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran