SlideShare ist ein Scribd-Unternehmen logo
1 von 62
Downloaden Sie, um offline zu lesen
Brad Richardson
Framework for Extensible,
Asynchronous Task
Scheduling (FEATS) in Fortran
Co-contributors: Damian Rouson,
Harris Snyder and Robert Singleterry
Agenda
• Motivations
• Example/Demo Application
• Implementation Details
• Performance Studies
• Conclusions
Motivations
https://portal.nersc.gov/project/m888/nersc10/workload/N10_Workload_Analysis.latest.pdf
Motivations
● Target an under-served user base: Fortran
○ Enable scientists and engineers to develop efficient applications for HPC beyond the
“embarrassingly parallel” problems
● Explore the native parallel features of Fortran
○ Don’t force “reformulating” the problem to be able to interoperate with C/C++ or other
external libraries
Example Applications
if (this_image() == 1) then
write(output_unit, "(A)") &
"Enter values for a, b and c in `a*x**2 + b*x + c`:"
flush(output_unit)
read(input_unit, *) a, b, c
end if
call co_broadcast(a, 1)
call co_broadcast(b, 1)
call co_broadcast(c, 1)
solver = dag_t( &
[ vertex_t([integer::], a_t(a)) &
, vertex_t([integer::], b_t(b)) &
, vertex_t([integer::], c_t(c)) &
, vertex_t([2], b_squared_t()) &
, vertex_t([1, 3], four_ac_t()) &
, vertex_t([4, 5], square_root_t()) &
, vertex_t([2, 6], minus_b_pm_square_root_t()) &
, vertex_t([1], two_a_t()) &
, vertex_t([8, 7], division_t()) &
, vertex_t([9], printer_t()) &
])
Quadratic Solver
-b±√(b2
-4*a*c)
2*a
Enter values for a, b and c in `a*x**2 + b*x + c`:
1 1 -12
c = -12.0000000
b = 1.0000000
a = 1.0000000
2*a = 2.0000000
b**2 = 1.0000000
4*a*c = -48.0000000
sqrt(b**2 - 4*a*c) = 7.0000000 -7.0000000
-b +- sqrt(b**2 - 4*a*c) = 6.0000000 -8.0000000
(-b +- sqrt(b**2 - 4*a*c)) / (2*a) = 3.0000000 -4.0000000
The roots are 3.0000000 -4.0000000
Enter values for a, b and c in `a*x**2 + b*x + c`:
1 -1 -30
b = -1.0000000
c = -30.0000000
a = 1.0000000
b**2 = 1.0000000
4*a*c = -1.2000000E+02
2*a = 2.0000000
sqrt(b**2 - 4*a*c) = 11.0000000 -11.0000000
-b +- sqrt(b**2 - 4*a*c) = 12.0000000 -10.0000000
(-b +- sqrt(b**2 - 4*a*c)) / (2*a) = 6.0000000 -5.0000000
The roots are 6.0000000 -5.0000000
Quadratic Solver
LU Decomposition
In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix
as the product of a lower triangular matrix and an upper triangular matrix (see matrix decomposition). The
product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form
of Gaussian elimination.
- Wikipedia
I.e. A = LU =
call read_matrix_in(matrix, matrix_size)
num_tasks = 4 + (matrix_size-1)*2 + sum([((matrix_size-step)*3, step=1,matrix_size-1)])
allocate(vertices(num_tasks))
vertices(1) = vertex_t([integer::], initial_t(matrix))
vertices(2) = vertex_t([1], print_matrix_t(0))
do step = 1, matrix_size-1
do row = step+1, matrix_size
latest_matrix = 1 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) ! reconstructed matrix from last step
task_base = sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) + 3*(row-(step+1))
vertices(3+task_base) = vertex_t([latest_matrix], calc_factor_t(row=row, step=step))
vertices(4+task_base) = vertex_t([latest_matrix, 3+task_base], row_multiply_t(step=step))
vertices(5+task_base) = vertex_t([latest_matrix, 4+task_base], row_subtract_t(row=row))
end do
reconstruction_step = 3 + sum([(3*(matrix_size-i), i = 1, step)]) + 2*(step-1)
vertices(reconstruction_step) = vertex_t( &
[ 1 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) &
, [(5 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) + 3*(row-(step+1)) &
, row=step+1, matrix_size)] &
], &
reconstruct_t(step=step)) ! depends on previous reconstructed matrix and just subtracted rows
! print the just reconstructed matrix
vertices(reconstruction_step+1) = vertex_t([reconstruction_step], print_matrix_t(step))
end do
vertices(num_tasks-1) = vertex_t( &
[([(3 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) + 3*(row-(step+1)) &
, row=step+1, matrix_size)] &
, step=1, matrix_size-1)] &
, back_substitute_t(n_rows=matrix_size)) ! depends on all "factors"
vertices(num_tasks) = vertex_t([num_tasks-1], print_matrix_t(matrix_size))
dag = dag_t(vertices)
LU Decomposition
LU Decomposition - 1x1
LU Decomposition - 2x2
LU Decomposition - 3x3
LU Decomposition - 4x4
LU Decomposition - 5x5
LU Decomposition - 3x3
LU Decomposition - 3x3
Step: 0
11.0516795999999999 4.1176995799999997E+02 5.7346350099999995E+02
9.7038173699999994 9.7847460899999999E+02 7.5112048300000004E+02
5.6446673599999997E+02 8.9235162400000002E+02 8.0637854000000004E+02
Step: 1
11.0516795999999999 4.1176995799999997E+02 5.7346350099999995E+02
0.0000000000000000 6.1692409220031186E+02 2.4759655872112285E+02
0.0000000000000000 -2.0138880965761025E+04 -2.8483383952264314E+04
Step: 3
1.0000000000000000 0.0000000000000000 0.0000000000000000
0.8780400555586139 1.0000000000000000 0.0000000000000000
51.0751991036728938 -32.6440176682580301 1.0000000000000000
Step: 2
11.0516795999999999 4.1176995799999997E+02 5.7346350099999995E+02
0.0000000000000000 6.1692409220031186E+02 2.4759655872112285E+02
0.0000000000000000 0.0000000000000000 -2.0400837514772094E+04
Implementation
Derived Types
Run Implementation (#1)
• One scheduler image and multiple executor images
• Mailbox, and task assignment coarrays
• Events to signal ready for work and task completed
• Directed acyclic graph (DAG) to define task dependencies
Coarrays Needed
• type(payload_t), allocatable :: mailbox(:)[:]
• type(event_type), allocatable :: ready_for_next_task(:)[:]
• type(event_type) :: task_assigned[*]
• integer :: task_identifier[*]
• integer, allocatable :: task_assignment_history(:)[:]
Startup Procedure
● User:
○ Define DAG
■ Define tasks and their dependencies
○ call run(dag)
NOTE: All images must have the same dag to start
● Framework:
○ Allocate coarrays needed for communication
■ If we’re executing on more than one image
Data Layout
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Scheduler Steps
• Find an executor that has posted it is ready
• While we do this, we keep track of what tasks have been completed
• Find an uncompleted task with all dependencies completed
• “Wait” for the ready executor (balances posts/waits)
• Assign the task to the executor
• Post that the executor has been assigned a task
• Repeat
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Find Executor
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Find Next Task
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Wait for ready image
event wait(...)
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Assign task
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Post task assigned
event post(...)
Executor Steps
• Post ready for a task
• Wait until it has been assigned a task
• Collect payloads from executors that ran dependent tasks
• We access the history kept by the scheduler to determine this
• Execute task and store result in mailbox
• Repeat
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Post ready for task
event post(...)
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Wait for task
assignment
event wait(...)
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Collect inputs
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Execute task and
store result
Performance
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - single node (OpenCoarrays)
Compilation flags= -O3
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
• Every image attempts to claim a task as soon as possible
• Mailbox, task claimed and num tasks completed coarrays
• Events to signal task completed
• Directed acyclic graph (DAG) to define task dependencies
Run Implementation (#2)
Coarrays Needed
• integer(atomic_int_kind), allocatable :: taken_on(:)[:]
• integer(atomic_int_kind), allocatable :: num_tasks_completed[:]
• type(event_type), allocatable :: task_completed(:)[:]
• type(payload_t), allocatable :: mailbox(:)[:]
Startup Procedure
● User:
○ Define DAG
■ Define tasks and their dependencies
○ call run(dag)
NOTE: All images must have the same dag to start
● Framework:
○ Allocate coarrays needed for communication
Data Layout
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Execution Steps
• Find task that hasn’t been claimed, and all of its dependencies have been completed
• Attempt to claim that task
• another image may have claimed it by now
• Collect payloads from images that ran dependent tasks
• “Wait” on event that it was completed
• Execute task and store result in mailbox
• Post to all images that the task has been completed and increment task completed counter
• Repeat
Find Ready Task
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Claim Task
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Collect Payloads
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Execute Task
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Notify Task Completion
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Performance
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - single node (OpenCoarrays)
Compilation flags= -O3
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - OpenCoarrays on Perlmutter
Compilation flags= -O3
On: Perlmutter
Conclusions
Fortran’s Advantages
• Coarrays and Events a great match
• Coarray to communicate task inputs and outputs
• Events to signal task start and completion
• Teams should allow for scalable implementation
• Partition DAG and have multiple schedulers work on independent regions with separate
teams of executors
• Polymorphism
• Different kinds of task can exist that capture different kinds of “input” data at startup
• Fortan’s History
• Likely lots of applications that could be easily adapted
Fortran’s Disadvantages
• Can’t use polymorphic objects in coarrays
• A strategic change to the standard could enable this
• Coindexed objects with polymorphic components not allowed in variable definition
context, but could be allowed in expressions
• See https://github.com/j3-fortran/fortran_proposals/issues/251
• No introspection
• Automatic task detection, fusion or splitting not possible
• Fortran’s History
• Many existing applications have shared global state
• Presents data races in task based execution
Conclusions
• It “works”
• Scaling/performance will depend not only on algorithm, but also compiler and platform
• There are limitations
• Future Work
• Propose changes to Fortran standard to improve utility/flexibility
• Investigate performance issues
• Explore alternative scheduling algorithms
• Find “beta” testers, i.e. target applications
• Questions?
• https://github.com/sourceryinstitute/feats
Thank You

Weitere ähnliche Inhalte

Ähnlich wie Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran

06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptxMouDhara1
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonCarlos V.
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Languagemspline
 
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015Codemotion
 
The Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep DiveThe Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep DiveHostedbyConfluent
 
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxINFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxcarliotwaycave
 
C++ process new
C++ process newC++ process new
C++ process new敬倫 林
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Brendan Eich
 
“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1SethTisue
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryDaniel Cuneo
 
Functions in python
Functions in pythonFunctions in python
Functions in pythonIlian Iliev
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiDatabricks
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptAnishaJ7
 
Day 4a iteration and functions.pptx
Day 4a   iteration and functions.pptxDay 4a   iteration and functions.pptx
Day 4a iteration and functions.pptxAdrien Melquiond
 
dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)Mumtaz Ali
 

Ähnlich wie Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran (20)

06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptx
 
Dafunctor
DafunctorDafunctor
Dafunctor
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with Python
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Language
 
Metaprogramming
MetaprogrammingMetaprogramming
Metaprogramming
 
Learn Matlab
Learn MatlabLearn Matlab
Learn Matlab
 
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
 
Matlab-1.pptx
Matlab-1.pptxMatlab-1.pptx
Matlab-1.pptx
 
The Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep DiveThe Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
 
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxINFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
 
6-Python-Recursion PPT.pptx
6-Python-Recursion PPT.pptx6-Python-Recursion PPT.pptx
6-Python-Recursion PPT.pptx
 
C++ process new
C++ process newC++ process new
C++ process new
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
 
“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal Recovery
 
Functions in python
Functions in pythonFunctions in python
Functions in python
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.ppt
 
Day 4a iteration and functions.pptx
Day 4a   iteration and functions.pptxDay 4a   iteration and functions.pptx
Day 4a iteration and functions.pptx
 
dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)
 

Mehr von Patrick Diehl

Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-TigerPatrick Diehl
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerPatrick Diehl
 
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsPatrick Diehl
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Patrick Diehl
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondPatrick Diehl
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...Patrick Diehl
 
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer FugakuSimulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer FugakuPatrick Diehl
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsPatrick Diehl
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerPatrick Diehl
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Patrick Diehl
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchPatrick Diehl
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Patrick Diehl
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Patrick Diehl
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...Patrick Diehl
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerPatrick Diehl
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Patrick Diehl
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsPatrick Diehl
 
Deploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersDeploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersPatrick Diehl
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsPatrick Diehl
 

Mehr von Patrick Diehl (20)

Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
 
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff Hammond
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...
 
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer FugakuSimulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local models
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-Tiger
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task Bench
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-Tiger
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics models
 
Deploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersDeploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi Clusters
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic models
 

Kürzlich hochgeladen

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 

Kürzlich hochgeladen (20)

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 

Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran