SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Benchmarking the Parallel 1D Heat Equation Solver in
Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust,
Swift, and Java
Patrick Diehl, Max Morris, Steven R. Brandt, Nikunj Gupta and
Hartmut Kaiser
Center of Computation and Technology
Department of Physiscs and Astronomy
Louisiana State University
patrickdiehl@lsu.edu
August 28, 2023
P. Diehl and et al. (LSU) August 28, 2023 1 / 26
Motivation
Ranking Language Ranking Change
1 Python 13.33% -2.30%
3 C++ 11.41% +0.49%
4 Java 10.33% -2.24%
12 Go 1.16% +0.20%
18 Swift 0.90% -0.35%
19 Rust 0.89% +0.32%
20 Julia 0.85% +0.41%
Table: TIOBE Index for August 2023
Chapel is not listed in the index.
Charm++ and HPX are using C++
How do these languages compare?
P. Diehl and et al. (LSU) August 28, 2023 2 / 26
Overview
1 Model problem
2 Features of the approaches
3 Productivity
4 Performance measurements
5 Conclusion and Outlook
P. Diehl and et al. (LSU) August 28, 2023 3 / 26
Model problem
P. Diehl and et al. (LSU) August 28, 2023 4 / 26
Model problem I
The one-dimensional heat equation on a 1-D loop (e.g. limp noodle)
(0 ≤ x < L) with the length L for all times t > 0 is described by
∂u
∂t
= α
∂2u
∂x2
, 0 ≤ x < L, t > 0, (1)
with α as the material’s diffusivity. For the discretization in space, we use
the N grid points x = {xi = i · h ∈ R | i = 0, . . . , N − 1}, with the grid
spacing h and we use 2nd order finite differencing. For the discretization in
time, we use the Euler method, i.e.
u(t + δt, xi) = u(t, xi) + δt · α
u(t, xi−1) − 2 · u(t, xi) + u(t, xi+1)
2h
, (2)
with the initial condition u(0, xi) = xi. To model a loop, we use periodic
boundary conditions, i.e. u(t, x) = u(t, L + x).
P. Diehl and et al. (LSU) August 28, 2023 5 / 26
Model problem II
The parallel algorithm was implemented by having multiple threads of
execution each sequentially applying Eq. 2 on a local segment of the grid.
We used queues to communicate ghost zones between the segments. We
note that for this problem, the queues are single-producer, single-consumer
and, therefore, in principle, don’t need synchronization (although
synchronization to suspend/resume threads seemed to help in some cases).
P. Diehl and et al. (LSU) August 28, 2023 6 / 26
Features of the approaches
P. Diehl and et al. (LSU) August 28, 2023 7 / 26
Overview
Approach Async Coroutine ParAlg Win Linux Mac Licence
C++ 17 X X X X X X GNU
Java X X X X X X GNU
Swift X X X X X X Apache
Chapel X X ∼ X X X Apache
Charm++ X ∼ X X X X Own
HPX X X X X X X Boost
Go X X X X X X BSD
Python X X X X X X BSD
Julia X X X X X X MIT
Rust X X X X X X MIT
Table: Overview of the programming languages: (1) the parallelism approaches
they provide, (2) supported OS, and (3) the license. The C++ 17 standard was
used as a base. The symbol ∼ indicates that partial support.
P. Diehl and et al. (LSU) August 28, 2023 8 / 26
Chapel
We had to write our own queue and the full/empty bit
synchronization mechanism was helpful
The coforall loop, which assigns a different thread to each iteration,
provided a convenient mechanism for launching the outer loop.
Chapel also lacked a built-in way to append to a file. However,
opening a file, seeking to the end, and writing is possible.
We also add that the support we received from questions asked in the
Chapel Gitter was exceptional.
We found Chapel among the higher performing codes, comparable to Rust
or C++.
P. Diehl and et al. (LSU) August 28, 2023 9 / 26
Go
We use go func to launch worker threads (goroutines) and buffered
channels using make() to facilitate the exchange of ghost zones.
We use go func to launch worker threads (goroutines) and buffered
channels using make() to facilitate the exchange of ghost zones. For
synchronization of the goroutines, we use sync.WaitGroup and add
threads by calling waitGroup.Add(), and synchronize the threads by
calling waitGroup.Wait().
At the time of this writing, only biogo, an HPC bioinformatics toolkit
[1], is available.
Reference
1. Köster, J.: Rust-bio: a fast and safe bioinformatics library. Bioinformatics 32(3), 444–446 (2016)
P. Diehl and et al. (LSU) August 28, 2023 10 / 26
Julia
Both Python and Fortran clearly inspire Julia. It is a good choice for
Fortran programmers who want to get into scripting, as it will offer
some familiarity in using one as the default start for array indexes
(instead of zero) and its use of end to mark the end of a block.
In our Julia code, we implemented our own queue. Since Julia does
not support classes directly (though it has structs), we found it
convenient to use arrays. For parallelism, we used Julia’s
Thread.@threads for loop macro.
Julia’s community contacted us and provided some optimized code.
However, you need to be confident in Julia and know the internals for
these optimizations.
P. Diehl and et al. (LSU) August 28, 2023 11 / 26
Rust
We use std :: thread :: scope to launch worker threads, and
non-blocking channels from std :: sync :: mpsc to facilitate the
exchange of ghost zones.
We avoided using unsafe, working only in the safe subset of Rust.
Only two scientific codes (molecular dynamic and bioinformatics) are
using Rust.
Because of its guarantees concerning data race conditions and memory
access, as well as its high performance, Rust is a potentially good choice
for new scientific programming projects.
However, Rust has vastly different syntax and semantics than more
traditional languages like C++, Java, and Python, all of which may make
for a steep learning curve.
P. Diehl and et al. (LSU) August 28, 2023 12 / 26
Swift
Swift claims to be safe by design and produces lightning-fast software.
Unfortunately, we had to disable the safety feature to get a
performant code.
UnsafeMutableBufferPointer<Double> to avoid unnecessary calls of
await for accessing the elements of arrays. These buffers allow
explicit vectorization on newer x86 and Apple Silicon. See, for
example, addingProduct. However, we could not measure a
significant improvement using these functions.
For concurrency, we use await with TaskGroup{ body: { group in}}
to launch chunks of works on each thread and
for wait _ in group{}.
We found Swift is designed for application development for iOS or Mac
OS, but not for numerical applications.
P. Diehl and et al. (LSU) August 28, 2023 13 / 26
Productivity
P. Diehl and et al. (LSU) August 28, 2023 14 / 26
Lines of code
0 50 100 150 200
Python
Swift
HPX
Julia
Go
Rust
Chapel
Charm++
C++ 17
Java
Lines of code (LOC)
The numbers were determined with the Linux tool cloc.
P. Diehl and et al. (LSU) August 28, 2023 15 / 26
Productivity metric
Average of the computation time
Taverage(approach) := (T2(approach) + T20(approach) + T40(approach))/3
Constructive Cost Model (COCOMO)
COCOMO does not reflect parallel features
However, the HPX community never proposed their cost model
We map both metrics to the interval [−1, 1] using
Easy and Difficult for the costs
Slow and Fast for computation time
References
1. Barry, B., et al.: Software engineering economics. New York 197 (1981)
2. Stutzke, R.D., Crosstalk, M.: Software estimating technology: A survey. Los. Alamitos, CA: IEEE Computer Society
Press (1997)
P. Diehl and et al. (LSU) August 28, 2023 16 / 26
Productivity
Difficult
Fast
Easy
Slow
Python
Go
Julia
Rust
Chapel
C++ 17
HPX
Charm++
Swift Java
Figure: 2D classification using the computational time and the COCOMO model.
P. Diehl and et al. (LSU) August 28, 2023 17 / 26
Performance measurements
P. Diehl and et al. (LSU) August 28, 2023 18 / 26
AMD EPYC 7H12
0 10 20 30 40
#cores
10−1
100
Time
[s]
nx=1000000 and nt=1000
go
python
swift
rust
chapel
cxx
hpx
julia
charm++
java
P. Diehl and et al. (LSU) August 28, 2023 19 / 26
Intel®
Xeon®
Gold 6148 Skylake
0 10 20 30 40
#cores
10−1
100
101
Time
[s]
nx=1000000 and nt=1000
go
python
swift
rust
chapel
cxx
hpx
julia
charm++
java
P. Diehl and et al. (LSU) August 28, 2023 20 / 26
A64FX
0 10 20 30 40
#cores
10−1
100
101
Time
[s]
nx=1000000 and nt=1000
go
python
rust
chapel
cxx
hpx
julia
charm++
java
Swift is missing, since no package was available for Rocky Linux.
P. Diehl and et al. (LSU) August 28, 2023 21 / 26
Summary of performance measurements
Table: R2
correlation of the fit of the measured data points for all approaches and
architectures, computed using Python NumPy.
Arch C++ Charm++ Chapel Rust Go Julia HPX Swift Python Java
Intel 0.49 0.36 0.45 0.52 0.28 0.41 0.52 0.56 0.43 0.03
AMD 0.48 0.45 0.53 0.49 0.75 0.12 0.42 0.02 0.46 0.12
A64FX 0.49 0.52 0.08 0.40 0.52 0.42 0.73 – 0.90 0.32
Python was the slowest approach.
Swift and Julia are comparable.
For larger than 10 threads Go behaves slightly better than Swift and Julia.
For smaller core counts up to eight cores, the remaining approaches behave
similarly.
However, Chapel gets slower for higher node counts.
For Rust, Charm++, and HPX the performance is comparable. HPX is for larger
node counts the fastest, but has a high variance, see R2
in Table 3.
P. Diehl and et al. (LSU) August 28, 2023 22 / 26
Conclusion and Outlook
P. Diehl and et al. (LSU) August 28, 2023 23 / 26
Conclusion and Outlook
Conclusion
We will not name a winner concerning speed.
The higher performing platforms were mostly similar in what they
achieved.
The tests in this paper depend on the
hardware, the version of the interpreters and compilers, the particular
problem chosen,
the amount of effort applied, and our level of expertise (which varied
by platform).
Outlook
More numerical applications for a more comprehensive comparison
Distributed runs and GPU support
I am happy to answer any of your questions.
P. Diehl and et al. (LSU) August 28, 2023 24 / 26
Special issue
P. Diehl and et al. (LSU) August 28, 2023 25 / 26
Advertisement
P. Diehl and et al. (LSU) August 28, 2023 26 / 26

Weitere ähnliche Inhalte

Ähnlich wie Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust, Swift, and Java

Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
[COSCUP 2023] 我的Julia軟體架構演進之旅
[COSCUP 2023] 我的Julia軟體架構演進之旅[COSCUP 2023] 我的Julia軟體架構演進之旅
[COSCUP 2023] 我的Julia軟體架構演進之旅岳華 杜
 
Enabling Congestion Control Using Homogeneous Archetypes
Enabling Congestion Control Using Homogeneous ArchetypesEnabling Congestion Control Using Homogeneous Archetypes
Enabling Congestion Control Using Homogeneous ArchetypesJames Johnson
 
Low complexity low-latency architecture for matching
Low complexity low-latency architecture for matchingLow complexity low-latency architecture for matching
Low complexity low-latency architecture for matchingBhavya Venkatesh
 
A New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScienceA New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScienceUniversity of Washington
 
Scimakelatex.93126.cocoon.bobbin
Scimakelatex.93126.cocoon.bobbinScimakelatex.93126.cocoon.bobbin
Scimakelatex.93126.cocoon.bobbinAgostino_Marchetti
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Ganesan Narayanasamy
 
The effect of distributed archetypes on complexity theory
The effect of distributed archetypes on complexity theoryThe effect of distributed archetypes on complexity theory
The effect of distributed archetypes on complexity theoryVinícius Uchôa
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnGilles Louppe
 
A peek on numerical programming in perl and python e christopher dyken 2005
A peek on numerical programming in perl and python  e christopher dyken  2005A peek on numerical programming in perl and python  e christopher dyken  2005
A peek on numerical programming in perl and python e christopher dyken 2005Jules Krdenas
 
An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling IJECEIAES
 
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...
DCE: A NOVEL DELAY CORRELATION  MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE  REAL...DCE: A NOVEL DELAY CORRELATION  MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE  REAL...
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...ijdpsjournal
 
Producer consumer-problems
Producer consumer-problemsProducer consumer-problems
Producer consumer-problemsRichard Ashworth
 
A methodology for the study of fiber optic cables
A methodology for the study of fiber optic cablesA methodology for the study of fiber optic cables
A methodology for the study of fiber optic cablesijcsit
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsBertram Ludäscher
 
Future Programming Language
Future Programming LanguageFuture Programming Language
Future Programming LanguageYLTO
 

Ähnlich wie Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust, Swift, and Java (20)

Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
[COSCUP 2023] 我的Julia軟體架構演進之旅
[COSCUP 2023] 我的Julia軟體架構演進之旅[COSCUP 2023] 我的Julia軟體架構演進之旅
[COSCUP 2023] 我的Julia軟體架構演進之旅
 
Enabling Congestion Control Using Homogeneous Archetypes
Enabling Congestion Control Using Homogeneous ArchetypesEnabling Congestion Control Using Homogeneous Archetypes
Enabling Congestion Control Using Homogeneous Archetypes
 
20 26
20 26 20 26
20 26
 
Low complexity low-latency architecture for matching
Low complexity low-latency architecture for matchingLow complexity low-latency architecture for matching
Low complexity low-latency architecture for matching
 
Voip
VoipVoip
Voip
 
A New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScienceA New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScience
 
Scimakelatex.93126.cocoon.bobbin
Scimakelatex.93126.cocoon.bobbinScimakelatex.93126.cocoon.bobbin
Scimakelatex.93126.cocoon.bobbin
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 
The effect of distributed archetypes on complexity theory
The effect of distributed archetypes on complexity theoryThe effect of distributed archetypes on complexity theory
The effect of distributed archetypes on complexity theory
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
 
A peek on numerical programming in perl and python e christopher dyken 2005
A peek on numerical programming in perl and python  e christopher dyken  2005A peek on numerical programming in perl and python  e christopher dyken  2005
A peek on numerical programming in perl and python e christopher dyken 2005
 
An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling
 
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...
DCE: A NOVEL DELAY CORRELATION  MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE  REAL...DCE: A NOVEL DELAY CORRELATION  MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE  REAL...
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...
 
Producer consumer-problems
Producer consumer-problemsProducer consumer-problems
Producer consumer-problems
 
A methodology for the study of fiber optic cables
A methodology for the study of fiber optic cablesA methodology for the study of fiber optic cables
A methodology for the study of fiber optic cables
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
 
post119s1-file2
post119s1-file2post119s1-file2
post119s1-file2
 
Future Programming Language
Future Programming LanguageFuture Programming Language
Future Programming Language
 

Mehr von Patrick Diehl

Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerPatrick Diehl
 
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsPatrick Diehl
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondPatrick Diehl
 
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in FortranFramework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in FortranPatrick Diehl
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...Patrick Diehl
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsPatrick Diehl
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Patrick Diehl
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchPatrick Diehl
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Patrick Diehl
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Patrick Diehl
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...Patrick Diehl
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Patrick Diehl
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsPatrick Diehl
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsPatrick Diehl
 
EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...Patrick Diehl
 
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...Patrick Diehl
 

Mehr von Patrick Diehl (16)

Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
 
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff Hammond
 
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in FortranFramework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local models
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task Bench
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics models
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic models
 
EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...
 
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
 

Kürzlich hochgeladen

Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 

Kürzlich hochgeladen (20)

Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 

Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust, Swift, and Java

  • 1. Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust, Swift, and Java Patrick Diehl, Max Morris, Steven R. Brandt, Nikunj Gupta and Hartmut Kaiser Center of Computation and Technology Department of Physiscs and Astronomy Louisiana State University patrickdiehl@lsu.edu August 28, 2023 P. Diehl and et al. (LSU) August 28, 2023 1 / 26
  • 2. Motivation Ranking Language Ranking Change 1 Python 13.33% -2.30% 3 C++ 11.41% +0.49% 4 Java 10.33% -2.24% 12 Go 1.16% +0.20% 18 Swift 0.90% -0.35% 19 Rust 0.89% +0.32% 20 Julia 0.85% +0.41% Table: TIOBE Index for August 2023 Chapel is not listed in the index. Charm++ and HPX are using C++ How do these languages compare? P. Diehl and et al. (LSU) August 28, 2023 2 / 26
  • 3. Overview 1 Model problem 2 Features of the approaches 3 Productivity 4 Performance measurements 5 Conclusion and Outlook P. Diehl and et al. (LSU) August 28, 2023 3 / 26
  • 4. Model problem P. Diehl and et al. (LSU) August 28, 2023 4 / 26
  • 5. Model problem I The one-dimensional heat equation on a 1-D loop (e.g. limp noodle) (0 ≤ x < L) with the length L for all times t > 0 is described by ∂u ∂t = α ∂2u ∂x2 , 0 ≤ x < L, t > 0, (1) with α as the material’s diffusivity. For the discretization in space, we use the N grid points x = {xi = i · h ∈ R | i = 0, . . . , N − 1}, with the grid spacing h and we use 2nd order finite differencing. For the discretization in time, we use the Euler method, i.e. u(t + δt, xi) = u(t, xi) + δt · α u(t, xi−1) − 2 · u(t, xi) + u(t, xi+1) 2h , (2) with the initial condition u(0, xi) = xi. To model a loop, we use periodic boundary conditions, i.e. u(t, x) = u(t, L + x). P. Diehl and et al. (LSU) August 28, 2023 5 / 26
  • 6. Model problem II The parallel algorithm was implemented by having multiple threads of execution each sequentially applying Eq. 2 on a local segment of the grid. We used queues to communicate ghost zones between the segments. We note that for this problem, the queues are single-producer, single-consumer and, therefore, in principle, don’t need synchronization (although synchronization to suspend/resume threads seemed to help in some cases). P. Diehl and et al. (LSU) August 28, 2023 6 / 26
  • 7. Features of the approaches P. Diehl and et al. (LSU) August 28, 2023 7 / 26
  • 8. Overview Approach Async Coroutine ParAlg Win Linux Mac Licence C++ 17 X X X X X X GNU Java X X X X X X GNU Swift X X X X X X Apache Chapel X X ∼ X X X Apache Charm++ X ∼ X X X X Own HPX X X X X X X Boost Go X X X X X X BSD Python X X X X X X BSD Julia X X X X X X MIT Rust X X X X X X MIT Table: Overview of the programming languages: (1) the parallelism approaches they provide, (2) supported OS, and (3) the license. The C++ 17 standard was used as a base. The symbol ∼ indicates that partial support. P. Diehl and et al. (LSU) August 28, 2023 8 / 26
  • 9. Chapel We had to write our own queue and the full/empty bit synchronization mechanism was helpful The coforall loop, which assigns a different thread to each iteration, provided a convenient mechanism for launching the outer loop. Chapel also lacked a built-in way to append to a file. However, opening a file, seeking to the end, and writing is possible. We also add that the support we received from questions asked in the Chapel Gitter was exceptional. We found Chapel among the higher performing codes, comparable to Rust or C++. P. Diehl and et al. (LSU) August 28, 2023 9 / 26
  • 10. Go We use go func to launch worker threads (goroutines) and buffered channels using make() to facilitate the exchange of ghost zones. We use go func to launch worker threads (goroutines) and buffered channels using make() to facilitate the exchange of ghost zones. For synchronization of the goroutines, we use sync.WaitGroup and add threads by calling waitGroup.Add(), and synchronize the threads by calling waitGroup.Wait(). At the time of this writing, only biogo, an HPC bioinformatics toolkit [1], is available. Reference 1. Köster, J.: Rust-bio: a fast and safe bioinformatics library. Bioinformatics 32(3), 444–446 (2016) P. Diehl and et al. (LSU) August 28, 2023 10 / 26
  • 11. Julia Both Python and Fortran clearly inspire Julia. It is a good choice for Fortran programmers who want to get into scripting, as it will offer some familiarity in using one as the default start for array indexes (instead of zero) and its use of end to mark the end of a block. In our Julia code, we implemented our own queue. Since Julia does not support classes directly (though it has structs), we found it convenient to use arrays. For parallelism, we used Julia’s Thread.@threads for loop macro. Julia’s community contacted us and provided some optimized code. However, you need to be confident in Julia and know the internals for these optimizations. P. Diehl and et al. (LSU) August 28, 2023 11 / 26
  • 12. Rust We use std :: thread :: scope to launch worker threads, and non-blocking channels from std :: sync :: mpsc to facilitate the exchange of ghost zones. We avoided using unsafe, working only in the safe subset of Rust. Only two scientific codes (molecular dynamic and bioinformatics) are using Rust. Because of its guarantees concerning data race conditions and memory access, as well as its high performance, Rust is a potentially good choice for new scientific programming projects. However, Rust has vastly different syntax and semantics than more traditional languages like C++, Java, and Python, all of which may make for a steep learning curve. P. Diehl and et al. (LSU) August 28, 2023 12 / 26
  • 13. Swift Swift claims to be safe by design and produces lightning-fast software. Unfortunately, we had to disable the safety feature to get a performant code. UnsafeMutableBufferPointer<Double> to avoid unnecessary calls of await for accessing the elements of arrays. These buffers allow explicit vectorization on newer x86 and Apple Silicon. See, for example, addingProduct. However, we could not measure a significant improvement using these functions. For concurrency, we use await with TaskGroup{ body: { group in}} to launch chunks of works on each thread and for wait _ in group{}. We found Swift is designed for application development for iOS or Mac OS, but not for numerical applications. P. Diehl and et al. (LSU) August 28, 2023 13 / 26
  • 14. Productivity P. Diehl and et al. (LSU) August 28, 2023 14 / 26
  • 15. Lines of code 0 50 100 150 200 Python Swift HPX Julia Go Rust Chapel Charm++ C++ 17 Java Lines of code (LOC) The numbers were determined with the Linux tool cloc. P. Diehl and et al. (LSU) August 28, 2023 15 / 26
  • 16. Productivity metric Average of the computation time Taverage(approach) := (T2(approach) + T20(approach) + T40(approach))/3 Constructive Cost Model (COCOMO) COCOMO does not reflect parallel features However, the HPX community never proposed their cost model We map both metrics to the interval [−1, 1] using Easy and Difficult for the costs Slow and Fast for computation time References 1. Barry, B., et al.: Software engineering economics. New York 197 (1981) 2. Stutzke, R.D., Crosstalk, M.: Software estimating technology: A survey. Los. Alamitos, CA: IEEE Computer Society Press (1997) P. Diehl and et al. (LSU) August 28, 2023 16 / 26
  • 17. Productivity Difficult Fast Easy Slow Python Go Julia Rust Chapel C++ 17 HPX Charm++ Swift Java Figure: 2D classification using the computational time and the COCOMO model. P. Diehl and et al. (LSU) August 28, 2023 17 / 26
  • 18. Performance measurements P. Diehl and et al. (LSU) August 28, 2023 18 / 26
  • 19. AMD EPYC 7H12 0 10 20 30 40 #cores 10−1 100 Time [s] nx=1000000 and nt=1000 go python swift rust chapel cxx hpx julia charm++ java P. Diehl and et al. (LSU) August 28, 2023 19 / 26
  • 20. Intel® Xeon® Gold 6148 Skylake 0 10 20 30 40 #cores 10−1 100 101 Time [s] nx=1000000 and nt=1000 go python swift rust chapel cxx hpx julia charm++ java P. Diehl and et al. (LSU) August 28, 2023 20 / 26
  • 21. A64FX 0 10 20 30 40 #cores 10−1 100 101 Time [s] nx=1000000 and nt=1000 go python rust chapel cxx hpx julia charm++ java Swift is missing, since no package was available for Rocky Linux. P. Diehl and et al. (LSU) August 28, 2023 21 / 26
  • 22. Summary of performance measurements Table: R2 correlation of the fit of the measured data points for all approaches and architectures, computed using Python NumPy. Arch C++ Charm++ Chapel Rust Go Julia HPX Swift Python Java Intel 0.49 0.36 0.45 0.52 0.28 0.41 0.52 0.56 0.43 0.03 AMD 0.48 0.45 0.53 0.49 0.75 0.12 0.42 0.02 0.46 0.12 A64FX 0.49 0.52 0.08 0.40 0.52 0.42 0.73 – 0.90 0.32 Python was the slowest approach. Swift and Julia are comparable. For larger than 10 threads Go behaves slightly better than Swift and Julia. For smaller core counts up to eight cores, the remaining approaches behave similarly. However, Chapel gets slower for higher node counts. For Rust, Charm++, and HPX the performance is comparable. HPX is for larger node counts the fastest, but has a high variance, see R2 in Table 3. P. Diehl and et al. (LSU) August 28, 2023 22 / 26
  • 23. Conclusion and Outlook P. Diehl and et al. (LSU) August 28, 2023 23 / 26
  • 24. Conclusion and Outlook Conclusion We will not name a winner concerning speed. The higher performing platforms were mostly similar in what they achieved. The tests in this paper depend on the hardware, the version of the interpreters and compilers, the particular problem chosen, the amount of effort applied, and our level of expertise (which varied by platform). Outlook More numerical applications for a more comprehensive comparison Distributed runs and GPU support I am happy to answer any of your questions. P. Diehl and et al. (LSU) August 28, 2023 24 / 26
  • 25. Special issue P. Diehl and et al. (LSU) August 28, 2023 25 / 26
  • 26. Advertisement P. Diehl and et al. (LSU) August 28, 2023 26 / 26