SlideShare ist ein Scribd-Unternehmen logo
1 von 77
Downloaden Sie, um offline zu lesen
1
LLNL-PRES-767803
LLNL-PRES-767803
This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-
AC52-07NA27344. Lawrence Livermore National Security, LLC
Sierra: Science Unleashed
HPC Advisory Council
Stanford University
Presenter: Rob Neely
Weapons Simulation and Computing Program Coordinator for Computing Environments
Feb 14, 2019
2
LLNL-PRES-767803
Outline
• LLNL’s Mission for Stockpile Stewardship
• Sierra system overview
• Motivation for Sierra
• Application preparation with the “Sierra Center of Excellence”
• Exemplar unclassified NNSA application areas
• Open Science Period Results
• Earthquake modeling with SW4
• Biological modeling of Cancer
• Neutron lifetime
• Extreme-scale RANS modeling
• Future directions: AI/ML in scientific computing
3
LLNL-PRES-767803
LLNL’s Mission Requirements for
Extreme-scale Computing
4
LLNL-PRES-767803
Lawrence Livermore Nat’l Lab (LLNL) is a
multidisciplinary national security laboratory
§ Established in 1952
§ Approximately 7,000 employees
§ 1 square mile, 684 facilities
4
Located in the SF Bay Area
5
LLNL-PRES-767803
Nuclear Security Other National Security
Bi0-Science
and Bio-Engineering
Earth and Atmospheric
Science
High-Energy-Density
Science
Lasers and Optical S&T
Advanced Materials
and Manufacturing
Nuclear, Chemical,
and Isotopic S&T
High-Performance Computing,
Simulation, and Data Science
ScienceandTechnology
6
LLNL-PRES-767803
Nuclear security and the Stockpile Stewardship Program
§ Assess annually the safety, security,
reliability and, effectiveness of the
stockpile
§ Extend life of stockpile warheads;
adapting safety and security features
to evolving requirements
§ Strengthen underpinning science,
technology, and engineering
§ Nuclear nonproliferation and
counterterrorism
The Stockpile Stewardship Program has successfully maintained
the nuclear stockpile without explosive nuclear testing since 1992
7
LLNL-PRES-767803
Life extension programs can improve safety and security while providing no new military
capabilities (consistent with policy)
Legacy System
Improved safety and
security
Still cannot take it on the
track
Remanufacture as close as
possible to original
Remanufacture
approach
Improve safety and
security approach
8
LLNL-PRES-767803
Nuclear security is more than nuclear weapons
Non-proliferation Treaty monitoring Counterterrorism
§ Verificationmonitoring
§ Samplingandanalysis
§ Safetyandsecurityof
special nuclear material
§ Monitoringanddetection
for nuclear test bantreaty
agreements
§ Analysisandcharacterizationof
nuclear tests
§ Forensics
§ Detectionanddiagnostics
§ Incident response
9
LLNL-PRES-767803
Providing simulation capabilities to the end user involves a large,
integrated ecosphere
Development and
debugging tools
PCA
Linux Cluster (128)
PCB
Linux Cluster (88)
ASCI Platform
White
4
Visualization
Servers
DisCom2 WAN
Cisco 8540 Router
SCF Network
4-way striping on jumbo frame GigE
4 OC-12c Fastlane
ATM Encryptors
ASCI Platform
SKY
HPGN
Routers
OC-48c
DisCom2 WAN
(from SNL/NM & LANL)
TBMX
Switch
Frame
1
OC-48c
to SNL/CA
Frame
2
Frame
3
Frame
4
SecureNet
TidalWave
EdgeWater
SecureNet
Router
NES
Encryptor
Fastlane
Encryptor
gw1
gw2
NES
Router
LLNL ATM
Switch
Open
LabNetESnet
ATM Switch
Cisco 8540
ATM switch
Ice
32 VIEWS
nodes
GigE
channel
B451
Area
B113
Area
4
32
4
4
4
2
4
7
8
2
4
4
4
4 LOGIN
nodes
4 LOGIN
nodes
4 LOGIN
nodes
4
subnets
4
subnets
4
subnets
4
subnets
2
2
NFS
4
4
4
NFS
R5
NFS
MTD
NFS
R2
NFS
D2
NFS
E2
22 2
NFS
Maple
NFS
Poplar
22
6
6
Alpha 255
Forest
5
5
5
Forest
Cluster (6)
SC
Cluster (5)
ICF
Cluster (8)
8
8
1
Sun Ultra
Denali
(X
Server)
CLN#2
CLN#1
GigE, External/Internet, std frame
GigE, Internal, std frame
GigE, Internal, jumbo frame
CLN#3
To SCF Server Network
FDDI
Switch
To NSF Servers
D2, E3, R2, R5, MTD
J90
Router
To SCF Server1 Network
To LC Servers
2
Computers
Facilities
Filesystems and Networks
Security access
controls
Visualization
Codes
10
LLNL-PRES-767803
High-performance computing (HPC) is central to sustaining the
nuclear stockpile in the absence of additional nuclear tests
11
LLNL-PRES-767803
What drives super-exascale requirements is 3D high resolution,
enhanced physics calculations done in an uncertainty quantification
regime
11
Advanced	physics	
and	algorithms	(new	
codes?)
Higher	fidelity	
and	resolution	in	
3D	(new	
platforms,	
scalable	codes)
Predictive	simulation	through	
Quantification	of	Uncertainties	
(1000’s	of		managed	simulations	
and	data	analytics)
Our mission demands require
progress along all three axes to
advance our predictive capabilities
12
LLNL-PRES-767803
LLNL has a long history of fielding Top500 systems
Courtesy of Erich Strohmaier (LBL) SC’17 Invited talk
https://www.top500.org/static/media/uploads/presentations/top500_ppt_invited_201711.pdf
Normalized HPL performance
of deployed systems over the
first 50 Top500 lists (25 years)
What is not captured here is the derivative. Like a “moving average”, this will change
over the next decade as non-US/DOE sites dominate the top 3 positions.
13
LLNL-PRES-767803
Sierra System Overview
14
LLNL-PRES-767803
Sierra is a 125 Petaflop peak system based in the Department of
Energy’s Lawrence Livermore National Laboratory supporting
national security mission advancing science in the public interest
15
LLNL-PRES-767803
15
System performance
• Peak performance of 125
petaflops for modeling and
simulation
Each node has
• 2 IBM POWER9 processors
• 4 NVIDIA Tesla V100 GPUs
• 320 GiB of fast memory
• 256 GiB DDR4
• 64 GiB HBM2
• 1.6 TB of NVMe memory
The system includes
• 4,320 nodes
• 2:1 tapered Mellanox EDR InfiniBand
tree topology (50% global bandwidth)
with dual-port HCA per node
• 154 PB IBM Spectrum Scale
file system with 1.54 TB/s R/W
bandwidth
System Overview
16
LLNL-PRES-767803
What makes Sierra so effective?
GPUs: Sierra links more
than 17,000 deep-learning optimized
NVIDIA GPUs with the potential to
deliver exascale-level performance (a
billion-billion calculations per second)
for AI applications.
High-Speed Data Movement: High-speed
Mellanox interconnect and NVLink high-
bandwidth technology built into all of
Sierra’s processors supply the next-
generation information superhighways.
Memory: Sierra’s sizable memory gives
researchers a convenient launching point for
data-intensive tasks, an asset that allows for
greatly improved application performance and
algorithmic accuracy as well as AI training.
CPUs: IBM Power9 processors to rapidly
execute serial code, run storage and I/O
services, and manage data so the compute is
done in the right place.
17
LLNL-PRES-767803
IBM Power9 Processor
• Up to 24 cores
• Sierra’s P9s have 22 cores for yield optimization on
first processors
• PCI-Express 4.0
• Twice as fast as PCIe 3.0
• NVLink 2.0
• Coherent, high-bandwidth links to GPUs
• 14nm FinFET SOI technology
• 8 billion transistors
• Cache
• L1I: 32 KiB per core, 8-way set associative
• L1D: 32KiB per core, 8-way
• L2: 256 KiB per core
• L3: 120 MiB eDRAM, 20-way
18
LLNL-PRES-767803
NVIDIA Volta Details
Tesla V100 for NVLink Tesla V100 for PCIe
Performance with
NVIDIA GPU Boost
DOUBLE-PRECISION
7.8 TeraFLOPS
SINGLE-PRECISION
15.7 TeraFLOPS
DEEP LEARNING
125 TeraFLOPS
DOUBLE-PRECISION
7 TeraFLOPS
SINGLE-PRECISION
14 TeraFLOPS
DEEP LEARNING
112 TeraFLOPS
Interconnect
Bandwidth
Bi-Directional
NVLINK
300GB/s
PCIE
32GB/s
Memory CoWoS
Stacked HBM2
CAPACITY
16GB HBM2
BANDWIDTH
900GB/s
19
LLNL-PRES-767803
Lassen is an unclassified system similar to
Sierra, but smaller in size.
Its peak performance is 20.9 petaflops vs.
Sierra's 125 petaflops peak performance.
Multiple programs and institutional users
will use Lassen for leading-edge science,
such as predictive biology, computer-
aided drug discovery, or exploring novel
materials, on a world-class system.
Lassen System Details (subject to change
on final install)
Lassen
Zone CZ (Unclassified)
Nodes 720
POWER9 Processors per Node 2
GV100 (Volta) GPUs per Node 4
Node Peak (tFLOP/s) 29.1
System Peak (pFLOP/s) 20.9
Node Memory (GiB) 320
System Memory (PiB) 0.225
Interconnect 2x IB EDR
Off-Node Aggregate b/w (GB/s) 45.5
Compute Racks 40
Network and Infrastructure Racks 4
Storage Racks 4
Total Racks 48
Peak Power (MW) ~1.8
20
LLNL-PRES-767803
LLNL‘s Sierra System Installation
21
LLNL-PRES-767803
The Need to Pivot Toward Exascale
22
LLNL-PRES-767803
Advancements in HPC have happened over three major
epochs, and we’re settling on a fourth (heterogeneity)
Each of these eras defines a new common programming model and
transitions are taking longer; DOE/NNSA is entering a fourth era.
1.00E+03
1.00E+04
1.00E+05
1.00E+06
1.00E+07
1.00E+08
1.00E+09
1.00E+10
1.00E+11
1.00E+12
1.00E+13
1.00E+14
1.00E+15
1.00E+16
1.00E+17
1.00E+18
1.00E+19Jan-52
Jun-54
D
ec-56
M
ay-59
N
ov-61
M
ay-64
O
ct-66
A
pr-69
S
ep-71
M
ar-74
A
ug-76
Feb-79
A
ug-81
Jan-84
Jul-86
D
ec-88
Jun-91
N
ov-93
M
ay-96
N
ov-98
A
pr-01
O
ct-03
M
ar-06
S
ep-08
Feb-11
A
ug-13
Jan-16
Jul-18
Jan-21
Jun-23
D
ec-25
MegascaleGigascaleTerascalePetascaleKiloscale
peak
(flops)
Serial Era Vector Era
Blue Pacific
BlueGene/L
White
Roadrunner
Many Core EraDistributed Memory Era
Cielo
ATS-4
Exascale
ATS-5
Sierra
Trinity
Sequoia
Purple
ASCI Red
Red Storm
Crossroads
Transitions are
essential for
progress: between
1952 and 2012
(Sequoia), NNSA saw
a factor of ~1 trillion
improvement in peak
speed
Code
Transition
Period
Serial Era Vector Era Distributed Memory Era
After a decade or
so of uncertainty,
history will most
likely refer to this
next era as the
heterogeneous era
Heterogeneous Era
23
LLNL-PRES-767803
Power
• Unreasonable
operating costs
• Today's tech =
$100M/year –
exceeds capital
investments
Chip /
processor
efficiency
• GPUs /
accelerators
• Simpler cores
• Unreliability at
near-threshold
voltages –
compounded by
scale of systems
On-node
"inscaling"
challenges
• Complex
hardware
• Massive on-
node
concurrency
requirements
• New
programming
and memory
models
Disruptive
Changes
• Code
evolution
• Algorithm
rewrites
• Platform
uncertainty
The drive for more capable high-end computing is driving disruption in
HPC application development
Meanwhile, “Big Data” and AI is driving industry trends and investments
• Social Networking, Machine/Deep Learning, Cloud Computing, …
• Software innovation in data-centric HPC is thriving
• Traditional HPC is facing strong headwinds if it can’t adapt
External
Constraints
Hardware
Designs
Software
Supports
Applications
React
The trickle down…
Assume:
Continued
increased
computational
power is desired
24
LLNL-PRES-767803
Processor trends tell the parallelism story….
Moore’s Law – Alive and well!
Barely hanging on (dynamic clock
management, IPC tricks)
Flat (or even slightly down)
Mostly flat
Exponential growth?
25
LLNL-PRES-767803
The LLNL/ASC procurement strategy relies on
long-lived partnerships with HPC vendors
25
Concept 9 – 10 years Retirement
Pre-
competitive
discussions
with potential
bidders
Machine
Delivery
Machine
Retirement
Request for
Proposals
with target
requirements
Contract
selection
Non-recurring
engineering
contracts
(make system
better, not
bigger)
Establish
representative
evaluation
benchmarks
Early
Delivery
System(s)
Nail down target
requirements as
final deliverables
Establish Center
of Excellence for
Application
Development
Machine
Acceptance
• Long lead times in procurements allow vendor roadmaps to be clear enough to
bid, but still fungible
• Use of target requirements reduces risk to vendor, and thus cost to the Program
• Testbed architectures make up the unofficial 3rd tier of the ASC platform strategy:
evaluation of competing and potential future technologies at smaller scale
~5 years of
effective use
Begin next
procurement
cycle
26
LLNL-PRES-767803
Predicting the life cycle behavior of nuclear warheads requires simulation at a wide
range of spatial and temporal scale
10–8 10–4 0.01 110–10 10–6Distance (m)
Nuclear Physics
Full System
Atomic Physics
Molecular Level
Turbulence
Continuum Models
27
LLNL-PRES-767803
Large, Integrated multi-physics Codes provide the basic simulation
capabilities for broad range of application domains
Average of 60Kloc/year of additional
code to support in a representative IC
Source code
size over time
900 kloc
§ Long life-time projects
§ 15+ years of development by large
teams (10 – 20+ people, ~50% CS &
50% Physicists)
§ Algorithms tuned for minimal turn-
around time based on today’s
hardware!
HE Cookoff
Navy Railguns
Fracture
and
Failure
Blast on Structures
Inertial Confinement Fusion
Additive Manufacturing
28
LLNL-PRES-767803
Sierra Represents a Major Shift For Our Mission-Critical
Applications on Advance Architectures
2004 -
present
• IBM BlueGene, along with copious amounts of
linux-based cluster technology dominate the
HPC landscape at LLNL.
• Apps scale to unprecedented 1.6M MPI ranks,
but are still largely MPI-everywhere on ”lean”
cores
2014
BG/L
BG/P
BG/Q• Heterogeneous GPGPU-based computing begins
to look attractive with NVLINK and Unified
Memory (P9/Volta)
• Apps must undertake a 180 turn from BlueGene
toward heterogeneity and large-memory nodes
2015 –
20??
• Heterogeneous computing is here to stay, and Sierra gives us
a solid leg-up in preparing for exascale and beyond
29
LLNL-PRES-767803
Application enablement through the Center of
Excellence – the original (and enduring) concept
Vendor
Application /
Platform
Expertise
Staff (code
developers)
Applications
and Key
Libraries
Hands-on collaborations
On-site vendor expertise
Customized training
Early delivery platform access
Application bottleneck analysis
Both portable and vendor-specific solutions
Technical areas: algorithms, compilers,
programming models, resilience, tools, …
Codes tuned for
next-gen
platforms
Rapid feedback
to vendors on
early issues
Publications,
early science
Steering
committee
The LLNL/Sierra
CoE is jointly
funded by the ASC
Program and the
LLNL Institution
Improved
Software Tools
and
Environment
Performance
Portability
30
LLNL-PRES-767803
High level goals of the Center of Excellence
Specific strategies for each of these goals were captured in an “Execution Strategy” document
and used to guide our implementation
To train ASC and LLNL institutional application teams in the art and science of developing high
performance software on the Sierra architecture
To have our flagship applications utilizing advanced features of Sierra as soon as the machine
is generally available, and fully optimized (4-6x improvement over Sequoia) for the
architecture within 1-2 years of deployment
To develop performance-portable solutions that allow application teams to focus on
maintaining a single source code that will be effective on platforms deployed at our sister
laboratories
To build strong ties with our vendor partners as a key element of our co-design strategy –
leveraging their deep knowledge of the hardware and algorithmic approaches specific to their
platform, and informing them of our long-term application requirements as they will impact
future machine designs
31
LLNL-PRES-767803
High resolution ICF simulations help us understand yield
degradation mechanisms
Lasers
heat
inside
surface of
hohlraum
Radiation
drives
capsule
inward
DT ignites
producing
fusion
energy
Basics of
Inertial
Confinement
Fusion
32
LLNL-PRES-767803
Chris Clouse
LLNL
Until recently, 3D simulations were often run as a double
check on routinely employed 2D approximations. These 3D
simulations were computationally expensive and, as a
result, were run as sparingly as possible. Design and
analysis in LLNL’s National Ignition Facility (NIF) Program
until today relied primarily on 2D approximations because
3D simulations could not be turned around quickly enough
to make them a useful routine design tool.
Sierra’s architecture, which is expected to bring speed-ups
on the order of 10X for many of our 3D applications, will be
able to process these crucial simulations efficiently,
changing the way users compute by making the use of 3D
routine.
Taking Inertial Confinement Fusion (ICF) simulations to a higher level
“Sierra provided an unprecedented (98 billion cell)
simulation of two-fluid mixing in a spherical geometry.
Understanding hydrodynamic instability and the
transition to turbulence process is important in inertial
confinement fusion and High Energy Density (HED)
Physics.”
—Chris Clouse
33
LLNL-PRES-767803
Rob Rieben
LLNL
To maximize the full potential of new computer
architectures like Sierra’s, we are developing new
algorithms that play to the strengths of the machine—
specifically the high intensity of compute operations
available to data that remain local.
We are also developing next-generation simulation codes
for inertial confinement fusion and nuclear weapons
analysis that employ high-order, compute-intensive
algorithms that maximize the amount of computing done
for each piece of data retrieved from memory. These
schemes are very robust and should significantly improve
the overall analysis workflow for users. These advanced
simulation tools enabled by Sierra will improve throughput
along two axes: faster turnaround and less user
intervention.
Developing the next generation of
algorithms and codes
“The combination of our next generation,
compute intensive, high-order algorithms with
the processing power of Sierra will deliver an
exciting new era in multi-physics simulations.”
—Rob Rieben
34
LLNL-PRES-767803
Early Science Applications
Early Science - That time between when the system is first booted up (~3/1/18) and when it is transitioned to
the classified network for Programmatic use (~1/20/19) when friendly users help “shake down” the system.
Characteristics: Buggy or incomplete software stack, random bad hardware, rebooted on a whim, nursing
problems along through the night, etc…
Benefits… when it’s up, you can sometimes get the whole system to yourself for as long as you can keep things
running. Invaluable feedback to the facility and vendors, and a one-time shot for open science apps
35
LLNL-PRES-767803
Arthur Rodgers
LLNL
The character of strong earthquake shaking near a fault is
highly variable and poorly constrained by empirical data.
Supercomputers enable simulation of earthquake motions
to investigate the hazard and risk to buildings and
infrastructure before damaging events occur. The large
scale (~100 km) and fine detail (~10 m) of high frequency
waves (>5 Hz) from damaging earthquakes require today’s
most powerful computers.
Using SW4-RAJA, a 3D seismic simulation code ported to
GPU hardware, LLNL researchers are able to increase the
resolution of earthquake simulations to span most
frequencies of engineering interest on regional domains
with rapid throughput to enable sampling of various
rupture scenarios and sub-surface models.
Sierra advances resolution of
earthquake simulations
“Sierra’s thousands of GPU-accelerated nodes allow
SW4-RAJA to compute earthquake ground motions
with 100’s of billions of grid points in shorter run
times so we can resolve high-frequency waves and
investigate different rupture scenarios or earth
models.”
—Arthur Rodgers
36
LLNL-PRES-767803
Modeling a 7.0 quake on the hayward fault (SF Bay Area)
Courtesy: Artie Rogers, LLNL
37
LLNL-PRES-767803
Computational Domain and Rupture
• 120 km x 80 km x 35 km
• Centered on:
• San Leandro
• hypocenter (green star)
• Rupture: MW 7.0
• Dimensions: 77 x 13 km
• Draped onto 3D geometry of
the Hayward Fault
38
LLNL-PRES-767803
National Lab HPC resources we run on:
Cori-II at LBNL/NERSC and Sierra at LC/LLNL
Cori Phase-II at LBNL/NERSC
#10 on Top 500
27 PetaFLOPS (peak)
68 Intel Xeon Phi (CPUs) per node
Sierra at LLNL
#3 on Top 500
119 PetaFLOPS (peak)
2 IBM POWER (CPU) per node and
4 NVIDIA GPUs per node
SW4 port to GPU
platforms supported by 3-
year LLNL internal project
Future platforms will rely heavily on GPU’s
39
LLNL-PRES-767803
SW4 (CPU)
fmax = 5.0 Hz on Cori-II
SW4-RAJA (GPU)
fmax = 5.0 Hz on Sierra
vSmin = 500 m/s vSmin = 500 m/s
PPW = 8 PPW = 8
T = 90 s T = 90 s
hmin = 12.5 m hmin = 12.5 m
Npoints = 25.8 billion Npoints = 25.8 billion
Ntime-steps = 63,063 Ntime-steps = 63,063
Checkpointing every 4000 time-steps No checkpointing
Ran on 8192 nodes (85%) of Cori-II
(524,288 cores CPUs)
Ran on 256 nodes (6%) of Sierra
(1024 GPUs)
Solver time 10.3 hours Solver time 10.3 hours
Summer 2018
Hayward Fault MW 7.0 runs
40
LLNL-PRES-767803
Verification of SW4 and SW4-RAJA:
Peak Ground Velocity maps
HF MW7.0 3DTOPO
fmax = 5 Hz
hmin = 12.5 m
Two codes, different platforms PGV maps
agree to machine precision!
Cori-II: PGV = 2.1797349453 m/s
Sierra: PGV = 2.1797349453 m/s
41
LLNL-PRES-767803
Verification of SW4 and SW4-RAJA:
Waveforms agree to single precision (SAC files)
SW4 on Cori-II 5 Hz
SW4-RAJA on Sierra 5 Hz Difference ~1e-7
SW4-RAJA on Sierra 5 Hz
SW4 on Cori-II 5 Hz
overplotted
Two codes, different platforms waveforms
agree to machine precision!
42
LLNL-PRES-767803
Comparison of near-fault accelerations for a
range of resolutions (fmax = maximum frequency)
As frequency content increases,
amplitudes increase and are
more consistent with
observations
Doubling fmax requires 16x
more computational effort {
Sierra run Aug. 2018
fmax = 10.0 Hz
Livermore 37.6 km
43
LLNL-PRES-767803
GMIM’s versus distance with ASK (2014) GMM
HF MW 7.0 10.0 Hz - PGV
44
LLNL-PRES-767803
Perspective on previous regional-scale Hayward Fault runs on
various machines
45
LLNL-PRES-767803
Joint Design of Advanced Computing Solutions for Cancer
A collaboration between DOE and NCI
4
Pilot 2
RAS Biology in
Membranes
Pilot 3
Precision
Oncology
Surveillance
Pilot 1
Pre-clinical
Model
Development
Yvonne Evrard (FNLCR)
Rick Stevens (ANL)
Dwight Nissley (FNLCR)
Fred Streitz (LLNL)
Lynn Penberthy (NCI)
Gina Tourassi (ORNL)
Courtesy: Fred Streitz, Jim Glosli, LLNL
46
LLNL-PRES-767803
Oncogenic RAS is responsible for many human cancers
93% of all pancreatic
42% of all colorectal
33% of all lung cancers
1 million deaths/year world-wide
No effective inhibitors
Pathway transmits
signals
RAS is a switch
oncogenic RAS is “on”
RAS localizes to the
plasma membrane
RAS binds effectors (RAF)
to activate growthSimanshu,Cell 170, 2017
47
LLNL-PRES-767803
Essential strategy: utilize appropriate scale methodology
for each component
Membrane evolves on ms
time frame across µm…
…while protein interactions
involve µs across nm
A problem of length
and time scale:
48
LLNL-PRES-767803
Essential strategy: utilize appropriate scale methodology
for each component
Model membrane
with RAS at micron
(continuum) scale
Model protein behavior
at molecular scale
49
LLNL-PRES-767803
• Use variational autoencoder to
learn reduced order model (ROM)
of system
• Map every region of simulation
into this ROM
• Rank every patch in the
simulation by distance from
others (”uniqueness”) in ROM
• Identify the most unique patches
as the most interesting
From continuum to molecular modeling
1. Identify regions of interest
using AI techniques
50
LLNL-PRES-767803
1. Identify region
From continuum to molecular modeling
• Lipid positions are randomly generated consistent with composition
• Ras proteins, if present, are inserted with appropriate conformation
• System is equilibrated to “smooth edges”
2. Generate particle positions
51
LLNL-PRES-767803
1. Identify region
From continuum to molecular modeling
3. Simulate at molecular resolution
2. Generate particle positions
52
LLNL-PRES-767803
1. Identify region
From continuum to molecular modeling
3. Simulate at molecular resolution
4. Feedback molecular information to
continuum model
2. Generate particle positions
Feedback
53
LLNL-PRES-767803
Preliminary Results
• 50nm X 50nm high-res study w/ 2 RAS
proteins (40 μs)
• Investigate phenomena witnessed
originally in μm X μm scale simulation
• Aggregation/repulsion of charged lipids
(PAP6, PAPS, DIPE, CHOL) following
“collision” of RAS
• Results demonstrate importance of
time and length-scale for simulation
54
LLNL-PRES-767803
❑ The lifetime of a neutron is ~15 min. (~881 sec), and its value has a profound effect on the mass composition of the universe.
❑ These two methods result in measurements of the lifetime that
have a 99% probability of being incompatible - why?
❑ Is there an unaccounted for systematic?
❑ Or more exciting, is there new physics causing them to
disagree?
❑ Which method is consistent with the prediction from the
Standard Model?
❑ Answering this question requires access to leadership class
supercomputers
Bottle experiments trap neutrons in a “bottle” and
measure how many are left after a period of time. If
neutrons also decay to other new unknown particles (?
dark matter ?) results will not agree with the known
theory of physics (QCD).
In the known theory of physics (QCD) neutrons decay to
protons. Beam experiments count how many protons
emerge from a beam of neutrons.
❑ Using Sierra the team simulated the fundamental theory of Quantum
Chromodynamics (QCD) and calculated the lifetime to an
unprecedented theoretical accuracy (blue bar). This preliminary
result has an uncertainty that is 4 times smaller than the next best
result in the field (equivalent to a factor of 16 times more data).
❑ This early science time allowed us to form a concrete plan to further
reduce the uncertainty and achieve a discriminating level of
precision in the theoretical prediction.
The Neutron Lifetime on Sierra Early Science
Pavlos Vranas LLNL, André Walker-Loud LBNL, et. al. (CalLat Collaboration)
beam
! = 887.3 ± 3.1s
bottle
! = 879.4 ± 0.6s
Courtesy: Pavlos Vranas, LLNL
56
LLNL-PRES-767803
The Neutron Lifetime on Sierra Early Science
Pavlos Vranas LLNL, André Walker-Loud LBNL, et. al. (CalLat Collaboration)
❑ The vertical gray band denotes the physical value of pion mass - these points are significantly more expensive than the rest to compute
- but the most valuable for the final predictions
❑ The green point in our publication cost as much computing time as all the other points combined
❑ The green point from Sierra has 10x more statistics than our publication
❑ The red point from our publication was not useful
❑ The red point from Sierra came from an entirely new calculation and is now very useful
❑ The blue point from Sierra was entirely unattainable from previous computers (it still needs more statistics to be useful)
Nature 558 (2018) no. 7708, 91-94 Sierra Early Science (preliminary)
Sierra
0.00 0.05 0.10 0.15 0.20 0.25 0.30
✏⇡ = m⇡/(4⇡F⇡)
1.10
1.15
1.20
1.25
1.30
1.35
gA
model average gLQCD
A (✏⇡, a = 0)
gPDG
A = 1.2723(23)
gA(✏⇡, a ' 0.15 fm)
gA(✏⇡, a ' 0.12 fm)
gA(✏⇡, a ' 0.09 fm)
a ' 0.15 fm
a ' 0.12 fm
a ' 0.09 fm
57
LLNL-PRES-767803
Stanford PSAAP Center Results
Extreme-Scale HPC with the Soleil-X Multiphysics Solver
M. Papadakis1, L. Jofre1, G. Iaccarino1 & A. Aiken1
▪ Research Objectives:
- Performance & scalability extreme scales
- Demonstrate potential of Soleil-X
▪ Performance & Scalability Analysis:
- Strong and Weak Scaling
- Cost distribution different physics
- Time-to-solution
- Task-mapping (CPU/GPU) studies
▪ Physics modeling:
- Compressible flow formulation
- Lagrangian particle tracking
- Algebraic & DOM radiation model
▪ Computational approach:
- Written in Regent
- Targets Legion runtime
- GASNet library communication
- Realm for OpenMP
- CUDA toolkit for GPU
1 2 4 8 16 32 64 128 256 512 1024 2048
Number of Nodes
0.0
0.2
0.4
0.6
0.8
1.0
Efficiency
Weak Scaling: Total Solver
Soleil-X
Max. comm. pattern
Runtime bottleneck
Investigating …
Conclusions & Ongoing Work
- At present, Soleil-X is (in average) 1.7 times faster than Soleil-MPI in terms of time-to-solution
- Soleil-MPI presents very good weak scaling efficiency on Titan up to 256 nodes
- Soleil-X presents very good weak scaling efficiency both on Titan and Sierra up to 256 nodes
- Ongoing investigation of the runtime bottleneck observed at 512 nodes
58
LLNL-PRES-767803
§ A high resolution simulation of two-fluid mixing in a
spherical geometry was ran to gain insight into the
growth of a Rayleigh-Taylor instability.
§ High resolution simulations of instability growth are
not practical for routine use, so high resolution
simulations like this help guide the development of
sub-grid models that capture instability effects with
much less computational cost, which are used for
Inertial Confinement Fusion (ICF) calculations.
We ran a massive turbulent fluid mixing simulation on Sierra in
October 2018
In-situ Visualization of
Mixing LayerCourtesy: Cyrus Harrison, LLNL
59
LLNL-PRES-767803
Highlights:
§ The 97.8 billion element simulation ran across 16,384 GPUs on 4,096 Sierra Compute
Nodes
§ The simulation application used CUDA via RAJA to run on the GPUs
§ Time-varying evolution of the mixing was visualized in-situ using Ascent, also
leveraging 16,384 GPUs
§ Ascent leveraged VTK-m to run visualization algorithms on the GPUs
§ The last time step was exported to the parallel file system for detailed post-hoc
visualization using VisIt
This simulation was the first to utilize over 16,000 GPUs on
Sierra
This successful simulation is the result of many years of preparation!
60
LLNL-PRES-767803
§ Running physics algorithms on the GPUs allows us to
scale and turn around massive simulations in a power
efficient way
§ For this simulation, the physics calculations alone
(excluding including setup, I/O, etc) took 60 hours of
compute time on Sierra’s V100 GPUs
§ We estimate using the entire Sequoia system the same
problem would take 30 days of physics compute time
Porting has unlocked impressive performance on Sierra and
positions us well for future architectures
Compared Node to Node to other HPC architectures, we expect Sierra will
provide 6x to 12x faster simulation turn around time
61
LLNL-PRES-767803
Similarly, DOE’s visualization community has invested over many
years to create tools for in-situ and GPU-enabled visualization
2010 2011 2012 2013 2014 2015 2016 2017
EAVL
DAX
PISTON
ISAV15
Strawman
SC16
Performance
Modeling
(Strawman)ASCAC Subcommittee
on Exascale Computing
Strawman
ECP Begins
ISAV17
Ascent
Data Parallel
Visualization
Research
Dist. Mem.
2018
In-situ viz of
97.8 B Element
Simulation on
Sierra using
Ascent
62
LLNL-PRES-767803
Knowing this, we used a mix of in-situ (in-memory) and post-hoc (file-based)
visualization:
§ In-situ: Use used Ascent (https://github.com/alpine-dav/ascent) to visualize the time-
varying evolution of the mixing layer and export the final mesh data to compressed
HDF5 files for post-hoc exploration
—Ascent used VTK-m (http://m.vtk.org/) to run visualization algorithms on the GPUs
§ Post-hoc: We used VisIt (https://visit.llnl.gov) to explore details of final state at the
final simulation state
On Sierra, computation far outpaces our ability to save data to
the parallel file system for detailed post-hoc visualization
In-situ visualization of the time-varying details of the simulation
avoided terabytes of I/O vs post-hoc visualization
63
LLNL-PRES-767803
We ran a massive turbulent fluid mixing simulation on Sierra in
October 2018
Post-hoc Visualization of
Mixing Layer
64
LLNL-PRES-767803
Conclusion: This successful simulation is the result of many years
of preparation in both our simulation and visualization tools
Highlights:
§ The 97.8 billion element simulation ran
across 16,384 GPUs on 4,096 Sierra
Compute Nodes
§ The simulation application used CUDA via
RAJA to run on the GPUs
§ Time-varying evolution of the mixing was
visualized in-situ using Ascent, also
leveraging 16,384 GPUs
§ Ascent leveraged VTK-m to run visualization
algorithms on the GPUs
§ The last time step was exported to the
parallel file system for detailed post-hoc
visualization using VisIt
Links:
§ LLNL's Sierra Supercomputer:
https://computation.llnl.gov/computers/sierra
§ RAJA https://github.com/llnl/raja
§ https://github.com/alpine-dav/ascent
§ http://m.vtk.org/
§ https://visit.llnl.gov
§ https://github.com/llnl/conduit
VisIt
65
LLNL-PRES-767803
Future simulations for complex systems will
be guided by machine learning
Experimental data
Model
predictions,
uncertainties
Complex
design
model The next
simulation?
Simulation
ensemble
pipeline
On-line
machine
learning
Cognitive simulation learns an approximate
model for simulation state dynamics and
output as it runs …
The next
experiment?
We’re developing an integrated roadmap for NNSA and LLNL
in cognitive simulation
66
LLNL-PRES-767803
The Virtuous Circle of HPC Simulation and Machine Learning
AI and machine learning is driven by
large amounts of training data Which gets funneled down to a small amount
of data useful for decision analysis
HPC Simulation starts with a small
amount of data as initial conditions
And generates huge amounts of data
suitable for AI training
67
LLNL-PRES-767803
Cognitive simulation integrates machine
learning with simulation at multiple levels
Learning in the
simulation
loop
Learning on the
simulation loop
Steering the
simulation
loop
T
10
0
T
60
00
T
0
Molecular dynamics systems that decide when
and where to switch resolution
100X longer
simulation time
spans
Simulation systems that can predict
and correct mesh tangling
Greatly reduced
time on mesh
tuning
Learned models integrating
simulation and experiment
Improved
prediction,
Efficient UQ,
New designs
68
LLNL-PRES-767803
Scientific computing challenges differ from industry challenges
• Different type of scaling
• Data augmentation
techniques
• Image classification defined
by fine features
• Laws of physics
Goal is to leverage industry advancements and specialize when needed.
Validation needs to be a part of this!
69
LLNL-PRES-767803
• Results must honor laws conservation (e.g. energy, mass)Physics Constraints
• It is sometimes intractable to generate sufficient training dataSparse Data
• Interpretability of results necessary for predictive scienceExplainability
• Lots of data with no standards for specification or sharingData Collection
Open Research Questions for Machine Learning in Science
These are all critical research areas to pursue if we are to demonstrate the
value of AI techniques in the pursuit of predictive science
70
LLNL-PRES-767803
Intelligence built into HPC simulations will change how we tackle
problems that have overwhelming amounts of data
Higher-fidelity subgrid models run
inline with multi-physics
integrated codes
Exascale Approach
Analysis is performed by knowing
where to look for interesting
features in the data
Performance optimized in
applications to be best for
average case
AI mimics high fidelity
models at a fraction of the
computational cost
AI-based Approach
Smart integrated AI
identifies features of
interest in throngs of data
On-the-fly recompilation
of code with optimal
settings for specific run
Applications run faster, we can
increase number of simulations
for UQ and validation
Expected Benefits
Insights from exascale
simulations aren’t lost in a data
deluge
More effective use of complex
exascale architectures
Intelligent Simulation complements our exascale investments.
Sierra will allow us to research and explore the possiblities
71
LLNL-PRES-767803
§ Multi-scale problems are difficult to model
—Often solved using low-fidelity table lookups
—More accurate models are user-intensive and/or computationally expensive
§ Surrogate models can be trained on high-fidelity physics models
—Expensive calculations may be reduced to function calls
Surrogate models to increase simulation coupling for
multi-scale problems
Ab-initio Atoms Long-time Microstructure Dislocation Crystal Continuum
Inter-atomic
forces, EOS,
excited states
Defects and
interfaces,
nucleation
Defects and
defect structures
Meso-scale multi-
phase, multi-grain
evolution
Meso-scale
strength
Meso-scale
material response
Macro-scale
material response
72
LLNL-PRES-767803
The ATOM partnership is exploring new active
learning approaches to accelerate drug design
Approach: An open public-
private partnership
-Lead with computation
supported and validated by
targeted experiments
-Data-sharing to build models
using everyone’s data
Partners: LLNL, GSK, NCI,
UCSF
Product: An open-source
framework of tools and
capabilities
25 FTE’s in shared Mission
Bay, SF space
R&D started March 2018
73
LLNL-PRES-767803
ADAPD: Advanced Data Analytics for Proliferation
Detection
Indirect Data
New and responsive analytics
capabilities at the Labs
Focused Data
Science R&D
SME process and
physics models
NNSA R&D Testbeds
Establishing the next
generation of proliferation
analytics expertise
• Predictive Models
• MultiModal Detection
• Crossing Facilities and
Scenarios
DOE/NNSA
Partnerships
Student and University
Programs
Focus on Hard
Problems
Leveraging Growing
Industry Investments
Innovation in
low-profile
nuclear
proliferation
detection
74
LLNL-PRES-767803
LLNL’s Data Science Institute (established 2017)
Big Machines. Big Data. Big Ideas.
The Data Science Institute acts as the central hub for
all data science activity at LLNL working to help lead,
build, and strengthen the data science workforce,
research, and outreach to advance the state-of-the-
art of our nation's data science capabilities.
The Data Science Institute acts as the central hub for all data science activity at LLNL,
working to help lead, build, and strengthen the data science workforce, research, and
outreach to advance the state of the art of LLNL’s data science capabilities.
https://data-science.llnl.gov/
75
LLNL-PRES-767803
Data Science
Council
§ Inform S&T
investments
§ Communication
Workforce
Development
§ Reading groups
§ Courses
§ Summer Program
Web Presence
§ External/Internal
website
§ Mailing list
§ Slack
Outreach
§ Academic
engagement
§ Info sessions
§ Recruiting
Seminar
Series/
Workshop
§ Invited speakers
§ UC/LLNL multi-
day workshop
§ Strengthen and sustain LLNL’s data science workforce
§ Enhance internal coordination and communication of the data science
workforce across disciplines and programs
§ Promote and expand the visibility of data science work at LLNL
§ Evolve strategic data science vision and guide S&T investments
Strengthening the LLNL Data Science Workforce
Fostering a sense of community
76
LLNL-PRES-767803
Data Science Summer Institute
The Focus is the Future
1000+ applicants in FY18
12-week flexible internship focused on data science
Students funded 50% to work on solutions to challenge problems,
attend courses and seminars, and strengthen skills.
50 total students in FY17+FY18
§ 26 students in FY18
§ 24 students in FY17
Visiting Faculty
§ James Flegal, UC Riverside
§ Robert Gramacy, Virginia Tech
Challenge Problems
§ Topology Optimization
§ Machine Vision
§ Multimodal Data Exploration
§ Cyber
77
LLNL-PRES-767803
§ The 125 PF Sierra system is currently being stood up for use in the Stockpile
Stewardship mission of the NNSA
§ Preparation for the pivot to heterogenous computing was difficult, but is
demonstrating significant benefits in computational results
§ The Open Science period for Sierra allowed open applications significant allocations to
demonstrate cutting-edge science results
§ LLNL is pursuing an application strategy that will encompass both traditional exascale
computing needs, as well as the burgeoning AI/ML trends.
Conclusions
Disclaimer
This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United
States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or
implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus,
product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific
commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or
imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security,
LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government
or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes.

Weitere ähnliche Inhalte

Was ist angesagt?

3D V-Cache
3D V-Cache 3D V-Cache
3D V-Cache AMD
 
Chapt 02 ia-32 processer architecture
Chapt 02   ia-32 processer architectureChapt 02   ia-32 processer architecture
Chapt 02 ia-32 processer architecturebushrakainat214
 
Cache Performance Evaluation under Multi-parameters Using SMPCache simulator
Cache Performance Evaluation under Multi-parameters Using SMPCache simulatorCache Performance Evaluation under Multi-parameters Using SMPCache simulator
Cache Performance Evaluation under Multi-parameters Using SMPCache simulatorالمهندسة عائشة بني صخر
 
Chapter 1 - An Introduction to Programming
Chapter 1 - An Introduction to ProgrammingChapter 1 - An Introduction to Programming
Chapter 1 - An Introduction to Programmingmshellman
 
I.t fundamentals
I.t fundamentalsI.t fundamentals
I.t fundamentalsgwandabwa
 
Introduction to Operating System (Important Notes)
Introduction to Operating System (Important Notes)Introduction to Operating System (Important Notes)
Introduction to Operating System (Important Notes)Gaurav Kakade
 
Operating system presentation
Operating system presentationOperating system presentation
Operating system presentationSonu Vishwakarma
 
Jedi course notes intro to programming 1
Jedi course notes intro to programming 1Jedi course notes intro to programming 1
Jedi course notes intro to programming 1aehj02
 
User account (Windows)
User account (Windows)User account (Windows)
User account (Windows)Dev Dorse
 
Introduction to High Performance Computing
Introduction to High Performance ComputingIntroduction to High Performance Computing
Introduction to High Performance ComputingUmarudin Zaenuri
 
2 ecdl modulo2-online-essential explorer+gmail
2 ecdl modulo2-online-essential explorer+gmail2 ecdl modulo2-online-essential explorer+gmail
2 ecdl modulo2-online-essential explorer+gmailPietro Latino
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD
 
Open source technology
Open source technologyOpen source technology
Open source technologyaparnaz1
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance ComputingDell World
 

Was ist angesagt? (20)

Programming Language
Programming LanguageProgramming Language
Programming Language
 
Programming languages.pptx
Programming languages.pptxProgramming languages.pptx
Programming languages.pptx
 
3D V-Cache
3D V-Cache 3D V-Cache
3D V-Cache
 
Chapt 02 ia-32 processer architecture
Chapt 02   ia-32 processer architectureChapt 02   ia-32 processer architecture
Chapt 02 ia-32 processer architecture
 
Cache Performance Evaluation under Multi-parameters Using SMPCache simulator
Cache Performance Evaluation under Multi-parameters Using SMPCache simulatorCache Performance Evaluation under Multi-parameters Using SMPCache simulator
Cache Performance Evaluation under Multi-parameters Using SMPCache simulator
 
Chapter 1 - An Introduction to Programming
Chapter 1 - An Introduction to ProgrammingChapter 1 - An Introduction to Programming
Chapter 1 - An Introduction to Programming
 
8085 instruction-set part 1
8085 instruction-set part 18085 instruction-set part 1
8085 instruction-set part 1
 
Processor Basics
Processor BasicsProcessor Basics
Processor Basics
 
I.t fundamentals
I.t fundamentalsI.t fundamentals
I.t fundamentals
 
Introduction to Operating System (Important Notes)
Introduction to Operating System (Important Notes)Introduction to Operating System (Important Notes)
Introduction to Operating System (Important Notes)
 
Lecture 4 process cpu scheduling
Lecture 4   process cpu schedulingLecture 4   process cpu scheduling
Lecture 4 process cpu scheduling
 
Operating system presentation
Operating system presentationOperating system presentation
Operating system presentation
 
Jedi course notes intro to programming 1
Jedi course notes intro to programming 1Jedi course notes intro to programming 1
Jedi course notes intro to programming 1
 
User account (Windows)
User account (Windows)User account (Windows)
User account (Windows)
 
Introduction to High Performance Computing
Introduction to High Performance ComputingIntroduction to High Performance Computing
Introduction to High Performance Computing
 
OpenVINO introduction
OpenVINO introductionOpenVINO introduction
OpenVINO introduction
 
2 ecdl modulo2-online-essential explorer+gmail
2 ecdl modulo2-online-essential explorer+gmail2 ecdl modulo2-online-essential explorer+gmail
2 ecdl modulo2-online-essential explorer+gmail
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop Products
 
Open source technology
Open source technologyOpen source technology
Open source technology
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
 

Ähnlich wie Sierra Supercomputer: Science Unleashed

MVAPICH: How a Bunch of Buckeyes Crack Tough Nuts
MVAPICH: How a Bunch of Buckeyes Crack Tough NutsMVAPICH: How a Bunch of Buckeyes Crack Tough Nuts
MVAPICH: How a Bunch of Buckeyes Crack Tough Nutsinside-BigData.com
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit SupercomputerVigneshwarRamaswamy
 
Secure lustre on openstack
Secure lustre on openstackSecure lustre on openstack
Secure lustre on openstackJames Beal
 
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...inside-BigData.com
 
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Larry Smarr
 
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...Christian Esteve Rothenberg
 
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...Larry Smarr
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningDataWorks Summit
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facilityinside-BigData.com
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph Community
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...Ganesan Narayanasamy
 
OptIPuter Overview
OptIPuter OverviewOptIPuter Overview
OptIPuter OverviewLarry Smarr
 
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...Larry Smarr
 
PowerDRC/LVS 2.0.1 released by POLYTEDA
PowerDRC/LVS 2.0.1 released by POLYTEDAPowerDRC/LVS 2.0.1 released by POLYTEDA
PowerDRC/LVS 2.0.1 released by POLYTEDAAlexander Grudanov
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009Ian Foster
 

Ähnlich wie Sierra Supercomputer: Science Unleashed (20)

MVAPICH: How a Bunch of Buckeyes Crack Tough Nuts
MVAPICH: How a Bunch of Buckeyes Crack Tough NutsMVAPICH: How a Bunch of Buckeyes Crack Tough Nuts
MVAPICH: How a Bunch of Buckeyes Crack Tough Nuts
 
Sierra overview
Sierra overviewSierra overview
Sierra overview
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit Supercomputer
 
Secure lustre on openstack
Secure lustre on openstackSecure lustre on openstack
Secure lustre on openstack
 
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
 
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
 
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
 
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
TransPAC3/ACE Measurement & PerfSONAR Update
TransPAC3/ACE Measurement & PerfSONAR UpdateTransPAC3/ACE Measurement & PerfSONAR Update
TransPAC3/ACE Measurement & PerfSONAR Update
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 
PowerDRC/LVS 2.0 Overview
PowerDRC/LVS 2.0 OverviewPowerDRC/LVS 2.0 Overview
PowerDRC/LVS 2.0 Overview
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-Gene
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
 
NSCC Training Introductory Class
NSCC Training Introductory Class NSCC Training Introductory Class
NSCC Training Introductory Class
 
OptIPuter Overview
OptIPuter OverviewOptIPuter Overview
OptIPuter Overview
 
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
 
PowerDRC/LVS 2.0.1 released by POLYTEDA
PowerDRC/LVS 2.0.1 released by POLYTEDAPowerDRC/LVS 2.0.1 released by POLYTEDA
PowerDRC/LVS 2.0.1 released by POLYTEDA
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 

Mehr von inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 

Mehr von inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Kürzlich hochgeladen

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Sierra Supercomputer: Science Unleashed

  • 1. 1 LLNL-PRES-767803 LLNL-PRES-767803 This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE- AC52-07NA27344. Lawrence Livermore National Security, LLC Sierra: Science Unleashed HPC Advisory Council Stanford University Presenter: Rob Neely Weapons Simulation and Computing Program Coordinator for Computing Environments Feb 14, 2019
  • 2. 2 LLNL-PRES-767803 Outline • LLNL’s Mission for Stockpile Stewardship • Sierra system overview • Motivation for Sierra • Application preparation with the “Sierra Center of Excellence” • Exemplar unclassified NNSA application areas • Open Science Period Results • Earthquake modeling with SW4 • Biological modeling of Cancer • Neutron lifetime • Extreme-scale RANS modeling • Future directions: AI/ML in scientific computing
  • 4. 4 LLNL-PRES-767803 Lawrence Livermore Nat’l Lab (LLNL) is a multidisciplinary national security laboratory § Established in 1952 § Approximately 7,000 employees § 1 square mile, 684 facilities 4 Located in the SF Bay Area
  • 5. 5 LLNL-PRES-767803 Nuclear Security Other National Security Bi0-Science and Bio-Engineering Earth and Atmospheric Science High-Energy-Density Science Lasers and Optical S&T Advanced Materials and Manufacturing Nuclear, Chemical, and Isotopic S&T High-Performance Computing, Simulation, and Data Science ScienceandTechnology
  • 6. 6 LLNL-PRES-767803 Nuclear security and the Stockpile Stewardship Program § Assess annually the safety, security, reliability and, effectiveness of the stockpile § Extend life of stockpile warheads; adapting safety and security features to evolving requirements § Strengthen underpinning science, technology, and engineering § Nuclear nonproliferation and counterterrorism The Stockpile Stewardship Program has successfully maintained the nuclear stockpile without explosive nuclear testing since 1992
  • 7. 7 LLNL-PRES-767803 Life extension programs can improve safety and security while providing no new military capabilities (consistent with policy) Legacy System Improved safety and security Still cannot take it on the track Remanufacture as close as possible to original Remanufacture approach Improve safety and security approach
  • 8. 8 LLNL-PRES-767803 Nuclear security is more than nuclear weapons Non-proliferation Treaty monitoring Counterterrorism § Verificationmonitoring § Samplingandanalysis § Safetyandsecurityof special nuclear material § Monitoringanddetection for nuclear test bantreaty agreements § Analysisandcharacterizationof nuclear tests § Forensics § Detectionanddiagnostics § Incident response
  • 9. 9 LLNL-PRES-767803 Providing simulation capabilities to the end user involves a large, integrated ecosphere Development and debugging tools PCA Linux Cluster (128) PCB Linux Cluster (88) ASCI Platform White 4 Visualization Servers DisCom2 WAN Cisco 8540 Router SCF Network 4-way striping on jumbo frame GigE 4 OC-12c Fastlane ATM Encryptors ASCI Platform SKY HPGN Routers OC-48c DisCom2 WAN (from SNL/NM & LANL) TBMX Switch Frame 1 OC-48c to SNL/CA Frame 2 Frame 3 Frame 4 SecureNet TidalWave EdgeWater SecureNet Router NES Encryptor Fastlane Encryptor gw1 gw2 NES Router LLNL ATM Switch Open LabNetESnet ATM Switch Cisco 8540 ATM switch Ice 32 VIEWS nodes GigE channel B451 Area B113 Area 4 32 4 4 4 2 4 7 8 2 4 4 4 4 LOGIN nodes 4 LOGIN nodes 4 LOGIN nodes 4 subnets 4 subnets 4 subnets 4 subnets 2 2 NFS 4 4 4 NFS R5 NFS MTD NFS R2 NFS D2 NFS E2 22 2 NFS Maple NFS Poplar 22 6 6 Alpha 255 Forest 5 5 5 Forest Cluster (6) SC Cluster (5) ICF Cluster (8) 8 8 1 Sun Ultra Denali (X Server) CLN#2 CLN#1 GigE, External/Internet, std frame GigE, Internal, std frame GigE, Internal, jumbo frame CLN#3 To SCF Server Network FDDI Switch To NSF Servers D2, E3, R2, R5, MTD J90 Router To SCF Server1 Network To LC Servers 2 Computers Facilities Filesystems and Networks Security access controls Visualization Codes
  • 10. 10 LLNL-PRES-767803 High-performance computing (HPC) is central to sustaining the nuclear stockpile in the absence of additional nuclear tests
  • 11. 11 LLNL-PRES-767803 What drives super-exascale requirements is 3D high resolution, enhanced physics calculations done in an uncertainty quantification regime 11 Advanced physics and algorithms (new codes?) Higher fidelity and resolution in 3D (new platforms, scalable codes) Predictive simulation through Quantification of Uncertainties (1000’s of managed simulations and data analytics) Our mission demands require progress along all three axes to advance our predictive capabilities
  • 12. 12 LLNL-PRES-767803 LLNL has a long history of fielding Top500 systems Courtesy of Erich Strohmaier (LBL) SC’17 Invited talk https://www.top500.org/static/media/uploads/presentations/top500_ppt_invited_201711.pdf Normalized HPL performance of deployed systems over the first 50 Top500 lists (25 years) What is not captured here is the derivative. Like a “moving average”, this will change over the next decade as non-US/DOE sites dominate the top 3 positions.
  • 14. 14 LLNL-PRES-767803 Sierra is a 125 Petaflop peak system based in the Department of Energy’s Lawrence Livermore National Laboratory supporting national security mission advancing science in the public interest
  • 15. 15 LLNL-PRES-767803 15 System performance • Peak performance of 125 petaflops for modeling and simulation Each node has • 2 IBM POWER9 processors • 4 NVIDIA Tesla V100 GPUs • 320 GiB of fast memory • 256 GiB DDR4 • 64 GiB HBM2 • 1.6 TB of NVMe memory The system includes • 4,320 nodes • 2:1 tapered Mellanox EDR InfiniBand tree topology (50% global bandwidth) with dual-port HCA per node • 154 PB IBM Spectrum Scale file system with 1.54 TB/s R/W bandwidth System Overview
  • 16. 16 LLNL-PRES-767803 What makes Sierra so effective? GPUs: Sierra links more than 17,000 deep-learning optimized NVIDIA GPUs with the potential to deliver exascale-level performance (a billion-billion calculations per second) for AI applications. High-Speed Data Movement: High-speed Mellanox interconnect and NVLink high- bandwidth technology built into all of Sierra’s processors supply the next- generation information superhighways. Memory: Sierra’s sizable memory gives researchers a convenient launching point for data-intensive tasks, an asset that allows for greatly improved application performance and algorithmic accuracy as well as AI training. CPUs: IBM Power9 processors to rapidly execute serial code, run storage and I/O services, and manage data so the compute is done in the right place.
  • 17. 17 LLNL-PRES-767803 IBM Power9 Processor • Up to 24 cores • Sierra’s P9s have 22 cores for yield optimization on first processors • PCI-Express 4.0 • Twice as fast as PCIe 3.0 • NVLink 2.0 • Coherent, high-bandwidth links to GPUs • 14nm FinFET SOI technology • 8 billion transistors • Cache • L1I: 32 KiB per core, 8-way set associative • L1D: 32KiB per core, 8-way • L2: 256 KiB per core • L3: 120 MiB eDRAM, 20-way
  • 18. 18 LLNL-PRES-767803 NVIDIA Volta Details Tesla V100 for NVLink Tesla V100 for PCIe Performance with NVIDIA GPU Boost DOUBLE-PRECISION 7.8 TeraFLOPS SINGLE-PRECISION 15.7 TeraFLOPS DEEP LEARNING 125 TeraFLOPS DOUBLE-PRECISION 7 TeraFLOPS SINGLE-PRECISION 14 TeraFLOPS DEEP LEARNING 112 TeraFLOPS Interconnect Bandwidth Bi-Directional NVLINK 300GB/s PCIE 32GB/s Memory CoWoS Stacked HBM2 CAPACITY 16GB HBM2 BANDWIDTH 900GB/s
  • 19. 19 LLNL-PRES-767803 Lassen is an unclassified system similar to Sierra, but smaller in size. Its peak performance is 20.9 petaflops vs. Sierra's 125 petaflops peak performance. Multiple programs and institutional users will use Lassen for leading-edge science, such as predictive biology, computer- aided drug discovery, or exploring novel materials, on a world-class system. Lassen System Details (subject to change on final install) Lassen Zone CZ (Unclassified) Nodes 720 POWER9 Processors per Node 2 GV100 (Volta) GPUs per Node 4 Node Peak (tFLOP/s) 29.1 System Peak (pFLOP/s) 20.9 Node Memory (GiB) 320 System Memory (PiB) 0.225 Interconnect 2x IB EDR Off-Node Aggregate b/w (GB/s) 45.5 Compute Racks 40 Network and Infrastructure Racks 4 Storage Racks 4 Total Racks 48 Peak Power (MW) ~1.8
  • 21. 21 LLNL-PRES-767803 The Need to Pivot Toward Exascale
  • 22. 22 LLNL-PRES-767803 Advancements in HPC have happened over three major epochs, and we’re settling on a fourth (heterogeneity) Each of these eras defines a new common programming model and transitions are taking longer; DOE/NNSA is entering a fourth era. 1.00E+03 1.00E+04 1.00E+05 1.00E+06 1.00E+07 1.00E+08 1.00E+09 1.00E+10 1.00E+11 1.00E+12 1.00E+13 1.00E+14 1.00E+15 1.00E+16 1.00E+17 1.00E+18 1.00E+19Jan-52 Jun-54 D ec-56 M ay-59 N ov-61 M ay-64 O ct-66 A pr-69 S ep-71 M ar-74 A ug-76 Feb-79 A ug-81 Jan-84 Jul-86 D ec-88 Jun-91 N ov-93 M ay-96 N ov-98 A pr-01 O ct-03 M ar-06 S ep-08 Feb-11 A ug-13 Jan-16 Jul-18 Jan-21 Jun-23 D ec-25 MegascaleGigascaleTerascalePetascaleKiloscale peak (flops) Serial Era Vector Era Blue Pacific BlueGene/L White Roadrunner Many Core EraDistributed Memory Era Cielo ATS-4 Exascale ATS-5 Sierra Trinity Sequoia Purple ASCI Red Red Storm Crossroads Transitions are essential for progress: between 1952 and 2012 (Sequoia), NNSA saw a factor of ~1 trillion improvement in peak speed Code Transition Period Serial Era Vector Era Distributed Memory Era After a decade or so of uncertainty, history will most likely refer to this next era as the heterogeneous era Heterogeneous Era
  • 23. 23 LLNL-PRES-767803 Power • Unreasonable operating costs • Today's tech = $100M/year – exceeds capital investments Chip / processor efficiency • GPUs / accelerators • Simpler cores • Unreliability at near-threshold voltages – compounded by scale of systems On-node "inscaling" challenges • Complex hardware • Massive on- node concurrency requirements • New programming and memory models Disruptive Changes • Code evolution • Algorithm rewrites • Platform uncertainty The drive for more capable high-end computing is driving disruption in HPC application development Meanwhile, “Big Data” and AI is driving industry trends and investments • Social Networking, Machine/Deep Learning, Cloud Computing, … • Software innovation in data-centric HPC is thriving • Traditional HPC is facing strong headwinds if it can’t adapt External Constraints Hardware Designs Software Supports Applications React The trickle down… Assume: Continued increased computational power is desired
  • 24. 24 LLNL-PRES-767803 Processor trends tell the parallelism story…. Moore’s Law – Alive and well! Barely hanging on (dynamic clock management, IPC tricks) Flat (or even slightly down) Mostly flat Exponential growth?
  • 25. 25 LLNL-PRES-767803 The LLNL/ASC procurement strategy relies on long-lived partnerships with HPC vendors 25 Concept 9 – 10 years Retirement Pre- competitive discussions with potential bidders Machine Delivery Machine Retirement Request for Proposals with target requirements Contract selection Non-recurring engineering contracts (make system better, not bigger) Establish representative evaluation benchmarks Early Delivery System(s) Nail down target requirements as final deliverables Establish Center of Excellence for Application Development Machine Acceptance • Long lead times in procurements allow vendor roadmaps to be clear enough to bid, but still fungible • Use of target requirements reduces risk to vendor, and thus cost to the Program • Testbed architectures make up the unofficial 3rd tier of the ASC platform strategy: evaluation of competing and potential future technologies at smaller scale ~5 years of effective use Begin next procurement cycle
  • 26. 26 LLNL-PRES-767803 Predicting the life cycle behavior of nuclear warheads requires simulation at a wide range of spatial and temporal scale 10–8 10–4 0.01 110–10 10–6Distance (m) Nuclear Physics Full System Atomic Physics Molecular Level Turbulence Continuum Models
  • 27. 27 LLNL-PRES-767803 Large, Integrated multi-physics Codes provide the basic simulation capabilities for broad range of application domains Average of 60Kloc/year of additional code to support in a representative IC Source code size over time 900 kloc § Long life-time projects § 15+ years of development by large teams (10 – 20+ people, ~50% CS & 50% Physicists) § Algorithms tuned for minimal turn- around time based on today’s hardware! HE Cookoff Navy Railguns Fracture and Failure Blast on Structures Inertial Confinement Fusion Additive Manufacturing
  • 28. 28 LLNL-PRES-767803 Sierra Represents a Major Shift For Our Mission-Critical Applications on Advance Architectures 2004 - present • IBM BlueGene, along with copious amounts of linux-based cluster technology dominate the HPC landscape at LLNL. • Apps scale to unprecedented 1.6M MPI ranks, but are still largely MPI-everywhere on ”lean” cores 2014 BG/L BG/P BG/Q• Heterogeneous GPGPU-based computing begins to look attractive with NVLINK and Unified Memory (P9/Volta) • Apps must undertake a 180 turn from BlueGene toward heterogeneity and large-memory nodes 2015 – 20?? • Heterogeneous computing is here to stay, and Sierra gives us a solid leg-up in preparing for exascale and beyond
  • 29. 29 LLNL-PRES-767803 Application enablement through the Center of Excellence – the original (and enduring) concept Vendor Application / Platform Expertise Staff (code developers) Applications and Key Libraries Hands-on collaborations On-site vendor expertise Customized training Early delivery platform access Application bottleneck analysis Both portable and vendor-specific solutions Technical areas: algorithms, compilers, programming models, resilience, tools, … Codes tuned for next-gen platforms Rapid feedback to vendors on early issues Publications, early science Steering committee The LLNL/Sierra CoE is jointly funded by the ASC Program and the LLNL Institution Improved Software Tools and Environment Performance Portability
  • 30. 30 LLNL-PRES-767803 High level goals of the Center of Excellence Specific strategies for each of these goals were captured in an “Execution Strategy” document and used to guide our implementation To train ASC and LLNL institutional application teams in the art and science of developing high performance software on the Sierra architecture To have our flagship applications utilizing advanced features of Sierra as soon as the machine is generally available, and fully optimized (4-6x improvement over Sequoia) for the architecture within 1-2 years of deployment To develop performance-portable solutions that allow application teams to focus on maintaining a single source code that will be effective on platforms deployed at our sister laboratories To build strong ties with our vendor partners as a key element of our co-design strategy – leveraging their deep knowledge of the hardware and algorithmic approaches specific to their platform, and informing them of our long-term application requirements as they will impact future machine designs
  • 31. 31 LLNL-PRES-767803 High resolution ICF simulations help us understand yield degradation mechanisms Lasers heat inside surface of hohlraum Radiation drives capsule inward DT ignites producing fusion energy Basics of Inertial Confinement Fusion
  • 32. 32 LLNL-PRES-767803 Chris Clouse LLNL Until recently, 3D simulations were often run as a double check on routinely employed 2D approximations. These 3D simulations were computationally expensive and, as a result, were run as sparingly as possible. Design and analysis in LLNL’s National Ignition Facility (NIF) Program until today relied primarily on 2D approximations because 3D simulations could not be turned around quickly enough to make them a useful routine design tool. Sierra’s architecture, which is expected to bring speed-ups on the order of 10X for many of our 3D applications, will be able to process these crucial simulations efficiently, changing the way users compute by making the use of 3D routine. Taking Inertial Confinement Fusion (ICF) simulations to a higher level “Sierra provided an unprecedented (98 billion cell) simulation of two-fluid mixing in a spherical geometry. Understanding hydrodynamic instability and the transition to turbulence process is important in inertial confinement fusion and High Energy Density (HED) Physics.” —Chris Clouse
  • 33. 33 LLNL-PRES-767803 Rob Rieben LLNL To maximize the full potential of new computer architectures like Sierra’s, we are developing new algorithms that play to the strengths of the machine— specifically the high intensity of compute operations available to data that remain local. We are also developing next-generation simulation codes for inertial confinement fusion and nuclear weapons analysis that employ high-order, compute-intensive algorithms that maximize the amount of computing done for each piece of data retrieved from memory. These schemes are very robust and should significantly improve the overall analysis workflow for users. These advanced simulation tools enabled by Sierra will improve throughput along two axes: faster turnaround and less user intervention. Developing the next generation of algorithms and codes “The combination of our next generation, compute intensive, high-order algorithms with the processing power of Sierra will deliver an exciting new era in multi-physics simulations.” —Rob Rieben
  • 34. 34 LLNL-PRES-767803 Early Science Applications Early Science - That time between when the system is first booted up (~3/1/18) and when it is transitioned to the classified network for Programmatic use (~1/20/19) when friendly users help “shake down” the system. Characteristics: Buggy or incomplete software stack, random bad hardware, rebooted on a whim, nursing problems along through the night, etc… Benefits… when it’s up, you can sometimes get the whole system to yourself for as long as you can keep things running. Invaluable feedback to the facility and vendors, and a one-time shot for open science apps
  • 35. 35 LLNL-PRES-767803 Arthur Rodgers LLNL The character of strong earthquake shaking near a fault is highly variable and poorly constrained by empirical data. Supercomputers enable simulation of earthquake motions to investigate the hazard and risk to buildings and infrastructure before damaging events occur. The large scale (~100 km) and fine detail (~10 m) of high frequency waves (>5 Hz) from damaging earthquakes require today’s most powerful computers. Using SW4-RAJA, a 3D seismic simulation code ported to GPU hardware, LLNL researchers are able to increase the resolution of earthquake simulations to span most frequencies of engineering interest on regional domains with rapid throughput to enable sampling of various rupture scenarios and sub-surface models. Sierra advances resolution of earthquake simulations “Sierra’s thousands of GPU-accelerated nodes allow SW4-RAJA to compute earthquake ground motions with 100’s of billions of grid points in shorter run times so we can resolve high-frequency waves and investigate different rupture scenarios or earth models.” —Arthur Rodgers
  • 36. 36 LLNL-PRES-767803 Modeling a 7.0 quake on the hayward fault (SF Bay Area) Courtesy: Artie Rogers, LLNL
  • 37. 37 LLNL-PRES-767803 Computational Domain and Rupture • 120 km x 80 km x 35 km • Centered on: • San Leandro • hypocenter (green star) • Rupture: MW 7.0 • Dimensions: 77 x 13 km • Draped onto 3D geometry of the Hayward Fault
  • 38. 38 LLNL-PRES-767803 National Lab HPC resources we run on: Cori-II at LBNL/NERSC and Sierra at LC/LLNL Cori Phase-II at LBNL/NERSC #10 on Top 500 27 PetaFLOPS (peak) 68 Intel Xeon Phi (CPUs) per node Sierra at LLNL #3 on Top 500 119 PetaFLOPS (peak) 2 IBM POWER (CPU) per node and 4 NVIDIA GPUs per node SW4 port to GPU platforms supported by 3- year LLNL internal project Future platforms will rely heavily on GPU’s
  • 39. 39 LLNL-PRES-767803 SW4 (CPU) fmax = 5.0 Hz on Cori-II SW4-RAJA (GPU) fmax = 5.0 Hz on Sierra vSmin = 500 m/s vSmin = 500 m/s PPW = 8 PPW = 8 T = 90 s T = 90 s hmin = 12.5 m hmin = 12.5 m Npoints = 25.8 billion Npoints = 25.8 billion Ntime-steps = 63,063 Ntime-steps = 63,063 Checkpointing every 4000 time-steps No checkpointing Ran on 8192 nodes (85%) of Cori-II (524,288 cores CPUs) Ran on 256 nodes (6%) of Sierra (1024 GPUs) Solver time 10.3 hours Solver time 10.3 hours Summer 2018 Hayward Fault MW 7.0 runs
  • 40. 40 LLNL-PRES-767803 Verification of SW4 and SW4-RAJA: Peak Ground Velocity maps HF MW7.0 3DTOPO fmax = 5 Hz hmin = 12.5 m Two codes, different platforms PGV maps agree to machine precision! Cori-II: PGV = 2.1797349453 m/s Sierra: PGV = 2.1797349453 m/s
  • 41. 41 LLNL-PRES-767803 Verification of SW4 and SW4-RAJA: Waveforms agree to single precision (SAC files) SW4 on Cori-II 5 Hz SW4-RAJA on Sierra 5 Hz Difference ~1e-7 SW4-RAJA on Sierra 5 Hz SW4 on Cori-II 5 Hz overplotted Two codes, different platforms waveforms agree to machine precision!
  • 42. 42 LLNL-PRES-767803 Comparison of near-fault accelerations for a range of resolutions (fmax = maximum frequency) As frequency content increases, amplitudes increase and are more consistent with observations Doubling fmax requires 16x more computational effort { Sierra run Aug. 2018 fmax = 10.0 Hz Livermore 37.6 km
  • 43. 43 LLNL-PRES-767803 GMIM’s versus distance with ASK (2014) GMM HF MW 7.0 10.0 Hz - PGV
  • 44. 44 LLNL-PRES-767803 Perspective on previous regional-scale Hayward Fault runs on various machines
  • 45. 45 LLNL-PRES-767803 Joint Design of Advanced Computing Solutions for Cancer A collaboration between DOE and NCI 4 Pilot 2 RAS Biology in Membranes Pilot 3 Precision Oncology Surveillance Pilot 1 Pre-clinical Model Development Yvonne Evrard (FNLCR) Rick Stevens (ANL) Dwight Nissley (FNLCR) Fred Streitz (LLNL) Lynn Penberthy (NCI) Gina Tourassi (ORNL) Courtesy: Fred Streitz, Jim Glosli, LLNL
  • 46. 46 LLNL-PRES-767803 Oncogenic RAS is responsible for many human cancers 93% of all pancreatic 42% of all colorectal 33% of all lung cancers 1 million deaths/year world-wide No effective inhibitors Pathway transmits signals RAS is a switch oncogenic RAS is “on” RAS localizes to the plasma membrane RAS binds effectors (RAF) to activate growthSimanshu,Cell 170, 2017
  • 47. 47 LLNL-PRES-767803 Essential strategy: utilize appropriate scale methodology for each component Membrane evolves on ms time frame across µm… …while protein interactions involve µs across nm A problem of length and time scale:
  • 48. 48 LLNL-PRES-767803 Essential strategy: utilize appropriate scale methodology for each component Model membrane with RAS at micron (continuum) scale Model protein behavior at molecular scale
  • 49. 49 LLNL-PRES-767803 • Use variational autoencoder to learn reduced order model (ROM) of system • Map every region of simulation into this ROM • Rank every patch in the simulation by distance from others (”uniqueness”) in ROM • Identify the most unique patches as the most interesting From continuum to molecular modeling 1. Identify regions of interest using AI techniques
  • 50. 50 LLNL-PRES-767803 1. Identify region From continuum to molecular modeling • Lipid positions are randomly generated consistent with composition • Ras proteins, if present, are inserted with appropriate conformation • System is equilibrated to “smooth edges” 2. Generate particle positions
  • 51. 51 LLNL-PRES-767803 1. Identify region From continuum to molecular modeling 3. Simulate at molecular resolution 2. Generate particle positions
  • 52. 52 LLNL-PRES-767803 1. Identify region From continuum to molecular modeling 3. Simulate at molecular resolution 4. Feedback molecular information to continuum model 2. Generate particle positions Feedback
  • 53. 53 LLNL-PRES-767803 Preliminary Results • 50nm X 50nm high-res study w/ 2 RAS proteins (40 μs) • Investigate phenomena witnessed originally in μm X μm scale simulation • Aggregation/repulsion of charged lipids (PAP6, PAPS, DIPE, CHOL) following “collision” of RAS • Results demonstrate importance of time and length-scale for simulation
  • 54. 54 LLNL-PRES-767803 ❑ The lifetime of a neutron is ~15 min. (~881 sec), and its value has a profound effect on the mass composition of the universe. ❑ These two methods result in measurements of the lifetime that have a 99% probability of being incompatible - why? ❑ Is there an unaccounted for systematic? ❑ Or more exciting, is there new physics causing them to disagree? ❑ Which method is consistent with the prediction from the Standard Model? ❑ Answering this question requires access to leadership class supercomputers Bottle experiments trap neutrons in a “bottle” and measure how many are left after a period of time. If neutrons also decay to other new unknown particles (? dark matter ?) results will not agree with the known theory of physics (QCD). In the known theory of physics (QCD) neutrons decay to protons. Beam experiments count how many protons emerge from a beam of neutrons. ❑ Using Sierra the team simulated the fundamental theory of Quantum Chromodynamics (QCD) and calculated the lifetime to an unprecedented theoretical accuracy (blue bar). This preliminary result has an uncertainty that is 4 times smaller than the next best result in the field (equivalent to a factor of 16 times more data). ❑ This early science time allowed us to form a concrete plan to further reduce the uncertainty and achieve a discriminating level of precision in the theoretical prediction. The Neutron Lifetime on Sierra Early Science Pavlos Vranas LLNL, André Walker-Loud LBNL, et. al. (CalLat Collaboration) beam ! = 887.3 ± 3.1s bottle ! = 879.4 ± 0.6s Courtesy: Pavlos Vranas, LLNL
  • 55. 56 LLNL-PRES-767803 The Neutron Lifetime on Sierra Early Science Pavlos Vranas LLNL, André Walker-Loud LBNL, et. al. (CalLat Collaboration) ❑ The vertical gray band denotes the physical value of pion mass - these points are significantly more expensive than the rest to compute - but the most valuable for the final predictions ❑ The green point in our publication cost as much computing time as all the other points combined ❑ The green point from Sierra has 10x more statistics than our publication ❑ The red point from our publication was not useful ❑ The red point from Sierra came from an entirely new calculation and is now very useful ❑ The blue point from Sierra was entirely unattainable from previous computers (it still needs more statistics to be useful) Nature 558 (2018) no. 7708, 91-94 Sierra Early Science (preliminary) Sierra 0.00 0.05 0.10 0.15 0.20 0.25 0.30 ✏⇡ = m⇡/(4⇡F⇡) 1.10 1.15 1.20 1.25 1.30 1.35 gA model average gLQCD A (✏⇡, a = 0) gPDG A = 1.2723(23) gA(✏⇡, a ' 0.15 fm) gA(✏⇡, a ' 0.12 fm) gA(✏⇡, a ' 0.09 fm) a ' 0.15 fm a ' 0.12 fm a ' 0.09 fm
  • 56. 57 LLNL-PRES-767803 Stanford PSAAP Center Results Extreme-Scale HPC with the Soleil-X Multiphysics Solver M. Papadakis1, L. Jofre1, G. Iaccarino1 & A. Aiken1 ▪ Research Objectives: - Performance & scalability extreme scales - Demonstrate potential of Soleil-X ▪ Performance & Scalability Analysis: - Strong and Weak Scaling - Cost distribution different physics - Time-to-solution - Task-mapping (CPU/GPU) studies ▪ Physics modeling: - Compressible flow formulation - Lagrangian particle tracking - Algebraic & DOM radiation model ▪ Computational approach: - Written in Regent - Targets Legion runtime - GASNet library communication - Realm for OpenMP - CUDA toolkit for GPU 1 2 4 8 16 32 64 128 256 512 1024 2048 Number of Nodes 0.0 0.2 0.4 0.6 0.8 1.0 Efficiency Weak Scaling: Total Solver Soleil-X Max. comm. pattern Runtime bottleneck Investigating … Conclusions & Ongoing Work - At present, Soleil-X is (in average) 1.7 times faster than Soleil-MPI in terms of time-to-solution - Soleil-MPI presents very good weak scaling efficiency on Titan up to 256 nodes - Soleil-X presents very good weak scaling efficiency both on Titan and Sierra up to 256 nodes - Ongoing investigation of the runtime bottleneck observed at 512 nodes
  • 57. 58 LLNL-PRES-767803 § A high resolution simulation of two-fluid mixing in a spherical geometry was ran to gain insight into the growth of a Rayleigh-Taylor instability. § High resolution simulations of instability growth are not practical for routine use, so high resolution simulations like this help guide the development of sub-grid models that capture instability effects with much less computational cost, which are used for Inertial Confinement Fusion (ICF) calculations. We ran a massive turbulent fluid mixing simulation on Sierra in October 2018 In-situ Visualization of Mixing LayerCourtesy: Cyrus Harrison, LLNL
  • 58. 59 LLNL-PRES-767803 Highlights: § The 97.8 billion element simulation ran across 16,384 GPUs on 4,096 Sierra Compute Nodes § The simulation application used CUDA via RAJA to run on the GPUs § Time-varying evolution of the mixing was visualized in-situ using Ascent, also leveraging 16,384 GPUs § Ascent leveraged VTK-m to run visualization algorithms on the GPUs § The last time step was exported to the parallel file system for detailed post-hoc visualization using VisIt This simulation was the first to utilize over 16,000 GPUs on Sierra This successful simulation is the result of many years of preparation!
  • 59. 60 LLNL-PRES-767803 § Running physics algorithms on the GPUs allows us to scale and turn around massive simulations in a power efficient way § For this simulation, the physics calculations alone (excluding including setup, I/O, etc) took 60 hours of compute time on Sierra’s V100 GPUs § We estimate using the entire Sequoia system the same problem would take 30 days of physics compute time Porting has unlocked impressive performance on Sierra and positions us well for future architectures Compared Node to Node to other HPC architectures, we expect Sierra will provide 6x to 12x faster simulation turn around time
  • 60. 61 LLNL-PRES-767803 Similarly, DOE’s visualization community has invested over many years to create tools for in-situ and GPU-enabled visualization 2010 2011 2012 2013 2014 2015 2016 2017 EAVL DAX PISTON ISAV15 Strawman SC16 Performance Modeling (Strawman)ASCAC Subcommittee on Exascale Computing Strawman ECP Begins ISAV17 Ascent Data Parallel Visualization Research Dist. Mem. 2018 In-situ viz of 97.8 B Element Simulation on Sierra using Ascent
  • 61. 62 LLNL-PRES-767803 Knowing this, we used a mix of in-situ (in-memory) and post-hoc (file-based) visualization: § In-situ: Use used Ascent (https://github.com/alpine-dav/ascent) to visualize the time- varying evolution of the mixing layer and export the final mesh data to compressed HDF5 files for post-hoc exploration —Ascent used VTK-m (http://m.vtk.org/) to run visualization algorithms on the GPUs § Post-hoc: We used VisIt (https://visit.llnl.gov) to explore details of final state at the final simulation state On Sierra, computation far outpaces our ability to save data to the parallel file system for detailed post-hoc visualization In-situ visualization of the time-varying details of the simulation avoided terabytes of I/O vs post-hoc visualization
  • 62. 63 LLNL-PRES-767803 We ran a massive turbulent fluid mixing simulation on Sierra in October 2018 Post-hoc Visualization of Mixing Layer
  • 63. 64 LLNL-PRES-767803 Conclusion: This successful simulation is the result of many years of preparation in both our simulation and visualization tools Highlights: § The 97.8 billion element simulation ran across 16,384 GPUs on 4,096 Sierra Compute Nodes § The simulation application used CUDA via RAJA to run on the GPUs § Time-varying evolution of the mixing was visualized in-situ using Ascent, also leveraging 16,384 GPUs § Ascent leveraged VTK-m to run visualization algorithms on the GPUs § The last time step was exported to the parallel file system for detailed post-hoc visualization using VisIt Links: § LLNL's Sierra Supercomputer: https://computation.llnl.gov/computers/sierra § RAJA https://github.com/llnl/raja § https://github.com/alpine-dav/ascent § http://m.vtk.org/ § https://visit.llnl.gov § https://github.com/llnl/conduit VisIt
  • 64. 65 LLNL-PRES-767803 Future simulations for complex systems will be guided by machine learning Experimental data Model predictions, uncertainties Complex design model The next simulation? Simulation ensemble pipeline On-line machine learning Cognitive simulation learns an approximate model for simulation state dynamics and output as it runs … The next experiment? We’re developing an integrated roadmap for NNSA and LLNL in cognitive simulation
  • 65. 66 LLNL-PRES-767803 The Virtuous Circle of HPC Simulation and Machine Learning AI and machine learning is driven by large amounts of training data Which gets funneled down to a small amount of data useful for decision analysis HPC Simulation starts with a small amount of data as initial conditions And generates huge amounts of data suitable for AI training
  • 66. 67 LLNL-PRES-767803 Cognitive simulation integrates machine learning with simulation at multiple levels Learning in the simulation loop Learning on the simulation loop Steering the simulation loop T 10 0 T 60 00 T 0 Molecular dynamics systems that decide when and where to switch resolution 100X longer simulation time spans Simulation systems that can predict and correct mesh tangling Greatly reduced time on mesh tuning Learned models integrating simulation and experiment Improved prediction, Efficient UQ, New designs
  • 67. 68 LLNL-PRES-767803 Scientific computing challenges differ from industry challenges • Different type of scaling • Data augmentation techniques • Image classification defined by fine features • Laws of physics Goal is to leverage industry advancements and specialize when needed. Validation needs to be a part of this!
  • 68. 69 LLNL-PRES-767803 • Results must honor laws conservation (e.g. energy, mass)Physics Constraints • It is sometimes intractable to generate sufficient training dataSparse Data • Interpretability of results necessary for predictive scienceExplainability • Lots of data with no standards for specification or sharingData Collection Open Research Questions for Machine Learning in Science These are all critical research areas to pursue if we are to demonstrate the value of AI techniques in the pursuit of predictive science
  • 69. 70 LLNL-PRES-767803 Intelligence built into HPC simulations will change how we tackle problems that have overwhelming amounts of data Higher-fidelity subgrid models run inline with multi-physics integrated codes Exascale Approach Analysis is performed by knowing where to look for interesting features in the data Performance optimized in applications to be best for average case AI mimics high fidelity models at a fraction of the computational cost AI-based Approach Smart integrated AI identifies features of interest in throngs of data On-the-fly recompilation of code with optimal settings for specific run Applications run faster, we can increase number of simulations for UQ and validation Expected Benefits Insights from exascale simulations aren’t lost in a data deluge More effective use of complex exascale architectures Intelligent Simulation complements our exascale investments. Sierra will allow us to research and explore the possiblities
  • 70. 71 LLNL-PRES-767803 § Multi-scale problems are difficult to model —Often solved using low-fidelity table lookups —More accurate models are user-intensive and/or computationally expensive § Surrogate models can be trained on high-fidelity physics models —Expensive calculations may be reduced to function calls Surrogate models to increase simulation coupling for multi-scale problems Ab-initio Atoms Long-time Microstructure Dislocation Crystal Continuum Inter-atomic forces, EOS, excited states Defects and interfaces, nucleation Defects and defect structures Meso-scale multi- phase, multi-grain evolution Meso-scale strength Meso-scale material response Macro-scale material response
  • 71. 72 LLNL-PRES-767803 The ATOM partnership is exploring new active learning approaches to accelerate drug design Approach: An open public- private partnership -Lead with computation supported and validated by targeted experiments -Data-sharing to build models using everyone’s data Partners: LLNL, GSK, NCI, UCSF Product: An open-source framework of tools and capabilities 25 FTE’s in shared Mission Bay, SF space R&D started March 2018
  • 72. 73 LLNL-PRES-767803 ADAPD: Advanced Data Analytics for Proliferation Detection Indirect Data New and responsive analytics capabilities at the Labs Focused Data Science R&D SME process and physics models NNSA R&D Testbeds Establishing the next generation of proliferation analytics expertise • Predictive Models • MultiModal Detection • Crossing Facilities and Scenarios DOE/NNSA Partnerships Student and University Programs Focus on Hard Problems Leveraging Growing Industry Investments Innovation in low-profile nuclear proliferation detection
  • 73. 74 LLNL-PRES-767803 LLNL’s Data Science Institute (established 2017) Big Machines. Big Data. Big Ideas. The Data Science Institute acts as the central hub for all data science activity at LLNL working to help lead, build, and strengthen the data science workforce, research, and outreach to advance the state-of-the- art of our nation's data science capabilities. The Data Science Institute acts as the central hub for all data science activity at LLNL, working to help lead, build, and strengthen the data science workforce, research, and outreach to advance the state of the art of LLNL’s data science capabilities. https://data-science.llnl.gov/
  • 74. 75 LLNL-PRES-767803 Data Science Council § Inform S&T investments § Communication Workforce Development § Reading groups § Courses § Summer Program Web Presence § External/Internal website § Mailing list § Slack Outreach § Academic engagement § Info sessions § Recruiting Seminar Series/ Workshop § Invited speakers § UC/LLNL multi- day workshop § Strengthen and sustain LLNL’s data science workforce § Enhance internal coordination and communication of the data science workforce across disciplines and programs § Promote and expand the visibility of data science work at LLNL § Evolve strategic data science vision and guide S&T investments Strengthening the LLNL Data Science Workforce Fostering a sense of community
  • 75. 76 LLNL-PRES-767803 Data Science Summer Institute The Focus is the Future 1000+ applicants in FY18 12-week flexible internship focused on data science Students funded 50% to work on solutions to challenge problems, attend courses and seminars, and strengthen skills. 50 total students in FY17+FY18 § 26 students in FY18 § 24 students in FY17 Visiting Faculty § James Flegal, UC Riverside § Robert Gramacy, Virginia Tech Challenge Problems § Topology Optimization § Machine Vision § Multimodal Data Exploration § Cyber
  • 76. 77 LLNL-PRES-767803 § The 125 PF Sierra system is currently being stood up for use in the Stockpile Stewardship mission of the NNSA § Preparation for the pivot to heterogenous computing was difficult, but is demonstrating significant benefits in computational results § The Open Science period for Sierra allowed open applications significant allocations to demonstrate cutting-edge science results § LLNL is pursuing an application strategy that will encompass both traditional exascale computing needs, as well as the burgeoning AI/ML trends. Conclusions
  • 77. Disclaimer This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes.