In this deck from the Blue Waters Symposium, Dr. Rommie E. Amaro from UC San Diego presents: Computational Biophysics in the Petascale Computing Era.
"Advances in structural, chemical, and biophysical data acquisition (e.g., protein structures via X-ray crystallography and near atomic cryoEM, isothermal calorimetry, etc.), coupled with the continued exponential growth in computing power and advances in the underlying algorithms are opening a new era for the simulation of biological systems at the molecular level, and at scales never before reached. In this talk I will discuss how the BlueWaters Petascale computing architecture forever altered the landscape and potential of computational biophysics. In particular, new and emerging capabilities for multiscale dynamic simulations that cross spatial scales from the molecular (angstrom) to cellular ultrastructure (near micron), and temporal scales from the picoseconds of macromolecular dynamics to the physiologically important time scales of organelles and cells (milliseconds to seconds) are now possible. These efforts are driven by the outstanding and persistent advances in peta- and exascale computing and availability of multimodal biological datasets, as well as by gaps in current abilities to connect across scales where it is already clear that new approaches will result in novel fundamental understanding of biological phenomena or open new therapeutic avenues."
Watch the video: https://wp.me/p3RLHQ-j1u
Learn more: https://bluewaters.ncsa.illinois.edu/blue-waters-symposium-2018
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
TeamStation AI System Report LATAM IT Salaries 2024
Computational Biophysics in the Petascale Computing Era
1. Rommie E. Amaro . UC San Diego . Blue Waters Symposium . June 2018
Computational Biophysics in the Petascale Computing Era
2. Convergence of HPC, data science, & data enabling transformative advances at the
intersection of observational and simulation sciences
HP 735
12 CPUs
protein
10k atoms
100s ps
SGI Origin
128 CPUs
Ranger
60k CPUs
time
ion channel
100k atoms
1 ns
ribosome
2 mil atoms
100s ns
Enveloped virus
160 mil+ atoms
1-100 μs
ComputePower
1993 1997
2007
2013
Exascale
2002
LeMieux
3k CPUs
ATPase
500k atoms
10s ns
360,000 cores + GPU acceleration
Anton
BW is a key component of the
cyberinfrastructure ecosystem
4. Biophysics on Blue Waters
Biophysics
15%
Elementary Particle Physics
13%
Stellar Astronomy and
Astrophysics
13%
Physics
8%
Earth Sciences
7%
Chemistry
6%
Astronomical Sciences
5%
Atmospheric Sciences
4% Molecular Biosciences
4%
Engineering
4%
Materials Research
3%
Fluid, Particulate, and Hydraulic Systems
3%
Extragalactic Astronomy and Cosmology
2%
Magnetospheric Physics
2%
Biological Sciences
2%
Galactic Astronomy
1%Nuclear Physics
1%
Climate Dynamics
1%
Planetary Astronomy
1%
Computer and
Computation Research
1%
Biochemistry and
Molecular Structure
and Function
1%
Geophysics
0%
Social, Behavioral, and Economic Sciences
0%
Computer and
Information Science
and Engineering
0%
Neuroscience
Biology
0%
Chemical,
Thermal
Systems
0%
Other
5%
Actual Usage by Discipline, 4/2013-5/2018
5. Biophysics on Blue Waters
DI Data Intensive: uses large numbers of files, e.g. large disk space/bandwidth, or
automated workflows/off-site transfers.
GA GPU-Accelerated: written to run faster on XK nodes than on XE nodes
TN Thousand Node: scales to at least 1,000 nodes for production science
MI Memory Intensive: uses at least 50 percent of available memory on 1,000-node-runs
BW Blue Waters: research only possible on Blue Waters
MP Multi-Physics/multi-scale: job spans multiple length/timescales or physical/chemical
processes
ML Machine Learning: employs deep learning or other techniques, includes “big data”
CI Communication-Intensive: requires high-bandwidth/low-latency interconnect for
frequent, tightly coupled messaging
IA Industry Applicable: Researcher has private sector collaborators or results directly
applicable to industry
6. Computational biophysics bridges gaps across scales
e.g., Can we understand the drug target in its real environment?
Can we understand the molecular and chemical mechanisms underlying disease?
Blue waters took us into and across these key “capability gaps”;
Engaging all-atom & coarse grained MD to give unseen views
into the inner workings of cells at the molecular level
7. / OL15 is 0.44 A from NMR average of Dickerson DNA dodecamer
Tom Cheatham
University of Utah
Reproducibility and convergence (ensembles, replica
exchange) – we can overcome the sampling problem for
modest systems (tetraloops and other RNA motifs)
Force field assessment, validation, and optimization
8. Inner gate opening shifts the SF conformational preference from conductive to constricted conformation
Pinched ConductivePinched Conductive
Allosteric Dynamics of C-type Inactivation in
the KcsA Potassium Channel
Benoît Roux
Eduardo Perozo
Jing Li
9. These waters are now visible in a new high-resolution structure of
the open-inactivated KcsA (Perozo & Cuello, private communication)
10. Hepatitis B Viral capsid is a semi-permeable
container with charge selectivity.
Sodium (+) translocates five times
faster than chloride (-) in HBV
capsids.
Analysis of 6 M solvent particles
in parallel in Blue Waters.
0 Blue Waters XK nodes for 6 months
13. LARGE-SCALE COARSE-GRAINED MOLECULAR
SIMULATIONS OF THE VIRAL LIFECYCLE OF HIV-1
Gregory A. Voth
University of Chicago
The immature HIV-1 assembly process is catalyzed by scaffolds
RNA co-localizes protein & promotes
assembly
Pak, Grime, … and Voth. PNAS 114:E10056 (2017)
Membrane deformation co-localizes & promotes assembly
15. HIV capsid: 4.2 million atoms, 1300+ proteinsHIV capsid contains 186 hexamers, 12 pentamers
16. A204
E213
K203
I201
Perilla and Schulten . Nat. Commun. (2017), 15959
Acoustic analysis reveals
allostery between distant sites
Contact points control curvatureIons permeate through capsid
17. How membrane organization controls
influenza infection: simulation & experiment
Simulations yield a new molecular organizing principle for cholesterol
that controls influenza binding and infection.
Zawada…Kasson, 2016; Goronzy…Kasson, 2018.
18. 18
Routine dataset is 1.2 trillion
pixels
• 100,000’s of structures in
a single dataset
3D Structural data to build visible virtual cells
Serial Section EM
Resin-embedded samples Serial Block EM
Resin-embedded
tissues
22. Alasdair Steven, NIH
PyMolecule
LipidWrapper
CellPACK
Fully Atomic
Reconstructions
Moving from single protein to whole virus
Johnson et al, Nature Methods (2014),; Durrant & Amaro, PLOS Comp Bio (2014).
• Improved sense of the physical arrangement of biological entities in complex biological milieu
• Enables simultaneous study of multiple components
• Mesoscale molecular models as a platform for other simulation approaches (e.g., Brownian dynamics, Mcell,
lattice boltzmann MD)
… leads us to new avenues of investigation, not possible on the single protein scale
23. 114,688 processors
(16,384 Blue Waters nodes)
25.6 steps / s or ~4.5 ns/day
2013 2014 2015 2016
120 ns total, 12TB
Durrant and Amaro, unpublished (2017)
Equilibration,
membrane
System & tool
building, waiting
ProdProp.
Rev.
160 million atoms,
with explicit solvent
24. Petascale MD of Fully Enveloped Flu Virus
• Largest biological system ever simulated
(~165 million atoms)
• 4.5 ns/day using 114,688 CPUs
• 158 ns total simulation
• Saving every 20 ps è ~25 TB of data
• Collaboration with TCBG P41
25. Active
Inactive
Markov state models define metastable states
and transitions between states
Allows one to extract long timescale dynamics from many short timescale
simulations
Cell-scale Markov state models of protein dynamics
Swope, Pande, Schutte, Noe…
26. MSMs characterize loop dynamics & druggable pockets
Virion has 30 NAs, 236 HAs
Enough sampling to make a
Markov state model (MSM) of
NA loop dynamics
2-state Macrostate model
open/closed
MFPT for the 150-loop:
• open to closed 52.9ns
• closed to open 198.4 ns
27. Molecular simulation at the mesoscale
100 nm
1000 nm (1 um)
10 nm
Biophysics is ready for exascale!