SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
GENOMES TO STRUCTURE TO 
FUNCTION (AND MOVIES): ROLE OF 
HPC 
Jack R. Collins, Ph.D. 
Frederick National Laboratory for Cancer Research 
HPC User Forum 
September 16, 2014
A Guiding View 
Probably held by most HPC folks in this room 
• “The more advanced the sciences have become, the 
more they have tended to enter the domain of 
mathematics, which is a sort of center towards which they 
converge. We can judge of the perfection to which a 
science has come by the facility, more or less great, with 
which it may be approached by calculation.” - Adolphe 
Quetelet 
• Edward Mailly, Essai sur la vie et les ouv rages de Quetelet in the Annuaire de Vacadimie royale 
des sciences des lettres et des beaux-arts de Belgique (1875) Vol. xli pp. 109-297 found also in 
"Conclusions" of Instructions populaires sur le calcul des probabilités p. 230 
• Wikiquote.org
Outline 
• Changes in Biology / Life Sciences 
• Data – Beyond Databases 
• Examples merging simulation and experimental data 
• Ultra-high resolution structures 
• Xray 
• Electron Microscopy 
• Nanoparticles 
• Geno Nano-toxicity
Simple Biology 
Genome 
• DNA 
Transcriptome 
• RNA messages 
Protein 
• Structure, 
Function
Not So Simple Biology 
Genome 
• DNA 
• Epigenome 
• Histone Marks 
Transcriptome 
• mRNA, tRNA, 
alternate tx 
• miRNA, siRNA, 
lncRNA, … 
• RNA structure 
Protein 
• Structure, Function, 
Mutations, PTM 
• Localization, 
quaternary 
structure, multiple 
conformations, … 
Small 
Molecules, 
Nanoparticles
Closer to Reality 
“The fact that cancerous 
cells can be inserted into 
an animal and not develop 
into a tumor, reinforces the 
theory that it is not the 
char- acteristics of the 
cells themselves, that 
result in cancer, but the 
properties emerging from 
the interaction between 
the cell and other 
response systems.” Knox 
Cancer Cell International 
2010, 10:11 
Figure taken from: http://en.wikipedia.org/wiki/Complex_systems_biology
Beyond Databases: Application of cancer systems biology to 
decipher complex interactions in multiple dimensions 
Cancer Systems Biology: a 
peek into the future of patient 
care? 
Henrica M. J. Werner, 
Gordon B. Mills & 
Prahlad T. Ram 
Nature Reviews Clinical 
Oncology 11, 167–176 (2014) 
doi:10.1038/nrclinonc.2014.6
Beyond Databases: 
Example from the microbiome 
Vaginal microbe yields novel antibiotic 
Nature, Erika Check Hayden, 11 September 2014 
• Drug is one of thousands that may be produced by the 
human microbiome. 
• "This is a great example of the power of bioinformatics 
to not merely identify genes of interest from 'big data' 
'omics, but to connect together cassettes of genes to 
increase our fundamental understanding of how 
commensal bacteria maintain a healthy human 
microbiome," says microbial genomicist Derrick Fouts of 
the J. Craig Venter Institute in Rockville, Maryland: 
Quoted in Nature.
Developing Therapeutics: Functional understanding 
and drug development require 3D structures 
Examples with a common theme: 
• BioXFEL: CXI detectors capture millions of high resolution 
images before merging at ~400Mb/image with current 
technologies. (Currently, 40TB/day : Next generation 
~150TB/day) 
• EM: Very large data sets (1.5Tb / set with current 
technologies) 
• LVEM: Highly redundant sets (> 20,000 images per stack 
per view)
What’s technologies are fueling the 
advances in structural biology?
Brighter Light: x-ray lasers can generate ultra-high 
resolution structures
Better Detectors / Cameras 
Science benefits from 
consumer and 
astronomy applications 
driving increased 
sensitivity and speed 
(Just like GPU 
advances are helping 
push HPC.)
X-ray imaging of biomolecules 
• “With the new bioimaging technique developed in the BioXFEL center, we 
will be able to analyze crystals 1,000 times smaller than the ones we 
can use now,” Lattman said. “These are crystals we could never use 
before and, in fact, may not have known existed. A whole new universe of 
drug targets will become accessible for study as a result.” 
• “The techniques the BioXFEL center will develop could shorten the 
process of determining protein structure from years to days,” said 
Ourmazd of the University of Wisconsin-Milwaukee. “This will rely 
heavily on mathematical algorithms we and others are developing to 
deduce structure from millions of ultralow-signal snapshots.” 
• A key advantage is that it will let scientists see the motions of 
molecules for the first time. “Most biological processes require 
movements within the molecules involved,” Lattman said. 
• http://www.buffalo.edu/ubreporter/featured-stories.host.html/content/shared/university/news/ub-reporter-articles/ 
stories/2013/lattman_bioxfel.detail.html#sthash.E3SHsnSy.dpuf
Merging QM with Experiment to explore “chemical 
resolution” 
• Ultra-high resolution data contains finer details including 
the positions of protons - important to function of proteins 
and to aid in NMR refinement. 
• Some of these processes can not be described using 
stationary models but can be revealed by refining the 
structure using a combination of quantum mechanical 
tools and careful matching of the electron density data.
QMRx: Small motion dynamics within the electron 
density envelope reveal distribution related to catalytic 
mechanism required for function.
Structural Analysis integrating EM and 
QM tools for large molecular aggregates 
Use of high contrast (low 
voltage) EM for the 3D 
reconstruction of a complex 
nanomaterial without 
preprocessing of the sample 
or the use of staining agents. 
Clockwise from bottom left: 1) Electron microscopy image of self assembled nanoparticle 
with a hydrodynamic radii ~ 22.5nm. 2) Intermediate model. 3) The final model (right) 
contains 670,000 atoms.
Nanoparticle simulation and FDA approval 
Incorporate the relevant risk 
Risk Characterization Targeted research in FDA-regulated 
characterization information, 
hazard identification, exposure 
science, and risk modeling and 
methods into the safety evaluation 
of nanomaterials. 
product areas of potential 
nanotechnology applications where risk 
characterization information would 
help to enhance the understanding of 
hazard identification, exposure 
science, and risk modeling. 
Evaluate risk assessment 
approaches for risk management. 
Risk Assessment Enhance state of knowledge and 
scientific evidence to support 
potential development of generalized 
class-based approaches to risk 
assessment of FDA-regulated products 
containing nanomaterials. 
Integrate and standardize risk 
communication within the risk 
management framework 
Risk Communication Improve risk communication associated 
with FDA regulated product areas that 
either contain nanomaterials or product 
areas otherwise relevant to 
nanotechnology
Geno-Nano-Toxicity 
• Recent studies show the in vitro micronucleus assay to be a 
powerful tool in the study of nanoparticle-induced 
genotoxicity. ABCC developed procedures that facilitate the 
use of high contrast images to improve the quantitative 
annotation of micronucleus assay images. 
Automated workflow with 
feature extraction can facilitate 
the access to archived data 
providing an extra benefit to re-evaluate 
results and to facilitate 
the compilation of training sets.
Role/Challenges for HPC 
• Challenge of integrating “Big Data” into the Enterprise HPC 
infrastructure to enable workflows using heterogeneous 
technologies. (NoSQL, Hadoop, Graph Analytics, Literature, 
etc.) 
• System may need to be “tuned/balanced” differently 
• Challenge of integrating heterogeneous computational 
technologies (CPU, Big Memory, Accelerators - GPGPU, Phi, 
FPGA) to work together efficiently. 
• System may need to be designed differently 
• Challenge of efficient software to effectively make use of the 
heterogeneous HPC infrastructure. 
• Software may need to be redesigned and rewritten 
• Challenge of integrating skilled HPC people to catalyze 
adoption/innovation using HPC computational resources.
Acknowledgements 
• Raul Cachau, Ph.D. 
• Yanling Liu, Ph.D. 
• Joe Ivanic, Ph.D. 
• Brian Luke, Ph.D. 
• Uma Mudunuri
Just for Fun: A Life Science User/Researcher* 
*Not representative of all, but not uncommon 
• Preferred programming model? 
• R, Matlab, or Python (maybe Java) 
• Algorithm/Code may not be “FLOPs” dependent 
• Often involves integer or character or mixed 
• Uses a Mac because “it works” 
• Generally doesn’t want to be bothered with the details of “how” it works but wants it 
to work and solve their problem when they need it. 
• Will spend money on generating lots of data, students, postdocs 
• And often worry about what to do with the data afterward 
• Generally prefers Open Source software 
• Many of the applications change rapidly with little or no support 
• Wants to play with it on their laptop before production 
• Hear that GPUs (Phi, etc.) can make my application run faster 
• Can you port my script? 
• Is willing to use “Cloud” because you don’t have to wait for IT to provision 
a system (and there are community scripts and AMIs that make it 
relatively easy) 
• Wants to stay up to date with the latest cutting edge science 
• This was published yesterday and I want to use it on the HPC system
Merging Enterprise and HPC: 
Optimizing People and Workflows 
High-Res (up to 70k x 70k) Aperio Image 
Registration 
Common Imaging Tool Development for SAIP 
Automatic Visualization 
on 3D Biological 
Datasets 
New Image Segmentation Module in Open 
Source Imaging Software 3D Slicer
Optimizing People and Workflows 
Pathologist Annotates Image
Optimizing People and Workflows 
HPC Problem: Processing and Analysis
Optimizing People and Workflow 
HPC Problem: Processing and Analysis 
l 300 Aperio Images 
l Up to 70k by 70k pixel resolution 
l 5~20 GB for each uncompressed image 
l Insight Tool Kit Multi-resolution Image Registration 
Pipeline 
l NCSA system 
l Brute Force full resolution registration requires 720 cores 1.8TB 
mem for 240 hours 
l Modification to use low resolution + full resolution refinement 
requires 180 cores 450GB mem for 40 hours 
l Result: What would have been an impractical / 
impossible problem for the pathologists with the 
tools they had was solved fairly easily by a 
practical application of HPC and domain expertise.
What would my HPC computer look like? 
• Lots of memory bandwidth. 
• Many lookups, compares, and branches per clock tick. Not 
just Flops. 
• Ingest data from LARGE databases (I/O) 
• Scale as I need to reduce time to solution or grow model or 
model complexity evolves 
• Software libraries that efficiently use the hardware 
• Lots of capacity to run ensemble simulations in parallel so 
results can be aggregated to calculate distributions in timely 
manner 
From a presentation given at HPC 
User forum 2 1/2 years ago.
Software? 
• Programming model? 
• Skilled programmers? 
• More efficient utilization of the CPU 
• Beyond a few % of theoretical peak

Weitere ähnliche Inhalte

Was ist angesagt?

Plant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural networkPlant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural networkjournalBEEI
 
A survey of deep learning approaches to medical applications
A survey of deep learning approaches to medical applicationsA survey of deep learning approaches to medical applications
A survey of deep learning approaches to medical applicationsJoseph Paul Cohen PhD
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Amit Sheth
 
Chest X-ray Pneumonia Classification with Deep Learning
Chest X-ray Pneumonia Classification with Deep LearningChest X-ray Pneumonia Classification with Deep Learning
Chest X-ray Pneumonia Classification with Deep LearningBaoTramDuong2
 
University at Buffalo’s Center for Computational Research
University at Buffalo’s Center for Computational ResearchUniversity at Buffalo’s Center for Computational Research
University at Buffalo’s Center for Computational ResearchAllineaSoftware
 
Agile large-scale machine-learning pipelines in drug discovery
Agile large-scale machine-learning pipelines in drug discoveryAgile large-scale machine-learning pipelines in drug discovery
Agile large-scale machine-learning pipelines in drug discoveryOla Spjuth
 
Big Data and AI for Covid-19
Big Data and AI for Covid-19Big Data and AI for Covid-19
Big Data and AI for Covid-19Andrew Zhang
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astrowebuploader
 
Pneumonia detection using cnn
Pneumonia detection using cnnPneumonia detection using cnn
Pneumonia detection using cnnTushar Dalvi
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionCory Andrew Henson
 
Changing the World with CUDA
Changing the World with CUDAChanging the World with CUDA
Changing the World with CUDACalisa Cole
 
lehigh_resolve_pulse
lehigh_resolve_pulselehigh_resolve_pulse
lehigh_resolve_pulsechrisquirk
 

Was ist angesagt? (20)

Plant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural networkPlant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural network
 
A survey of deep learning approaches to medical applications
A survey of deep learning approaches to medical applicationsA survey of deep learning approaches to medical applications
A survey of deep learning approaches to medical applications
 
323462348
323462348323462348
323462348
 
Cyberinfrastructure for Einstein's Equations and Beyond
Cyberinfrastructure for Einstein's Equations and BeyondCyberinfrastructure for Einstein's Equations and Beyond
Cyberinfrastructure for Einstein's Equations and Beyond
 
Cyberistructure
CyberistructureCyberistructure
Cyberistructure
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
 
Cri big data
Cri big dataCri big data
Cri big data
 
Big Data
Big Data Big Data
Big Data
 
Chest X-ray Pneumonia Classification with Deep Learning
Chest X-ray Pneumonia Classification with Deep LearningChest X-ray Pneumonia Classification with Deep Learning
Chest X-ray Pneumonia Classification with Deep Learning
 
University at Buffalo’s Center for Computational Research
University at Buffalo’s Center for Computational ResearchUniversity at Buffalo’s Center for Computational Research
University at Buffalo’s Center for Computational Research
 
Agile large-scale machine-learning pipelines in drug discovery
Agile large-scale machine-learning pipelines in drug discoveryAgile large-scale machine-learning pipelines in drug discovery
Agile large-scale machine-learning pipelines in drug discovery
 
CV_12242015
CV_12242015CV_12242015
CV_12242015
 
Big Data and AI for Covid-19
Big Data and AI for Covid-19Big Data and AI for Covid-19
Big Data and AI for Covid-19
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astro
 
NRNB EAC Meeting 2012
NRNB EAC Meeting 2012NRNB EAC Meeting 2012
NRNB EAC Meeting 2012
 
Pneumonia detection using cnn
Pneumonia detection using cnnPneumonia detection using cnn
Pneumonia detection using cnn
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine Perception
 
Changing the World with CUDA
Changing the World with CUDAChanging the World with CUDA
Changing the World with CUDA
 
lehigh_resolve_pulse
lehigh_resolve_pulselehigh_resolve_pulse
lehigh_resolve_pulse
 
Ai covers space
Ai covers spaceAi covers space
Ai covers space
 

Ähnlich wie Collins seattle-2014-final

The Impact of Information Technology on Chemistry and Related Sciences
The Impact of Information Technology on Chemistry and Related SciencesThe Impact of Information Technology on Chemistry and Related Sciences
The Impact of Information Technology on Chemistry and Related SciencesAshutosh Jogalekar
 
AI-powered Medical Imaging Analysis for Precision Medicine
AI-powered Medical Imaging Analysis for Precision MedicineAI-powered Medical Imaging Analysis for Precision Medicine
AI-powered Medical Imaging Analysis for Precision MedicineSean Yu
 
How to Scale from Workstation through Cloud to HPC in Cryo-EM Processing
How to Scale from Workstation through Cloud to HPC in Cryo-EM ProcessingHow to Scale from Workstation through Cloud to HPC in Cryo-EM Processing
How to Scale from Workstation through Cloud to HPC in Cryo-EM Processinginside-BigData.com
 
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...Larry Smarr
 
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...Larry Smarr
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of ScienceGlobus
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Robert Grossman
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)Michael Atkins
 
The Science of Data Science
The Science of Data Science The Science of Data Science
The Science of Data Science James Hendler
 
Continuous modeling - automating model building on high-performance e-Infrast...
Continuous modeling - automating model building on high-performance e-Infrast...Continuous modeling - automating model building on high-performance e-Infrast...
Continuous modeling - automating model building on high-performance e-Infrast...Ola Spjuth
 
informatics_future.pdf
informatics_future.pdfinformatics_future.pdf
informatics_future.pdfAdhySugara2
 
Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridIan Foster
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckPistoia Alliance
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleEnis Afgan
 
Future Directions in Chemical Engineering and Bioengineering
Future Directions in Chemical Engineering and BioengineeringFuture Directions in Chemical Engineering and Bioengineering
Future Directions in Chemical Engineering and BioengineeringIlya Klabukov
 
OpenPOWER Academia and Research team's webinar - Presentations from Oak Ridg...
OpenPOWER Academia and Research team's webinar  - Presentations from Oak Ridg...OpenPOWER Academia and Research team's webinar  - Presentations from Oak Ridg...
OpenPOWER Academia and Research team's webinar - Presentations from Oak Ridg...Ganesan Narayanasamy
 
Cao report 2007-2012
Cao report 2007-2012Cao report 2007-2012
Cao report 2007-2012Elif Ceylan
 
"The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,...
"The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,..."The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,...
"The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,...Edge AI and Vision Alliance
 

Ähnlich wie Collins seattle-2014-final (20)

AI for Science
AI for ScienceAI for Science
AI for Science
 
The Impact of Information Technology on Chemistry and Related Sciences
The Impact of Information Technology on Chemistry and Related SciencesThe Impact of Information Technology on Chemistry and Related Sciences
The Impact of Information Technology on Chemistry and Related Sciences
 
AI-powered Medical Imaging Analysis for Precision Medicine
AI-powered Medical Imaging Analysis for Precision MedicineAI-powered Medical Imaging Analysis for Precision Medicine
AI-powered Medical Imaging Analysis for Precision Medicine
 
How to Scale from Workstation through Cloud to HPC in Cryo-EM Processing
How to Scale from Workstation through Cloud to HPC in Cryo-EM ProcessingHow to Scale from Workstation through Cloud to HPC in Cryo-EM Processing
How to Scale from Workstation through Cloud to HPC in Cryo-EM Processing
 
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
 
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
 
The Science of Data Science
The Science of Data Science The Science of Data Science
The Science of Data Science
 
Continuous modeling - automating model building on high-performance e-Infrast...
Continuous modeling - automating model building on high-performance e-Infrast...Continuous modeling - automating model building on high-performance e-Infrast...
Continuous modeling - automating model building on high-performance e-Infrast...
 
informatics_future.pdf
informatics_future.pdfinformatics_future.pdf
informatics_future.pdf
 
Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deck
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an example
 
Future Directions in Chemical Engineering and Bioengineering
Future Directions in Chemical Engineering and BioengineeringFuture Directions in Chemical Engineering and Bioengineering
Future Directions in Chemical Engineering and Bioengineering
 
OpenPOWER Academia and Research team's webinar - Presentations from Oak Ridg...
OpenPOWER Academia and Research team's webinar  - Presentations from Oak Ridg...OpenPOWER Academia and Research team's webinar  - Presentations from Oak Ridg...
OpenPOWER Academia and Research team's webinar - Presentations from Oak Ridg...
 
Cao report 2007-2012
Cao report 2007-2012Cao report 2007-2012
Cao report 2007-2012
 
"The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,...
"The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,..."The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,...
"The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,...
 

Mehr von inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 

Mehr von inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Kürzlich hochgeladen

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 

Kürzlich hochgeladen (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

Collins seattle-2014-final

  • 1. GENOMES TO STRUCTURE TO FUNCTION (AND MOVIES): ROLE OF HPC Jack R. Collins, Ph.D. Frederick National Laboratory for Cancer Research HPC User Forum September 16, 2014
  • 2. A Guiding View Probably held by most HPC folks in this room • “The more advanced the sciences have become, the more they have tended to enter the domain of mathematics, which is a sort of center towards which they converge. We can judge of the perfection to which a science has come by the facility, more or less great, with which it may be approached by calculation.” - Adolphe Quetelet • Edward Mailly, Essai sur la vie et les ouv rages de Quetelet in the Annuaire de Vacadimie royale des sciences des lettres et des beaux-arts de Belgique (1875) Vol. xli pp. 109-297 found also in "Conclusions" of Instructions populaires sur le calcul des probabilités p. 230 • Wikiquote.org
  • 3. Outline • Changes in Biology / Life Sciences • Data – Beyond Databases • Examples merging simulation and experimental data • Ultra-high resolution structures • Xray • Electron Microscopy • Nanoparticles • Geno Nano-toxicity
  • 4. Simple Biology Genome • DNA Transcriptome • RNA messages Protein • Structure, Function
  • 5. Not So Simple Biology Genome • DNA • Epigenome • Histone Marks Transcriptome • mRNA, tRNA, alternate tx • miRNA, siRNA, lncRNA, … • RNA structure Protein • Structure, Function, Mutations, PTM • Localization, quaternary structure, multiple conformations, … Small Molecules, Nanoparticles
  • 6. Closer to Reality “The fact that cancerous cells can be inserted into an animal and not develop into a tumor, reinforces the theory that it is not the char- acteristics of the cells themselves, that result in cancer, but the properties emerging from the interaction between the cell and other response systems.” Knox Cancer Cell International 2010, 10:11 Figure taken from: http://en.wikipedia.org/wiki/Complex_systems_biology
  • 7. Beyond Databases: Application of cancer systems biology to decipher complex interactions in multiple dimensions Cancer Systems Biology: a peek into the future of patient care? Henrica M. J. Werner, Gordon B. Mills & Prahlad T. Ram Nature Reviews Clinical Oncology 11, 167–176 (2014) doi:10.1038/nrclinonc.2014.6
  • 8. Beyond Databases: Example from the microbiome Vaginal microbe yields novel antibiotic Nature, Erika Check Hayden, 11 September 2014 • Drug is one of thousands that may be produced by the human microbiome. • "This is a great example of the power of bioinformatics to not merely identify genes of interest from 'big data' 'omics, but to connect together cassettes of genes to increase our fundamental understanding of how commensal bacteria maintain a healthy human microbiome," says microbial genomicist Derrick Fouts of the J. Craig Venter Institute in Rockville, Maryland: Quoted in Nature.
  • 9. Developing Therapeutics: Functional understanding and drug development require 3D structures Examples with a common theme: • BioXFEL: CXI detectors capture millions of high resolution images before merging at ~400Mb/image with current technologies. (Currently, 40TB/day : Next generation ~150TB/day) • EM: Very large data sets (1.5Tb / set with current technologies) • LVEM: Highly redundant sets (> 20,000 images per stack per view)
  • 10. What’s technologies are fueling the advances in structural biology?
  • 11. Brighter Light: x-ray lasers can generate ultra-high resolution structures
  • 12. Better Detectors / Cameras Science benefits from consumer and astronomy applications driving increased sensitivity and speed (Just like GPU advances are helping push HPC.)
  • 13. X-ray imaging of biomolecules • “With the new bioimaging technique developed in the BioXFEL center, we will be able to analyze crystals 1,000 times smaller than the ones we can use now,” Lattman said. “These are crystals we could never use before and, in fact, may not have known existed. A whole new universe of drug targets will become accessible for study as a result.” • “The techniques the BioXFEL center will develop could shorten the process of determining protein structure from years to days,” said Ourmazd of the University of Wisconsin-Milwaukee. “This will rely heavily on mathematical algorithms we and others are developing to deduce structure from millions of ultralow-signal snapshots.” • A key advantage is that it will let scientists see the motions of molecules for the first time. “Most biological processes require movements within the molecules involved,” Lattman said. • http://www.buffalo.edu/ubreporter/featured-stories.host.html/content/shared/university/news/ub-reporter-articles/ stories/2013/lattman_bioxfel.detail.html#sthash.E3SHsnSy.dpuf
  • 14. Merging QM with Experiment to explore “chemical resolution” • Ultra-high resolution data contains finer details including the positions of protons - important to function of proteins and to aid in NMR refinement. • Some of these processes can not be described using stationary models but can be revealed by refining the structure using a combination of quantum mechanical tools and careful matching of the electron density data.
  • 15. QMRx: Small motion dynamics within the electron density envelope reveal distribution related to catalytic mechanism required for function.
  • 16. Structural Analysis integrating EM and QM tools for large molecular aggregates Use of high contrast (low voltage) EM for the 3D reconstruction of a complex nanomaterial without preprocessing of the sample or the use of staining agents. Clockwise from bottom left: 1) Electron microscopy image of self assembled nanoparticle with a hydrodynamic radii ~ 22.5nm. 2) Intermediate model. 3) The final model (right) contains 670,000 atoms.
  • 17. Nanoparticle simulation and FDA approval Incorporate the relevant risk Risk Characterization Targeted research in FDA-regulated characterization information, hazard identification, exposure science, and risk modeling and methods into the safety evaluation of nanomaterials. product areas of potential nanotechnology applications where risk characterization information would help to enhance the understanding of hazard identification, exposure science, and risk modeling. Evaluate risk assessment approaches for risk management. Risk Assessment Enhance state of knowledge and scientific evidence to support potential development of generalized class-based approaches to risk assessment of FDA-regulated products containing nanomaterials. Integrate and standardize risk communication within the risk management framework Risk Communication Improve risk communication associated with FDA regulated product areas that either contain nanomaterials or product areas otherwise relevant to nanotechnology
  • 18. Geno-Nano-Toxicity • Recent studies show the in vitro micronucleus assay to be a powerful tool in the study of nanoparticle-induced genotoxicity. ABCC developed procedures that facilitate the use of high contrast images to improve the quantitative annotation of micronucleus assay images. Automated workflow with feature extraction can facilitate the access to archived data providing an extra benefit to re-evaluate results and to facilitate the compilation of training sets.
  • 19. Role/Challenges for HPC • Challenge of integrating “Big Data” into the Enterprise HPC infrastructure to enable workflows using heterogeneous technologies. (NoSQL, Hadoop, Graph Analytics, Literature, etc.) • System may need to be “tuned/balanced” differently • Challenge of integrating heterogeneous computational technologies (CPU, Big Memory, Accelerators - GPGPU, Phi, FPGA) to work together efficiently. • System may need to be designed differently • Challenge of efficient software to effectively make use of the heterogeneous HPC infrastructure. • Software may need to be redesigned and rewritten • Challenge of integrating skilled HPC people to catalyze adoption/innovation using HPC computational resources.
  • 20. Acknowledgements • Raul Cachau, Ph.D. • Yanling Liu, Ph.D. • Joe Ivanic, Ph.D. • Brian Luke, Ph.D. • Uma Mudunuri
  • 21. Just for Fun: A Life Science User/Researcher* *Not representative of all, but not uncommon • Preferred programming model? • R, Matlab, or Python (maybe Java) • Algorithm/Code may not be “FLOPs” dependent • Often involves integer or character or mixed • Uses a Mac because “it works” • Generally doesn’t want to be bothered with the details of “how” it works but wants it to work and solve their problem when they need it. • Will spend money on generating lots of data, students, postdocs • And often worry about what to do with the data afterward • Generally prefers Open Source software • Many of the applications change rapidly with little or no support • Wants to play with it on their laptop before production • Hear that GPUs (Phi, etc.) can make my application run faster • Can you port my script? • Is willing to use “Cloud” because you don’t have to wait for IT to provision a system (and there are community scripts and AMIs that make it relatively easy) • Wants to stay up to date with the latest cutting edge science • This was published yesterday and I want to use it on the HPC system
  • 22. Merging Enterprise and HPC: Optimizing People and Workflows High-Res (up to 70k x 70k) Aperio Image Registration Common Imaging Tool Development for SAIP Automatic Visualization on 3D Biological Datasets New Image Segmentation Module in Open Source Imaging Software 3D Slicer
  • 23. Optimizing People and Workflows Pathologist Annotates Image
  • 24. Optimizing People and Workflows HPC Problem: Processing and Analysis
  • 25. Optimizing People and Workflow HPC Problem: Processing and Analysis l 300 Aperio Images l Up to 70k by 70k pixel resolution l 5~20 GB for each uncompressed image l Insight Tool Kit Multi-resolution Image Registration Pipeline l NCSA system l Brute Force full resolution registration requires 720 cores 1.8TB mem for 240 hours l Modification to use low resolution + full resolution refinement requires 180 cores 450GB mem for 40 hours l Result: What would have been an impractical / impossible problem for the pathologists with the tools they had was solved fairly easily by a practical application of HPC and domain expertise.
  • 26. What would my HPC computer look like? • Lots of memory bandwidth. • Many lookups, compares, and branches per clock tick. Not just Flops. • Ingest data from LARGE databases (I/O) • Scale as I need to reduce time to solution or grow model or model complexity evolves • Software libraries that efficiently use the hardware • Lots of capacity to run ensemble simulations in parallel so results can be aggregated to calculate distributions in timely manner From a presentation given at HPC User forum 2 1/2 years ago.
  • 27. Software? • Programming model? • Skilled programmers? • More efficient utilization of the CPU • Beyond a few % of theoretical peak