SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
SUSTAINABLE SOFTWARE FOR
COMPUTATIONAL CHEMISTRY
AND MATERIALS MODELING
Beverly Sanders, University of Florida
Outline
   Overview
   Challenges and Current State
   Overcoming the Barriers—technical and cultural
Computational Chemistry
  Long history – over 50 years
  Underpins broad array of scientific applications—Grand
   Challenges
      Efficient combustion systems
      Drug design
      Understanding biological systems
      Semiconductor design
      Water sustainability
      CO2 Sequestration
      Efficient lighting (quantum dots)
     …
    Full partner with experiments—”computational experiments”
     may be more reliable than lab measurements
Scientific Software Innovation Institute for Computational
Chemistry and Materials Modeling -- S2I2C2M2

   Collaboration between computational chemists,
    computer scientists, applied mathematicians, and
    computer engineers
   Goal: overcome obstacles of algorithms and culture
    and change the nature of computational chemistry
    software development.
   Year long conceptualization phase has been funded by
    NSF
     First   meeting scheduled for January 2013
People

Daniel Crawford (Virginia Tech)     Vijay Pande (Stanford)
Robert Harrison (Tennessee, ORNL)   Manish Parashar (Rutgers)
Anna I. Krylov (U.S.C.)             Ram Ramanujam (LSU)
Theresa Windus (Iowa State)         Beverly Sanders (Florida)
Emily Carter (Princeton)            Bernhard Schlegel (Wayne State)
Edmund Chow (Georgia Tech)          David Sherrill (Georgia Tech)
Erik Deumens (Florida)              Lyudmila Slipchenko (Purdue)
Mark Gordon (Iowa State)            Masha Sosonkina (Iowa State)
Martin Head-Gordon (Berkeley)       Edward Valeev (Virginia Tech)
Todd Martinez (Stanford),           Ross Walker (San Diego
David McDowell (Georgia Tech)       Supercomputing Center)

                      + others
Current State of Computational
Chemistry
   Long history--legacies of modern molecular
    dynamics and quantum chemistry packages span
    decades
   Both open source and commercial
   Amalgam of programming languages
   Domain specific methods
     Multi-dimensional   integral engines
   General purpose
     Davidson  method for computing eigenvalues of large
      matrices (ranks in tens of billions)
Software is extremely complex
   Example: Modern ab initio quantum chemistry
    simulations
       Computations scale as O(N7) or higher
           Where N represents size of molecular system (number of atoms,
            electrons, or basis functions)
   Code complexity arises naturally from problems, but
     is an obstacle to long-term sustainability
     is an obstacle to exploitation of (ever changing) HPC
      hardware
     hinders education of next generation of scientists
           only a handful of very senior students can make a contribution
   Much recent software development focused on
    exploiting parallel architectures
     Varying   degrees of success
     With a few exceptions, still not fully exploiting
      available systems
     Utilizing exascale will require rethinking of approach

   Desperate need for tools to generate high
    performance massively parallel code from high
    level specifications
Developers
   Mostly grad students and post-docs
   Training in software engineering left to individual
    research groups
     Extent to which this is done varies
     Large burden for small groups: community approach
      has potential benefits for both software and the
      students
     Education tends to be narrow: students learn about
      software their advisors are involved with
Science Drivers
   Catalysis
       Catalysts facilitate control of chemical reactions by raising rates
        that chemical bonds are formed or broken
           Improve selectivity and control over unwanted byproducts
           Decreased energy consumption
           Reduction of waste stream
       Rational design of catalysts for a specific application is one of
        the Holy Grails of of modern chemistry and chemical engineering
       Requires quantitative information about transition states
           Intermediates low concentration and short lifetimes—thwart
            experimental evaluation
       Will require state-of-the-art computation combined with
        experiements
Science Drivers
   Organic photovoltaic cells
     Potential  applications: thin-film transistors, LEDs, solar
      cells, optical switches
     Advantages
       Devicescan be flexible
       Inexpensive to produce

     Limitations
       Reduced     power conversion energy
     But,process leading to current generation not well
      understood
Overcoming the Barriers
   1 year conceptualization phase funded by NSF
     First
          meeting Jan 2013
     3 working groups

   Highest priority
     Portable  parallel infrastructure
     General-purpose tensor algebra algorithms

     Protocols for information exchange and code
      interoperability
     Education and training
Portable parallel infrastructure
   Technology trends
     Massive concurrency on a chip
     Massive number of sockets in largest supercomputers
     Heterogeneity (CPU + GPU)
     Deep, complex memory hierarchies
     Memory and communication bandwidth limited

   Bleeding edge applications may need to
     Coordinate over 109 threads
     Tolerate faults
     Explicitly manage energy consumption
Sustainability of large and widely distributed
chemistry codes

   Enable most computations in chemistry
   Likely will run on leadership class machines
   Working group will include computational chemists,
    parallel programming experts, and reps from major
    tech providers (NVIDIA, Intel, IBM)
Sustainability of software developed by smaller
research groups

   Need to understand programming models and tools
   Need to understand how both community and
    software can be better organized
     Accelerate  testing of new ideas at sufficient scale to
      determine their worth
     Key: being able to write code and integrate into
      existing software.
   Currently, new developments take months or years
    to migrate from developers software to other
    packages
General-purpose tensor algebra
algorithms
   Tensor algebra ubiquitous in science and
    engineering
   Need new approaches for computing with high
    dimensional tensors
     Currentsoftware—8 or fewer
     Emerging methods require 3N dimensions where N (the
      number of electrons) may be O(100) or more.
   Need common framework of reusable software
    elements.
Challenges for high-rank tensors
   Challenges
     Develop robust implementations of algorithms
     Standardize data structures and algorithms, APIs,
      software elements, and frameworks
     Automate the derivation, transformation, and
      implementation of tensor expressions.
   Will require cross-disciplinary collaborations
   Infrastructure will include DSL, runtimes, compilers as
    well as static and dynamic algorithm analyses
Protocols for Information Exchange and
Code Interoperability
   Historically culture has been competitive rather than
    collaborative
     Sharing   of code and data limited
   Theoretical methods driving code towards greater
    size and complexity
   Currently, progress may require substantial
    duplication of effort—code that could be reused is
    not due to lack of standards
   Impedes new science, wastes human labor
Information Sharing
   Standards for data shared between codes
   Standards (or methods to convert between
    standards) for stored data to facilitate mining.
   Data provenance
   Cannot expect all code to use the same format, but
    transformation leads to errors, computational
    inefficiencies, complex interfaces
Code Sharing
   Need to establish level of interoperability
     Coarse-grained
       Hartree-Fock   code from one app, MP2 code from another
     Fine-grained
       Calculate most of one electron contributions to a Fock matrix
        in one program, relativistic and solvent terms in another
   Need architecture
Education and Training
   Mastering existing codes is daunting task for grad students
   Chemical education culture worse than many STEM fields
       PhD only requires modest coursework
           OK for most fields of chemistry where undergrad training is
            adequate preparation for hands-on lab research
           Most students unprepared for research in computational chemistry
                 Ad hoc training by individual research groups is innefficient
   How should students be prepared for 2020 and beyond?
       Programming models and tools
       Multidisciplinary foundation to computational science
           Reasoning about software
           Manipulating software with confidence and facility
Summer school
   Summer school for grad students supported by
    community
     Fundamental algorithms of computational chemistry
     Software best practices
If    S 2I2C2M2         is successful
    Open access to new software tools and infrastructure
    Training and educational opportunities for grad
     students and post-docs in
      Algorithms
      Code standards
      Software best practices
    NOT the goal to produce monolithic computational
     chemistry package to replace existing ones
        Healthy competition is good
    Sets of robust and properly validated software
     components that can be shared will benefit the entire
     community

Weitere ähnliche Inhalte

Was ist angesagt?

Data Management for Citizen Science
Data Management for Citizen ScienceData Management for Citizen Science
Data Management for Citizen ScienceAndrea Wiggins
 
Lecture 2 Teaching Digital Technologies 2016
Lecture 2 Teaching Digital Technologies 2016Lecture 2 Teaching Digital Technologies 2016
Lecture 2 Teaching Digital Technologies 2016Jason Zagami
 
Epistemic Encounters: Interdisciplinary collaboration in developing virtual r...
Epistemic Encounters: Interdisciplinary collaboration in developing virtual r...Epistemic Encounters: Interdisciplinary collaboration in developing virtual r...
Epistemic Encounters: Interdisciplinary collaboration in developing virtual r...Smiljana Antonijevic
 
Data Intensive Collaboration in Science and Engineering: CSCW workshop themes
Data Intensive Collaboration in Science and Engineering: CSCW workshop themesData Intensive Collaboration in Science and Engineering: CSCW workshop themes
Data Intensive Collaboration in Science and Engineering: CSCW workshop themesAndrea Wiggins
 

Was ist angesagt? (7)

Data Management for Citizen Science
Data Management for Citizen ScienceData Management for Citizen Science
Data Management for Citizen Science
 
Ucsd research-it-09-11-18
Ucsd research-it-09-11-18Ucsd research-it-09-11-18
Ucsd research-it-09-11-18
 
Sgci xsede-gateways-07-08-16
Sgci xsede-gateways-07-08-16Sgci xsede-gateways-07-08-16
Sgci xsede-gateways-07-08-16
 
Lecture 2 Teaching Digital Technologies 2016
Lecture 2 Teaching Digital Technologies 2016Lecture 2 Teaching Digital Technologies 2016
Lecture 2 Teaching Digital Technologies 2016
 
Epistemic Encounters: Interdisciplinary collaboration in developing virtual r...
Epistemic Encounters: Interdisciplinary collaboration in developing virtual r...Epistemic Encounters: Interdisciplinary collaboration in developing virtual r...
Epistemic Encounters: Interdisciplinary collaboration in developing virtual r...
 
Innovation Ecosystems Summit Brochure
Innovation Ecosystems Summit BrochureInnovation Ecosystems Summit Brochure
Innovation Ecosystems Summit Brochure
 
Data Intensive Collaboration in Science and Engineering: CSCW workshop themes
Data Intensive Collaboration in Science and Engineering: CSCW workshop themesData Intensive Collaboration in Science and Engineering: CSCW workshop themes
Data Intensive Collaboration in Science and Engineering: CSCW workshop themes
 

Ähnlich wie Sustainable Software for Computational Chemistry and Materials Modeling

Redes de sensores sem fio autonômicas: abordagens, aplicações e desafios
 Redes de sensores sem fio autonômicas: abordagens, aplicações e desafios Redes de sensores sem fio autonômicas: abordagens, aplicações e desafios
Redes de sensores sem fio autonômicas: abordagens, aplicações e desafiosPET Computação
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...San Diego Supercomputer Center
 
10.1.1.104.5038
10.1.1.104.503810.1.1.104.5038
10.1.1.104.503896565
 
06 styles and_greenfield_design
06 styles and_greenfield_design06 styles and_greenfield_design
06 styles and_greenfield_designMajong DevJfu
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogcemarpierc
 
Teaching Computational Physics
Teaching Computational PhysicsTeaching Computational Physics
Teaching Computational PhysicsAmdeselassie Amde
 
Hedstrom Infrastructure
Hedstrom InfrastructureHedstrom Infrastructure
Hedstrom Infrastructureguest2c9ba28e
 
Disambiguating Advanced Computing for Humanities Researchers
Disambiguating Advanced Computing for Humanities ResearchersDisambiguating Advanced Computing for Humanities Researchers
Disambiguating Advanced Computing for Humanities ResearchersBaden Hughes
 
Linking data, models and tools an overview
Linking data, models and tools an overviewLinking data, models and tools an overview
Linking data, models and tools an overviewGennadii Donchyts
 
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems  ( IJASSN )International Journal of Advanced Smart Sensor Network Systems  ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )ijassn
 
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...EarthCube
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models IJECEIAES
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...ijassn
 
Digital Security by Design: Formal Verification with Broad-Spectrum ANSI-C Re...
Digital Security by Design: Formal Verification with Broad-Spectrum ANSI-C Re...Digital Security by Design: Formal Verification with Broad-Spectrum ANSI-C Re...
Digital Security by Design: Formal Verification with Broad-Spectrum ANSI-C Re...KTN
 
Final teit syllabus_2012_course_04.06.2014
Final teit syllabus_2012_course_04.06.2014Final teit syllabus_2012_course_04.06.2014
Final teit syllabus_2012_course_04.06.2014deepti112233
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinarPistoia Alliance
 

Ähnlich wie Sustainable Software for Computational Chemistry and Materials Modeling (20)

Cyberistructure
CyberistructureCyberistructure
Cyberistructure
 
Redes de sensores sem fio autonômicas: abordagens, aplicações e desafios
 Redes de sensores sem fio autonômicas: abordagens, aplicações e desafios Redes de sensores sem fio autonômicas: abordagens, aplicações e desafios
Redes de sensores sem fio autonômicas: abordagens, aplicações e desafios
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
 
10.1.1.104.5038
10.1.1.104.503810.1.1.104.5038
10.1.1.104.5038
 
06 styles and_greenfield_design
06 styles and_greenfield_design06 styles and_greenfield_design
06 styles and_greenfield_design
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogce
 
Teaching Computational Physics
Teaching Computational PhysicsTeaching Computational Physics
Teaching Computational Physics
 
Hedstrom Infrastructure
Hedstrom InfrastructureHedstrom Infrastructure
Hedstrom Infrastructure
 
Disambiguating Advanced Computing for Humanities Researchers
Disambiguating Advanced Computing for Humanities ResearchersDisambiguating Advanced Computing for Humanities Researchers
Disambiguating Advanced Computing for Humanities Researchers
 
Linking data, models and tools an overview
Linking data, models and tools an overviewLinking data, models and tools an overview
Linking data, models and tools an overview
 
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems  ( IJASSN )International Journal of Advanced Smart Sensor Network Systems  ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
 
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
 
Digital Security by Design: Formal Verification with Broad-Spectrum ANSI-C Re...
Digital Security by Design: Formal Verification with Broad-Spectrum ANSI-C Re...Digital Security by Design: Formal Verification with Broad-Spectrum ANSI-C Re...
Digital Security by Design: Formal Verification with Broad-Spectrum ANSI-C Re...
 
AI for Science
AI for ScienceAI for Science
AI for Science
 
Final teit syllabus_2012_course_04.06.2014
Final teit syllabus_2012_course_04.06.2014Final teit syllabus_2012_course_04.06.2014
Final teit syllabus_2012_course_04.06.2014
 
Hdllab1
Hdllab1Hdllab1
Hdllab1
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar
 

Mehr von SoftwarePractice

Software Practice 12 breakout - Tracking usage and impact of software
Software Practice 12 breakout - Tracking usage and impact of softwareSoftware Practice 12 breakout - Tracking usage and impact of software
Software Practice 12 breakout - Tracking usage and impact of softwareSoftwarePractice
 
Software Practice 12 breakout - Life for Software Beyond Public Funding
Software Practice 12 breakout - Life for Software Beyond Public FundingSoftware Practice 12 breakout - Life for Software Beyond Public Funding
Software Practice 12 breakout - Life for Software Beyond Public FundingSoftwarePractice
 
Overview of the TriBITS Lifecycle Model
Overview of the TriBITS Lifecycle ModelOverview of the TriBITS Lifecycle Model
Overview of the TriBITS Lifecycle ModelSoftwarePractice
 
ScienceSoft: Open Software for Open Science
ScienceSoft: Open Software for Open ScienceScienceSoft: Open Software for Open Science
ScienceSoft: Open Software for Open ScienceSoftwarePractice
 
libHPC: Software sustainability and reuse through metadata preservation
libHPC: Software sustainability and reuse through metadata preservationlibHPC: Software sustainability and reuse through metadata preservation
libHPC: Software sustainability and reuse through metadata preservationSoftwarePractice
 
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...SoftwarePractice
 
The Relationship Between Development Problems and Use of Software Engineering...
The Relationship Between Development Problems and Use of Software Engineering...The Relationship Between Development Problems and Use of Software Engineering...
The Relationship Between Development Problems and Use of Software Engineering...SoftwarePractice
 

Mehr von SoftwarePractice (7)

Software Practice 12 breakout - Tracking usage and impact of software
Software Practice 12 breakout - Tracking usage and impact of softwareSoftware Practice 12 breakout - Tracking usage and impact of software
Software Practice 12 breakout - Tracking usage and impact of software
 
Software Practice 12 breakout - Life for Software Beyond Public Funding
Software Practice 12 breakout - Life for Software Beyond Public FundingSoftware Practice 12 breakout - Life for Software Beyond Public Funding
Software Practice 12 breakout - Life for Software Beyond Public Funding
 
Overview of the TriBITS Lifecycle Model
Overview of the TriBITS Lifecycle ModelOverview of the TriBITS Lifecycle Model
Overview of the TriBITS Lifecycle Model
 
ScienceSoft: Open Software for Open Science
ScienceSoft: Open Software for Open ScienceScienceSoft: Open Software for Open Science
ScienceSoft: Open Software for Open Science
 
libHPC: Software sustainability and reuse through metadata preservation
libHPC: Software sustainability and reuse through metadata preservationlibHPC: Software sustainability and reuse through metadata preservation
libHPC: Software sustainability and reuse through metadata preservation
 
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
 
The Relationship Between Development Problems and Use of Software Engineering...
The Relationship Between Development Problems and Use of Software Engineering...The Relationship Between Development Problems and Use of Software Engineering...
The Relationship Between Development Problems and Use of Software Engineering...
 

Sustainable Software for Computational Chemistry and Materials Modeling

  • 1. SUSTAINABLE SOFTWARE FOR COMPUTATIONAL CHEMISTRY AND MATERIALS MODELING Beverly Sanders, University of Florida
  • 2. Outline  Overview  Challenges and Current State  Overcoming the Barriers—technical and cultural
  • 3. Computational Chemistry  Long history – over 50 years  Underpins broad array of scientific applications—Grand Challenges  Efficient combustion systems  Drug design  Understanding biological systems  Semiconductor design  Water sustainability  CO2 Sequestration  Efficient lighting (quantum dots) …  Full partner with experiments—”computational experiments” may be more reliable than lab measurements
  • 4. Scientific Software Innovation Institute for Computational Chemistry and Materials Modeling -- S2I2C2M2  Collaboration between computational chemists, computer scientists, applied mathematicians, and computer engineers  Goal: overcome obstacles of algorithms and culture and change the nature of computational chemistry software development.  Year long conceptualization phase has been funded by NSF  First meeting scheduled for January 2013
  • 5. People Daniel Crawford (Virginia Tech) Vijay Pande (Stanford) Robert Harrison (Tennessee, ORNL) Manish Parashar (Rutgers) Anna I. Krylov (U.S.C.) Ram Ramanujam (LSU) Theresa Windus (Iowa State) Beverly Sanders (Florida) Emily Carter (Princeton) Bernhard Schlegel (Wayne State) Edmund Chow (Georgia Tech) David Sherrill (Georgia Tech) Erik Deumens (Florida) Lyudmila Slipchenko (Purdue) Mark Gordon (Iowa State) Masha Sosonkina (Iowa State) Martin Head-Gordon (Berkeley) Edward Valeev (Virginia Tech) Todd Martinez (Stanford), Ross Walker (San Diego David McDowell (Georgia Tech) Supercomputing Center) + others
  • 6. Current State of Computational Chemistry  Long history--legacies of modern molecular dynamics and quantum chemistry packages span decades  Both open source and commercial  Amalgam of programming languages  Domain specific methods  Multi-dimensional integral engines  General purpose  Davidson method for computing eigenvalues of large matrices (ranks in tens of billions)
  • 7. Software is extremely complex  Example: Modern ab initio quantum chemistry simulations  Computations scale as O(N7) or higher  Where N represents size of molecular system (number of atoms, electrons, or basis functions)  Code complexity arises naturally from problems, but  is an obstacle to long-term sustainability  is an obstacle to exploitation of (ever changing) HPC hardware  hinders education of next generation of scientists  only a handful of very senior students can make a contribution
  • 8. Much recent software development focused on exploiting parallel architectures  Varying degrees of success  With a few exceptions, still not fully exploiting available systems  Utilizing exascale will require rethinking of approach  Desperate need for tools to generate high performance massively parallel code from high level specifications
  • 9. Developers  Mostly grad students and post-docs  Training in software engineering left to individual research groups  Extent to which this is done varies  Large burden for small groups: community approach has potential benefits for both software and the students  Education tends to be narrow: students learn about software their advisors are involved with
  • 10. Science Drivers  Catalysis  Catalysts facilitate control of chemical reactions by raising rates that chemical bonds are formed or broken  Improve selectivity and control over unwanted byproducts  Decreased energy consumption  Reduction of waste stream  Rational design of catalysts for a specific application is one of the Holy Grails of of modern chemistry and chemical engineering  Requires quantitative information about transition states  Intermediates low concentration and short lifetimes—thwart experimental evaluation  Will require state-of-the-art computation combined with experiements
  • 11. Science Drivers  Organic photovoltaic cells  Potential applications: thin-film transistors, LEDs, solar cells, optical switches  Advantages  Devicescan be flexible  Inexpensive to produce  Limitations  Reduced power conversion energy  But,process leading to current generation not well understood
  • 12. Overcoming the Barriers  1 year conceptualization phase funded by NSF  First meeting Jan 2013  3 working groups  Highest priority  Portable parallel infrastructure  General-purpose tensor algebra algorithms  Protocols for information exchange and code interoperability  Education and training
  • 13. Portable parallel infrastructure  Technology trends  Massive concurrency on a chip  Massive number of sockets in largest supercomputers  Heterogeneity (CPU + GPU)  Deep, complex memory hierarchies  Memory and communication bandwidth limited  Bleeding edge applications may need to  Coordinate over 109 threads  Tolerate faults  Explicitly manage energy consumption
  • 14. Sustainability of large and widely distributed chemistry codes  Enable most computations in chemistry  Likely will run on leadership class machines  Working group will include computational chemists, parallel programming experts, and reps from major tech providers (NVIDIA, Intel, IBM)
  • 15. Sustainability of software developed by smaller research groups  Need to understand programming models and tools  Need to understand how both community and software can be better organized  Accelerate testing of new ideas at sufficient scale to determine their worth  Key: being able to write code and integrate into existing software.  Currently, new developments take months or years to migrate from developers software to other packages
  • 16. General-purpose tensor algebra algorithms  Tensor algebra ubiquitous in science and engineering  Need new approaches for computing with high dimensional tensors  Currentsoftware—8 or fewer  Emerging methods require 3N dimensions where N (the number of electrons) may be O(100) or more.  Need common framework of reusable software elements.
  • 17. Challenges for high-rank tensors  Challenges  Develop robust implementations of algorithms  Standardize data structures and algorithms, APIs, software elements, and frameworks  Automate the derivation, transformation, and implementation of tensor expressions.  Will require cross-disciplinary collaborations  Infrastructure will include DSL, runtimes, compilers as well as static and dynamic algorithm analyses
  • 18. Protocols for Information Exchange and Code Interoperability  Historically culture has been competitive rather than collaborative  Sharing of code and data limited  Theoretical methods driving code towards greater size and complexity  Currently, progress may require substantial duplication of effort—code that could be reused is not due to lack of standards  Impedes new science, wastes human labor
  • 19. Information Sharing  Standards for data shared between codes  Standards (or methods to convert between standards) for stored data to facilitate mining.  Data provenance  Cannot expect all code to use the same format, but transformation leads to errors, computational inefficiencies, complex interfaces
  • 20. Code Sharing  Need to establish level of interoperability  Coarse-grained  Hartree-Fock code from one app, MP2 code from another  Fine-grained  Calculate most of one electron contributions to a Fock matrix in one program, relativistic and solvent terms in another  Need architecture
  • 21. Education and Training  Mastering existing codes is daunting task for grad students  Chemical education culture worse than many STEM fields  PhD only requires modest coursework  OK for most fields of chemistry where undergrad training is adequate preparation for hands-on lab research  Most students unprepared for research in computational chemistry  Ad hoc training by individual research groups is innefficient  How should students be prepared for 2020 and beyond?  Programming models and tools  Multidisciplinary foundation to computational science  Reasoning about software  Manipulating software with confidence and facility
  • 22. Summer school  Summer school for grad students supported by community  Fundamental algorithms of computational chemistry  Software best practices
  • 23. If S 2I2C2M2 is successful  Open access to new software tools and infrastructure  Training and educational opportunities for grad students and post-docs in  Algorithms  Code standards  Software best practices  NOT the goal to produce monolithic computational chemistry package to replace existing ones  Healthy competition is good  Sets of robust and properly validated software components that can be shared will benefit the entire community