Weitere ähnliche Inhalte Ähnlich wie Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor (20) Mehr von Intel IT Center (20) Kürzlich hochgeladen (20) Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor1. Early Successes Debugging with TotalView on the Intel Xeon
Phi Coprocessor
Chris Gottbrath
Principal Product Manager
June 19th, 2013
2. 1
Rogue Wave Today
History
• Founded: 1989
• Acquired by Audax Group: 2012
• Acquired:
– TotalView Technologies: 2009
• 40 years of experience in HPC
Customers
• 3,000+ customers in 36 countries
• Multiple sectors:
– Financial services
– Telecom
– Oil and gas
– Government and aerospace
– Research and academic
The largest independent provider of cross-platform
software development tools and embedded components
for the next generation of HPC applications.
Highlights
• Pioneers in C++/object-
oriented development
• Leading the way in
cross-platform, parallel
development
| Copyright © 2013 Rogue Wave Software | All Rights Reserved
3. What is TotalView®?
• Application Analysis and Debugging Tool: Code Confidently
– Debug and Analyse C/C++ and Fortran on Linux, Unix or Mac OS X
– Laptops to supercomputers (Such as Cray® XC)
– Makes developing, maintaining, and supporting critical apps
easier and less risky
• Major Features
– Easy to learn graphical user interface with data visualization
– Parallel Debugging
• MPI, Pthreads, OpenMP™
• Intel® Xeon Phi™ coprocessor
– Includes a Remote Display Client which frees you to work from
anywhere
– Memory Debugging with MemoryScape™
– Deterministic Replay Capability Included on Linux™/x86-64
– Non-interactive Batch Debugging with TVScript and the CLI
– TTF & C++View to transform user defined objects
| Copyright © 2013 Rogue Wave Software | All Rights Reserved2
4. TotalView for the Intel Xeon Phi coprocessor
• Supports Multiple Intel Xeon Phi coprocessor configurations
– Native Mode
• With MPI
– Offload Directives
• Similar to GPU
– Multi-device
– Multi-node
• Certain configurations
– CS300-AC, Future XC30
• User Interface
– MPI Debugging Features
• Process Control
• View Across
• Shared Breakpoints
– Heterogeneous Debugging
• Debug Both Xeon and Intel Xeon Phi Processes
| Copyright © 2013 Rogue Wave Software | All Rights Reserved3
5. The Beacon Project
Beacon – Phase 1
Cray CS300-AC Cluster Supercomputer
Nodes 2 service,
16 compute
Interconnect FDR IB Fat Tree
CPU model Intel Xeon E5-2670
CPUs per node 2 8-core, 2.6 GHz
RAM per node 64 GB
SSD per node 80 GB
Intel® Xeon Phi™ coprocessors per node 2 x pre-production
50+ cores,
8 GB GDDR5 RAM
Beacon – Phase 2
Cray CS300-AC Cluster Supercomputer
Nodes 4 service, 6 I/O,
48 compute
Interconnect FDR IB Fat Tree
CPU model Intel Xeon E5-2670
CPUs per node 2 8-core, 2.6GHz
RAM per node 256 GB
SSD per node 2 x 480 GB (compute),
16 x 300 GB (I/O)
Intel® Xeon Phi Coprocessors per node 4 x 5110P
60-core, 1.053GHz
8 GB GDDR5 RAM
• Funded by NSF to port and optimize scientific codes to the Intel® Xeon Phi™ coprocessor
• State-funded expansion focuses on energy efficiency, big data applications, and industry
• Example Codes: PSC, H3D, OMEN, ENZO, MADNESS, NWCHEM, Amber, MILC, and MAGMA
This material is based upon work supported by the National Science Foundation under Grant Number 1137097.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily
reflect the views of the National Science Foundation.
6. Beacon Project
• 210 Tflops, 11,520 Xeon Phi Coprocessor Cores
• #1 on the Green 500
• 9 scientific teams optimizing apps including: MHD, Plasma, Cosmology, Chemistry,
QCD, Bio-informatics…
• Porting & optimization
– Hybrid MPI + OpenMP
– Many more threads than previous paradigms
• Subtle issues might present themselves
– And did
| Copyright © 2013 Rogue Wave Software | All Rights Reserved5
7. Debugging on Beacon with TotalView
• OpenMP Hybridization of Boltzman BGK
– Correctness issues came up with the OpenMP code
– Troubleshooting with TotalView
• Native mode debugging on the Xeon Phi
• Thread level examination of the OpenMP region
• Comparison of data between threads
• … Clarified otherwise puzzling results
– Developers were able to resolve the correctness issue
– Ultimately obtained performance gains on the Xeon Phi
• And correct results
| Copyright © 2013 Rogue Wave Software | All Rights Reserved6
8. Debugging on Beacon with TotalView
• Porting Gyro Tokamak Plasma Simulation to the Xeon Phi
– Intermittent crash due to Out Of Memory (OOM) Condition
– Troubleshooting with TotalView
• TotalView was used to diagnose the issue across MPI processes
• Work was not being distributed evenly
• … the routine had an invalid assumption
– Developers were able to resolve the OOM error
– Better load balancing also improved the performance of the code
| Copyright © 2013 Rogue Wave Software | All Rights Reserved7
9. TotalView for Xeon Phi
NICS had a goal to port and optimize scientific codes for the many core Xeon Phi co-processor.
TotalView helped the Beacon project developers troubleshoot issues that came up during the
process.
There are many other scientific codes that benefit from the power of Intel Xeon Phi by adopting a
hybrid MPI + OpenMP architecture and tuning for the right number of threads per process.
TotalView is now generally available with support for the Intel Xeon Phi and can help other scientists
take advantage of the power of Intel many core technology.
Please visit us here at ISC Booth 550 to learn more!
| Copyright © 2013 Rogue Wave Software | All Rights Reserved8