SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Get Performance on Intel® Xeon Phi™ with
Allinea MAP and Allinea DDT
Discovering bottlenecks without pain
In my Parallel Universe…

… we develop new antibiotics faster than
bacteria develop resistance
... every household can prototype and evolve
their own 3D-printed designs
… accurate simulation of the natural world is
taken for granted
So I decided to…
… create parallel development tools for scientists:

We’re accelerating the pace of scientific progress
HPC on the critical path to progress
Single Core Era

Multi-Core Era

Many-Core Era

Constraints :

Constraints :

Constraints :

-Power

-Power
-Parallel software availability
-Scalability

-Programming models

Performance

-Complexity of algorithms

Time(years)
Allinea MAP
Increase application performance
• Parallel profiler designed for:
‒ C/C++, Fortran
‒ MPI code
 Interdependent or independent processes

‒ Multithreaded code
 Monitor the main threads for each process

‒ Accelerated codes
 GPUs, Intel® Xeon Phi™

• Improve productivity :
‒ Helps you detect performance issues quickly and easily
‒ Tells you immediately where your time is spent in your source code

‒ Helps you to optimize your application efficiently
Allinea MAP 4.2
New features in 2013
• Support for I/O metrics
‒ I/O can be a major bottleneck in HPC systems
‒ Find the optimal configuration for your file system.
Benefit : Broader profiling and analysis capabilities to solve
even more performance issues.

• Support for Intel® Xeon Phi™
‒ Already supported on Allinea DDT
‒ Officially extended to profiling
Benefit : Ensure you are getting the best performance from
new technology.
Optimizing for Intel® Xeon Phi™
Where do you start?

“Code that’s well-optimized for the host
usually performs pretty well on the cards”
- Almost everybody
Optimizing for Intel® Xeon Phi™
But what matters?

Vectorization
Performance

Other
stuff
Optimizing for Intel® Xeon Phi™
Is my code well-vectorized?

… maybe?
Allinea Performance Reports
Is my code well-vectorized?
Optimizing for Intel® Xeon Phi™
Is my code well-vectorized?

… maybe?
Optimizing for Intel® Xeon Phi™
Is my code well-vectorized?

Not in this loop
(16.5% of total time)

… maybe?
Allinea DDT
Unified interface for debugging
• Full, graphical debugger designed for :
‒ C/C++, Fortran, Intel® Xeon Phi™, UPC, …

‒ MPI, OpenMP and mixed-mode code

• Unified interface with Allinea MAP :
‒ Just what you need when you’ve added

OpenMP and now everything segfaults!
‒ One interface eliminates learning curve
‒ Spend more time on your results

• Slash your time to develop :
‒ Reproduces and triggers your bugs instantly
‒ Helps you easily understand where issues come from quickly
‒ Helps you to fix them as swiftly as possible
Allinea at the forefront of science
with COSMOS and Intel® Xeon Phi™

“While I was porting CAMB to offload certain parts of it to Intel®
Xeon Phi™, I wasted weeks debugging it because the offloads
were basically opaque. I only had print statements to help me.”
Allinea at the forefront of science
with COSMOS and Intel® Xeon Phi™

“Using DDT's new offload debugging I can now look at the offload
code and look at the state of the array on the Intel® Xeon Phi™
side before it is manipulated”
Allinea at the forefront of science
with COSMOS and Intel® Xeon Phi™

Fix is easy - either set NOCOPY->IN or just set the thing
to zero on the MIC side which is probably cheaper.”
Allinea at the forefront of science
with COSMOS and Intel® Xeon Phi™

“I’m now using MAP – it shows that the code is fairly well vectorised at 70%.
This will have to be improved a bit to get the most out of the coprocessors.”
Allinea Software
• Ten years of high-quality development tools
‒ Leading in HPC software tools market worldwide
‒ Global customer base
• Making parallel programming accessible to the widest range of
scientists and programmers

‒ Design an unrivaled productive and easy-to-use development
environment…
‒ … To help you reach the highest level of performance and scalability

‒ Define a new standard of customer support
Summary
The premier Intel® Xeon Phi™ development environment from Allinea
– Is your code ready for Intel® Xeon Phi™? Run a Performance Report!
– See which loops are important to vectorize with Allinea MAP
– Stay productive with full profiling and debugging on both host and
coprocessor

– Powerful unified interface with industry-leading technical support to help
you get the job finished faster

Visit us at our booth #1719 to see this in action!
Enter our Performance Reports competition to
win a Kindle Fire every day!

Weitere ähnliche Inhalte

Andere mochten auch

Cómic: derecho humano
Cómic: derecho humano Cómic: derecho humano
Cómic: derecho humano
Rocio Diaz
 
Cерверы Depo storm 3400 на базе новейших процессоров intel xeon e5 2600v3 fin
Cерверы Depo storm 3400 на базе новейших процессоров intel xeon e5 2600v3 finCерверы Depo storm 3400 на базе новейших процессоров intel xeon e5 2600v3 fin
Cерверы Depo storm 3400 на базе новейших процессоров intel xeon e5 2600v3 fin
DEPO Computers
 
Courses, Development Tools, and Academic Opportunities from Intel
Courses, Development Tools, and Academic Opportunities from IntelCourses, Development Tools, and Academic Opportunities from Intel
Courses, Development Tools, and Academic Opportunities from Intel
Intel IT Center
 
Developing tools & Methodologies for the NExt Generation of Genomics & Bio In...
Developing tools & Methodologies for the NExt Generation of Genomics & Bio In...Developing tools & Methodologies for the NExt Generation of Genomics & Bio In...
Developing tools & Methodologies for the NExt Generation of Genomics & Bio In...
Intel IT Center
 
Solutions for a Data Intensive World in a Parallel Universe..
Solutions for a Data Intensive World in a Parallel Universe..Solutions for a Data Intensive World in a Parallel Universe..
Solutions for a Data Intensive World in a Parallel Universe..
Intel IT Center
 
Dia del maestro
Dia del maestroDia del maestro
Dia del maestro
Noemipaola
 
The new Data Economy & The results of shifting social, cultural, and personal...
The new Data Economy & The results of shifting social, cultural, and personal...The new Data Economy & The results of shifting social, cultural, and personal...
The new Data Economy & The results of shifting social, cultural, and personal...
Intel IT Center
 
Power point la imprenta
Power point la imprentaPower point la imprenta
Power point la imprenta
marianela199
 
TRIBUNAL REGISTRAL -RESOLUCIÓN No.1256-2012 - DECLARATORIA DE FÁBRICA Y OTROS
TRIBUNAL REGISTRAL -RESOLUCIÓN No.1256-2012 - DECLARATORIA DE FÁBRICA Y OTROSTRIBUNAL REGISTRAL -RESOLUCIÓN No.1256-2012 - DECLARATORIA DE FÁBRICA Y OTROS
TRIBUNAL REGISTRAL -RESOLUCIÓN No.1256-2012 - DECLARATORIA DE FÁBRICA Y OTROS
Massey Abogados (Oscar Massey)
 

Andere mochten auch (20)

Mesas
MesasMesas
Mesas
 
Tp nº4 informatica gonzalo mochon
Tp nº4 informatica gonzalo mochonTp nº4 informatica gonzalo mochon
Tp nº4 informatica gonzalo mochon
 
Śniadanie Daje Moc
Śniadanie Daje MocŚniadanie Daje Moc
Śniadanie Daje Moc
 
Cómic: derecho humano
Cómic: derecho humano Cómic: derecho humano
Cómic: derecho humano
 
Cерверы Depo storm 3400 на базе новейших процессоров intel xeon e5 2600v3 fin
Cерверы Depo storm 3400 на базе новейших процессоров intel xeon e5 2600v3 finCерверы Depo storm 3400 на базе новейших процессоров intel xeon e5 2600v3 fin
Cерверы Depo storm 3400 на базе новейших процессоров intel xeon e5 2600v3 fin
 
Mercado brasileiro de energia elétrica - O papel das renováveis de grande esc...
Mercado brasileiro de energia elétrica - O papel das renováveis de grande esc...Mercado brasileiro de energia elétrica - O papel das renováveis de grande esc...
Mercado brasileiro de energia elétrica - O papel das renováveis de grande esc...
 
Courses, Development Tools, and Academic Opportunities from Intel
Courses, Development Tools, and Academic Opportunities from IntelCourses, Development Tools, and Academic Opportunities from Intel
Courses, Development Tools, and Academic Opportunities from Intel
 
Developing tools & Methodologies for the NExt Generation of Genomics & Bio In...
Developing tools & Methodologies for the NExt Generation of Genomics & Bio In...Developing tools & Methodologies for the NExt Generation of Genomics & Bio In...
Developing tools & Methodologies for the NExt Generation of Genomics & Bio In...
 
Solutions for a Data Intensive World in a Parallel Universe..
Solutions for a Data Intensive World in a Parallel Universe..Solutions for a Data Intensive World in a Parallel Universe..
Solutions for a Data Intensive World in a Parallel Universe..
 
Dia del maestro
Dia del maestroDia del maestro
Dia del maestro
 
The new Data Economy & The results of shifting social, cultural, and personal...
The new Data Economy & The results of shifting social, cultural, and personal...The new Data Economy & The results of shifting social, cultural, and personal...
The new Data Economy & The results of shifting social, cultural, and personal...
 
Designing to shift Enterprise Ecosystems - Global Service Design Conference 2...
Designing to shift Enterprise Ecosystems - Global Service Design Conference 2...Designing to shift Enterprise Ecosystems - Global Service Design Conference 2...
Designing to shift Enterprise Ecosystems - Global Service Design Conference 2...
 
Intel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
 
Power point la imprenta
Power point la imprentaPower point la imprenta
Power point la imprenta
 
TRIBUNAL REGISTRAL -RESOLUCIÓN No.1256-2012 - DECLARATORIA DE FÁBRICA Y OTROS
TRIBUNAL REGISTRAL -RESOLUCIÓN No.1256-2012 - DECLARATORIA DE FÁBRICA Y OTROSTRIBUNAL REGISTRAL -RESOLUCIÓN No.1256-2012 - DECLARATORIA DE FÁBRICA Y OTROS
TRIBUNAL REGISTRAL -RESOLUCIÓN No.1256-2012 - DECLARATORIA DE FÁBRICA Y OTROS
 

Ähnlich wie HPC Performance & Development Tuning tools for scientists to go parallel faster with allinea

Scaling systems for research computing
Scaling systems for research computingScaling systems for research computing
Scaling systems for research computing
The BioTeam Inc.
 
Are you ready to work in the Parallel Universe? Rise to the challenge at SC13
Are you ready to work in the Parallel Universe? Rise to the challenge at SC13Are you ready to work in the Parallel Universe? Rise to the challenge at SC13
Are you ready to work in the Parallel Universe? Rise to the challenge at SC13
Intel IT Center
 
Ovp Introduction Presentation (04 Feb 10)
Ovp Introduction Presentation (04 Feb 10)Ovp Introduction Presentation (04 Feb 10)
Ovp Introduction Presentation (04 Feb 10)
simon56
 

Ähnlich wie HPC Performance & Development Tuning tools for scientists to go parallel faster with allinea (20)

Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Early Successes Debugging with TotalView on the Intel Xeon Phi CoprocessorEarly Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 
Scaling systems for research computing
Scaling systems for research computingScaling systems for research computing
Scaling systems for research computing
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel Product
 
Scale Up Performance with Intel® Development
Scale Up Performance with Intel® DevelopmentScale Up Performance with Intel® Development
Scale Up Performance with Intel® Development
 
Lessons learned in Using Intel Xeon Phi Coprocessors in Engr Applications
Lessons learned in Using Intel Xeon Phi Coprocessors in Engr ApplicationsLessons learned in Using Intel Xeon Phi Coprocessors in Engr Applications
Lessons learned in Using Intel Xeon Phi Coprocessors in Engr Applications
 
OpenVINO introduction
OpenVINO introductionOpenVINO introduction
OpenVINO introduction
 
Altair on Intel Xeon Phi: Optimizing HPC for Breakthrough Performance
Altair on Intel Xeon Phi:  Optimizing HPC for Breakthrough PerformanceAltair on Intel Xeon Phi:  Optimizing HPC for Breakthrough Performance
Altair on Intel Xeon Phi: Optimizing HPC for Breakthrough Performance
 
Are you ready to work in the Parallel Universe? Rise to the challenge at SC13
Are you ready to work in the Parallel Universe? Rise to the challenge at SC13Are you ready to work in the Parallel Universe? Rise to the challenge at SC13
Are you ready to work in the Parallel Universe? Rise to the challenge at SC13
 
Across the Silicon Spectrum: Xeon Phi to Quark – Unleash the Performance in Y...
Across the Silicon Spectrum: Xeon Phi to Quark – Unleash the Performance in Y...Across the Silicon Spectrum: Xeon Phi to Quark – Unleash the Performance in Y...
Across the Silicon Spectrum: Xeon Phi to Quark – Unleash the Performance in Y...
 
OneAPI_Tool.pptx
OneAPI_Tool.pptxOneAPI_Tool.pptx
OneAPI_Tool.pptx
 
High Performance Continuous Delivery - Versioning and Release Management Aligned
High Performance Continuous Delivery - Versioning and Release Management AlignedHigh Performance Continuous Delivery - Versioning and Release Management Aligned
High Performance Continuous Delivery - Versioning and Release Management Aligned
 
Intel Developer Program
Intel Developer ProgramIntel Developer Program
Intel Developer Program
 
How to Get the Best Deep Learning performance with OpenVINO Toolkit
How to Get the Best Deep Learning performance with OpenVINO ToolkitHow to Get the Best Deep Learning performance with OpenVINO Toolkit
How to Get the Best Deep Learning performance with OpenVINO Toolkit
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
Scaling Systems for Research Computing
Scaling Systems for Research ComputingScaling Systems for Research Computing
Scaling Systems for Research Computing
 
Devops is (not ) a buzzword
Devops is (not ) a buzzwordDevops is (not ) a buzzword
Devops is (not ) a buzzword
 
The Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPCThe Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPC
 
Ovp Introduction Presentation (04 Feb 10)
Ovp Introduction Presentation (04 Feb 10)Ovp Introduction Presentation (04 Feb 10)
Ovp Introduction Presentation (04 Feb 10)
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined Infrastructure
 

Mehr von Intel IT Center

Mehr von Intel IT Center (20)

AI Crash Course- Supercomputing
AI Crash Course- SupercomputingAI Crash Course- Supercomputing
AI Crash Course- Supercomputing
 
FPGA Inference - DellEMC SURFsara
FPGA Inference - DellEMC SURFsaraFPGA Inference - DellEMC SURFsara
FPGA Inference - DellEMC SURFsara
 
High Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel StationHigh Memory Bandwidth Demo @ One Intel Station
High Memory Bandwidth Demo @ One Intel Station
 
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutionsINFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
 
Disrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User AuthenticationDisrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User Authentication
 
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
 
Harness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace TodayHarness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace Today
 
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.
 
Achieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldAchieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital World
 
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
 
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
 
Identity Protection for the Digital Age
Identity Protection for the Digital AgeIdentity Protection for the Digital Age
Identity Protection for the Digital Age
 
Three Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a RealityThree Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a Reality
 
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
 
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
 
Intel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMGIntel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMG
 
Gobblin for Data Analytics
Gobblin for Data AnalyticsGobblin for Data Analytics
Gobblin for Data Analytics
 
Empower Your Workforce to Work Anywhere.
Empower Your Workforce to Work Anywhere.Empower Your Workforce to Work Anywhere.
Empower Your Workforce to Work Anywhere.
 
Cloud-Ready Networks
Cloud-Ready NetworksCloud-Ready Networks
Cloud-Ready Networks
 
Intel & SAP Simplify IT
Intel & SAP Simplify ITIntel & SAP Simplify IT
Intel & SAP Simplify IT
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

HPC Performance & Development Tuning tools for scientists to go parallel faster with allinea

  • 1. Get Performance on Intel® Xeon Phi™ with Allinea MAP and Allinea DDT Discovering bottlenecks without pain
  • 2. In my Parallel Universe… … we develop new antibiotics faster than bacteria develop resistance ... every household can prototype and evolve their own 3D-printed designs … accurate simulation of the natural world is taken for granted
  • 3. So I decided to… … create parallel development tools for scientists: We’re accelerating the pace of scientific progress
  • 4. HPC on the critical path to progress Single Core Era Multi-Core Era Many-Core Era Constraints : Constraints : Constraints : -Power -Power -Parallel software availability -Scalability -Programming models Performance -Complexity of algorithms Time(years)
  • 5. Allinea MAP Increase application performance • Parallel profiler designed for: ‒ C/C++, Fortran ‒ MPI code  Interdependent or independent processes ‒ Multithreaded code  Monitor the main threads for each process ‒ Accelerated codes  GPUs, Intel® Xeon Phi™ • Improve productivity : ‒ Helps you detect performance issues quickly and easily ‒ Tells you immediately where your time is spent in your source code ‒ Helps you to optimize your application efficiently
  • 6. Allinea MAP 4.2 New features in 2013 • Support for I/O metrics ‒ I/O can be a major bottleneck in HPC systems ‒ Find the optimal configuration for your file system. Benefit : Broader profiling and analysis capabilities to solve even more performance issues. • Support for Intel® Xeon Phi™ ‒ Already supported on Allinea DDT ‒ Officially extended to profiling Benefit : Ensure you are getting the best performance from new technology.
  • 7. Optimizing for Intel® Xeon Phi™ Where do you start? “Code that’s well-optimized for the host usually performs pretty well on the cards” - Almost everybody
  • 8. Optimizing for Intel® Xeon Phi™ But what matters? Vectorization Performance Other stuff
  • 9. Optimizing for Intel® Xeon Phi™ Is my code well-vectorized? … maybe?
  • 10. Allinea Performance Reports Is my code well-vectorized?
  • 11. Optimizing for Intel® Xeon Phi™ Is my code well-vectorized? … maybe?
  • 12. Optimizing for Intel® Xeon Phi™ Is my code well-vectorized? Not in this loop (16.5% of total time) … maybe?
  • 13. Allinea DDT Unified interface for debugging • Full, graphical debugger designed for : ‒ C/C++, Fortran, Intel® Xeon Phi™, UPC, … ‒ MPI, OpenMP and mixed-mode code • Unified interface with Allinea MAP : ‒ Just what you need when you’ve added OpenMP and now everything segfaults! ‒ One interface eliminates learning curve ‒ Spend more time on your results • Slash your time to develop : ‒ Reproduces and triggers your bugs instantly ‒ Helps you easily understand where issues come from quickly ‒ Helps you to fix them as swiftly as possible
  • 14. Allinea at the forefront of science with COSMOS and Intel® Xeon Phi™ “While I was porting CAMB to offload certain parts of it to Intel® Xeon Phi™, I wasted weeks debugging it because the offloads were basically opaque. I only had print statements to help me.”
  • 15. Allinea at the forefront of science with COSMOS and Intel® Xeon Phi™ “Using DDT's new offload debugging I can now look at the offload code and look at the state of the array on the Intel® Xeon Phi™ side before it is manipulated”
  • 16. Allinea at the forefront of science with COSMOS and Intel® Xeon Phi™ Fix is easy - either set NOCOPY->IN or just set the thing to zero on the MIC side which is probably cheaper.”
  • 17. Allinea at the forefront of science with COSMOS and Intel® Xeon Phi™ “I’m now using MAP – it shows that the code is fairly well vectorised at 70%. This will have to be improved a bit to get the most out of the coprocessors.”
  • 18. Allinea Software • Ten years of high-quality development tools ‒ Leading in HPC software tools market worldwide ‒ Global customer base • Making parallel programming accessible to the widest range of scientists and programmers ‒ Design an unrivaled productive and easy-to-use development environment… ‒ … To help you reach the highest level of performance and scalability ‒ Define a new standard of customer support
  • 19. Summary The premier Intel® Xeon Phi™ development environment from Allinea – Is your code ready for Intel® Xeon Phi™? Run a Performance Report! – See which loops are important to vectorize with Allinea MAP – Stay productive with full profiling and debugging on both host and coprocessor – Powerful unified interface with industry-leading technical support to help you get the job finished faster Visit us at our booth #1719 to see this in action! Enter our Performance Reports competition to win a Kindle Fire every day!

Hinweis der Redaktion

  1. It’s not often that marketing lives up to its hype, but something we’ve consistently heard from users around the world porting their codes to Xeon Phi is that – once they’ve done a good job of optimizing for the host – the performance on the Phi is normally pretty good right away.
  2. The reason is that even on a standard Xeon these days, you need to take advantage of vectorized instructions to get good performance. With 512-bit registers, vectorization is absolutely critical to achieving good performance on the Xeon Phi. There’s no point in sending all the cars down one lane of the highway!
  3. The Intel compilers can give very detailed reports about what they’re doing to each loop using the –vec-report flags, but even on a small program you need to know which loops are worth spending your time on and which you can ignore.
  4. Allinea MAP shows you the behavior of your code at a single glance – let me briefly walk you through the interface here. <talk about how to interpret the metric graphs and the sparkline graphs next to the code viewer. Finish by pointing out that the CPU floating-point vector graph is at 0 for the selected region of time!
  5. This is Allinea MAP’s answer to our question – there’s an important loop taking 16.5% of the total program time that isn’t vectorizing at all! Now we know which lines of code are affected, we can ask the compiler for a report and investigate further.
  6. It’s not just profiling that works the same – our unified interface is shared with Allinea DDT, a full-featured debugger supporting a huge range of platforms and codes including the Xeon Phi.
  7. You can’t achieve full performance by looking through a microscope all the time – you have to be able to step back from the quest to vectorized the next loop, and the next, and ask “is this worth it? Is there a library I can use here? Can I refactor my code here?” MAP gives you the oversight and insight you need to answer these questions.
  8. You can’t achieve full performance by looking through a microscope all the time – you have to be able to step back from the quest to vectorized the next loop, and the next, and ask “is this worth it? Is there a library I can use here? Can I refactor my code here?” MAP gives you the oversight and insight you need to answer these questions.
  9. You can’t achieve full performance by looking through a microscope all the time – you have to be able to step back from the quest to vectorized the next loop, and the next, and ask “is this worth it? Is there a library I can use here? Can I refactor my code here?” MAP gives you the oversight and insight you need to answer these questions.
  10. And when you come to run your code on the card, Allinea MAP gathers exactly the same information and displays it in exactly the same way