SlideShare a Scribd company logo
1 of 26
A review of power & energy
consumption optimization in HPC

                   Rishi Pathak
                 riship@cdac.in
 National PARAM Supercomputing Facility, C-DAC, Pune


    Symposium on HPC Applications – IIT Kanpur
                 March 12 - 14, 2012
Top 10 – Top500
Top 10 – Green 500
3

          2.02                                                     GF/Watt
                      2.02
                                         1.98
2.5                                                        1.68


                                                                                     Green 500, Rank 1-10 (GF per Watt)
                                 1.99

                                                                                     Top 500, Rank 1-10 (GF per Watt)
 2




                                                                        1.37
                                                                                 1.26
                                                                  GPU
1.5                                                                                               GPU
                                                                               GPU
                                                                                                   1.01                      0.95
                                                                                                                         GPU
                                                                                                              0.96
                                                                                                               GPU
 1
                                                           GPU
          0.83                                             0.85

                     GPU
                      0.63
                                        GPU
0.5                                             0.49
                                                                                                  0.36                       0.44
                                                                        0.28
                                 0.25                                                                         0.29
                                                                                     0.27

 0
      1          2           3            4            5            6           7             8           9             10
Exascale system
• Likely to be feasible by 2017±2
• 10-100 Million processing elements (cores or mini-
  cores)
• Chips perhaps as dense as 1,000 cores per socket
• Clock rates will grow more slowly
• Large-scale optics based interconnects
• 10-100 PB of aggregate memory
• Performance per watt ~ 100 GF/watt sustained
  performance
• 10 – 100 MW Exascale system
Power & Energy
   E=P*T
   Energy(E) consumed in time(T) with average
    power(P)
   Minimizing time interval will limit energy
   A minimum value of T for an application
       Mapping of application to cluster system
       Scalability & system bottlenecks
   Beyond that – Power management approaches
Power management techniques
   Static Power Management(SPM)
       Low power CPUs
       Local flash storage
       Suitable for data centric applications
   Dynamic Power Management(DPM)
       Software & power scalable components
       Dynamically adjust power consumption
       Frequency & Voltage scaling for CPU & memory
DVFS
   Dynamic Voltage & Frequency Scaling
   P = C * V2 * f
   Throttling when
       Workload is not CPU bound
       Is not much CPU intensive
DVFS Scheduling
   Off-line, trace-based scheduling
       Source code instrumentation for performance profiling
       Execution with profiling
       Determination of appropriate processor frequencies for
        each phase
       Source code instrumentation for DVFS scheduling


S. Huang & W. Feng – Proc. Cluster computing[IEEE/ACM](2009)
DVFS Scheduling
   Run-time, profiling-based scheduling
       Time-window based performance prediction model
       No a priori information of application phases
       False prediction will have dire consequences for performance
        or energy efficiency
       Metrics
            MIPS & CPU utilization
            Interception of MPI communication calls
            File I/O calls
            MPI receive wait cycles
       Shown to reduce energy with pre-specified performance loss
        constraint
DVFS Implementations
    Memory MISER (Management Infra-Structure for Enerygy
     Reduction)
    CPU MISER
    Linux CPUSPEED
    Ecod
    Beta-Algorithm
    M. E. Tolentino, J. Turner & K. W. Cameron – Proc. of the 4th international conference
    on Computing frontiers(2007)
    S. Huang & W. Feng – Proc. Cluster computing[IEEE/ACM](2009)
    C. Hsu & W. Feng - Proc. of the 2005 ACM/IEEE conference on Supercomputing
Enhancements in DVFS
   Dynamic Frequency Scaling per Core
       Each core runs at its own clock
       Power is linear with frequency
       Power savings are relatively small
   Separate power planes for the core and "uncore" part
    of the CPU
       Cores can go to sleep (C-state)
       Memory controller is still operational for external device
        (e.g. via DMA)
Enhancements in DVFS
   Clock gating
       Clock disabled sleep state (AMD-C1,E1, Intel-
        C[0,1,3,6])
       At the CPU block level
       At the core level
       Reduces dynamic power
   Power Gating
       Power to CPU/core cut off (~0V)
       Reduces both dynamic and static(leakage) power
Nehalem core sleep states
AMD's and Intel's techniques
Power optimization at NPSF
   Scheduler capable of :
       Power off a node after a pre specified state of idleness(no
        job)
       Power optimization with QOS(turnaround time)
       Node power on time(2-3 min) is additional
   Targeted power policies
       Aggressive optimization w/o regard to QOS
       Power capping
       Power budget
Power optimization at NPSF
   Node packing via checkpointing, migration & restart
       MPI with BLCR – one approach
       Use of virtualization – another approach
       Considerations –
            Remaining walltime of job being migrated
            Remaining walltime of jobs on node in consideration
            Associated cost of migration against power savings expected to
             be achieved
Saving Potential
Simulation Result - Plot
Simulation Results - Table
       Parameter Case           Case I   Case II   Case III



 Power saving (in percentage)    4.05     4.22      9.29



NODEIDLEPOWERTHRESHOLD            8        6          4
        (In minutes)
Power optimization at NPSF
   Feedback driven policy engine
       Speculative power on/off of nodes at any given time
       Metrics/deciding factors
            Function of Jobs arrival time & resource requirements
            How many nodes at what time
            Current and probable cluster utilization at given time – another
             metric
       Expected starttime of jobs in queue
       Minimize impact on turnaround time of job
Job Arrival Time
PARAM Yuva – Access & Account
https://yuva.cdac.in/
Technical Affiliation Scheme
Thank You
npsfhelp@cdac.in

More Related Content

What's hot

Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Shaheryar Iqbal
 
AMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPUAMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPUDevang Sachdev
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architectureDhaval Kaneria
 
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...Bharath Sudharsan
 
Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)Saksham Tanwar
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Shien-Chun Luo
 
SLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
SLA-aware Dynamic CPU Scaling in Business Cloud Computing EnvironmentsSLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
SLA-aware Dynamic CPU Scaling in Business Cloud Computing EnvironmentsZhenyun Zhuang
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitCarlo C. del Mundo
 
Graphic Processing Unit
Graphic Processing UnitGraphic Processing Unit
Graphic Processing UnitKamran Ashraf
 
Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Jafar Khan
 
Atoll getting started_lte_282_en
Atoll getting started_lte_282_enAtoll getting started_lte_282_en
Atoll getting started_lte_282_enMorokot
 
GPU power consumption and performance trends
GPU power consumption and performance trendsGPU power consumption and performance trends
GPU power consumption and performance trendsAlessio Villardita
 
Gpu Systems
Gpu SystemsGpu Systems
Gpu Systemsjpaugh
 

What's hot (20)

Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
Parallel Computing on the GPU
 
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
 
Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)
 
TPU paper slide
TPU paper slideTPU paper slide
TPU paper slide
 
AMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPUAMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPU
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
 
GPU Programming
GPU ProgrammingGPU Programming
GPU Programming
 
Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)
 
SLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
SLA-aware Dynamic CPU Scaling in Business Cloud Computing EnvironmentsSLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
SLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
 
2017 04-13-google-tpu-04
2017 04-13-google-tpu-042017 04-13-google-tpu-04
2017 04-13-google-tpu-04
 
Graphic Processing Unit
Graphic Processing UnitGraphic Processing Unit
Graphic Processing Unit
 
Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
 
Atoll getting started_lte_282_en
Atoll getting started_lte_282_enAtoll getting started_lte_282_en
Atoll getting started_lte_282_en
 
GPU power consumption and performance trends
GPU power consumption and performance trendsGPU power consumption and performance trends
GPU power consumption and performance trends
 
Gpu Systems
Gpu SystemsGpu Systems
Gpu Systems
 

Viewers also liked

Ejercicio tecnica vocal
Ejercicio tecnica vocalEjercicio tecnica vocal
Ejercicio tecnica vocalANAIS TIPAN
 
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
#CNX14 - Building Killer Apps - Moving Beyond Transactions to ExperiencesSalesforce Marketing Cloud
 
Value Proposition Of Thomas Jackson
Value Proposition Of Thomas JacksonValue Proposition Of Thomas Jackson
Value Proposition Of Thomas JacksonThomas Jackson
 
網站首頁比較
網站首頁比較網站首頁比較
網站首頁比較心瑜 楊
 
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...Global Business Events
 
AlphaGraphics Design
AlphaGraphics DesignAlphaGraphics Design
AlphaGraphics DesignAlpha522
 
BelalOssamaAbuLabanResume2016 - Copy
BelalOssamaAbuLabanResume2016 - CopyBelalOssamaAbuLabanResume2016 - Copy
BelalOssamaAbuLabanResume2016 - Copybelal abulaban
 
Inmigración Armenia
Inmigración ArmeniaInmigración Armenia
Inmigración ArmeniaLadesergio
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhenDavid Peyruc
 
UPB - Software is eating up the world
UPB - Software is eating up the worldUPB - Software is eating up the world
UPB - Software is eating up the worldEddy D. Sánchez
 
Using Social Media for Ministry
Using Social Media for MinistryUsing Social Media for Ministry
Using Social Media for MinistryJason Caston
 
现代化敏捷测试工作者
现代化敏捷测试工作者现代化敏捷测试工作者
现代化敏捷测试工作者Yi Xu
 
黃晴與彭立人醫師的對話錄
黃晴與彭立人醫師的對話錄黃晴與彭立人醫師的對話錄
黃晴與彭立人醫師的對話錄honan4108
 

Viewers also liked (17)

5° básico b semana 18 al 22 abril
 5° básico b  semana 18  al 22 abril 5° básico b  semana 18  al 22 abril
5° básico b semana 18 al 22 abril
 
Ejercicio tecnica vocal
Ejercicio tecnica vocalEjercicio tecnica vocal
Ejercicio tecnica vocal
 
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
 
Backlink service
Backlink serviceBacklink service
Backlink service
 
Value Proposition Of Thomas Jackson
Value Proposition Of Thomas JacksonValue Proposition Of Thomas Jackson
Value Proposition Of Thomas Jackson
 
網站首頁比較
網站首頁比較網站首頁比較
網站首頁比較
 
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
 
AlphaGraphics Design
AlphaGraphics DesignAlphaGraphics Design
AlphaGraphics Design
 
BelalOssamaAbuLabanResume2016 - Copy
BelalOssamaAbuLabanResume2016 - CopyBelalOssamaAbuLabanResume2016 - Copy
BelalOssamaAbuLabanResume2016 - Copy
 
Inmigración Armenia
Inmigración ArmeniaInmigración Armenia
Inmigración Armenia
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
 
What's all about this ....
What's all about this ....What's all about this ....
What's all about this ....
 
Guns (v.m.)
Guns (v.m.)Guns (v.m.)
Guns (v.m.)
 
UPB - Software is eating up the world
UPB - Software is eating up the worldUPB - Software is eating up the world
UPB - Software is eating up the world
 
Using Social Media for Ministry
Using Social Media for MinistryUsing Social Media for Ministry
Using Social Media for Ministry
 
现代化敏捷测试工作者
现代化敏捷测试工作者现代化敏捷测试工作者
现代化敏捷测试工作者
 
黃晴與彭立人醫師的對話錄
黃晴與彭立人醫師的對話錄黃晴與彭立人醫師的對話錄
黃晴與彭立人醫師的對話錄
 

Similar to Symposium on HPC Applications – IIT Kanpur

CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...CAST, Inc.
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performances.rohit
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2Junli Gu
 
PACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptxPACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptxssuser30e7d2
 
Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedRCCSRENKEI
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievVolodymyr Saviak
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUsiguazio
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit pptSandeep Singh
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerFörderverein Technische Fakultät
 
CUDA and Caffe for deep learning
CUDA and Caffe for deep learningCUDA and Caffe for deep learning
CUDA and Caffe for deep learningAmgad Muhammad
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~Kohei KaiGai
 
NAMD Molecular Dynamics on GPU
NAMD Molecular Dynamics on GPUNAMD Molecular Dynamics on GPU
NAMD Molecular Dynamics on GPUDevang Sachdev
 
GPU Computing In Higher Education And Research
GPU Computing In Higher Education And ResearchGPU Computing In Higher Education And Research
GPU Computing In Higher Education And ResearchDevang Sachdev
 
2-GPGPU-Sim-Overview.pptx
2-GPGPU-Sim-Overview.pptx2-GPGPU-Sim-Overview.pptx
2-GPGPU-Sim-Overview.pptxYonggangLiu3
 
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...Chanwoo Choi
 
GPU Compute in Medical and Print Imaging
GPU Compute in Medical and Print ImagingGPU Compute in Medical and Print Imaging
GPU Compute in Medical and Print ImagingAMD
 

Similar to Symposium on HPC Applications – IIT Kanpur (20)

CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performance
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2
 
PACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptxPACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptx
 
Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons Learned
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 Kiev
 
Dasia 2022
Dasia 2022Dasia 2022
Dasia 2022
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
 
GPU - Basic Working
GPU - Basic WorkingGPU - Basic Working
GPU - Basic Working
 
Gpu
GpuGpu
Gpu
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
 
CUDA and Caffe for deep learning
CUDA and Caffe for deep learningCUDA and Caffe for deep learning
CUDA and Caffe for deep learning
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
NAMD Molecular Dynamics on GPU
NAMD Molecular Dynamics on GPUNAMD Molecular Dynamics on GPU
NAMD Molecular Dynamics on GPU
 
GPU Computing In Higher Education And Research
GPU Computing In Higher Education And ResearchGPU Computing In Higher Education And Research
GPU Computing In Higher Education And Research
 
2-GPGPU-Sim-Overview.pptx
2-GPGPU-Sim-Overview.pptx2-GPGPU-Sim-Overview.pptx
2-GPGPU-Sim-Overview.pptx
 
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
 
GPU Compute in Medical and Print Imaging
GPU Compute in Medical and Print ImagingGPU Compute in Medical and Print Imaging
GPU Compute in Medical and Print Imaging
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Symposium on HPC Applications – IIT Kanpur

  • 1. A review of power & energy consumption optimization in HPC Rishi Pathak riship@cdac.in National PARAM Supercomputing Facility, C-DAC, Pune Symposium on HPC Applications – IIT Kanpur March 12 - 14, 2012
  • 2. Top 10 – Top500
  • 3. Top 10 – Green 500
  • 4. 3 2.02 GF/Watt 2.02 1.98 2.5 1.68 Green 500, Rank 1-10 (GF per Watt) 1.99 Top 500, Rank 1-10 (GF per Watt) 2 1.37 1.26 GPU 1.5 GPU GPU 1.01 0.95 GPU 0.96 GPU 1 GPU 0.83 0.85 GPU 0.63 GPU 0.5 0.49 0.36 0.44 0.28 0.25 0.29 0.27 0 1 2 3 4 5 6 7 8 9 10
  • 5. Exascale system • Likely to be feasible by 2017±2 • 10-100 Million processing elements (cores or mini- cores) • Chips perhaps as dense as 1,000 cores per socket • Clock rates will grow more slowly • Large-scale optics based interconnects • 10-100 PB of aggregate memory • Performance per watt ~ 100 GF/watt sustained performance • 10 – 100 MW Exascale system
  • 6. Power & Energy  E=P*T  Energy(E) consumed in time(T) with average power(P)  Minimizing time interval will limit energy  A minimum value of T for an application  Mapping of application to cluster system  Scalability & system bottlenecks  Beyond that – Power management approaches
  • 7. Power management techniques  Static Power Management(SPM)  Low power CPUs  Local flash storage  Suitable for data centric applications  Dynamic Power Management(DPM)  Software & power scalable components  Dynamically adjust power consumption  Frequency & Voltage scaling for CPU & memory
  • 8. DVFS  Dynamic Voltage & Frequency Scaling  P = C * V2 * f  Throttling when  Workload is not CPU bound  Is not much CPU intensive
  • 9. DVFS Scheduling  Off-line, trace-based scheduling  Source code instrumentation for performance profiling  Execution with profiling  Determination of appropriate processor frequencies for each phase  Source code instrumentation for DVFS scheduling S. Huang & W. Feng – Proc. Cluster computing[IEEE/ACM](2009)
  • 10. DVFS Scheduling  Run-time, profiling-based scheduling  Time-window based performance prediction model  No a priori information of application phases  False prediction will have dire consequences for performance or energy efficiency  Metrics  MIPS & CPU utilization  Interception of MPI communication calls  File I/O calls  MPI receive wait cycles  Shown to reduce energy with pre-specified performance loss constraint
  • 11. DVFS Implementations  Memory MISER (Management Infra-Structure for Enerygy Reduction)  CPU MISER  Linux CPUSPEED  Ecod  Beta-Algorithm M. E. Tolentino, J. Turner & K. W. Cameron – Proc. of the 4th international conference on Computing frontiers(2007) S. Huang & W. Feng – Proc. Cluster computing[IEEE/ACM](2009) C. Hsu & W. Feng - Proc. of the 2005 ACM/IEEE conference on Supercomputing
  • 12. Enhancements in DVFS  Dynamic Frequency Scaling per Core  Each core runs at its own clock  Power is linear with frequency  Power savings are relatively small  Separate power planes for the core and "uncore" part of the CPU  Cores can go to sleep (C-state)  Memory controller is still operational for external device (e.g. via DMA)
  • 13. Enhancements in DVFS  Clock gating  Clock disabled sleep state (AMD-C1,E1, Intel- C[0,1,3,6])  At the CPU block level  At the core level  Reduces dynamic power  Power Gating  Power to CPU/core cut off (~0V)  Reduces both dynamic and static(leakage) power
  • 15. AMD's and Intel's techniques
  • 16. Power optimization at NPSF  Scheduler capable of :  Power off a node after a pre specified state of idleness(no job)  Power optimization with QOS(turnaround time)  Node power on time(2-3 min) is additional  Targeted power policies  Aggressive optimization w/o regard to QOS  Power capping  Power budget
  • 17. Power optimization at NPSF  Node packing via checkpointing, migration & restart  MPI with BLCR – one approach  Use of virtualization – another approach  Considerations –  Remaining walltime of job being migrated  Remaining walltime of jobs on node in consideration  Associated cost of migration against power savings expected to be achieved
  • 20. Simulation Results - Table Parameter Case Case I Case II Case III Power saving (in percentage) 4.05 4.22 9.29 NODEIDLEPOWERTHRESHOLD 8 6 4 (In minutes)
  • 21. Power optimization at NPSF  Feedback driven policy engine  Speculative power on/off of nodes at any given time  Metrics/deciding factors  Function of Jobs arrival time & resource requirements  How many nodes at what time  Current and probable cluster utilization at given time – another metric  Expected starttime of jobs in queue  Minimize impact on turnaround time of job
  • 23. PARAM Yuva – Access & Account