SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
INTRODUCTION TO
HETEROGENEOUS SYSTEM
ARCHITECTURE
Presenter: BingRu Wu
Outline
◻ Introduction
◻ Goal
◻ Concept
◻ Memory Model
◻ System Components
Introduction
◻ HSA: Heterogeneous System Architecture
◻ Promising future:
◻ Arm processors producers
◻ GPU vendors: AMD, Imaginations
◻ Fully utilize computation resource
◻ Our system may connect to major
application base with supporting HSA
Goal of HSA
◻ Remove programmability barrier
◻ Memory space barrier
◻ Access latency among devices
◻ Backward compatible
◻ Utilize existing programming models
Concept of HSA
Abstract
◻ Two kinds of compute unit
◻ LCU: Latency Compute Unit (ex. CPU)
◻ TCU: Throughput Compute Unit (ex. GPU)
◻ Merged memory space
Memory Management (1/2)
◻ Shared page table
◻ Memory is shared by all devices
◻ No longer host to device copy and vice versa
◻ Support pointer data structure (ex. list)
◻ Page faulting
◻ Virtual memory space for all devices
◻ ex. GPU now can use memory as if it has
whole memory space
Memory Management (2/2)
◻ Coherent memory regions
◻ The memory is coherent
◻ Shared among all devices (CUs)
◻ Unified address space
◻ Memory type separated by address
◻ Private / local / global memory decided by
memory region
◻ No special instruction is required
User-Level Command Queue
◻ Queues for communication
◻ User to device
◻ Device to device
◻ HSA runtime handles the queue
◻ Allocation & destruction
◻ Each per application
◻ Vendor dependent implementation
◻ Direct access to devices
◻ No OS syscall
◻ No task managing
Hardware Scheduler (1/3)
◻ No real scheduling on TCU (GPU)
◻ Task scheduling
◻ Task preemption
◻ Current implementation
◻ Execute without lock:
◻ All threads execute
◻ Multiple tasks cause error result
Hardware Scheduler (2/3)
◻ Current implementation
◻ Execute with lock:
◻ Code exception may cause the resource being
locked up
◻ Long runtime tasks prevent others from
execution
◻ We may fail to finish critical jobs
Hardware Scheduler (3/3)
HSA runtime guarantees:
◻ Bounded execution time
◻ Any process cease in reasonable time
◻ Fast switch among applications
◻ Use hardware to save time
◻ Application level parallelism
HSAIL (1/2)
◻ HSA Intermediate Language
◻ The language for TCU
◻ Similar to “PTX” code
◻ No graphic-specific instructions
◻ Further translated to HW ISA (by Finalizer)
◻ The abstract platform is similar to OpenCL
◻ Work item (thread)
◻ Work group (block)
◻ NDRange (grid)
HSAIL (2/2)
Memory Model
◻ All types of memory using same space
◻ Memory access behavior
◻ Not all regions are accessible by all devices
◻ OS kernel should not be accessible
◻ Mapping to a region in kernel is still possible
◻ Accessing identical address may gives
different values
◻ Work item private memory
◻ Work group local memory
◻ Accessing other item / group is not valid
Virtual Memory Address
◻ Global
◻ The memory shared by all LCU & TCU
◻ Accessible via work item / group
◻ Group
◻ The memory shared by all work items in the
same group
◻ Private
◻ The memory only visible by a work item
Memory Region
◻ Kernarg
◻ The memory for kernel arguments
◻ Kernel is the code fragment we ask a device
to run on
◻ Readonly
◻ Read-only type of global memory
◻ Spill
◻ Memory for register spill
◻ Arg
◻ Memory for function call arguments
Memory Region
Memory Consistency
◻ LCU
◻ LCU maintains its own consistency
◻ Shares global memory
◻ Work item
◻ Memory operation to same address by single
work item is in order
◻ Memory operations to different address may
be reordered
◻ Other than that, nothing is guaranteed
System Components
HSA System
Compilation
◻ Frontend
◻ LLVM IR
◻ No data dependency
◻ Backend
◻ Convert IR to HSAIL
◻ Optimization happens
here
◻ Binary format
◻ ELF format
◻ Embedded container for
HSAIL (BRIG)
Runtime
◻ HSA runtime
◻ Issue tasks to device
protocol
◻ Device
◻ Convert HSAIL to ISA with
Finalizer
HSAIL Program Features
◻ Backward Compatible
◻ A system without HSA support should still
run the executable
◻ Function Invocation
◻ LCU functions may call LCU ones
◻ TCU functions may call TCU ones with
Finalizer support
◻ LCU to TCU / TCU to LCU is supported by
using queue
◻ C++ compatible
Conclusion
◻ HSA is an open and standard layer
between software / hardware
◻ The cardinal feature of HSA is the unified
virtual memory space
◻ No replacement for current programming
framework, no new language is required
Reference
◻ Heterogeneous System Architecture: A
Technical Review
◻ HSA Programmer’s Reference Manual
◻ HSAIL: Write-Once-Run-Everywhere for
Heterogeneous Systems

Weitere ähnliche Inhalte

Was ist angesagt?

Sharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjaySharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjayGluster.org
 
Seastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephSeastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephScyllaDB
 
Avoiding Data Hotspots at Scale
Avoiding Data Hotspots at ScaleAvoiding Data Hotspots at Scale
Avoiding Data Hotspots at ScaleScyllaDB
 
Kernel Recipes 2017 - What's new in the world of storage for Linux - Jens Axboe
Kernel Recipes 2017 - What's new in the world of storage for Linux - Jens AxboeKernel Recipes 2017 - What's new in the world of storage for Linux - Jens Axboe
Kernel Recipes 2017 - What's new in the world of storage for Linux - Jens AxboeAnne Nicolas
 
Unikraft: Fast, Specialized Unikernels the Easy Way
Unikraft: Fast, Specialized Unikernels the Easy WayUnikraft: Fast, Specialized Unikernels the Easy Way
Unikraft: Fast, Specialized Unikernels the Easy WayScyllaDB
 
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-VRISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-VScyllaDB
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreadingFraboni Ec
 
Many Cores Java - Session One: Threads and Threads
Many Cores Java - Session One: Threads and ThreadsMany Cores Java - Session One: Threads and Threads
Many Cores Java - Session One: Threads and ThreadsRobert Burrell Donkin
 
Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory MultiprocessorsSalvatore La Bua
 
Haskell-related part of speech in ONLab
Haskell-related part of speech in ONLabHaskell-related part of speech in ONLab
Haskell-related part of speech in ONLabDmitry Zuikov
 
The Linux Kernel Scheduler (For Beginners) - SFO17-421
The Linux Kernel Scheduler (For Beginners) - SFO17-421The Linux Kernel Scheduler (For Beginners) - SFO17-421
The Linux Kernel Scheduler (For Beginners) - SFO17-421Linaro
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking MechanismsKernel TLV
 
Performance Analysis and Troubleshooting Methodologies for Databases
Performance Analysis and Troubleshooting Methodologies for DatabasesPerformance Analysis and Troubleshooting Methodologies for Databases
Performance Analysis and Troubleshooting Methodologies for DatabasesScyllaDB
 
Life as a GlusterFS Consultant with Ivan Rossi
Life as a GlusterFS Consultant with Ivan RossiLife as a GlusterFS Consultant with Ivan Rossi
Life as a GlusterFS Consultant with Ivan RossiGluster.org
 

Was ist angesagt? (20)

Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
Sharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjaySharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika Dhananjay
 
Seastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephSeastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for Ceph
 
Avoiding Data Hotspots at Scale
Avoiding Data Hotspots at ScaleAvoiding Data Hotspots at Scale
Avoiding Data Hotspots at Scale
 
Linux logging
Linux loggingLinux logging
Linux logging
 
Multicore
MulticoreMulticore
Multicore
 
Kernel Recipes 2017 - What's new in the world of storage for Linux - Jens Axboe
Kernel Recipes 2017 - What's new in the world of storage for Linux - Jens AxboeKernel Recipes 2017 - What's new in the world of storage for Linux - Jens Axboe
Kernel Recipes 2017 - What's new in the world of storage for Linux - Jens Axboe
 
Unikraft: Fast, Specialized Unikernels the Easy Way
Unikraft: Fast, Specialized Unikernels the Easy WayUnikraft: Fast, Specialized Unikernels the Easy Way
Unikraft: Fast, Specialized Unikernels the Easy Way
 
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-VRISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreading
 
Threads and Threads
Threads and ThreadsThreads and Threads
Threads and Threads
 
Many Cores Java - Session One: Threads and Threads
Many Cores Java - Session One: Threads and ThreadsMany Cores Java - Session One: Threads and Threads
Many Cores Java - Session One: Threads and Threads
 
Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory Multiprocessors
 
Haskell-related part of speech in ONLab
Haskell-related part of speech in ONLabHaskell-related part of speech in ONLab
Haskell-related part of speech in ONLab
 
CUDA
CUDACUDA
CUDA
 
The Linux Kernel Scheduler (For Beginners) - SFO17-421
The Linux Kernel Scheduler (For Beginners) - SFO17-421The Linux Kernel Scheduler (For Beginners) - SFO17-421
The Linux Kernel Scheduler (For Beginners) - SFO17-421
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking Mechanisms
 
Performance Analysis and Troubleshooting Methodologies for Databases
Performance Analysis and Troubleshooting Methodologies for DatabasesPerformance Analysis and Troubleshooting Methodologies for Databases
Performance Analysis and Troubleshooting Methodologies for Databases
 
Life as a GlusterFS Consultant with Ivan Rossi
Life as a GlusterFS Consultant with Ivan RossiLife as a GlusterFS Consultant with Ivan Rossi
Life as a GlusterFS Consultant with Ivan Rossi
 
Lecture1
Lecture1Lecture1
Lecture1
 

Andere mochten auch

Interwiew about ABC Sealants (PVC Vitrini)
Interwiew about ABC Sealants (PVC Vitrini)Interwiew about ABC Sealants (PVC Vitrini)
Interwiew about ABC Sealants (PVC Vitrini)MURAT KARADAYI
 
Next a new face heaven memorial park
Next a new face heaven memorial parkNext a new face heaven memorial park
Next a new face heaven memorial parkLie Jeffri L Tjiputra
 
Godhead 5 - His Teaching in our Past History
Godhead 5 - His Teaching in our Past HistoryGodhead 5 - His Teaching in our Past History
Godhead 5 - His Teaching in our Past HistorySami Wilberforce
 
CIAT views on Extractive Industries
CIAT views on Extractive IndustriesCIAT views on Extractive Industries
CIAT views on Extractive IndustriesMiguel Pecho
 
An updated look at social network extraction system a personal data analysis ...
An updated look at social network extraction system a personal data analysis ...An updated look at social network extraction system a personal data analysis ...
An updated look at social network extraction system a personal data analysis ...eSAT Publishing House
 
The feel of silk bagian pertama
The feel of silk bagian pertamaThe feel of silk bagian pertama
The feel of silk bagian pertamaBerti Subagijo
 
Dramtic reading assignment english
Dramtic reading assignment englishDramtic reading assignment english
Dramtic reading assignment englishBradymort9
 
Developing of decision support system for budget allocation of an r&d organiz...
Developing of decision support system for budget allocation of an r&d organiz...Developing of decision support system for budget allocation of an r&d organiz...
Developing of decision support system for budget allocation of an r&d organiz...eSAT Publishing House
 
Atkinson et al 2015 Length-weight Emerald shiner
Atkinson et al 2015 Length-weight Emerald shinerAtkinson et al 2015 Length-weight Emerald shiner
Atkinson et al 2015 Length-weight Emerald shinerThomas Simon
 

Andere mochten auch (14)

Interwiew about ABC Sealants (PVC Vitrini)
Interwiew about ABC Sealants (PVC Vitrini)Interwiew about ABC Sealants (PVC Vitrini)
Interwiew about ABC Sealants (PVC Vitrini)
 
Sa electronic-pune
Sa electronic-puneSa electronic-pune
Sa electronic-pune
 
Next a new face heaven memorial park
Next a new face heaven memorial parkNext a new face heaven memorial park
Next a new face heaven memorial park
 
Godhead 5 - His Teaching in our Past History
Godhead 5 - His Teaching in our Past HistoryGodhead 5 - His Teaching in our Past History
Godhead 5 - His Teaching in our Past History
 
Product Liability
Product LiabilityProduct Liability
Product Liability
 
CIAT views on Extractive Industries
CIAT views on Extractive IndustriesCIAT views on Extractive Industries
CIAT views on Extractive Industries
 
An updated look at social network extraction system a personal data analysis ...
An updated look at social network extraction system a personal data analysis ...An updated look at social network extraction system a personal data analysis ...
An updated look at social network extraction system a personal data analysis ...
 
The feel of silk bagian pertama
The feel of silk bagian pertamaThe feel of silk bagian pertama
The feel of silk bagian pertama
 
Dramtic reading assignment english
Dramtic reading assignment englishDramtic reading assignment english
Dramtic reading assignment english
 
Social networks
Social networksSocial networks
Social networks
 
Developing of decision support system for budget allocation of an r&d organiz...
Developing of decision support system for budget allocation of an r&d organiz...Developing of decision support system for budget allocation of an r&d organiz...
Developing of decision support system for budget allocation of an r&d organiz...
 
DEEPU KUMAR CV
DEEPU KUMAR CVDEEPU KUMAR CV
DEEPU KUMAR CV
 
Atkinson et al 2015 Length-weight Emerald shiner
Atkinson et al 2015 Length-weight Emerald shinerAtkinson et al 2015 Length-weight Emerald shiner
Atkinson et al 2015 Length-weight Emerald shiner
 
пасха презентация
пасха презентацияпасха презентация
пасха презентация
 

Ähnlich wie Introduction to HSA

C for Cuda - Small Introduction to GPU computing
C for Cuda - Small Introduction to GPU computingC for Cuda - Small Introduction to GPU computing
C for Cuda - Small Introduction to GPU computingIPALab
 
Concurrency, Parallelism And IO
Concurrency,  Parallelism And IOConcurrency,  Parallelism And IO
Concurrency, Parallelism And IOPiyush Katariya
 
LCU13: HSA Architecture Presentation
LCU13: HSA Architecture PresentationLCU13: HSA Architecture Presentation
LCU13: HSA Architecture PresentationLinaro
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8AbdullahMunir32
 
Heterogeneous System Architecture Overview
Heterogeneous System Architecture OverviewHeterogeneous System Architecture Overview
Heterogeneous System Architecture Overviewinside-BigData.com
 
HSA Queuing Hot Chips 2013
HSA Queuing Hot Chips 2013 HSA Queuing Hot Chips 2013
HSA Queuing Hot Chips 2013 HSA Foundation
 
Shared memory Parallelism (NOTES)
Shared memory Parallelism (NOTES)Shared memory Parallelism (NOTES)
Shared memory Parallelism (NOTES)Subhajit Sahu
 
Evolution of the Windows Kernel Architecture, by Dave Probert
Evolution of the Windows Kernel Architecture, by Dave ProbertEvolution of the Windows Kernel Architecture, by Dave Probert
Evolution of the Windows Kernel Architecture, by Dave Probertyang
 
HSA From A Software Perspective
HSA From A Software Perspective HSA From A Software Perspective
HSA From A Software Perspective HSA Foundation
 
UCL All of the Things (MeetBSD California 2014 Lightning Talk)
UCL All of the Things (MeetBSD California 2014 Lightning Talk)UCL All of the Things (MeetBSD California 2014 Lightning Talk)
UCL All of the Things (MeetBSD California 2014 Lightning Talk)iXsystems
 
Basic Computer Architeccture
Basic Computer ArchitecctureBasic Computer Architeccture
Basic Computer ArchitecctureShreerajKhatiwada
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersVaibhav Sharma
 
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmKernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmAnne Nicolas
 
5.6 Basic computer structure microprocessors
5.6 Basic computer structure   microprocessors5.6 Basic computer structure   microprocessors
5.6 Basic computer structure microprocessorslpapadop
 

Ähnlich wie Introduction to HSA (20)

C for Cuda - Small Introduction to GPU computing
C for Cuda - Small Introduction to GPU computingC for Cuda - Small Introduction to GPU computing
C for Cuda - Small Introduction to GPU computing
 
Concurrency, Parallelism And IO
Concurrency,  Parallelism And IOConcurrency,  Parallelism And IO
Concurrency, Parallelism And IO
 
LCU13: HSA Architecture Presentation
LCU13: HSA Architecture PresentationLCU13: HSA Architecture Presentation
LCU13: HSA Architecture Presentation
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8
 
Heterogeneous System Architecture Overview
Heterogeneous System Architecture OverviewHeterogeneous System Architecture Overview
Heterogeneous System Architecture Overview
 
HSA Queuing Hot Chips 2013
HSA Queuing Hot Chips 2013 HSA Queuing Hot Chips 2013
HSA Queuing Hot Chips 2013
 
Shared memory Parallelism (NOTES)
Shared memory Parallelism (NOTES)Shared memory Parallelism (NOTES)
Shared memory Parallelism (NOTES)
 
Implement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVMImplement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVM
 
Oct2009
Oct2009Oct2009
Oct2009
 
Evolution of the Windows Kernel Architecture, by Dave Probert
Evolution of the Windows Kernel Architecture, by Dave ProbertEvolution of the Windows Kernel Architecture, by Dave Probert
Evolution of the Windows Kernel Architecture, by Dave Probert
 
HSA From A Software Perspective
HSA From A Software Perspective HSA From A Software Perspective
HSA From A Software Perspective
 
UCL All of the Things (MeetBSD California 2014 Lightning Talk)
UCL All of the Things (MeetBSD California 2014 Lightning Talk)UCL All of the Things (MeetBSD California 2014 Lightning Talk)
UCL All of the Things (MeetBSD California 2014 Lightning Talk)
 
2337610
23376102337610
2337610
 
Lec04 gpu architecture
Lec04 gpu architectureLec04 gpu architecture
Lec04 gpu architecture
 
Ch8
Ch8Ch8
Ch8
 
Basic Computer Architeccture
Basic Computer ArchitecctureBasic Computer Architeccture
Basic Computer Architeccture
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & Containers
 
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmKernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
 
5.6 Basic computer structure microprocessors
5.6 Basic computer structure   microprocessors5.6 Basic computer structure   microprocessors
5.6 Basic computer structure microprocessors
 
Os
OsOs
Os
 

Kürzlich hochgeladen

Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptNarmatha D
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...Amil Baba Dawood bangali
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 

Kürzlich hochgeladen (20)

Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.ppt
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 

Introduction to HSA

  • 2. Outline ◻ Introduction ◻ Goal ◻ Concept ◻ Memory Model ◻ System Components
  • 3. Introduction ◻ HSA: Heterogeneous System Architecture ◻ Promising future: ◻ Arm processors producers ◻ GPU vendors: AMD, Imaginations ◻ Fully utilize computation resource ◻ Our system may connect to major application base with supporting HSA
  • 4. Goal of HSA ◻ Remove programmability barrier ◻ Memory space barrier ◻ Access latency among devices ◻ Backward compatible ◻ Utilize existing programming models
  • 6. Abstract ◻ Two kinds of compute unit ◻ LCU: Latency Compute Unit (ex. CPU) ◻ TCU: Throughput Compute Unit (ex. GPU) ◻ Merged memory space
  • 7. Memory Management (1/2) ◻ Shared page table ◻ Memory is shared by all devices ◻ No longer host to device copy and vice versa ◻ Support pointer data structure (ex. list) ◻ Page faulting ◻ Virtual memory space for all devices ◻ ex. GPU now can use memory as if it has whole memory space
  • 8. Memory Management (2/2) ◻ Coherent memory regions ◻ The memory is coherent ◻ Shared among all devices (CUs) ◻ Unified address space ◻ Memory type separated by address ◻ Private / local / global memory decided by memory region ◻ No special instruction is required
  • 9. User-Level Command Queue ◻ Queues for communication ◻ User to device ◻ Device to device ◻ HSA runtime handles the queue ◻ Allocation & destruction ◻ Each per application ◻ Vendor dependent implementation ◻ Direct access to devices ◻ No OS syscall ◻ No task managing
  • 10. Hardware Scheduler (1/3) ◻ No real scheduling on TCU (GPU) ◻ Task scheduling ◻ Task preemption ◻ Current implementation ◻ Execute without lock: ◻ All threads execute ◻ Multiple tasks cause error result
  • 11. Hardware Scheduler (2/3) ◻ Current implementation ◻ Execute with lock: ◻ Code exception may cause the resource being locked up ◻ Long runtime tasks prevent others from execution ◻ We may fail to finish critical jobs
  • 12. Hardware Scheduler (3/3) HSA runtime guarantees: ◻ Bounded execution time ◻ Any process cease in reasonable time ◻ Fast switch among applications ◻ Use hardware to save time ◻ Application level parallelism
  • 13. HSAIL (1/2) ◻ HSA Intermediate Language ◻ The language for TCU ◻ Similar to “PTX” code ◻ No graphic-specific instructions ◻ Further translated to HW ISA (by Finalizer) ◻ The abstract platform is similar to OpenCL ◻ Work item (thread) ◻ Work group (block) ◻ NDRange (grid)
  • 16. ◻ All types of memory using same space ◻ Memory access behavior ◻ Not all regions are accessible by all devices ◻ OS kernel should not be accessible ◻ Mapping to a region in kernel is still possible ◻ Accessing identical address may gives different values ◻ Work item private memory ◻ Work group local memory ◻ Accessing other item / group is not valid Virtual Memory Address
  • 17. ◻ Global ◻ The memory shared by all LCU & TCU ◻ Accessible via work item / group ◻ Group ◻ The memory shared by all work items in the same group ◻ Private ◻ The memory only visible by a work item Memory Region
  • 18. ◻ Kernarg ◻ The memory for kernel arguments ◻ Kernel is the code fragment we ask a device to run on ◻ Readonly ◻ Read-only type of global memory ◻ Spill ◻ Memory for register spill ◻ Arg ◻ Memory for function call arguments Memory Region
  • 19. Memory Consistency ◻ LCU ◻ LCU maintains its own consistency ◻ Shares global memory ◻ Work item ◻ Memory operation to same address by single work item is in order ◻ Memory operations to different address may be reordered ◻ Other than that, nothing is guaranteed
  • 22. Compilation ◻ Frontend ◻ LLVM IR ◻ No data dependency ◻ Backend ◻ Convert IR to HSAIL ◻ Optimization happens here ◻ Binary format ◻ ELF format ◻ Embedded container for HSAIL (BRIG)
  • 23. Runtime ◻ HSA runtime ◻ Issue tasks to device protocol ◻ Device ◻ Convert HSAIL to ISA with Finalizer
  • 24. HSAIL Program Features ◻ Backward Compatible ◻ A system without HSA support should still run the executable ◻ Function Invocation ◻ LCU functions may call LCU ones ◻ TCU functions may call TCU ones with Finalizer support ◻ LCU to TCU / TCU to LCU is supported by using queue ◻ C++ compatible
  • 25. Conclusion ◻ HSA is an open and standard layer between software / hardware ◻ The cardinal feature of HSA is the unified virtual memory space ◻ No replacement for current programming framework, no new language is required
  • 26. Reference ◻ Heterogeneous System Architecture: A Technical Review ◻ HSA Programmer’s Reference Manual ◻ HSAIL: Write-Once-Run-Everywhere for Heterogeneous Systems