SlideShare a Scribd company logo
1 of 16
COMPUTE
UNIFIED DEVICE
ARCHITECTURE
MANNU MALHOTRA
ENROLL. NO. 08313303110
IT 7TH SEM.
CONTENTS
• What is GPU ?
• What is CUDA ?
• Why CUDA ?
• GPU VS CPU.
• DATA FLOW IN CUDA
• Programming model
• CUDA Memory Hierarchy
• A simple program in CUDA
• CUDA Limitations
WHAT IS GPU ?
• A GPU is a processor with thousands of cores , ALUs and cache.
GPU ARE SO MUCH POWERFUL THAN WHY COMPILER
RUNS OUR PROGRAM ON CPU?
The programming paradigm is not meant for GPU.
CUDA
WHAT IS CUDA?
• CUDA is a set of developing tools to create applications that will
perform execution on GPU (Graphics Processing Unit).
• CUDA compiler uses variation of C with future support of C++.
• CUDA was developed by NVidia and can only run on NVidia GPUs of
tesla and Geforce series.
• CUDA provides Heterogeneous serial-parallel computing Between
CPU and GPU
WHY CUDA ?
• CUDA provides ability to use high-level languages such as C to
develop application that can take advantage of high level performance
and scalability that GPUs architecture offer.
• GPUs allow creation of very large number of concurrently executed
threads at very low system resource cost.
• CUDA also exposes fast shared memory (16KB) that can be shared
between threads.
• Full support for integer and bitwise operations.
• Compiled code will run directly on GPU.
GPU VS. CPU
• GPUs contain much larger number
of dedicated ALUs.
• GPUs also contain extensive
support of Stream Processing
paradigm. It is related to SIMD (
Single Instruction Multiple Data)
processing.
• Each processing unit on GPU
contains local memory that
improves data manipulation and
reduces fetch time.
COMPUTATION TIME (CPU VS GPU)
FLOW OF DATA
PCIE
1. Data moved from CPU to GPU.
2. Launch Program on GPU.
3. Copy final values to CPU.
PROGRAMMING MODEL
CUDA MEMORY HIERARCHY
• The CUDA platform has three primary memory types
Local Memory – per thread memory for automatic variables and
register spilling.
Shared Memory – per block memory to allow for intra-block data
sharing and synchronization. Threads can safely share data through
this memory and can perform barrier synchronization through _
_syncthreads()
Global Memory – device level memory that may be shared between
blocks or grids
A VECTOR ADD EXAMPLE IN C
VECTOR ADD IN CUDA C
CUDA LIMITATIONS
• No support of recursive function. Any recursive function must be
converted into loops.
• Bus bandwidth and latency between GPU and CPU is a bottleneck
for many applications.
• Threads should only be run in groups of 32 and up for best
performance.
• Only supported on NVidia GPUs
REFERENCES
• Video tutorials by-
• Udacity.com by john owens , Professor of electric and computer at UC Davis.
• Plural site by Dimitri Nesteruk
• Web resources –
• Developer.nvidia.com
• Nividia.com/cuda
THANK YOU

More Related Content

What's hot

What's hot (20)

Cuda Architecture
Cuda ArchitectureCuda Architecture
Cuda Architecture
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
 
Parallel computing persentation
Parallel computing persentationParallel computing persentation
Parallel computing persentation
 
Introduction to parallel computing using CUDA
Introduction to parallel computing using CUDAIntroduction to parallel computing using CUDA
Introduction to parallel computing using CUDA
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
Distributed computing
Distributed computingDistributed computing
Distributed computing
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
 
Tech Talk NVIDIA CUDA
Tech Talk NVIDIA CUDATech Talk NVIDIA CUDA
Tech Talk NVIDIA CUDA
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.
 
Gpu databases
Gpu databasesGpu databases
Gpu databases
 
Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
Parallel Computing on the GPU
 
NVIDIA CUDA
NVIDIA CUDANVIDIA CUDA
NVIDIA CUDA
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)
 
GPU: Understanding CUDA
GPU: Understanding CUDAGPU: Understanding CUDA
GPU: Understanding CUDA
 
Computer architecture multi processor
Computer architecture multi processorComputer architecture multi processor
Computer architecture multi processor
 
parallel computing.ppt
parallel computing.pptparallel computing.ppt
parallel computing.ppt
 
parallel programming models
 parallel programming models parallel programming models
parallel programming models
 
Computer architecture multi core processor
Computer architecture multi core processorComputer architecture multi core processor
Computer architecture multi core processor
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 

Similar to Cuda

C for Cuda - Small Introduction to GPU computing
C for Cuda - Small Introduction to GPU computingC for Cuda - Small Introduction to GPU computing
C for Cuda - Small Introduction to GPU computingIPALab
 
Gpu application in cuda memory
Gpu application in cuda memoryGpu application in cuda memory
Gpu application in cuda memoryjournalacij
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An IntroductionDhan V Sagar
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxssuser413a98
 
gpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsngpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsnARUNACHALAM468781
 
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONSA SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONScseij
 
Stream Processing
Stream ProcessingStream Processing
Stream Processingarnamoy10
 
Computing using GPUs
Computing using GPUsComputing using GPUs
Computing using GPUsShree Kumar
 
Modern processor art
Modern processor artModern processor art
Modern processor artwaqasjadoon11
 
Throughput oriented aarchitectures
Throughput oriented aarchitecturesThroughput oriented aarchitectures
Throughput oriented aarchitecturesNomy059
 
Modern processor art
Modern processor artModern processor art
Modern processor artwaqasjadoon11
 
qCUDA-ARM : Virtualization for Embedded GPU Architectures
 qCUDA-ARM : Virtualization for Embedded GPU Architectures  qCUDA-ARM : Virtualization for Embedded GPU Architectures
qCUDA-ARM : Virtualization for Embedded GPU Architectures 柏瑀 黃
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 

Similar to Cuda (20)

CUDA
CUDACUDA
CUDA
 
GPU Programming with Java
GPU Programming with JavaGPU Programming with Java
GPU Programming with Java
 
C for Cuda - Small Introduction to GPU computing
C for Cuda - Small Introduction to GPU computingC for Cuda - Small Introduction to GPU computing
C for Cuda - Small Introduction to GPU computing
 
Gpu application in cuda memory
Gpu application in cuda memoryGpu application in cuda memory
Gpu application in cuda memory
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An Introduction
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
 
gpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsngpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsn
 
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONSA SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
 
Stream Processing
Stream ProcessingStream Processing
Stream Processing
 
cuda.ppt
cuda.pptcuda.ppt
cuda.ppt
 
Computing using GPUs
Computing using GPUsComputing using GPUs
Computing using GPUs
 
Modern processor art
Modern processor artModern processor art
Modern processor art
 
Danish presentation
Danish presentationDanish presentation
Danish presentation
 
Throughput oriented aarchitectures
Throughput oriented aarchitecturesThroughput oriented aarchitectures
Throughput oriented aarchitectures
 
Modern processor art
Modern processor artModern processor art
Modern processor art
 
processor struct
processor structprocessor struct
processor struct
 
Cuda
CudaCuda
Cuda
 
qCUDA-ARM : Virtualization for Embedded GPU Architectures
 qCUDA-ARM : Virtualization for Embedded GPU Architectures  qCUDA-ARM : Virtualization for Embedded GPU Architectures
qCUDA-ARM : Virtualization for Embedded GPU Architectures
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 

Recently uploaded

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 

Recently uploaded (20)

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 

Cuda

  • 2. CONTENTS • What is GPU ? • What is CUDA ? • Why CUDA ? • GPU VS CPU. • DATA FLOW IN CUDA • Programming model • CUDA Memory Hierarchy • A simple program in CUDA • CUDA Limitations
  • 3. WHAT IS GPU ? • A GPU is a processor with thousands of cores , ALUs and cache.
  • 4. GPU ARE SO MUCH POWERFUL THAN WHY COMPILER RUNS OUR PROGRAM ON CPU? The programming paradigm is not meant for GPU. CUDA
  • 5. WHAT IS CUDA? • CUDA is a set of developing tools to create applications that will perform execution on GPU (Graphics Processing Unit). • CUDA compiler uses variation of C with future support of C++. • CUDA was developed by NVidia and can only run on NVidia GPUs of tesla and Geforce series. • CUDA provides Heterogeneous serial-parallel computing Between CPU and GPU
  • 6. WHY CUDA ? • CUDA provides ability to use high-level languages such as C to develop application that can take advantage of high level performance and scalability that GPUs architecture offer. • GPUs allow creation of very large number of concurrently executed threads at very low system resource cost. • CUDA also exposes fast shared memory (16KB) that can be shared between threads. • Full support for integer and bitwise operations. • Compiled code will run directly on GPU.
  • 7. GPU VS. CPU • GPUs contain much larger number of dedicated ALUs. • GPUs also contain extensive support of Stream Processing paradigm. It is related to SIMD ( Single Instruction Multiple Data) processing. • Each processing unit on GPU contains local memory that improves data manipulation and reduces fetch time.
  • 9. FLOW OF DATA PCIE 1. Data moved from CPU to GPU. 2. Launch Program on GPU. 3. Copy final values to CPU.
  • 11. CUDA MEMORY HIERARCHY • The CUDA platform has three primary memory types Local Memory – per thread memory for automatic variables and register spilling. Shared Memory – per block memory to allow for intra-block data sharing and synchronization. Threads can safely share data through this memory and can perform barrier synchronization through _ _syncthreads() Global Memory – device level memory that may be shared between blocks or grids
  • 12. A VECTOR ADD EXAMPLE IN C
  • 13. VECTOR ADD IN CUDA C
  • 14. CUDA LIMITATIONS • No support of recursive function. Any recursive function must be converted into loops. • Bus bandwidth and latency between GPU and CPU is a bottleneck for many applications. • Threads should only be run in groups of 32 and up for best performance. • Only supported on NVidia GPUs
  • 15. REFERENCES • Video tutorials by- • Udacity.com by john owens , Professor of electric and computer at UC Davis. • Plural site by Dimitri Nesteruk • Web resources – • Developer.nvidia.com • Nividia.com/cuda