SlideShare ist ein Scribd-Unternehmen logo
1 von 24
VECTOR
COMPUTING
1
PRESENTED BY
VECTOR PROCESSOR
• Vector processors are special purpose computers
that match a range of (scientific) computing tasks.
• vector processors provide vector instructions. These
instructions operate in a pipeline .
3
OBJECTIVE
• Small Programs size
• No wastage
• Feeding of functional unit(FU) and the register
buses
4
HOW IT WORKS?
5
HOW IT WORKS?
6
OPERATIONS
• Add two vectors to produce a third.
• Subtract two vectors to produce a third
• Multiply two vectors to produce a third
• Divide two vectors to produce a third
• Load a vector from memory
• Store a vector to memory.
7
ARCHITECTURE
8
PROPERTIES
• Vector processors reduce the fetch and decode
bandwidth as the number of instructions fetched are
less.
• They also exploit data parallelism in large scientific
and multimedia applications.
• Many performance optimization schemes are used
in vector processors.
• Strip mining is used to generate code so that vector
operation is possible for vector operands whose size
is less than or greater than the size of vector
registers.
9
PROPERTIES
• Vector chaining the equivalent of forwarding in
vector processors - is used in case of data
dependency among vector instructions.
• Special scatter and gather instructions are provided
to efficiently operate on sparse matrices.
• Instruction are designed with the property that all
vector arithmetic instructions only allow element N
of one vector register to take part in operations with
element N from other vector registers.
10
PROPERTIES
• Based on how the operands are fetched, vector
processors can be divided into two categories - in
memory-memory architecture operands are directly
streamed to the functional units from the memory
and results are written back to memory as the vector
operation proceeds. In vector-register architecture,
operands are read into vector registers from which
they are fed to the functional units and results of
operations are written to vector registers.
11
ADVANTAGES
• Data can be represented at its original resolution and
form without generalization.
• Accurate location of data is maintained.
• Efficient encoding of topology, and as a result more
efficient operations.
• Mature, developed compiler technology
• Compact: Describe N operations with 1 short
instruction
12
SOME VECTOR PROCESSORS
13
NEW TERMS FOR VECTOR PROCESSORS
• Initiation rate
consuming operands
producing new results.
• Chime
timing measure
vector sequence
ignores the startup overhead for a vector operation.
14
• Convoy
is the set of vector instructions
potentially begin execution together in one clock period.
must complete before new instructions can begin.
• vector start-up time
overhead to start execution
related to the pipeline depth
NEW TERMS FOR VECTOR PROCESSORS
15
PROPOSED VECTOR PROCESSOR
• CODE (Clustered Organization for Decoupled
Execution) is a proposed vector architecture which
will overcome the some limitations of conventional
vector processors.
16
REASONS
• Complexity of central vector register files(VRF) - In
a processor with N vector functional units(VFU),
the register file needs approximately 3N access
ports. VRF area, power consumption and latency
are proportional to O(N*N), O(log N) and O(N)
respectively.
• Difficult to implement precise implementation - In
order to implement in-order commit, a large ROB is
needed with at least one vector register per VFU.
17
• In order to support virtual memory, large TLB is
needed so that TLB has enough entries to translate
all virtual addresses generated by a vector
instruction.
• Vector processors need expensive on-chip memory
for low latency.
REASONS
18
SOME FEATURES OF CODE
• Vector registers are organized in the form of clusters
in CODE architecture.
• CODE can hide communication latency by forcing
the output interface to look ahead into the
instruction queue and start executing register move
instructions.
• CODE supports precise exception using a history
buffer.
• In order to reduce the size of TLB.
• CODE proposes an ISA level change.
19
The Effect of cache design into
vector computers
• Numerical programs
 data sets that are too large for the current cache sizes.
Sweep accesses of a large vector
result in complete reloading of the cache
• achieve high memory bandwidth
Register files
highly interleaved memories
• Address sequentiation
20
Proposals of cache schemes
• Proposals such as prime-mapped cache schemes
have been proposed and studied. The new cache
organization minimizes cache misses caused by
cache line interferences that have been shown to be
critical in numerical applications.
• The cache lookup time of the new mapping scheme
keeps the same as conventional caches. Generation
of cache addresses for accessing the prime-mapped
cache can be done in parallel with normal address
calculations.
21
Conclusion
• Vector supercomputers
• Vector instruction
• Commodity technology like SMT
• Superscalar microprocessor
• Embedded and multimedia applications
22
References:
• J.L. Hennessy and D.A. Patterson, Computer Architecture, A
Quantitative Approach. Morgan Kaufmann, 1990.
• http://csep1.phy.ornl.gov/ca/node24.html
• http://www.comp.nus.edu.sg/~johnm/cs3220/l21.htm
• http://penta-performance.com/sager/vector/Default_vector2.htm
• www.google.com
• www.wikipedia.com
• www.youtube.com
23
24

Weitere ähnliche Inhalte

Was ist angesagt?

Real Time Kernels
Real Time KernelsReal Time Kernels
Real Time Kernels
Arnav Soni
 

Was ist angesagt? (20)

REQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGREQUIREMENT ENGINEERING
REQUIREMENT ENGINEERING
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.
 
FUNDAMENTALS OF COMPUTER DESIGN
FUNDAMENTALS OF COMPUTER DESIGNFUNDAMENTALS OF COMPUTER DESIGN
FUNDAMENTALS OF COMPUTER DESIGN
 
Uml in software engineering
Uml in software engineeringUml in software engineering
Uml in software engineering
 
Matrix multiplication
Matrix multiplicationMatrix multiplication
Matrix multiplication
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
 
PRESCRIPTIVE PROCESS MODEL(SOFTWARE ENGINEERING)
PRESCRIPTIVE PROCESS MODEL(SOFTWARE ENGINEERING)PRESCRIPTIVE PROCESS MODEL(SOFTWARE ENGINEERING)
PRESCRIPTIVE PROCESS MODEL(SOFTWARE ENGINEERING)
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Real Time Kernels
Real Time KernelsReal Time Kernels
Real Time Kernels
 
Leaky bucket algorithm
Leaky bucket algorithmLeaky bucket algorithm
Leaky bucket algorithm
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer Architecture
 
object oriented methodologies
object oriented methodologiesobject oriented methodologies
object oriented methodologies
 
How to Measure RTOS Performance
How to Measure RTOS Performance How to Measure RTOS Performance
How to Measure RTOS Performance
 
Array Processor
Array ProcessorArray Processor
Array Processor
 
Advanced DBMS presentation
Advanced DBMS presentationAdvanced DBMS presentation
Advanced DBMS presentation
 
Distributed operating system(os)
Distributed operating system(os)Distributed operating system(os)
Distributed operating system(os)
 
advanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelismadvanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelism
 
Processor Organization and Architecture
Processor Organization and ArchitectureProcessor Organization and Architecture
Processor Organization and Architecture
 
Distributed Operating System_1
Distributed Operating System_1Distributed Operating System_1
Distributed Operating System_1
 
formal verification
formal verificationformal verification
formal verification
 

Ähnlich wie Vector computing

FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
Dharmesh Tank
 
Application of Parallel Processing
Application of Parallel ProcessingApplication of Parallel Processing
Application of Parallel Processing
are you
 

Ähnlich wie Vector computing (20)

FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptx
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Application of Parallel Processing
Application of Parallel ProcessingApplication of Parallel Processing
Application of Parallel Processing
 
Basics of micro controllers for biginners
Basics of  micro controllers for biginnersBasics of  micro controllers for biginners
Basics of micro controllers for biginners
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Reduced instruction set computers
Reduced instruction set computersReduced instruction set computers
Reduced instruction set computers
 
Node architecture
Node architectureNode architecture
Node architecture
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer Architecture
 
SOC Processors Used in SOC
SOC Processors Used in SOCSOC Processors Used in SOC
SOC Processors Used in SOC
 
Parallel Algorithms Advantages and Disadvantages
Parallel Algorithms Advantages and DisadvantagesParallel Algorithms Advantages and Disadvantages
Parallel Algorithms Advantages and Disadvantages
 
RISC Vs CISC Computer architecture and design
RISC Vs CISC Computer architecture and designRISC Vs CISC Computer architecture and design
RISC Vs CISC Computer architecture and design
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud Infrastructures
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
Challenges in Embedded Computing
Challenges in Embedded ComputingChallenges in Embedded Computing
Challenges in Embedded Computing
 
Computer organization & ARM microcontrollers module 3 PPT
Computer organization & ARM microcontrollers module 3 PPTComputer organization & ARM microcontrollers module 3 PPT
Computer organization & ARM microcontrollers module 3 PPT
 
Monolithic to Microservices Architecture
Monolithic to Microservices ArchitectureMonolithic to Microservices Architecture
Monolithic to Microservices Architecture
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
CONDOR @ NGCLE@e-Novia 15.11.2017
CONDOR @ NGCLE@e-Novia 15.11.2017CONDOR @ NGCLE@e-Novia 15.11.2017
CONDOR @ NGCLE@e-Novia 15.11.2017
 

Mehr von Safayet Hossain

Application-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentApplication-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud Environment
Safayet Hossain
 

Mehr von Safayet Hossain (13)

Application-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentApplication-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud Environment
 
Epipolar geometry
Epipolar geometryEpipolar geometry
Epipolar geometry
 
Find Transitive closure of a Graph Using Warshall's Algorithm
Find Transitive closure of a Graph Using Warshall's AlgorithmFind Transitive closure of a Graph Using Warshall's Algorithm
Find Transitive closure of a Graph Using Warshall's Algorithm
 
Color Guided Thermal image Super Resolution
Color Guided Thermal image Super ResolutionColor Guided Thermal image Super Resolution
Color Guided Thermal image Super Resolution
 
Different type of attack on computer
Different type of attack on computerDifferent type of attack on computer
Different type of attack on computer
 
Region based image segmentation
Region based image segmentationRegion based image segmentation
Region based image segmentation
 
Anti- aliasing computer graphics
Anti- aliasing computer graphicsAnti- aliasing computer graphics
Anti- aliasing computer graphics
 
detect emotion from text
detect emotion from textdetect emotion from text
detect emotion from text
 
Grid computing
Grid computing Grid computing
Grid computing
 
Green computing
Green computing Green computing
Green computing
 
E waste...
E   waste...E   waste...
E waste...
 
Economic presentation
Economic presentationEconomic presentation
Economic presentation
 
Remittance Management System
Remittance Management System Remittance Management System
Remittance Management System
 

Kürzlich hochgeladen

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Kürzlich hochgeladen (20)

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Vector computing

  • 3. VECTOR PROCESSOR • Vector processors are special purpose computers that match a range of (scientific) computing tasks. • vector processors provide vector instructions. These instructions operate in a pipeline . 3
  • 4. OBJECTIVE • Small Programs size • No wastage • Feeding of functional unit(FU) and the register buses 4
  • 7. OPERATIONS • Add two vectors to produce a third. • Subtract two vectors to produce a third • Multiply two vectors to produce a third • Divide two vectors to produce a third • Load a vector from memory • Store a vector to memory. 7
  • 9. PROPERTIES • Vector processors reduce the fetch and decode bandwidth as the number of instructions fetched are less. • They also exploit data parallelism in large scientific and multimedia applications. • Many performance optimization schemes are used in vector processors. • Strip mining is used to generate code so that vector operation is possible for vector operands whose size is less than or greater than the size of vector registers. 9
  • 10. PROPERTIES • Vector chaining the equivalent of forwarding in vector processors - is used in case of data dependency among vector instructions. • Special scatter and gather instructions are provided to efficiently operate on sparse matrices. • Instruction are designed with the property that all vector arithmetic instructions only allow element N of one vector register to take part in operations with element N from other vector registers. 10
  • 11. PROPERTIES • Based on how the operands are fetched, vector processors can be divided into two categories - in memory-memory architecture operands are directly streamed to the functional units from the memory and results are written back to memory as the vector operation proceeds. In vector-register architecture, operands are read into vector registers from which they are fed to the functional units and results of operations are written to vector registers. 11
  • 12. ADVANTAGES • Data can be represented at its original resolution and form without generalization. • Accurate location of data is maintained. • Efficient encoding of topology, and as a result more efficient operations. • Mature, developed compiler technology • Compact: Describe N operations with 1 short instruction 12
  • 14. NEW TERMS FOR VECTOR PROCESSORS • Initiation rate consuming operands producing new results. • Chime timing measure vector sequence ignores the startup overhead for a vector operation. 14
  • 15. • Convoy is the set of vector instructions potentially begin execution together in one clock period. must complete before new instructions can begin. • vector start-up time overhead to start execution related to the pipeline depth NEW TERMS FOR VECTOR PROCESSORS 15
  • 16. PROPOSED VECTOR PROCESSOR • CODE (Clustered Organization for Decoupled Execution) is a proposed vector architecture which will overcome the some limitations of conventional vector processors. 16
  • 17. REASONS • Complexity of central vector register files(VRF) - In a processor with N vector functional units(VFU), the register file needs approximately 3N access ports. VRF area, power consumption and latency are proportional to O(N*N), O(log N) and O(N) respectively. • Difficult to implement precise implementation - In order to implement in-order commit, a large ROB is needed with at least one vector register per VFU. 17
  • 18. • In order to support virtual memory, large TLB is needed so that TLB has enough entries to translate all virtual addresses generated by a vector instruction. • Vector processors need expensive on-chip memory for low latency. REASONS 18
  • 19. SOME FEATURES OF CODE • Vector registers are organized in the form of clusters in CODE architecture. • CODE can hide communication latency by forcing the output interface to look ahead into the instruction queue and start executing register move instructions. • CODE supports precise exception using a history buffer. • In order to reduce the size of TLB. • CODE proposes an ISA level change. 19
  • 20. The Effect of cache design into vector computers • Numerical programs  data sets that are too large for the current cache sizes. Sweep accesses of a large vector result in complete reloading of the cache • achieve high memory bandwidth Register files highly interleaved memories • Address sequentiation 20
  • 21. Proposals of cache schemes • Proposals such as prime-mapped cache schemes have been proposed and studied. The new cache organization minimizes cache misses caused by cache line interferences that have been shown to be critical in numerical applications. • The cache lookup time of the new mapping scheme keeps the same as conventional caches. Generation of cache addresses for accessing the prime-mapped cache can be done in parallel with normal address calculations. 21
  • 22. Conclusion • Vector supercomputers • Vector instruction • Commodity technology like SMT • Superscalar microprocessor • Embedded and multimedia applications 22
  • 23. References: • J.L. Hennessy and D.A. Patterson, Computer Architecture, A Quantitative Approach. Morgan Kaufmann, 1990. • http://csep1.phy.ornl.gov/ca/node24.html • http://www.comp.nus.edu.sg/~johnm/cs3220/l21.htm • http://penta-performance.com/sager/vector/Default_vector2.htm • www.google.com • www.wikipedia.com • www.youtube.com 23
  • 24. 24