A VLIW processor implements instruction level parallelism by grouping multiple operations into a single very long instruction word. The compiler statically schedules independent instructions to execute in parallel on functional units. This avoids the need for complex hardware to dynamically schedule instructions at runtime. VLIW moves the complexity to the compiler, allowing for simpler hardware that can be lower cost and lower power while achieving higher performance than RISC and CISC chips.
2. VLIW PROCESSORS:A METHOD TO EXPLOIT
INSTRUCTION LEVEL PARALLELISM
• A VLIW processor is based on an architecture that implements Instruction Level
Parallelism (ILP) means execution of multiple instructions at the same time.
• A Very Long Instruction Word (VLIW) specifies multiple numbers of operations are
grouped together into one very long instruction.
• In a VLIW processor, multiple operations inside the long instruction are issued in
parallel to an equal number of functional units.
• They are passed to a register file that executes the operations in the instruction
with the help of functional units provided as part of the hardware.
3. • The compiler (not the processor) checks that there are only independent instructions
executed in parallel so as to extract as much parallelism as possible.
• One program counter points to one long instruction.
• Since multiple operations are packed in one instruction word, the instruction words are
much larger than CISC and RISC’s.
4.
5. Implementation of VLIW by static scheduling
• The VLIW architecture is implemented through static scheduling .This means that they are not
done at runtime by the processor but are handled by the compiler.
• The compiler takes the complex instructions that need to be handled and compiles them into
object code .
• The object code is then passed to the register file.
• It is this object code that is referred to as the Very Long Instruction Word (VLIW).
• The compiler must guarantee that the multiple instructions which group together are
independent so they can be executable in parallel.
• The compiler prearranges the object code so the VLIW chip can quickly execute the instructions
in parallel.
6. VLIW vs Super Scalar
• Super Scalar architectures, in contrast, use dynamic scheduling that transforms all ILP
complexity to the processor hardware .
• This leads to greater hardware complexity that is not seen in VLIW hardware .
• VLIW chips don’t need most of the complex circuitry that super scalar chips must use to
coordinate parallel execution at runtime .
• Thus in VLIW hardware complexity is greatly reduced:
• the executable instructions are generated directly by the compiler
• they are then passed as “native code” by the functional units present in the hardware
7. VLIW chips can
cost less
burn less power
achieve significantly higher performance than comparable RISC and CISC chips.