SOC Processors Used in SOC

Assistant Professor um P V P Institute of Technology, Budhgaon, Sangli
1. Mar 2017

Más contenido relacionado


SOC Processors Used in SOC

  1. SOC: Processors Used Mr. A. B. Shinde Assistant Professor, Electronics Engineering, PVPIT, Budhgaon, Sangli
  2. Unit-III Contents • Unit-III: PROCESSORS • Introduction, • Processor Selection for SOC, • Basic Concepts in Processor Architecture, • Study of IBM’s power PC, • Study of Picoblaze processor, • Study of Microblaze processor 2 Datasheets: PowerPC processor Picoblaze processor Microblaze processor
  3. Introduction • Processors come in many types and with many intended uses. • Much attention is focused on high - performance processors used in servers and workstations. • Figure shows the processor production profile by annual production count 3 Worldwide production of microprocessors and controllers
  4. Introduction • Market growth, shows that the demand for SOC and larger microcontrollers is growing at almost three times that of microprocessor units. • In SOC type applications, the processor itself is a small component occupying just a few percent of the die. • SOC designs often use many different types of processors suiting the application. 4 Annual growth in demand for microprocessors and controllers
  5. Processor Selection For SOC • Overview: • For SOC designs, the selection of the processor is the most obvious task and the most restricted. • The processor must run a specific system software, so at least a core processor (usually a general - purpose processor (GPP)) must be selected for this function. • In computation - limited applications, the system includes a processor configured and parameterized to meet requirements. 5
  6. Processor Selection For SOC • Overview: • In some cases, it may be possible to merge these processors, but that is usually an optimization consideration. • Memory and interconnect components are considered as simple delay elements in calculating processor performance. • These are referred to here as idealized components. 6
  7. Processor Selection For SOC • Figure shows the processor model used in the initial design process. 7 Processors in the SOC model
  8. Processor Selection For SOC • The process of selection is different in the case of compute - limited selection, as there can be a real – time requirement that must be met by one of the selected processors. • The processor selection and parameterization should result in an initial SOC design that appears to fully satisfy all functional and performance requirements set out in the specifications. 8 Process of processor core selection
  9. Processor Selection For SOC • Soft Processors: • A soft processor is an Intellectual Property (IP) core that is implemented using the logic primitives of the FPGA. • Being soft it has high degree of flexibility and configurability. • Soft processor is a microprocessor core that can be entirely implemented using logic synthesis. • It can be implemented via different semiconductor devices containing programmable logic (e.g., ASIC, FPGA, CPLD). 9
  10. Processor Selection For SOC • Soft Processors: • Most systems, uses a single soft processor. However, a few designers may use many soft cores onto an FPGA. • While many people put exactly one soft microprocessor on a FPGA. A sufficiently large FPGA can hold two or more soft microprocessors, resulting in a multi-core processor. • The number of soft processors on a single FPGA is only limited by the size of the FPGA. 10
  11. Processor Selection For SOC • Soft Processors: • The term “soft core” refers to an instruction processor design in bitstream format that can be used to program a FPGA device. • The 4 main reasons for using such designs, despite their large area – power – time cost, are 1. Cost reduction in terms of system - level integration, 2. Design reuse in cases where multiple designs are really just variations on one, 3. Creating an exact fit for a microcontroller/peripheral combination, and 4. Providing future protection against discontinued microcontroller variants. 11
  12. Processor Selection For SOC Some Features of Soft - Core Processors 12 Soft Processors:
  13. Processor Selection For SOC • rbe: register bit equivalent • register bit equivalent (rbe) is the unit of area measurement. • This is defined to be a six - transistor register cell. • This is significantly more than six times the area of a single transistor, since it includes larger transistors, their interconnections, and necessary inter - bit isolating spaces. • Example: • 1 register bit (rbe) 1.0 rbe • 1 static RAM bit in an on - chip cache 0.6 rbe • 1 DRAM bit 0.1 rbe • Xilinx FPGA • A slice (2 LUTs + 2 FFs + MUX) 700 rbe • A configurable logic block (4 slices) Virtex 4 2800 rbe • A 18 - KB block RAM 12,600 rbe 13
  14. Processor Selection For SOC • Processor Core Selection (General Core Path): • Assume that an initial design had performance of 1 using 100K rbe of area, and we would like to have additional speed and functionality. • So we double the performance (half the T for the processor). • This increases the area to 400K rbe and the power by a factor of 8. • Each rbe is now dissipating twice the power as before. • Doubling the performance (instruction execution rate) doubles the number of cache misses per unit time. 14
  15. Processor Selection For SOC • Processor Core Selection (General Core Path): • Cache misses significantly reduces the realized performance; to recover this performance, we now need to increase the cache size. • The general rule to half the miss rate, we need to double the cache size. • If the initial cache size was also 100K rbe, then new design will have cache size of 600K rbe and probably dissipates about 10 times the power of the initial design. • The faster processor cache combination may provide important functionality, such as additional security checking or input/output (I/O) capability. 15
  16. Processor Selection For SOC • Processor Core Selection (Compute Core Path): • Consider some trade - offs for the compute - limited path. • Suppose the application is generally parallelizable, and we have several different design approaches. • One is a 10 - stage pipelined vector processor; the other is multiple simpler processors. • The application has performance of 1 with the vector processor (area is 300K rbe) and half of that performance with a single simpler processor (area is 100K rbe). • In order to satisfy the real – time compute requirements, we need to increase the performance to 1.5 16
  17. Processor Selection For SOC • Processor Core Selection (Compute Core Path): • Now we must evaluate the various ways of achieving the target performance. • Approach 1 is to increase the pipeline depth and double the number of vector pipelines; this satisfies the performance target. • This increases the area to 600K rbe and doubles the power, while the clock rate remains unchanged. 17
  18. Processor Selection For SOC • Processor Core Selection, Compute Core Path: • Now we must evaluate the various ways of achieving the target performance. • Approach 2 is to use an “array” of simpler interconnected processors. • In order to achieve the target performance, we need to have at least four processors: three for the basic target and one to account for the overhead. 18
  19. Basic Concepts In Processor Architecture • Before Studying the basic concepts in Processor Architecture we will understand the designing of any GPP: • For example: EC-1 Microprocessor 19
  20. Steps for Designing CPU • Design of instruction set [IS] • Define instruction set, number of instructions • Define each instruction • Instruction Encoding (OPCODE) • Design of Data path • Define number of functional unit required for IS • Decide Number of registers needed • Use Separate registers or register file • Define Flow control registers, Status registers etc. • Connect the functional unit and register to implement [IS] • Design of Control unit • Program Counter (PC) • Instruction Register (IR) • Instruction Cycle • Step 1 fetches an Instruction • Step 2 Decodes the Instruction • Step 3 executes the Instruction 20
  21. EC-1 General-Purpose µP design • Instruction Set 21
  22. EC-1 General-Purpose µP design • Control Unit Designing: 22
  23. EC-1 General-Purpose µP design 23 Next State Table Next State Equations
  24. EC-1 General-Purpose µP design 24 Logic Diagram
  25. Basic Concepts In Processor Architecture • The processor architecture consists of the instruction set of the processor. • While the instruction set implies many implementation (microarchitecture) details. • It is the synthesis of the physical device limitations with area – time – power trade - offs to optimize specified user requirements. 25
  26. Basic Concepts In Processor Architecture • Instruction Set: • The instruction set for most processors is based upon a register set to hold operands and addresses. • The register set size can be varied from 8 to 64 words or more, where each word consists of 32 – 64 bits. • An additional set of floating - point registers (32 – 128 bits) can also be used. • A typical instruction set specifies a program status word, which consists of various types of control status information, including condition codes (CCs) set by the instruction. 26
  27. Basic Concepts In Processor Architecture • Instruction Set: • Common instruction sets can be classified into two basic types: – load – store ( L/S ) architecture and – register – memory ( R/M ) architecture: 27
  28. Basic Concepts In Processor Architecture • Instruction Set: • The L/S instruction set includes the RISC microprocessors. • Arguments are in registers before their execution. • An ALU instruction has both source operands and result specified as registers. • The advantages of the L/S architecture are: – regularity of execution and – ease of instruction decode. 28
  29. Basic Concepts In Processor Architecture • Instruction Set: • The R/M architectures include instructions that operate on operands in registers or with one of the operands in memory. • In the R/M architecture, an ADD instruction might sum a register value and a value contained in memory, with the result going to a register. 29
  30. Basic Concepts In Processor Architecture • Instruction Set: • The trade - off in instruction sets is an area – time compromise. • The R/M approach offers a program representation using fewer instructions of variable size compared with L/S. • The variable instruction size makes decoding more difficult. • The decoding of multiple instructions requires predicting the starting point of each. The R/M processors require more circuitry (and area) to be devoted to instruction fetch and decode. 30
  31. Basic Concepts In Processor Architecture 31 Instruction size and format for typical processors
  32. Basic Concepts In Processor Architecture 32 Instruction Set Mnemonic Operations
  33. Basic Concepts In Processor Architecture • Some Instruction Set Conventions: • To indicate the data type that the operation specifies, the operation mnemonic is extended by a data - type indicator: • OP.W might indicate an OP for integers, while • OP.F indicates a floating - point operation. 33 Typical data - type modifiers are shown in above table. A typical instruction has the form OP.M destination, source 1, source 2. The source and destination specification has the form of either a register or a memory location
  34. Basic Concepts In Processor Architecture • Branches: • Branches (or jumps ) manage program control flow. • They typically consist of unconditional BR, conditional BC, and subroutine call and return (link). • Typically, the CC is set by an ALU instruction to record one of several results, for example, specifying whether the instruction has generated 1. a positive result, 2. a negative result, 3. a zero result, or 4. an overflow. 34
  35. Basic Concepts In Processor Architecture • Interrupts and Exceptions: • Many embedded SOC controllers have external interrupts and internal exceptions • These facilities can be managed and supported in various ways: 1. User Requested versus Coerced (Forcefully): The former often covers executions like divide by zero, while the latter is usually triggered by external events. 2. Maskable versus Nonmaskable: The former type of event can be ignored, while the latter cannot be ignored. 3. Terminate versus Resume: An event such as divide by zero would terminate ordinary processing, while a processor resumes operation. 4. Asynchronous versus Synchronous: Interrupt events can occur in asynchrony with the processor clock by an external agent or not, as when caused by a program’s execution. 5. Between versus Within Instructions: Interrupt events can be recognized only between instructions or within an instruction execution. 35
  36. Basic Concepts In Processor Architecture • Interrupts and Exceptions: • In general, the first alternative of most of these pairs is easier to implement and may be handled after the completion of the current instruction. • Once the exception is handled, the latter instructions are restarted from scratch. • Some of these events may occur simultaneously and may even be nested. 36
  37. • PowerPC - Performance Optimization With Enhanced RISC – Performance Computing, • PowerPC sometimes abbreviated as PPC 37 IBM’s power PC
  38. • The IBM 405Fx is 32-bit reduced instruction set computer (RISC) processor core, referred to as the PPC405Fx core, implements the PowerPC Architecture with extensions for embedded applications. • PPC405Fx Features – The PPC405Fx core provides high performance and low power consumption. – The PPC405Fx RISC CPU executes at sustained speeds approaching one cycle per instruction. – On-chip instruction and data cache arrays can be implemented to reduce chip count and design complexity in systems. 38 PPC405Fx Embedded Processor
  39. • PPC405Fx Features • The PowerPC RISC fixed-point CPU features: – Thirty-two, 32-bit general purpose registers (GPRs) – Five-stage pipeline with single-cycle execution of most instructions. – Unaligned load/store support to cache arrays, main memory, and on-chip memory (OCM) – Hardware multiply/divide for faster integer arithmetic (4-cycle multiply, 35-cycle divide) – True little endian operation – Parity detection and reporting for the instruction cache, data cache. – Programmable Interval Timer (PIT), Fixed Interval Timer (FIT), and watchdog timer 39 PPC405Fx Embedded Processor
  40. • PPC405Fx Features • Storage control : – Separate, configurable, two-way set-associative instruction and data cache units; • Instruction cache array is 16KB and data cache array is 16KB – 32 bytes per cache line – Read and write line buffers – Programmable ICU pre-fetching of next sequential line into line buffer – Programmable allocation on loads and stores – Operand forwarding during cache line fills 40 PPC405Fx Embedded Processor
  41. • PPC405Fx Features • Memory Management – Translation of the 4GB logical address space into physical addresses – Page level access control using the translation mechanism – Software control of page replacement strategy – WIU0GE (write-through, cachability, compressed user-defined 0, guarded, endian) storage attribute control for each virtual memory region – Full floating-point unit (FPU) support using the auxiliary processor unit (APU) interface (the PPC405Fx does not include an FPU) 41 PPC405Fx Embedded Processor
  42. • PPC405Fx Features • PowerPC timer facilities – 64-bit time base – PIT, FIT, and watchdog timers • Debug Support – Enhanced debug support with logical operators – Four instruction address compares (IACs) – Two data address compares (DACs) – Two data value compares (DVCs) • Advanced power management support 42 PPC405Fx Embedded Processor
  43. • PowerPC Architecture • The PowerPC Architecture comprises three levels of standards: • PowerPC User Instruction Set Architecture (UISA): including the base user-level instruction set, user level registers, programming model, data types, and addressing modes. • PowerPC Virtual Environment Architecture (VEA): describing the memory model, cache model, cache-control instructions, address aliasing, and related issues. • PowerPC Operating Environment Architecture (OEA): including the memory management model, supervisor level registers, and the exception model. These features are not accessible from the user level. 43 PPC405Fx Embedded Processor
  44. • Processor Core Organization 44 PPC405Fx Embedded Processor
  45. • Processor Core Organization • The processor core consists of a 5-stage pipeline, separate instruction and data cache units, virtual memory management unit (MMU), three timers, debug, and interfaces to other functions. • Instruction and Data Cache Controllers – The instruction cache unit (ICU) and data cache unit (DCU) enable concurrent accesses and minimize pipeline stalls. – The storage capacity of the cache units, which can range from 0KB–32KB, depends upon the implementation. – The instruction set provides cache control instructions, including instructions to read tag information and data arrays. 45 PPC405Fx Embedded Processor
  46. • Processor Core Organization • Instruction Cache Unit – The ICU provides one or two instructions per cycle to the execution unit (EXU) over a 64-bit bus. – A line buffer enables the ICU to be accessed only once for every four instructions, to reduce power consumption by the array. – The ICU can forward any or all of the words of a line fill to the EXU to minimize pipeline stalls caused by cache misses. 46 PPC405Fx Embedded Processor
  47. • Processor Core Organization • Data Cache Unit – The DCU transfers 1, 2, 3, 4, or 8 bytes per cycle, depending on CPU. – The DCU contains a single-element command and store data queue to reduce pipeline stalls; this queue enables the DCU to independently process load/store and cache control instructions. – When the DCU is busy with a low-priority request while a subsequent storage operation requested by the CPU is stalled, the DCU automatically increases the priority of the current request to the PLB. 47 PPC405Fx Embedded Processor
  48. • Processor Core Organization • Data Cache Unit – The DCU uses a two-line flush queue to minimize pipeline stalls caused by cache misses. – Single queued flushes are non-blocking. When a flush operation is pending, the DCU can continue to access the array to determine subsequent load or store. – The DCU can function in write-back or write-through mode, as controlled by the Data Cache Write-through Register (DCWR) or the translation look-aside buffer (TLB). 48 PPC405Fx Embedded Processor
  49. • Processor Core Organization • Memory Management Unit – The 4GB address space of the PPC405Fx is flat address space. – The MMU provides address translation, protection functions, and storage attribute control for embedded applications. – MMU provides the following functions: • Translation of the 4GB logical address space into physical addresses • Page level access control using the translation mechanism • Software control of page replacement strategy – The MMU can be disabled under software control. 49 PPC405Fx Embedded Processor
  50. • The PicoBlaze microcontroller is a compact and cost-effective fully embedded 8-bit RISC microcontroller core optimized for the Spartan- 3 family. • It also provides support for the Virtex-5, Spartan-6, and Virtex-6 FPGA families. • It occupies just 96 FPGA slices, (only 12.5% of an XC3S50 FPGA). • Single FPGA block RAM stores up to 1024 program instructions, which are automatically loaded during FPGA configuration. • The PicoBlaze microcontroller performs a respectable 44 to 100 million instructions per second (MIPS) depending on the target FPGA family and speed grade. 50 PicoBlaze
  51. • The PicoBlaze microcontroller core is totally embedded within the target FPGA and requires no external resources. • The PicoBlaze peripheral set can be customized to meet the specific features, function, and cost requirements of the target application. • PicoBlaze microcontroller is delivered as synthesizable VHDL source code, the core is future-proof and can be migrated to future FPGA architectures. • Being integrated within the FPGA, the PicoBlaze microcontroller reduces board space, design cost, and inventory. 51 PicoBlaze
  52. • The PicoBlaze microcontroller is specifically designed and optimized for the Spartan-3 family, and supports for Spartan-6, and Virtex-6 FPGA architectures. • It is compact, and consumes considerably less FPGA resources than comparable 8-bit microcontroller architectures within an FPGA. • Because it is delivered as VHDL source, the PicoBlaze microcontroller is immune to product obsolescence. 52 Why the PicoBlaze Microcontroller
  53. • Before the advent of the PicoBlaze and MicroBlaze embedded processors, the microcontroller resided externally to the FPGA, limiting the connectivity to other FPGA functions and restricting overall interface performance. • By contrast, the PicoBlaze microcontroller is fully embedded in the FPGA with flexible, extensive on-chip connectivity to other FPGA resources. • The PicoBlaze microcontroller reduces system cost because it is a single-chip solution, integrated within the FPGA. 53 Why the PicoBlaze Microcontroller
  54. • Microcontrollers and FPGAs both are successfully implemented in any digital logic function. Each has unique advantages in cost, performance and ease of use. • Microcontrollers are well suited to control applications, especially with widely changing requirements. • The same FPGA logic is re-used by the various microcontroller instructions, conserving resources. • Programming control sequences or state machines in assembly code is often easier than creating similar structures in FPGA logic. • As an application increases in complexity, the number of instructions required to implement the application grows and system performance decreases accordingly. 54 Why Use a Microcontroller within an FPGA?
  55. • FPGA is more flexible than microcontroller. For example, an algorithm can be implemented sequentially or completely in parallel, depending on the performance requirements. A completely parallel implementation is faster but consumes more FPGA resources.  A microcontroller embedded within the FPGA provides the best of both.  The microcontroller implements non-timing crucial complex control functions while timing critical or data path functions are best implemented using FPGA logic. For example, a microcontroller cannot respond to events much faster than a few microseconds. The FPGA logic can respond to multiple, simultaneous events in just a few to tens of nanoseconds. 55 Why Use a Microcontroller within an FPGA?
  56. PicoBlaze Microcontroller FPGA Logic Strengths  Easy to program, excellent for control and state machine applications  Resource requirements remain constant with increasing complexity  Re-uses logic resources, excellent for lower-performance functions  Significantly higher performance  Excellent at parallel operations  Sequential Vs. parallel implementation  Fast response to multiple, simultaneous inputs Weaknesses  Executes sequentially  Performance degrades with increasing complexity  Program memory requirements increase with increasing complexity  Slower response to simultaneous inputs  Control and state machine applications more difficult to program  Logic resources grow with increasing Complexity 56 Why Use a Microcontroller within an FPGA?
  57. • 16 byte - wide general-purpose data registers • 1K instructions of programmable on-chip program store, automatically loaded during FPGA configuration • Byte-wide ALU with CARRY and ZERO indicator flags • 64-byte internal scratchpad RAM • 256 input and 256 output ports. • Automatic 31-location CALL/RETURN stack • Predictable performance, always two clock cycles per instruction, up to 200 MHz or 100 MIPS in a Virtex-II Pro FPGA • Fast interrupt response (worst-case 5 clock cycles) • Optimized for Xilinx Spartan-3 architecture — just 96 slices and 0.5 to 1 block RAM 57 PicoBlaze Microcontroller Features
  58. 58 PicoBlaze Microcontroller (Block Diagram)
  59. • General-Purpose Register – The PicoBlaze microcontroller includes 16 byte-wide general- purpose registers, designated as registers s0 through sF – All register operations are completely interchangeable. – There is no dedicated accumulator; each result is computed in a specified register. • 1,024-Instruction Program Store – The PicoBlaze microcontroller executes up to 1,024 instructions from memory within the FPGA. Each PicoBlaze instruction is 18 bits wide. – Other memory organizations are possible to accommodate more PicoBlaze controllers within a single FPGA. 59 PicoBlaze Microcontroller Functional Blocks
  60. • Arithmetic Logic Unit (ALU) – The byte-wide Arithmetic Logic Unit (ALU) performs all microcontroller calculations, including: • Basic arithmetic operations such as addition and subtraction • Bitwise logic operations such as AND, OR, and XOR • Arithmetic compare and Bitwise test operations • Comprehensive shift and rotate operations – All operations are performed using an operand provided by any specified register (sX). The result is returned to the same specified register (sX). – If an instruction requires a second operand, then the second operand is either a second register (sY) or an 8-bit immediate constant (kk). • Flags – ALU operations affect the ZERO and CARRY flags. – The INTERRUPT_ENABLE flag enables the INTERRUPT input. 60 PicoBlaze Microcontroller Functional Blocks
  61. • 64-Byte Scratchpad RAM – The PicoBlaze microcontroller provides an internal general- purpose 64-byte scratchpad RAM, directly or indirectly addressable from the register file using the STORE and FETCH instructions. – The STORE instruction writes the contents of any of the 16 registers to any of the 64 RAM locations. – The complementary FETCH instruction reads any of the 64 memory locations into any of the 16 registers. 61 PicoBlaze Microcontroller Functional Blocks
  62. • Input/Output – The Input/Output ports extend the PicoBlaze microcontroller’s capabilities and allow the microcontroller to connect to a custom peripheral set or to other FPGA logic. – The PicoBlaze microcontroller supports up to 256 input ports and 256 output ports or a combination of input/output ports. – The PORT_ID output provides the port address. • During an INPUT operation: PicoBlaze microcontroller reads data from the IN_PORT port to a specified register, sX. • During an OUTPUT operation: PicoBlaze microcontroller writes the contents of a specified register, sX, to the OUT_PORT port. 62 PicoBlaze Microcontroller Functional Blocks
  63. • Program Counter (PC) – The Program Counter (PC) points to the next instruction to be executed. – Only the JUMP, CALL, RETURN instructions and the Interrupt and Reset Events modify the default behavior. – If the PC reaches the top of the memory at 3FF hex, it rolls over to location 000. • Program Flow Control – The default execution sequence of the program can be modified using conditional and non-conditional program flow control instructions. – CALL and RETURN instructions provide subroutine facilities for commonly used sections of code. 63 PicoBlaze Microcontroller Functional Blocks
  64. • CALL/RETURN Stack – The CALL/RETURN hardware stack stores up to 31 instruction addresses. – When the stack is full, it overwrites the oldest value. – No program memory is required for the stack. • Interrupts – The optional INTERRUPT input, allows the PicoBlaze microcontroller to handle asynchronous external events. – The PicoBlaze microcontroller responds to interrupts quickly in just five clock cycles. • Reset – The PicoBlaze microcontroller is automatically reset immediately after the FPGA configuration process completes. – The PC is reset to address 0, the flags are cleared, interrupts are disabled, and the CALL/RETURN stack is reset. 64 PicoBlaze Microcontroller Functional Blocks
  65. 65 PicoBlaze Architecture
  66. • The MicroBlaze embedded processor soft core is a RISC optimized for implementation in Xilinx FPGAs. • With few exceptions, the MicroBlaze can issue a new instruction every cycle, maintaining single-cycle execution under most circumstances. • MicroBlaze's primary I/O bus, the CoreConnect PLB bus, is a used for system-memory data transactions. • For accessing the local-memory, MicroBlaze uses a dedicated LMB bus, which reduces loading on the other buses. • User-defined coprocessors are supported through a dedicated FIFO-style connection called FSL (Fast Simplex Link). 66 MicroBlaze Processor
  67. • Many aspects of the MicroBlaze can be user configured: – Cache size, – Pipeline depth (3-stage or 5-stage), – Embedded peripherals, – Memory management unit, and – Bus-interfaces can be customized.  The area-optimized version of MicroBlaze, uses a 3-stage pipeline.  The performance-optimized version expands the execution-pipeline to 5-stages. 67 MicroBlaze Processor
  68. • Features • The MicroBlaze soft core processor is highly configurable, allowing you to select a specific set of features. • The fixed feature set of the processor includes: – Thirty-two 32-bit general purpose registers – 32-bit instruction word with three operands and two addressing modes – 32-bit address bus – Single issue pipeline 68 MicroBlaze
  69. 69 MicroBlaze Architecture
  70. • Data Types and Endianness – MicroBlaze uses Big-Endian bit-reversed format to represent data. – The hardware supported data types for MicroBlaze are word, half word, and byte. Word Data Type Half Word Data Type Byte Data Type 70 MicroBlaze
  71. • Instructions • All MicroBlaze instructions are 32 bits and are defined as either Type A or Type B. • Type A instructions have up to two source register operands and one destination register operand. • Type B instructions have one source register and a 16-bit immediate operand. • Instructions are provided in the following functional categories: – arithmetic, – logical, – branch, – load/store, and – special. 71 MicroBlaze
  72. • Registers – It has thirty-two 32-bit general purpose registers and up to eighteen 32-bit special purpose registers. 1. General Purpose Registers The thirty-two 32-bit General Purpose Registers are numbered R0 through R31. 72 MicroBlaze
  73. 2. Special Purpose Registers Program Counter (PC) The Program Counter (PC) is the 32-bit address of the execution instruction. 73 MicroBlaze
  74. 2. Special Purpose Registers Machine Status Register (MSR) The Machine Status Register contains control and status bits for the processor. When reading: bit 29 is replicated in bit 0 as the carry copy. When writing: Carry bit takes effect immediately and the remaining bits take effect one clock cycle later. The MSR is specified by setting Sx = 0x0001. 74 MicroBlaze
  75. 2. Special Purpose Registers Exception Status Register (ESR) The Exception Status Register contains status bits for the processor. The ESR is specified by setting Sa = 0x0005. Branch Target Register (BTR) The Branch Target Register only exists if the MicroBlaze processor is configured to use exceptions. The BTR is specified by setting Sa = 0x000B. 75 MicroBlaze
  76. 2. Special Purpose Registers Floating Point Status Register (FSR) The Floating Point Status Register contains status bits for the floating point unit. The register is specified by setting Sa = 0x0007. Exception Data Register (EDR) The contents of this register is undefined for all other exceptions. The EDR is specified by setting Sa = 0x000D. 76 MicroBlaze
  77. 2. Special Purpose Registers Zone Protection Register (ZPR) The Zone Protection Register is used to override MMU memory protection defined in TLB entries. 77 MicroBlaze
  78. • Pipeline Architecture • MicroBlaze instruction execution is pipelined. For most instructions, each stage takes one clock cycle to complete. • Consequently, the number of clock cycles necessary for a specific instruction to complete is equal to the number of pipeline stages. • A few instructions require multiple clock cycles in the execute stage to complete. 78 MicroBlaze
  79. • Pipeline Architecture Three Stage Pipeline Five Stage Pipeline Fetch (IF), Decode (OF), Execute (EX), Access Memory (MEM), and Writeback (WB). 79 MicroBlaze
  80. 80 Thank You… This presentation is published only for Educational Purpose