POWER-AS is a 64-bit RISC architecture implemented by POWER processor chips. It is backward compatible with 32-bit PowerPC allowing 32-bit apps to run on 32- or 64-bit OSes. The architecture has general purpose and special purpose registers that are 64-bits wide, as well as floating point, decimal floating point, and vector/scalar instruction sets. It implements branches, conditionals, and other operations between registers in a RISC fashion.
3. RISC & Endianness
Reduced Instruction Set
Little Endian
Big Endian
AIX [Advanced Interactive eXecutive]
IBM i
Linux
4. 64-bit Power AS Architecture
Backward compatible with 32-bit PowerPC architecture (i.e. 603, 604, 750) to a point:
◦ Allows 32-bit applications to run under either a 32-bit or 64-bit operating system (OS).
◦ Allows 64-bit applications to run under a 64-bit OS
•Most architected registers are 64-bits wide (GPR's, FPR's, etc.)
•Major changes to the address translation from 32-bit designs (Book III)
•Allows backward compatibility with IBM “I” (Tags Active Mode / Architecture)
•P6 implements 2.05
•P7 implements 2.06
•P8 implements 2.07(B)
•P9 implements 3.0(B)
5. Book 1
Contains basic information needed by application programmers
Instruction Categories
◦ Branches
◦ Loads / Stores
◦ Arithmetic ops are all Register-to-Register
◦ FXU
◦ BFP
◦ DFP
◦ Vector
◦ Vector/Scalar instructions and registers
6. Book 1 (continued)
Special Purpose Registers (SPR’s)
Condition Register (CR)
Count Register (CTR)
Target Address Register (TAR)
Link Register (LR)
Fixed Point Exception Register (XER)
Floating Point Status and Control Register (FPSCR)
Vector Status and Control Register (VSCR)
7. Branches
Both conditional and unconditional branches
◦ Relative branches (b*)
◦ Absolute branches (ba*)
◦ Branch instructions ending in an “l” update the Link Register for call/return operations
Conditional branches (bc*) branch based on “BO” field in instruction:
8. CR Instructions
◦ Condition Register (CR)
◦ 32 bits wide
◦ Architecturally: 8 4-bit fields
◦ CR0 implicitly updated by
◦ Fixed point “dot” instructions
◦ Arithmetic recording ops
◦ stcx., icswx. and pbt.
◦ Transactional Memory instructions except tcheck
◦ Transactional Memory Failure Handling
◦ CR1 implicitly updated by FP “dot” instructions
◦ BFP, DFP and VSX
◦ CR6 implicitly updated by VMX “dot” instructions
◦ Remaining CR fields updated by compare type instructions, tcheck and mtcrf instructions
◦ CR logical instructions for manipulating the CR
◦ crand, crnand, etc.
◦ cror, crnor, etc.
◦ Move to and move from CR instructions
◦ mtcrf, mtocrf, mfcr, mfocrf
9. Branches (continued)
◦ Count Register (CTR)
◦ 2 purposes:
◦ Counter that decrements on certain conditional branches when condition is met
◦ Holds branch target address for “branch to count” instructions
◦ COUNT CACHE: Our designs have an architecturally invisible “Count Cache” which contains a history of previously used branch target addresses used by
“branch to count” and “branch to TAR” instructions
◦ Target Address Register (TAR)
◦ 1 purpose
◦ Similar to the CTR, but only use is to hold the branch target address for “branch to TAR” instructions
◦ Previously used values saved in “Count Cache”
10. Branches (continued)
◦ Link Register (LR)
◦ 1 purpose
◦ Similar to the CTR/TAR in that it contains a branch target address
◦ Difference is the “taken” branch instructions ending in an “l” (“branch and link”) implicitly update LR with the address of the
instruction following the branch instruction had the branch NOT been taken.
◦ Used for subroutine CALLS / RETURNS
◦ “RETURN” is accomplished by “branch to link” instruction
◦ LINK STACK: Our designs have an architecturally invisible “Link Stack” which contains previously used LR values. Each “branch and
link” pushes an LR value to the stack. Each “branch to link” pops an LR value from the stack which can be used by branch prediction
to prefetch that branch target address.
11. Branches (continued)
Branch History Rolling Buffer (BHRB)
◦ Architecturally visible structure used to hold previously taken branch instruction addresses (NOT the
target addresses)
◦ Can be used by programmers to trace code execution involving branches
◦ Depth of the BHRB can vary by design and SMT mode
◦ Software responsibility to manage what is read from BHRB accordingly
◦ “clrbhrb” instruction for clearing contents of BHRB
◦ “mfbhrb” instruction for reading contents of BHRB
12. Fixed Point Overview
◦ Fixed Point (aka Integer) Instructions
◦ Register-to-register instructions
◦ Add, Subtract, Multiply, Divide
◦ And, Or, Xor, etc.
◦ Shift, rotate
◦ Compare, Select,
◦ Trap
◦ BCD
◦ DARN (Deliver A Random Number)
◦ Move to / from GPR/VSR
13. Fixed Point Overview (continued)
32 General Purposes Registers (GPR’s)
◦ Each 64 bits wide
•Fixed Point Exception Register (XER)
◦ 64 bits wide
◦ OV and CA bits
◦ OV32 and CA32 bits (new in P9)
◦ SO (sticky)
◦ TAG bit (tags active only)
◦ Various other tags active bits
◦ length field for indexed form load / store string instructions
14. Floating Point Overview
P7 and later server designs support:
◦ Legacy Binary Floating Point (BFP) SISD
◦ Single Precision (SP): 32 bit operands x 1 per FPR
◦ Double Precision (DP): 64 bit operands x 1 per FPR
◦ All BFP instructions begin with letter “f”
◦ Address 32 FPR’s
◦ VMX SIMD
◦ Single Precision Only: 32 bit operands x 4 per VR
◦ All VMX instructions begin with letter “v”
◦ Address 32 VR’s
15. Floating Point Overview (continued)
◦ Decimal Floating Point (DFP) SISD
◦ Short: 32 bit operands x 1 per FPR
◦ Long: 64 bit operands x 1 per FPR
◦ Extended: 128 bit operands x 1 split over even-odd FPR-pair
◦ All DFP instructions begin with “d”; All quadword (QW) DFP instructions end in “q”
◦ Address 32 FPR’s (or 16 pairs for QW)
◦ Vector-Scalar Extension (VSX) SIMD
◦ Single Precision (sp): 32 bit operands x 4 per VSR
◦ Yield same result as equivalent SP BFP instruction
◦ Double Precision (dp): 64 bit operands x 2 per VSR
◦ Yield same result as equivalent DP BFP instruction
◦ All VSX instructions begin with “x”
◦ Address 64 VSR’s
16. ISA/A2I
Power ISA is open Source
https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0
A2I POWER Processor Core
https://openpowerfoundation.org/a2i-power-processor-core-contributed-to-openpower-
community-to-advance-open-hardware-collaboration/