2. The CPU Bus The bus is the mechanism by which the CPU communicates with memory and devices. A bus is, at a minimum, a collection of wires, but the bus also defines a protocol by which the CPU, memory and devices communicate. One of the major role of the bus is to provide an interface to memory.
3. Bus Protocols Bus protocol determines how devices communicate. Devices on the bus go through sequences of states. Protocols are specified by state machines, one state machine per actor in the protocol. May contain asynchronous logic behavior.
4. Four-cycle handshake Device1 raises its o/p to signal an enquiry, which tells device2 that it should get ready to listen for data. When device2 is ready to receive, it raises its o/p to signal an acknowledgement. At this point, device1 and 2 can transmit or receive. Once the data transfer is complete, device2 lowers its o/p, signaling that it has received the data. After seeing that ack has been released, device1 lowers its o/p.
6. Microprocessor busses Clock provides synchronization to the bus components, R/W’ is true when the bus is reading and false when the bus is writing, Address is an a bit bundle of signals that transmits the address for an access, Data is an n bit bundle of signals that can carry data to or from the CPU, Data ready’ signals when the values on the data bundle are valid
8. Timing diagrams A timing diagram shows how the signals on a bus vary over time, since values like the address and data can take on many values, some standard notation is used to describe signals. A signal can go between 0/1 state and a stable/changing state. To be sure that signals go to their proper values at the proper time, timing diagram sometimes show timing constraints.
10. Timing diagram for the example bus Timing diagram shown with timing constraints for the example bus. The diagram shows a read and a write. Timing constraints shown only for read operation, but similar constraints applies to the write operation. The bus is normally in read mode, since that does not change any state. During a read the external device or memory is sending a value on the data lines, while during a write the CPU is controlling the data lines.
12. Read Operation on timing diagram A read or write is initiated by setting address enable high after the clock starts to rise. We set R/W’=1 to indicate a read and the address lines are set to the desired address. One clock cycle later, the memory or device is expected to assert the data value at that address on the data lines. simultaneously, the external device specifies that the data are valid by pulling down the data ready’ line. This line is active low, meaning that a logically true value is indicated by a low voltage, in order to provide increased immunity to electrical noise. The CPU is free to remove the address at the end of clock cycle and must do so before the beginning of the next cycle. The external device has a similar requirement for removing the data value from the data lines.
14. Burst read The handshake that tells the CPU and devices when data are to be transferred is formed by data ready for the acknowledge side, but Is implicit for the inquiry side. The data ready signal allows the bus to be connected to devices that are slower than bus. The cycle between the minimum time at which data can be asserted and when it is actually inserted are known as wait states. In this burst read transaction the CPU sends one address but receives of data values.
16. State diagrams for bus read Get data Senddata Done Release ack See ack Ack Adrs Adrs Wait Wait device CPU start
17. State diagram The state machine view of the bus transaction is also helpful and useful complement to the timing diagram. It shows the transition of control signal. And the CPU decides to perform a read transaction, it moves to a new state, sending bus signals that cause the device to behave appropriately. The device’s state transition graph captures it side of the protocol.
19. Bus multiplexing Some buses use multiplexed address and data. Additional control lines are provided to tell whether the value on the address/data lines is an address or data. Typically, the address comes first followed by the data. The address can be held in a register until the data arrive so that both can be presented to the device at the same time.
20. DMA Direct memory access (DMA) performs data transfers without executing instructions. CPU sets up transfer. DMA engine fetches, writes. DMA controller is a separate unit.
21. Bus mastership By default, CPU is bus master and initiates transfers. DMA must become bus master to perform its work. CPU can’t use bus while DMA operates. Bus mastership protocol: Bus request. Bus grant.
22. DMA operation CPU sets DMA registers for start address, length. DMA status register controls the unit. Once DMA is bus master, it transfers automatically. May run continuously until complete. May use every nth bus cycle.
24. System bus configurations Multiple busses allow parallelism: Slow devices on one bus. Fast devices on separate bus. A bridge connects two busses. CPU slow device bridge memory slow device high-speed device
26. ARM AMBA bus Two varieties: AHB is high-performance. APB is lower-speed, lower cost. AHB supports pipelining, burst transfers, split transactions, multiple bus masters. All devices are slaves on APB.
27. Memory Devices Several different types of memory: Read Only Memories Flash. Read/Write Memories DRAM. SRAM. Each type of memory comes in varying: Capacities. Widths.
28.
29. The data are stored in 2-D array of memory cells.
33. Random-access memory Dynamic RAM is dense, requires refresh. Synchronous DRAM is dominant type. SDRAM uses clock to improve performance, pipeline memory accesses. Static RAM is faster, less dense, consumes more power.
34. Static RAM and its operation CE is the chip enable input. It is active low. When CE=1 the SRAM’s data pins are disabled, and when CE=0, the data pins are enabled. R/W controls whether the current operation is a read (R/W=1) or a write (R/W=0). Read and write are normally specified relative to the CPU, so read means reading from RAM and write means writing to RAM. Adrs specifies the address for the read or write. Data is a bidirectional bundle of signals for data transfer. When R/W=1, the pins are o/p, and when R/W=0, the data pins are input.
36. A read operation on the SRAM occurs as follows: CE is set to zero enabling the chip with R/W=1. An address is presented on the address lines. After some delay, data appear on the data lines. A write operation is similar: CE is set to zero. R/W is set to 0 for writing. An address is set on the address line and data is set on the data lines.
38. Timing diagram for Read First, RAS is set to 0 and the row part of the address is set on the address lines. Next, CAS is set to 0 and the column part of the address are put on the address lines.
39. Read-only memory ROM may be programmed at factory. Flash is dominant form of field-programmable ROM. Electrically erasable, must be block erased. Random access, but write/erase is much slower than read. NOR flash is more flexible. NAND flash is more dense.
40. Flash memory Non-volatile memory. Flash can be programmed in-circuit. Random access for read. To write: Erase a block to 1. Write bits to 0.
41. Flash writing Write is much slower than read. 1.6 ms write, 70 ns read. Blocks are large (approx. 1 Mb). Writing causes wear that eventually destroys the device. Modern lifetime approx. 1 million writes.
42. Types of flash NOR: Word-accessible read. Erase by blocks. NAND: Read by pages (512-4K bytes). Erase by blocks. NAND is cheaper, has faster erase, sequential access times.
43.
44. Some devices are often found as on-chip devices in microcontrollers.
47. Watchdog timer Watchdog timer is periodically reset by system timer. If watchdog is not reset, it generates an interrupt to reset the host. host CPU interrupt watchdog timer reset
56. Types of high-resolution display Liquid crystal display (LCD) is dominant form. Plasma, OLED, etc. Frame buffer holds current display contents. Written by processor. Read by video.
59. Component Interfacing : Memory interfacing Static RAM is simpler to interface to a bus than is DRAM, due to both the DRAM’s RAS/CAS multiplexing and the need for refresh. The R/W on the bus can often be directly connected to the SRAM. The main issue in interfacing SRAM is decoding the address. The chip enable pin is used in RAM’s to simplify the interfacing of large memories. If the required number of memory words fits within the height of an available memory, then the interface is simple: the CE signal is permanently wired to the ground so that the chip is always enabled.
60. DRAM interfacing The bus address can be split in to row and column address with a small amount of logic-a register captures the address, a multiplexer selects the row or column portion of the address, and a state machine generates RAS and CAS. The refresh signal can be generated with a counter and a state machine as shown. The counter times the wait between successive refresh actions, the controller generates the required signal. In idle state, the bus signals are passed through the DRAM to enable reads and writes. When the counter roles over, the controller generates CAS and then RAS to induce the next refresh cycle.
61. Device interfacing Some I/O devices are designed to interface directly to a particular bus, forming glueless interfaces. But glue logic is required when a device is required when a device is connected to a bus for which it is not designed. An I/O device typically requires a much smaller range of addresses than a memory, so addresses must be decoded much more finely. Some additional logic is required to cause the bus to read and write the device’s register.
62.
63. The architecture of an embedded computing system is the blue-print for implementing that system.
64.
65.
66.
67.
68. Software architecture Functional description must be broken into pieces: division among people; conceptual organization; performance; testability; maintenance.
69. Hardware and software architectures Hardware and software are intimately related: software doesn’t run without hardware; how much hardware you need is determined by the software requirements: speed; memory.
70. Evaluation boards Designed by CPU manufacturer or others. Includes CPU, memory, some I/O devices. May include prototyping section. CPU manufacturer often gives out evaluation board netlist---can be used as starting point for your custom board design.
71. Adding logic to a board Programmable logic devices (PLDs) provide low/medium density logic. Field-programmable gate arrays (FPGAs) provide more logic and multi-level logic. Application-specific integrated circuits (ASICs) are manufactured for a single purpose.
72. The PC as a platform Advantages: cheap and easy to get; rich and familiar software environment. Disadvantages: requires a lot of hardware resources; not well-adapted to real-time.
73. Typical PC hardware platform CPU memory device CPU bus bus interface high-speed bus DMA controller timers intr ctrl low-speed bus bus interface device
74. Typical PC hardware platform The CPU provides basic computational facilities. RAM is used for program storage. ROM holds the boot program. A DMA controller provides DMA capabilities. Timers are used by the operating system for a variety of purposes. A high speed bus connected to the CPU bus through a bridge, allows fast devices to communicate efficiently with the rest of the system. A low speed bus provides an inexpensive way to connect simpler devices and may be necessary for backward compatibility as well.
75. Typical busses PCI: standard for high-speed interfacing 33 or 66 MHz. PCI Express. USB (Universal Serial Bus), Firewire (IEEE 1394): relatively low-cost serial interface with high speed.
76. Software elements IBM PC uses BIOS (Basic I/O System) to implement low-level functions: boot-up; minimal device drivers. BIOS has become a generic term for the lowest-level system software.
77. Example: StrongARM StrongARM system includes: CPU chip (3.686 MHz clock) system control module (32.768 kHz clock). Real-time clock; operating system timer general-purpose I/O; interrupt controller; power manager controller; reset controller.
79. Peripheral devices of system control module: A real time clock. An operating system timer. 28 general-purpose I/Os(GPIOs). An interrupt controller. A power manager controller. A reset controller that handles resetting the processor.
80. Debugging embedded systems Challenges: target system may be hard to observe; target may be hard to control; may be hard to generate realistic inputs; setup sequence may be complex.
81. Host/target design Use a host system to prepare software for target system: target system serial line host system
85. Software debuggers A monitor program residing on the target provides basic debugger functions. Debugger should have a minimal footprint in memory. User program must be careful not to destroy debugger program, but , should be able to recover from some damage caused by user code.
86. Breakpoints A breakpoint allows the user to stop execution, examine system state, and change state. Replace the breakpointed instruction with a subroutine call to the monitor program.
87. ARM breakpoints 0x400 MUL r4,r6,r6 0x404 ADD r2,r2,r4 0x408 ADD r0,r0,#1 0x40c B loop uninstrumented code 0x400 MUL r4,r6,r6 0x404 ADD r2,r2,r4 0x408 ADD r0,r0,#1 0x40c BL bkpoint code with breakpoint
88. Breakpoint handler actions Save registers. Allow user to examine machine. Before returning, restore system state. Safest way to execute the instruction is to replace it and execute in place. Put another breakpoint after the replaced breakpoint to allow restoring the original breakpoint.
89. In-circuit emulators A microprocessor in-circuit emulator is a specially-instrumented microprocessor. Allows you to stop execution, examine CPU state, modify registers.
90. Logic analyzers A logic analyzer is an array of low-grade oscilloscopes:
91. Logic analyzer architecture System Data Samples UUT sample memory microprocessor vector address system clock controller state or timing mode clock gen keypad display
92. Hardware/software co-verification An instruction level simulation may be used to debug code running on the CPU. A cycle-level simulation tool may be used for faster simulation of parts of the system. A hardware/software co-simulator may be used to simulate various parts of the system at different level of detail.
93. Bus-Based Computer Systems Designing with microprocessors. Development and debugging. System-level performance analysis. Example: alarm clock
94. Design Example : Alarm clock Alarm on Alarm off PM Alarm ready light set time set alarm hour minute button
95. Operations Set time: hold set time, depress hour, minute. Set alarm time: hold set alarm, depress hour, minute. Turn alarm on/off: depress alarm on/off.
101. Update-time behavior update seconds with rollover display.set-time(current time) F Time >= alarm and alarm-on? F Rollover? T T update hh:mm with rollover alarm.buzzer(true) PM->AM AM->PM PM=true PM=false
102. Scan-keyboard behavior Set-time and not set-alarm and hours compute button activations Increment time tens w. rollover and AM/PM Alarm-on alarm-ready= true Alarm-off alarm-ready= false alarm.buzzer(false) Increment time ones w. rollover and AM/PM save button states Set-time and not set-alarm and minutes
103.
104.
105.
106. The buttons will remain depressed for many sample periods since the sample rate is much faster than any person can push and release buttons.