BUS-Based Computer Systems

Embedded Systems

UNIT 3

THE CPU BUS

A computer system encompasses much more than the CPU; it also includes memory and

I/O devices. The bus is the mechanism by which the CPU communicates with memory

and devices. A bus is, at a minimum, a collection of wires, but the bus also deﬁnes a

protocol by which the CPU, memory, and devices communicate. One of the major roles

of the bus is to provide an interface to memory.

3.1.1 Bus Protocols

The basic building block of most bus protocols is the four-cycle handshake,

illustrated in Figure 4.1. The handshake ensures that when two devices want to

communicate, one is ready to transmit and the other is ready to receive. The hand- shake

uses a pair of wires dedicated to the handshake: enq (meaning enquiry) and ack

(meaning acknowledge). Extra wires are used for the data transmitted during the

handshake. The four cycles are described below.

Device 1 raises its output to signal an enquiry, which tells

device

2 that it should

get ready to listen for data.

When

device

2 is ready to receive, it raises its output to signal an acknowl-

edgment. At this point,

devices

1 and 2 can transmit or receive.

Once the data transfer is complete,

device

2 lowers its output, signaling that it has

received the data.

4. After seeing that ack has been released,

device

1 lowers its output.

[Type text] Page 1

Partial preview of the text

Download BUS-Based Computer Systems and more Study notes Embedded Systems in PDF only on Docsity!

Embedded Systems

UNIT 3

THE CPU BUS

A computer system encompasses much more than the CPU; it also includes memory and I/O devices. The bus is the mechanism by which the CPU communicates with memory and devices. A bus is, at a minimum, a collection of wires, but the bus also defines a protocol by which the CPU, memory, and devices communicate. One of the major roles of the bus is to provide an interface to memory. 3.1.1 Bus Protocols The basic building block of most bus protocols is the four-cycle handshake , illustrated in Figure 4.1. The handshake ensures that when two devices want to communicate, one is ready to transmit and the other is ready to receive. The hand- shake uses a pair of wires dedicated to the handshake: enq (meaning enquiry) and ack (meaning acknowledge). Extra wires are used for the data transmitted during the handshake. The four cycles are described below.

Device 1 raises its output to signal an enquiry, which tells device 2 that it should get ready to listen for data.
When device 2 is ready to receive, it raises its output to signal an acknowl- edgment. At this point, devices 1 and 2 can transmit or receive.
Once the data transfer is complete, device 2 lowers its output, signaling that it has received the data.
After seeing that ack has been released, device 1 lowers its output.

Embedded Systems At the end of the handshake, both handshaking signals are low, just as they were at the start of the handshake. The system has thus returned to its original state in readiness for another handshake-enabled data transfer. Microprocessor buses build on the handshake for communication between the CPU and other system components. The term bus is used in two ways. The most basic use is as a set of related wires, such as address wires. However, the term may also mean a protocol for communicating between components. To avoid confusion, we will use the term bundle to refer to a set of related signals. The fundamental bus operations are reading and writing. Figure 4.2 shows the structure of a typical bus that supports reads and writes. The major components follow: ■ Clock provides synchronization to the bus components, ■ R/W is true when the bus is reading and false when the bus is writing, ■ Address is an a -bit bundle of signals that transmits the address for an access, ■ Data is an n -bit bundle of signals that can carry data to or from the CPU, and ■ Data ready signals when the values on the data bundle are valid. All transfers on this basic bus are controlled by the CPU—the CPU can read or write a device or memory, but devices or memory cannot initiate a transfer. This is reflected by the fact that R/W and address are unidirectional signals, since only the CPU can determine the address and direction of the transfer.

Embedded Systems a write. Timing constraints are shown only for the read operation, but similar constraints apply to the write operation. The bus is normally in the read mode since that does not change the state of any of the devices or memories. The CPU can then ignore the bus data lines until it wants to use the results of a read. Notice also that the direction of data transfer on bidirectional lines is not specified in the timing diagram. The sequence of operations for a read on the timing diagram as follows: ■ A read or write is initiated by setting address enable high after the clock starts to rise. We set R/W 1 to indicate a read, and the address lines are set to the desired address. ■ One clock cycle later, the memory or device is expected to assert the data value at that address on the data lines. Simultaneously, the external device specifies that the data are valid by pulling down the data ready line. This line is active low , meaning that a logically true value is indicated by a low voltage, in order to provide increased immunity to electrical noise. ■ The CPU is free to remove the address at the end of the clock cycle and must do so before the beginning of the next cycle. The external device has a similar requirement for removing the data value from the data lines. The write operation has a similar timing structure. The read/write sequence does illustrate that timing constraints are required on the transition of the R/W signal between read and write states. The signal must, of course, remain stable within a read or write. As a result there is a restricted time window in which the CPU can change between read and write modes.

Embedded Systems The handshake that tells the CPU and devices when data are to be transferred is formed by data ready for the acknowledge side, but is implicit for the enquiry side. Since the bus is normally in read mode, enq does not need to be asserted, but the acknowledge must be provided by data ready. The data ready signal allows the bus to be connected to devices that are slower than the bus. As shown in Figure 4.5, the external device need not immediately assert data ready. The cycles between the minimum time at which data can be asserted and when it is actually asserted are known as wait states. Wait states are commonly used to connect slow, inexpensive memories to buses. We can also use the bus handshaking signals to perform burst transfers , as illustrated in Figure 4.6. In this burst read transaction, the CPU sends one address but receives a sequence of data values. We add an extra line to the bus, called burst9 here,which signals when a transaction is actually a burst. Releasing the burst9 signal tells the device that enough data has been transmitted. To stop receiving data after the end of data 4 , the CPU releases the burst9 signal at the end of data 3 since the device requires some time to recognize the end of the burst. Some buses provide disconnected transfers. In these buses, the request and response are separate. A first operation requests the transfer. The bus can then be used for other operations. The transfer is completed later, when the data are ready. The state machine view of the bus transaction is also helpful and a useful com- plement to the timing diagram. Figure 4.7 shows the CPU and device state machines for

Embedded Systems by the data. The address can be held in a register until the data arrive so that both can be presented to the device (such as a RAM) at the same time. 3.1.2 DMA Standard bus transactions require the CPU to be in the middle of every read and write transaction. However, there are certain types of data transfers in which the CPU does not need to be involved. For example, a high-speed I/O device may want to transfer a block of data into memory. While it is possible to write a program that alternately reads the device and writes to memory, it would be faster to eliminate the CPU’s involvement and let the device and memory communicate directly. Direct memory access (DMA) is a bus operation that allows reads and writes not controlled by the CPU. A DMA transfer is controlled by a DMA controller , which requests control of the bus from the CPU. After gaining control, the DMA controller performs read and write operations directly between devices and memory. Figure 4.9 shows the configuration of a bus with a DMA controller. The DMA requires the CPU to provide two additional bus signals: ■ The bus request is an input to the CPU through which DMA controllers ask for ownership of the bus. ■ The bus grant signals that the bus has been granted to the DMA controller. A device that can initiate its own bus transfer is known as a bus master. Devices that do not have the capability to be bus masters do not need to connect to a bus request and bus

Embedded Systems grant. The DMA controller uses these two signals to gain control of the bus using a classic four-cycle handshake. The bus request is asserted by the DMA controller when it wants to control the bus, and the bus grant is asserted by the CPU when the bus is ready. The CPU will finish all pending bus transactions before granting control of the bus to the DMA controller. When it does grant control, it stops driving the other bus signals: R/W, address, and so on. Upon becoming bus master, the DMA con- troller has control of all bus signals (except, of course, for bus request and bus grant). Once the DMA controller is bus master, it can perform reads and writes using the same bus protocol as with any CPU-driven bus transaction. Memory and devices do not know whether a read or write is performed by the CPU or by a DMA controller. After the transaction is finished, the DMA controller returns the bus to the CPU by deasserting the bus request, causing the CPU to deassert the bus grant. The CPU controls the DMA operation through registers in the DMA controller. A typical DMA controller includes the following three registers: ■ A starting address register specifies where the transfer is to begin. ■ A length register specifies the number of words to be transferred. ■ A status register allows the DMA controller to be operated by the CPU. The CPU initiates a DMA transfer by setting the starting address and length reg- isters appropriately and then writing the status register to set its start transfer bit. After the DMA operation is complete, the DMA controller interrupts the CPU to tell it that the transfer is done. What is the CPU doing during a DMA transfer? It cannot use the bus. As illustrated in Figure 4.10,if the CPU has enough instructions and data in the cache and registers, it may be able to continue doing useful work for quite some time and may not notice the DMA transfer. But once the CPU needs the bus, it stalls until the DMA controller returns bus mastership to the CPU. To prevent the CPU from idling for too long, most DMA controllers implement modes that occupy the bus for only a few cycles at a time. For example, the trans- fer may be made 4, 8, or 16 words at a time. As illustrated in Figure 4.11, after each block, the DMA controller returns control of the bus to the CPU and goes to sleep for a preset period, after which it requests the bus again for the next block transfer.

Embedded Systems ■ Higher-speed buses may provide wider data connections. ■A high-speed bus usually requires more expensive circuits and connectors. The cost of low-speed devices can be held down by using a lower-speed, lower-cost bus. The bridge may allow the buses to operate independently, thereby providing some parallelism in I/O operations. Let’s consider the operation of a bus bridge between what we will call a fast bus and a slow bus as illustrated in Figure 4.13. The bridge is a slave on the fast bus and the master of the slow bus. The bridge takes commands from the fast bus on which it is a slave and issues those commands on the slow bus. It also returns the results from the slow bus to the fast bus—for example, it returns the results of a read on the slow bus to the fast bus. The upper sequence of states handles a write from the fast bus to the slow bus. These states must read the data from the fast bus and set up the handshake for the slow bus. Operations on the fast and slow sides of the bus bridge should be overlapped as much as possible to reduce the latency of bus-to-bus transfers. Similarly, the bottom sequence of states reads from the slow bus and writes the data to the fast bus. The bridge serves as a protocol translator between the two bridges as well. If the bridges are very close in protocol operation and speed, a simple state machine may be enough. If there are larger differences in the protocol and timing between the two buses, the bridge may need to use registers to hold some data values temporarily.

Embedded Systems 3.1.4 AMBA Bus Since the ARM CPU is manufactured by many different vendors, the bus provided off- chip can vary from chip to chip. ARM has created a separate bus specification for single- chip systems. The AMBA bus [ARM99A] supports CPUs, memories, and peripherals integrated in a system-on-silicon. As shown in Figure 4.14, the AMBA specification includes two buses. The AMBA high-performance bus (AHB) is opti-mized for high-speed transfers and is directly connected to the CPU. It supports several high- performance features: pipelining, burst transfers, split transactions, and multiple bus masters. A bridge can be used to connect the AHB to an AMBA peripherals bus (APB). This bus is designed to be simple and easy to implement; it also consumes relatively little power. The AHB assumes that all peripherals act as slaves, simplifying the logic required in both the peripherals and the bus controller. It also does not perform pipelined operations, which simplifies the bus logic. 3.2 MEMORY DEVICES In this section, we introduce the basic types of memory components that are com- monly used in embedded systems. There are several varieties of both read-only and read/write memories, each with its own advantages. 3.2.1 Memory Device Organization The most basic way to characterize a memory is by its capacity, such as 256 MB. However, manufacturers usually make several versions of a memory of a given size, each with a different data width. For example, a 256-MB memory may be available in two versions: ■ As a 64 M 4-bit array, a single memory access obtains an 8-bit data item, with a maximum of 2^26 different addresses. ■ As a 32 M 8-bit array, a single memory access obtains a 1-bit data item, with a maximum of 2^23 different addresses.

Embedded Systems The row and column select a particular memory cell. If the memory’s external width is 1 bit, the column address selects a single bit; for wider data widths, the column address can be used to select a subset of the columns. Most memories include an enable signal that controls the tri-stating of data onto the memory’s pins. We will see in Section 4.4.1 how the enable pin can be used to easily build large memories from multiple banks of memory chips. A read/write signal (R/W in the figure) on read/write memories controls the direction of data transfer; memory chips do not typically have separate read and write data pins. 3.2.2 Random-Access Memories Random-access memories can be both read and written. They are called random access because, unlike magnetic disks, addresses can be read in any order. Most bulk memory in modern systems is dynamic RAM (DRAM). DRAM is very dense; it does, however, require that its values be refreshed periodically since the values inside the memory cells decay over time. The dominant form of dynamic RAM today is the synchronous DRAMs (SDRAMs) , which uses clocks to improve DRAM performance. SDRAMs use Row Address Select (RAS) and Column Address Select (CAS) signals to break the address into two parts, which select the proper row and column in the RAM array. Signal transitions are relative to the SDRAM clock, which allows the internal SDRAM operations to be pipelined As shown in Figure 4.16, transitions on the control signals are related to a clock. RAS and CAS can therefore become valid at the same time. DRAM has to be refreshed roughly once per millisecond. Rather than refresh the entire memory at once, DRAMs refresh part of the memory at a time. When a section of memory is being refreshed, it cannot be accessed until the refresh is complete. The memory refresh occurs over fairly few seconds so that each section is refreshed every few microseconds.

Embedded Systems SDRAMs include registers that control the mode in which the SDRAM operates. SDRAMs support burst modes that allow several sequential addresses to be accessed by sending only one address. SDRAMs generally also support an interleaved mode that exchanges pairs of bytes. Even faster synchronous DRAMs, known as double-data rate (DDR) SDRAMs or DDR2 and DDR3 SDRAMs, are now in use. The details of DDR operation are beyond the scope of this book, but the basic capabilities of DDR memories are similar to those of single-rate SDRAMs; DDRs simply use sophisticated circuit techniques to perform more operations per clock cycle. SIMMs and DIMMs Memory for PCs is generally purchased as single in-line memory modules (SIMMs) or double in-line memory modules (DIMMs). A SIMM or DIMM is a small circuit board that fits into a standard memory socket. A DIMM has two sets of leads compared to the SIMM’s one. 3.2.3 Read-Only Memories Read-only memories (ROMs) are preprogrammed with fixed data. They are very useful in embedded systems since a great deal of the code, and perhaps some data, does not change over time. Read-only memories are also less sensitive to radiation- induced errors. There are several varieties of ROM available. The first-level distinction to be made is between factory-programmed ROM (sometimes called mask-programmed ROM ) and field-programmable ROM. Factory-programmed ROMs are ordered from the factory with particular programming. ROMs can typically be ordered in lots of a few thousand, but clearly factory programming is useful only when the ROMs are to be

Embedded Systems Figure 4.17 shows enough of the internals of a counter/timer to illustrate its operation. An n -bit counter/timer uses an n -bit register to store the current state of the count and an array of half subtractors to decrement the count when the count signal is asserted. Combinational logic checks when the count equals zero; the done output signals the zero count. It is often useful to be able to control the time-out, rather than require exactly 2 n^ events to occur. For this purpose, a reset register provides the value with which the count register is to be loaded. The counter/timer provides logic to load the reset register. Most counters provide both cyclic and acyclic modes of operation. In the cyclic mode, once the counter reaches the done state, it is automatically reloaded and the counting process continues. In acyclic mode, the counter/timer waits for an explicit signal from the microprocessor to resume counting. A watchdog timer is an I/O device that is used for internal operation of a system. As shown in Figure 4.18, the watchdog timer is connected into the CPU bus and also to the CPU’s reset line. The CPU’s software is designed to periodically reset. the watchdog timer, before the timer ever reaches its time-out limit. If the watchdog timer ever does reach that limit, its time-out action is to reset the processor. In that case, the presumption is that either a software flaw or hardware problem has caused the CPU to misbehave. 3.3.2 A/D and D/A Converters Analog/digital (A/D) and digital/analog (D/A) converters (typically known as ADCs and DACs , respectively) are often used to interface nondigital devices to embedded systems. The design of A/D and D/A converters themselves is beyond the scope of this book; we concentrate instead on the interface to the micropro- cessor bus. Because A/D conversion requires more complex circuitry, it requires a somewhat more complex interface. Analog/digital conversion requires sampling the analog input before convert- ing it to digital form. A control signal causes the A/D converter to take a sample and digitize it. There are several different types of A/D converter circuits, some of which take a constant amount of time, while the conversion time of others depends on the sam- pled value.Variable-time converters provide a done signal so that the microprocessor knows when the value is ready.

Embedded Systems A typical A/D interface has, in addition to its analog inputs, two major digital inputs. A data port allows A/D registers to be read and written, and a clock input tells when to start the next conversion. D/A conversion is relatively simple, so the D/A converter interface generally includes only the data value. The input value is continuously converted to analog form. 3.3.3 Keyboards A keyboard is basically an array of switches, but it may include some internal logic to help simplify the interface to the microprocessor. In this chapter, we build our understanding from a single switch to a microprocessor-controlled keyboard A hardware debouncing circuit can be built using a one-shot timer. Software can also be used to debounce switch inputs. A raw keyboard can be assembled from several switches. Each switch in a raw keyboard has its own pair of terminals, making raw keyboards impractical when a large number of keys is required. The microprocessor can provide debouncing, but it also provides other functions as well. An encoded keyboard uses some code to represent which switch is cur- rently being depressed. At the heart of the encoded keyboard is the scanned array of switches shown in Figure 4.20. Unlike a raw keyboard, the scanned keyboard array reads only one row of switches at a time. The demultiplexer at the left side of the array selects the row to be read. When the scan input is 1, that value is trans- mitted to one terminal of each key in the row. If the switch is depressed, the 1 is sensed at that switch’s column. Since only one switch in the column is activated, that value uniquely identifies a key. The row address and column output can be used for encoding, or circuitry can be used to give a different encoding. A consequence of encoding the keyboard is that combinations of keys may not be represented. For example, on a PC keyboard, the encoding must be chosen so that combinations such as control-Q can be recognized and sent to the PC. Another consequence is that rollover may not be allowed. For example, if you press “a,” and then press “b” before releasing “a,” in most applications you want the keyboard to send an “a” followed by a “b.” Rollover is very common in typing at even modest rates. A naive implementation of the encoder circuitry will simply throw away any character depressed after the first one until all the keys are released. The keyboard microcontroller can be programmed to provide n- key rollover , so that rollover keys are sensed, put on a stack, and transmitted in sequence as keys are released.

Embedded Systems Many large displays are built using LCD. Each pixel in the display is formed by a single liquid crystal. LCD displays present a very different interface to the system because the array of pixel LCDs can be randomly accessed. Early LCD panels were called passive matrix because they relied on a two-dimensional grid of wires to address the pixels. Modern LCD panels use an active matrix system that puts a transistor at each pixel to control access to the LCD. Active matrix displays provide higher contrast and a higher-quality display 3.3.6 Touchscreens A touchscreen is an input device overlaid on an output device. The touchscreen registers the position of a touch to its surface. By overlaying this on a display, the user can react to information shown on the display. The two most common types of touchscreens are resistive and capacitive. A resistive touchscreen uses a two-dimensional voltmeter to sense position. As shown in Figure 4.23, the touchscreen consists of two conductive sheets separated by spacer balls. The top conductive sheet is flexible so that it can be pressed to touch the bottom sheet. A voltage is applied across the sheet; its resistance causes a voltage gradient to appear across the sheet. The top sheet samples the conductive sheet’s applied voltage at the contact point. An analog/digital converter is used to measure the voltage and resulting position. The touchscreen alternates between x and y position sensing by alternately applying horizontal and vertical voltage gradients. 3.4 COMPONENT INTERFACING Building the logic to interface a device to a bus is not too difficult but does take some attention to detail. We first consider interfacing memory components to the bus, since that is relatively simple, and then use those concepts to interface to other types of devices. 3.4.1 Memory Interfacing If we can buy a memory of the exact size we need, then the memory structure is simple. If we need more memory than we can buy in a single chip, then we must construct the memory out of several chips. We may also want to build a memory that is wider than we can buy on a single chip; for example, we cannot generally buy a 32- bit-wide memory chip. We can easily construct a memory of a given width (32 bits, 64 bits, etc.) by placing RAMs in parallel. 3.4.2 Device Interfacing Some I/O devices are designed to interface directly to a particular bus, forming glueless interfaces. But glue logic is required when a device is connected to a bus for which it is not designed. An I/O device typically requires a much smaller range of addresses than a memory, so addresses must be decoded much more finely. Some additional logic is required to cause the bus to read and write the device’s registers.

Embedded Systems 3.5 DESIGNING WITH MICROPROCESSORS 3.5.1 System Architecture We know that an architecture is a set of elements and the relationships between them that together form a single unit. The architecture of an embedded computing system is the blueprint for implementing that system—it tells you what components you need and how you put them together. The architecture of an embedded computing system includes both hardware and software elements. Let’s consider each in turn.. ■ CPU An embedded computing system clearly contains a microprocessor. The choice of the CPU is one of the most important, but it cannot be made without considering the software that will execute on the machine. ■ Bus The choice of a bus is closely tied to that of a CPU, since the bus is an integral part of the microprocessor Memory Once again, the question is not whether the system will have memory but the characteristics of that memory. The most obvious characteristic is total size, which depends on both the required data volume and the size of the program instructions. The ratio of ROM to RAM and selection of DRAM versus SRAM can have a significant influence on the cost of the system. The speed of the memory will play a large part in determining system performance. ■ Input and output devices The user’s view of the input and output mech- anisms may not correspond to the devices connected to the microprocessor. For example, a set of switches and knobs on a front panel may all be controlled by a single microcontroller, which is in turn connected to the main CPU.

BUS-Based Computer Systems, Study notes of Embedded Systems

Related documents

Partial preview of the text

Download BUS-Based Computer Systems and more Study notes Embedded Systems in PDF only on Docsity!

UNIT 3

BUS-Based Computer Systems

THE CPU BUS