CS计算机代考程序代写 flex mips AI cache chain algorithm scheme C

C
APPENDIX
I always loved that word, Boolean.
Claude Shannon
IEEE Spectrum, April 1992
(Shannon’s master’s thesis showed that the algebra invented by George Boole in the 1800s could represent the workings of electrical switches.)
The Basics of Logic Design
C.1 Introduction C-3
C.2 Gates, Truth Tables, and Logic
Equations C-4
C.3 CombinationalLogic C-9 C.4 Using a Hardware Description
Language C-20
C.5 Constructing a Basic Arithmetic Logic
Unit C-26
C.6 Faster Addition: Carry Lookahead C-38 C.7 Clocks C-48

C.8 Memory Elements: Flip-Flops, Latches, and Registers C-50
C.9 Memory Elements: SRAMs and DRAMs C-58
C.10 Finite-StateMachines C-67
C.11 TimingMethodologies C-72
C.12 Field Programmable Devices C-78
C.13 ConcludingRemarks C-79
C.14 Exercises C-80
C.1 Introduction
This appendix provides a brief discussion of the basics of logic design. It does not replace a course in logic design, nor will it enable you to design significant working logic systems. If you have little or no exposure to logic design, however, this appendix will provide sufficient background to understand all the material in this book. In addition, if you are looking to understand some of the motivation behind how computers are implemented, this material will serve as a useful intro- duction. If your curiosity is aroused but not sated by this appendix, the references at the end provide several additional sources of information.
Section C.2 introduces the basic building blocks of logic, namely, gates. Section C.3 uses these building blocks to construct simple combinational logic systems, which contain no memory. If you have had some exposure to logic or digital systems, you will probably be familiar with the material in these first two sections. Section C.5 shows how to use the concepts of Sections C.2 and C.3 to design an ALU for the MIPS processor. Section C.6 shows how to make a fast adder,

C-4
Appendix C The Basics of Logic Design
asserted signal A signal that is (logically) true, or 1.
deasserted signal
A signal that is (logically) false, or 0.
and may be safely skipped if you are not interested in this topic. Section C.7 is a short introduction to the topic of clocking, which is necessary to discuss how memory elements work. Section C.8 introduces memory elements, and Section C.9 extends it to focus on random access memories; it describes both the characteris- tics that are important to understanding how they are used in Chapter 4, and the background that motivates many of the aspects of memory hierarchy design in Chapter 5. Section C.10 describes the design and use of finite-state machines, which are sequential logic blocks. If you intend to read Appendix D, you should thoroughly understand the material in Sections C.2 through C.10. If you intend to read only the material on control in Chapter 4, you can skim the appendices; however, you should have some familiarity with all the material except Section C.11. Section C.11 is intended for those who want a deeper understanding of clocking methodologies and timing. It explains the basics of how edge-triggered clocking works, introduces another clocking scheme, and briefly describes the problem of synchronizing asynchronous inputs.
Throughout this appendix, where it is appropriate, we also include segments to demonstrate how logic can be represented in Verilog, which we introduce in Section C.4. A more extensive and complete Verilog tutorial appears elsewhere on the CD.
C.2 Gates, Truth Tables, and Logic Equations
The electronics inside a modern computer are digital. Digital electronics operate with only two voltage levels of interest: a high voltage and a low voltage. All other voltage values are temporary and occur while transitioning between the values. (As we discuss later in this section, a possible pitfall in digital design is sampling a signal when it not clearly either high or low.) The fact that computers are digital is also a key reason they use binary numbers, since a binary system matches the underlying abstraction inherent in the electronics. In various logic families, the values and relationships between the two voltage values differ. Thus, rather than refer to the voltage levels, we talk about signals that are (logically) true, or 1, or are asserted; or signals that are (logically) false, or 0, or are deasserted. The values 0 and 1 are called complements or inverses of one another.
Logic blocks are categorized as one of two types, depending on whether they contain memory. Blocks without memory are called combinational; the output of a combinational block depends only on the current input. In blocks with memory, the outputs can depend on both the inputs and the value stored in memory, which is called the state of the logic block. In this section and the next, we will focus

C.2 Gates, Truth Tables, and Logic Equations
C-5
only on combinational logic. After introducing different memory elements in Section C.8, we will describe how sequential logic, which is logic including state, is designed.
Truth Tables
Because a combinational logic block contains no memory, it can be completely specified by defining the values of the outputs for each possible set of input values. Such a description is normally given as a truth table. For a logic block with n inputs, there are 2n entries in the truth table, since there are that many possible combinations of input values. Each entry specifies the value of all the outputs for that particular input combination.
Truth Tables
Consider a logic function with three inputs, A, B, and C, and three outputs, D, E, and F. The function is defined as follows: D is true if at least one input is true, E is true if exactly two inputs are true, and F is true only if all three inputs are true. Show the truth table for this function.
The truth table will contain 23 = 8 entries. Here it is:
combinational logic
A logic system whose blocks do not contain memory and hence compute the same output given the same input.
sequential logic
A group of logic elements that contain memory
and hence whose value depends on the inputs
as well as the current contents of the memory.
EXAMPLE
ANSWER
Inputs
Outputs
A
B
C
D
E
F
0
0
0
0
0
0
0
0
1
1
0
0
0
1
0
1
0
0
0
1
1
1
1
0
1
0
0
1
0
0
1
0
1
1
1
0
1
1
0
1
1
0
1
1
1
1
0
1
Truth tables can completely describe any combinational logic function; however, they grow in size quickly and may not be easy to understand. Sometimes we want to construct a logic function that will be 0 for many input combinations, and we use a shorthand of specifying only the truth table entries for the nonzero outputs. This approach is used in Chapter 4 and Appendix D.

C-6 Appendix C The Basics of Logic Design
Boolean Algebra
Another approach is to express the logic function with logic equations. This is done with the use of Boolean algebra (named after Boole, a 19th-century mathe- matician). In Boolean algebra, all the variables have the values 0 or 1 and, in typical formulations, there are three operators:
■ The OR operator is written as +, as in A + B. The result of an OR operator is 1 if either of the variables is 1. The OR operation is also called a logical sum, since its result is 1 if either operand is 1.
■ The AND operator is written as · , as in A · B. The result of an AND operator is 1 only if both inputs are 1. The AND operator is also called logical product, since its result is 1 only if both operands are 1.
__
■ The unary operator NOT is written as A. The result of a NOT operator is 1
only if the input is 0. Applying the operator NOT to a logical value results in an inversion or negation of the value (i.e., if the input is 0 the output is 1, and vice versa).
There are several laws of Boolean algebra that are helpful in manipulating logic equations.
■ Identity law: A + 0 = A and A · 1 = A.
■ Zero and One laws: A + 1 = 1 and A · 0 = 0.
__ __
■ Inverse laws: A + A = 1 and A · A = 1.
■ Commutativelaws:A+B=B+AandA·B=B·A.
■ Associative laws: A + (B + C) = (A + B) + C and A · (B · C) = (A · B) · C.
■ Distributive laws: A · (B + C) = (A · B) + (A · C) and A + (B · C) = (A + B) · (A + C).
In addition, there are two other useful theorems, called DeMorgan’s laws, that are discussed in more depth in the exercises.
Any set of logic functions can be written as a series of equations with an output on the left-hand side of each equation and a formula consisting of variables and the three operators above on the right-hand side.

C.2 Gates, Truth Tables, and Logic Equations
C-7
Logic Equations
Show the logic equations for the logic functions, D, E, and F, described in the previous example.
EXAMPLE
ANSWER
Here’s the equation for D:
F is equally simple:
D=A+B+C
F=A·B·C
E is a little tricky. Think of it in two parts: what must be true for E to be true (two of the three inputs must be true), and what cannot be true (all three cannot be true). Thus we can write E as
_______ E = ((A · B) + (A · C) + (B · C)) · (A · B · C)
We can also derive E by realizing that E is true only if exactly two of the inputs are true. Then we can write E as an OR of the three possible terms that have two true inputs and one false input:
__ __ __ E = (A · B · C) + (A · C · B) + (B · C · A)
Proving that these two expressions are equivalent is explored in the exercises.
In Verilog, we describe combinational logic whenever possible using the assign statement, which is described beginning on page C-23. We can write a definition for E using the Verilog exclusive-OR operator as assign E = (A ^ B ^ C) * (A + B + C) * (A * B * C), which is yet another way to describe this function. D and F have even simpler representations, which are just like the corresponding C code: D = A | B | C and F = A & B & C.

C-8
Appendix C The Basics of Logic Design
gate A device that implements basic logic functions, such as AND or OR.
Gates
Logic blocks are built from gates that implement basic logic functions. For exam- ple, an AND gate implements the AND function, and an OR gate implements the OR function. Since both AND and OR are commutative and associative, an AND or an OR gate can have multiple inputs, with the output equal to the AND or OR of all the inputs. The logical function NOT is implemented with an inverter that always has a single input. The standard representation of these three logic building blocks is shown in Figure C.2.1.
Rather than draw inverters explicitly, a common practice is to add “bubbles”
to the inputs or outputs of a gate to cause the logic value on that input line or
output line to_b_e__in_verted. For example, Figure C.2.2 shows the logic diagram for __
the function A + B, using explicit inverters on the left and bubbled inputs and outputs on the right.
Any logical function can be constructed using AND gates, OR gates, and inversion; several of the exercises give you the opportunity to try implementing some common logic functions with gates. In the next section, we’ll see how an implementation of any logic function can be constructed using this knowledge.
In fact, all logic functions can be constructed with only a single gate type, if that gate is inverting. The two common inverting gates are called NOR and NAND and correspond to inverted OR and AND gates, respectively. NOR and NAND gates are called universal, since any logic function can be built using this one gate type. The exercises explore this concept further.
Are the following two logical expressions equivalent? If not, find a setting of the
variables to show they are not:
__ __ __
■ (A·B·C)+(A·C·B)+(B·C·A) __ __
■ B · (A · C + C · A)
FIGURE C.2.1 Standard drawing for an AND gate, OR gate, and an inverter, shown from left to right. The signals to the left of each symbol are the inputs, while the output appears on the right. The AND and OR gates both have two inputs. Inverters have a single input.
AA BB
_______ __
NOR gate An inverted OR gate.
NAND gate An inverted AND gate.
Check Yourself
FIGURE C.2.2 Logic gate implementation of A + B using explicit inverts on the left __
and bubbled inputs and outputs on the right. This logic function can be simplified to A · B or in Verilog,A & ~ B.

C.3 Combinational Logic
C-9
C.3 Combinational Logic
In this section, we look at a couple of larger logic building blocks that we use heavily, and we discuss the design of structured logic that can be automatically implemented from a logic equation or truth table by a translation program. Last, we discuss the notion of an array of logic blocks.
Decoders
One logic block that we will use in building larger components is a decoder. The most common type of decoder has an n-bit input and 2n outputs, where only one output is asserted for each input combination. This decoder translates the n-bit input into a signal that corresponds to the binary value of the n-bit input. The outputs are thus usually numbered, say, Out0, Out1, . . . , Out2n − 1. If the value of the input is i, then Outi will be true and all other outputs will be false. Figure C.3.1 shows a 3-bit decoder and the truth table. This decoder is called a 3-to-8 decoder since there are 3 inputs and 8 (23) outputs. There is also a logic element called an encoder that performs the inverse function of a decoder, taking 2n inputs and pro- ducing an n-bit output.
decoder A logic block that has an n-bit input and 2n outputs, where only one output is asserted for each input combination.
Inputs
Outputs
12
0
11
0
10
0
Out7
0
Out6
0
Out5
0
Out4
0
Out3
0
Out2
0
Out1
0
Out0
1
0
0
1
0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
0
1
0
0
0
1
1
0
0
0
0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
0
1
0
1
0
0
1
0
0
0
0
0
1
1
0
0
1
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
b. The truth table for a 3-bit decoder
3
Out0 Out1 Out2 Out3 Out4 Out5 Out6 Out7
Decoder
a. A 3-bit decoder
FIGURE C.3.1 A 3-bit decoder has 3 inputs, called 12, 11, and 10, and 23 = 8 outputs, called Out0 to Out7. Only the output corresponding to the binary value of the input is true, as shown in the truth table. The label 3 on the input to the decoder says that the input signal is 3 bits wide.

C-10
Appendix C
The Basics of Logic Design
A0 M
u
x
B1
S
A
C
C
B
S
selector value Also called control value. The control signal that is used to select one of the input values of a multiplexor
as the output of the multiplexor.
FIGURE C.3.2 A two-input multiplexor on the left and its implementation with gates on the right. The multiplexor has two data inputs (A and B), which are labeled 0 and 1, and one selector input (S), as well as an output C. Implementing multiplexors in Verilog requires a little more work, especially when they are wider than two inputs. We show how to do this beginning on page C-23.
Multiplexors
One basic logic function that we use quite often in Chapter 4 is the multiplexor.
A multiplexor might more properly be called a selector, since its output is one
of the inputs that is selected by a control. Consider the two-input multiplexor.
The left side of Figure C.3.2 shows this multiplexor has three inputs: two data
values and a selector (or control) value. The selector value determines which of
the inputs becomes the output. We can represent the logic function computed by
a two-input multiplexor, shown in gate form on the right side of Figure C.3.2, as _
C = (A · S) + (B · S).
Multiplexors can be created with an arbitrary number of data inputs. When
there are only two inputs, the selector is a single signal that selects one of the inputs if it is true (1) and the other if it is false (0). If there are n data inputs, there will need to be ⎡log2n⎤ selector inputs. In this case, the multiplexor basically consists of three parts:
1. A decoder that generates n signals, each indicating a different input value
2. An array of n AND gates, each combining one of the inputs with a signal
from the decoder
3. A single large OR gate that incorporates the outputs of the AND gates
To associate the inputs with selector values, we often label the data inputs numeri- cally (i.e., 0, 1, 2, 3, . . . , n − 1) and interpret the data selector inputs as a binary number. Sometimes, we make use of a multiplexor with undecoded selector signals.
Multiplexors are easily represented combinationally in Verilog by using if expressions. For larger multiplexors, case statements are more convenient, but care must be taken to synthesize combinational logic.

Two-Level Logic and PLAs
As pointed out in the previous section, any logic function can be implemented with only AND, OR, and NOT functions. In fact, a much stronger result is true. Any logic function can be written in a canonical form, where every input is either a true or complemented variable and there are only two levels of gates—one being AND and the other OR—with a possible inversion on the final output. Such a representation is called a two-level representation, and there are two forms, called sum of products and product of sums. A sum-of-products representation is a logical sum (OR) of products (terms using the AND operator); a product of sums is just the opposite. In our earlier example, we had two equations for the output E:
sum of products A form of logical representation that employs a logical sum (OR) of products (terms joined using the AND operator).
and
________
E = ((A · B) + (A · C) + (B · C)) · (A · B · C) __ __ __
E = ( A · B · C ) + (A · C · B ) + ( B · C · A )
This second equation is in a sum-of-products form: it has two levels of logic and the
only inversions are on individual variables. The first equation has three levels of logic.
Elaboration: We can also write E as a product of sums: _______________________________
__ __ __ __ __
E = (A + B + C) · (A + C + B) · (B + C + A)
To derive this form, you need to use DeMorgan’s theorems, which are discussed in the exercises.
In this text, we use the sum-of-products form. It is easy to see that any logic function can be represented as a sum of products by constructing such a represen- tation from the truth table for the function. Each truth table entry for which the function is true corresponds to a product term. The product term consists of a logical product of all the inputs or the complements of the inputs, depending on whether the entry in the truth table has a 0 or 1 corresponding to this variable. The logic function is the logical sum of the product terms where the function is true. This is more easily seen with an example.
C.3 Combinational Logic
C-11

C-12
Appendix C The Basics of Logic Design
EXAMPLE
Sum of Products
Show the sum-of-products representation for the following truth table for D.
Inputs
Output
A
B
C
D
0
0
0
0
0
0
1
1
0
1
0
1
0
1
1
0
1
0
0
1
1
0
1
0
1
1
0
0
1
1
1
1
ANSWER
programmable logic array (PLA) A structured-logic element composed of a set of inputs and corresponding input complements and two stages of logic: the first generating product terms of the inputs and input complements, and the second generating sum terms of the product terms. Hence, PLAs implement logic functions as a sum of products.
minterms Also called product terms. A set
of logic inputs joined
by conjunction (AND operations); the product terms form the first logic stage of the programmable logic array (PLA).
There are four product terms, since the function is true (1) for four different input combinations. These are:
__ __ A·B·C __ __ A·B·C
__ __
A·B·C
A·B·C
Thus, we can write the function for D as the sum of these terms: __ __ __ __ __ __
D = (A · B · C) + (A · B · C) + (A · B · C) + (A · B · C)
Note that only those truth table entries for which the function is true generate
terms in the equation.
We can use this relationship between a truth table and a two-level representa- tion to generate a gate-level implementation of any set of logic functions. A set of logic functions corresponds to a truth table with multiple output columns, as we saw in the example on page C-5. Each output column represents a different logic function, which may be directly constructed from the truth table.
The sum-of-products representation corresponds to a common structured- logic implementation called a programmable logic array (PLA). A PLA has a set of inputs and corresponding input complements (which can be implemented with a set of inverters), and two stages of logic. The first stage is an array of AND gates that form a set of product terms (sometimes called minterms); each product term can consist of any of the inputs or their complements. The second stage is an array of OR gates, each of which forms a logical sum of any number of the product terms. Figure C.3.3 shows the basic form of a PLA.

C.3 Combinational Logic C-13
AND gates
OR gates
Inputs
Product terms
Outputs
FIGURE C.3.3 The basic form of a PLA consists of an array of AND gates followed by an array of OR gates. Each entry in the AND gate array is a product term consisting of any number of inputs or inverted inputs. Each entry in the OR gate array is a sum term consisting of any number of these product terms.
A PLA can directly implement the truth table of a set of logic functions with multiple inputs and outputs. Since each entry where the output is true requires a product term, there will be a corresponding row in the PLA. Each output corresponds to a potential row of OR gates in the second stage. The number of OR gates corresponds to the number of truth table entries for which the output is true. The total size of a PLA, such as that shown in Figure C.3.3, is equal to the sum of the size of the AND gate array (called the AND plane) and the size of the OR gate array (called the OR plane). Looking at Figure C.3.3, we can see that the size of the AND gate array is equal to the number of inputs times the number of different product terms, and the size of the OR gate array is the number of outputs times the number of product terms.
A PLA has two characteristics that help make it an efficient way to implement a set of logic functions. First, only the truth table entries that produce a true value for at least one output have any logic gates associated with them. Second, each dif- ferent product term will have only one entry in the PLA, even if the product term is used in multiple outputs. Let’s look at an example.
PLAs
Consider the set of logic functions defined in the example on page C-5. Show a PLA implementation of this example for D, E, and F.
EXAMPLE

C-14
Appendix C The Basics of Logic Design
ANSWER
Here is the truth table we constructed earlier:
Inputs
Outputs
A
B
C
D
E
F
0
0
0
0
0
0
0
0
1
1
0
0
0
1
0
1
0
0
0
1
1
1
1
0
1
0
0
1
0
0
1
0
1
1
1
0
1
1
0
1
1
0
1
1
1
1
0
1
read-only memory (ROM) A memory whose contents are designated at creation time, after which the contents can only be read. ROM is used as structured logic to implement a
set of logic functions by using the terms in the logic functions as address inputs and the outputs as bits in each word of the memory.
programmable ROM (PROM) A form of read-only memory that can be programmed when a designer knows its contents.
Since there are seven unique product terms with at least one true value in the output section, there will be seven columns in the AND plane. The number of rows in the AND plane is three (since there are three inputs), and there are also three rows in the OR plane (since there are three outputs). Figure C.3.4 shows the resulting PLA, with the product terms corresponding to the truth table entries from top to bottom.
Rather than drawing all the gates, as we do in Figure C.3.4, designers often show just the position of AND gates and OR gates. Dots are used on the intersection of a product term signal line and an input line or an output line when a corresponding AND gate or OR gate is required. Figure C.3.5 shows how the PLA of Figure C.3.4 would look when drawn in this way. The contents of a PLA are fixed when the PLA is created, although there are also forms of PLA-like structures, called PALs, that can be programmed electronically when a designer is ready to use them.
ROMs
Another form of structured logic that can be used to implement a set of logic func- tions is a read-only memory (ROM). A ROM is called a memory because it has a set of locations that can be read; however, the contents of these locations are fixed, usually at the time the ROM is manufactured. There are also programmable ROMs (PROMs) that can be programmed electronically, when a designer knows their contents. There are also erasable PROMs; these devices require a slow erasure process using ultraviolet light, and thus are used as read-only memories, except during the design and debugging process.
A ROM has a set of input address lines and a set of outputs. The number of addressable entries in the ROM determines the number of address lines: if the

ROM contains 2m addressable entries, called the height, then there are m input lines. The number of bits in each addressable entry is equal to the number of output bits and is sometimes called the width of the ROM. The total number of bits in the ROM is equal to the height times the width. The height and width are sometimes collectively referred to as the shape of the ROM.
Inputs
A B C
C.3 Combinational Logic C-15
Outputs D
E F
FIGURE C.3.4
The PLA for implementing the logic function described in the example.
A ROM can encode a collection of logic functions directly from the truth table. For example, if there are n functions with m inputs, we need a ROM with m address lines (and 2m entries), with each entry being n bits wide. The entries in the input portion of the truth table represent the addresses of the entries in the ROM, while the contents of the output portion of the truth table constitute the contents of the ROM. If the truth table is organized so that the sequence of entries in the input portion constitutes a sequence of binary numbers (as have all the truth tables we have shown so far), then the output portion gives the ROM contents in order as well. In the example starting on page C-13, there were three inputs and three outputs. This leads to a ROM with 23 = 8 entries, each 3 bits wide. The contents of those entries in increasing order by address are directly given by the output portion of the truth table that appears on page C-14.
ROMs and PLAs are closely related. A ROM is fully decoded: it contains a full output word for every possible input combination. A PLA is only partially decoded. This means that a ROM will always contain more entries. For the earlier truth table on page C-14, the ROM contains entries for all eight possible inputs, whereas the PLA contains only the seven active product terms. As the number of inputs grows,

C-16 Appendix C
The Basics of Logic Design
Inputs A
B
C
AND plane
Outputs D
OR plane E F
FIGURE C.3.5
and sum terms in the array. Rather than use inverters on the gates, usually all the inputs are run the width of the AND plane in both true and complement forms. A dot in the AND plane indicates that the input, or its inverse, occurs in the product term. A dot in the OR plane indicates that the corresponding product term appears in the corresponding output.
the number of entries in the ROM grows exponentially. In contrast, for most real logic functions, the number of product terms grows much more slowly (see the examples in Appendix D). This difference makes PLAs generally more efficient for implementing combinational logic functions. ROMs have the advantage of being able to implement any logic function with the matching number of inputs and outputs. This advantage makes it easier to change the ROM contents if the logic function changes, since the size of the ROM need not change.
In addition to ROMs and PLAs, modern logic synthesis systems will also trans- late small blocks of combinational logic into a collection of gates that can be placed and wired automatically. Although some small collections of gates are usually not area efficient, for small logic functions they have less overhead than the rigid structure of a ROM and PLA and so are preferred.
For designing logic outside of a custom or semicustom integrated circuit, a common choice is a field programming device; we describe these devices in Section C.12.
A PLA drawn using dots to indicate the components of the product terms

Don’t Cares
Often in implementing some combinational logic, there are situations where we do not care what the value of some output is, either because another output is true or because a subset of the input combinations determines the values of the outputs. Such situations are referred to as don’t cares. Don’t cares are important because they make it easier to optimize the implementation of a logic function.
There are two types of don’t cares: output don’t cares and input don’t cares, both of which can be represented in a truth table. Output don’t cares arise when we don’t care about the value of an output for some input combination. They appear as Xs in the output portion of a truth table. When an output is a don’t care for some input combination, the designer or logic optimization program is free to make the output true or false for that input combination. Input don’t cares arise when an output depends on only some of the inputs, and they are also shown as Xs, though in the input portion of the truth table.
Don’t Cares
Consider a logic function with inputs A, B, and C defined as follows:
■ If A or C is true, then output D is true, whatever the value of B.
■ If A or B is true, then output E is true, whatever the value of C.
■ Output F is true if exactly one of the inputs is true, although we don’t care about the value of F, whenever D and E are both true.
Show the full truth table for this function and the truth table using don’t cares. How many product terms are required in a PLA for each of these?
Here’s the full truth table, without don’t cares:
EXAMPLE
C.3 Combinational Logic C-17
ANSWER
Inputs
Outputs
A
B
C
D
E
F
0
0
0
0
0
0
0
0
1
1
0
1
0
1
0
0
1
1
0
1
1
1
1
0
1
0
0
1
1
1
1
0
1
1
1
0
1
1
0
1
1
0
1
1
1
1
1
0

C-18 Appendix C The Basics of Logic Design
This requires seven product terms without optimization. The truth table written with output don’t cares looks like this:
Inputs
Outputs
A
B
C
D
E
F
0
0
0
0
0
0
0
0
1
1
0
1
0
1
0
0
1
1
0
1
1
1
1
X
1
0
0
1
1
X
1
0
1
1
1
X
1
1
0
1
1
X
1
1
1
1
1
X
If we also use the input don’t cares, this truth table can be further simplified to yield the following:
Inputs
Outputs
A
B
C
D
E
F
0
0
0
0
0
0
0
0
1
1
0
1
0
1
0
0
1
1
X
1
1
1
1
X
1
X
X
1
1
X
This simplified truth table requires a PLA with four minterms, or it can be implemented in discrete gates with one two-input AND gate and three OR gates (two with three inputs and one with two inputs). This compares to the original truth table that had seven minterms and would have required four AND gates.
Logic minimization is critical to achieving efficient implementations. One tool useful for hand minimization of random logic is Karnaugh maps. Karnaugh maps represent the truth table graphically, so that product terms that may be combined are easily seen. Nevertheless, hand optimization of significant logic functions using Karnaugh maps is impractical, both because of the size of the maps and their complexity. Fortunately, the process of logic minimization is highly mechanical and can be performed by design tools. In the process of minimization, the tools take advantage of the don’t cares, so specifying them is important. The textbook references at the end of this Appendix provide further discussion on logic minimization, Karnaugh maps, and the theory behind such minimization algorithms.
Arrays of Logic Elements
Many of the combinational operations to be performed on data have to be done to an entire word (32 bits) of data. Thus we often want to build an array of logic elements,

C.3 Combinational Logic
C-19
which we can represent simply by showing that a given operation will happen to an entire collection of inputs. For example, we saw on page C-9 what a 1-bit multiplexor looked like, but inside a machine, much of the time we want to select between a pair of buses. A bus is a collection of data lines that is treated together as a single logical signal. (The term bus is also used to indicate a shared collection of lines with multiple sources and uses, especially in Chapter 6, where I/O buses were discussed.)
For example, in the MIPS instruction set, the result of an instruction that is written into a register can come from one of two sources. A multiplexor is used to choose which of the two buses (each 32 bits wide) will be written into the Result register. The 1-bit multiplexor, which we showed earlier, will need to be replicated 32 times.
We indicate that a signal is a bus rather than a single 1-bit line by showing it with a thicker line in a figure. Most buses are 32 bits wide; those that are not are explicitly labeled with their width. When we show a logic unit whose inputs and outputs are buses, this means that the unit must be replicated a sufficient number of times to accommodate the width of the input. Figure C.3.6 shows how we draw a multiplexor that selects between a pair of 32-bit buses and how this expands in terms of 1-bit-wide multiplexors. Sometimes we need to construct an array of logic elements where the inputs for some elements in the array are outputs from earlier elements. For example, this is how a multibit-wide ALU is constructed. In such cases, we must explicitly show how to create wider arrays, since the individual elements of the array are no longer independent, as they are in the case of a 32-bit-wide multiplexor.
Select Select
A 32 A31 MM
u32C uC31 B32x B31x
bus In logic design, a collection of data lines that is treated together as a single logical signal; also, a shared collection of lines with multiple sources and uses.
A30
B30 x .
A0
M
u
x B0
M u
C30 .
C0
a. A 32-bit wide 2-to-1 multiplexor
b. The 32-bit wide multiplexor is actually an array of 32 1-bit multiplexors
FIGURE C.3.6 A multiplexor is arrayed 32 times to perform a selection between two 32-bit inputs. Note that there is still only one data selection signal used for all 32 1-bit multiplexors.

C-20
Appendix C The Basics of Logic Design
Check Yourself
Parity is a function in which the output depends on the number of 1s in the input. For an even parity function, the output is 1 if the input has an even number of ones. Suppose a ROM is used to implement an even parity function with a 4-bit input. Which of A, B, C, or D represents the contents of the ROM?
Address A B C D
00101
10110
20101
30110
40101
50110
60101
70110
81001
91010
10 1 0 0 1
11 1 0 1 0
12 1 0 0 1
13 1 0 1 0
14 1 0 0 1
15 1 0 1 0
hardware description language Aprogramming language for describing hardware, used for generating simulations of a hardware design and also as input to synthesis tools that can generate actual hardware.
Verilog One of the two most common hardware description languages.
VHDL One of the two most common hardware description languages.
C.4 Using a Hardware Description Language Today most digital design of processors and related hardware systems is done
using a hardware description language. Such a language serves two purposes. First, it provides an abstract description of the hardware to simulate and debug the design. Second, with the use of logic synthesis and hardware compilation tools, this description can be compiled into the hardware implementation.
In this section, we introduce the hardware description language Verilog and show how it can be used for combinational design. In the rest of the appendix, we expand the use of Verilog to include design of sequential logic. In the optional sections of Chapter 4 that appear on the CD, we use Verilog to describe processor implementations. In the optional section from Chapter 5 that appears on the CD, we use system Verilog to describe cache controller implementations. System Verilog adds structures and some other useful features to Verilog.
Verilog is one of the two primary hardware description languages; the other is VHDL. Verilog is somewhat more heavily used in industry and is based on C, as opposed to VHDL, which is based on Ada. The reader generally familiar with C will

C.4 Using a Hardware Description Language
C-21
find the basics of Verilog, which we use in this appendix, easy to follow. Readers already familiar with VHDL should find the concepts simple, provided they have been exposed to the syntax of C.
Verilog can specify both a behavioral and a structural definition of a digital sys- tem. A behavioral specification describes how a digital system functionally oper- ates. A structural specification describes the detailed organization of a digital system, usually using a hierarchical description. A structural specification can be used to describe a hardware system in terms of a hierarchy of basic elements such as gates and switches. Thus, we could use Verilog to describe the exact contents of the truth tables and datapath of the last section.
With the arrival of hardware synthesis tools, most designers now use Verilog or VHDL to structurally describe only the datapath, relying on logic synthesis to generate the control from a behavioral description. In addition, most CAD sys- tems provide extensive libraries of standardized parts, such as ALUs, multiplexors, register files, memories, and programmable logic blocks, as well as basic gates.
Obtaining an acceptable result using libraries and logic synthesis requires that the specification be written with an eye toward the eventual synthesis and the desired outcome. For our simple designs, this primarily means making clear what we expect to be implemented in combinational logic and what we expect to require sequential logic. In most of the examples we use in this section and the remainder of this appendix, we have written the Verilog with the eventual synthesis in mind.
Datatypes and Operators in Verilog
There are two primary datatypes in Verilog:
1. A wire specifies a combinational signal.
2. A reg (register) holds a value, which can vary with time. A reg need not necessarily correspond to an actual register in an implementation, although it often will.
A register or wire, named X, that is 32 bits wide is declared as an array: reg [31:0] X or wire [31:0] X, which also sets the index of 0 to designate the least significant bit of the register. Because we often want to access a subfield of a register or wire, we can refer to a contiguous set of bits of a register or wire with the notation [starting bit: ending bit], where both indices must be constant values.
An array of registers is used for a structure like a register file or memory. Thus, the declaration
reg [31:0] registerfile[0:31]
specifies a variable registerfile that is equivalent to a MIPS registerfile, where register 0 is the first. When accessing an array, we can refer to a single element, as in C, using the notation registerfile[regnum].
behavioral specification
Describes how a digital system operates functionally.
structural specification
Describes how a digital system is organized in terms of a hierarchical connection of elements.
hardware synthesis tools Computer-aided design software that
can generate a gate-
level design based on behavioral descriptions of a digital system.
wire In Verilog, specifies a combinational signal.
reg InVerilog,aregister.

C-22
Appendix C The Basics of Logic Design
Check Yourself
The possible values for a register or wire in Verilog are
■ 0 or 1, representing logical false or true
■ X, representing unknown, the initial value given to all registers and to any wire not connected to something
■ Z, representing the high-impedance state for tristate gates, which we will not discuss in this appendix
Constant values can be specified as decimal numbers as well as binary, octal,
or hexadecimal. We often want to say exactly how large a constant field is in bits. This is done by prefixing the value with a decimal number specifying its size in bits. For example:
■ 4’b0100 specifies a 4-bit binary constant with the value 4, as does 4’d4.
■ – 8 ‘h4 specifies an 8-bit constant with the value −4 (in two’s complement
representation)
Values can also be concatenated by placing them within { } separated by com-
mas. The notation {x {bit field}} replicates bit field x times. For example:
■ {16{2’b01}} creates a 32-bit value with the pattern 0101 . . . 01.
■ {A[31:16],B[15:0]} creates a value whose upper 16 bits come from A and whose lower 16 bits come from B.
Verilog provides the full set of unary and binary operators from C, including the arithmetic operators (+, −, *. /), the logical operators (&, |, ~), the comparison operators (= =, !=, >, <, <=, >=), the shift operators (<<, >>), and C’s conditional operator (?, which is used in the form condition ? expr1 :expr2 and returns expr1 if the condition is true and expr2 if it is false). Verilog adds a set of unary logic reduction operators (&, |, ^) that yield a single bit by applying the logical operator to all the bits of an operand. For example, &A returns the value obtained by ANDing all the bits of A together, and ^A returns the reduction obtained by using exclusive OR on all the bits of A.
Which of the following define exactly the same value?
1. 8’b11110000
2. 8’hF0
3. 8’d240
4. {{4{1’b1}},{4{1’b0}}} 5. {4’b1,4’b0}

C.4 Using a Hardware Description Language C-23
Structure of a Verilog Program
A Verilog program is structured as a set of modules, which may represent anything from a collection of logic gates to a complete system. Modules are similar to classes in C++, although not nearly as powerful. A module specifies its input and output ports, which describe the incoming and outgoing connections of a module. A module may also declare additional variables. The body of a module consists of:
■ initialconstructs,whichcaninitializeregvariables
■ Continuous assignments, which define only combinational logic
■ always constructs, which can define either sequential or combinational logic
■ Instances of other modules, which are used to implement the module being defined
Representing Complex Combinational Logic in Verilog
A continuous assignment, which is indicated with the keyword assign, acts like a combinational logic function: the output is continuously assigned the value, and a change in the input values is reflected immediately in the output value. Wires may only be assigned values with continuous assignments. Using continuous assignments, we can define a module that implements a half-adder, as Figure C.4.1 shows.
Assign statements are one sure way to write Verilog that generates combinational logic. For more complex structures, however, assign statements may be awkward or tedious to use. It is also possible to use the always block of a module to describe a combinational logic element, although care must be taken. Using an always block allows the inclusion of Verilog control constructs, such as if-then–else, case statements, for statements, and repeat statements, to be used. These statements are similar to those in C with small changes.
An always block specifies an optional list of signals on which the block is sen- sitive (in a list starting with @). The always block is re-evaluated if any of the listed
module half_adder (A,B,Sum,Carry);
input A,B; //two 1-bit inputs
output Sum, Carry; //two 1-bit outputs
assign Sum = A ^ B; //sum is A xor B
assign Carry = A & B; //Carry is A and B
endmodule
FIGURE C.4.1 A Verilog module that defines a half-adder using continuous assignments.

C-24
Appendix C The Basics of Logic Design
sensitivity list The list of signals that specifies when an always block should be re-evaluated.
signals changes value; if the list is omitted, the always block is constantly re-evaluated. When an always block is specifying combinational logic, the sensitivity list should include all the input signals. If there are multiple Verilog state- ments to be executed in an always block, they are surrounded by the keywords begin and end, which take the place of the { and } in C. An always block thus looks like this:
always @(list of signals that cause reevaluation) begin
Verilog statements including assignments and other
control statements end
Reg variables may only be assigned inside an always block, using a procedural assignment statement (as distinguished from continuous assignment we saw earlier). There are, however, two different types of procedural assignments. The assignment operator = executes as it does in C; the right-hand side is evaluated, and the left-hand side is assigned the value. Furthermore, it executes like the nor- mal C assignment statement: that is, it is completed before the next statement is executed. Hence, the assignment operator = has the name blocking assignment. This blocking can be useful in the generation of sequential logic, and we will return to it shortly. The other form of assignment (nonblocking) is indicated by <=. In nonblocking assignment, all right-hand sides of the assignments in an always group are evaluated and the assignments are done simultaneously. As a first example of combinational logic implemented using an always block, Figure C.4.2 shows the implementation of a 4-to-1 multiplexor, which uses a case construct to make it easy to write. The case construct looks like a C switch statement. Figure C.4.3 shows a definition of a MIPS ALU, which also uses a case statement. Since only reg variables may be assigned inside always blocks, when we want to describe combinational logic using an always block, care must be taken to ensure that the reg does not synthesize into a register. A variety of pitfalls are described in the elaboration below. Elaboration: Continuous assignment statements always yield combinational logic, but other Verilog structures, even when in always blocks, can yield unexpected results during logic synthesis. The most common problem is creating sequential logic by imply- ing the existence of a latch or register, which results in an implementation that is both slower and more costly than perhaps intended. To ensure that the logic that you intend to be combinational is synthesized that way, make sure you do the following: 1. Place all combinational logic in a continuous assignment or an always block. 2. Make sure that all the signals used as inputs appear in the sensitivity list of an always block. 3. Ensure that every path through an always block assigns a value to the exact same set of bits. The last of these is the easiest to overlook; read through the example in Figure C.5.15 to convince yourself that this property is adhered to. blocking assignment In Verilog, an assignment that completes before the execution of the next statement. nonblocking assignment An assignment that continues after evaluating the right-hand side, assigning the left-hand side the value only after all right-hand sides are evaluated. C.4 Using a Hardware Description Language C-25 module Mult4to1 (In1,In2,In3,In4,Sel,Out); input [31:0] In1, In2, In3, In4; /four 32-bit inputs input [1:0] Sel; //selector signal output reg [31:0] Out;// 32-bit output always @(In1, In2, In3, In4, Sel) case (Sel) //a 4->1 multiplexor
0: Out <= In1; 1: Out <= In2; 2: Out <= In3; default: Out <= In4; endcase endmodule FIGURE C.4.2 A Verilog definition of a 4-to-1 multiplexor with 32-bit inputs, using a case statement. The case statement acts like a C switch statement, except that in Verilog only the code associ- ated with the selected case is executed (as if each case state had a break at the end) and there is no fall-through to the next statement. module MIPSALU (ALUctl, A, B, ALUOut, Zero); input [3:0] ALUctl; input [31:0] A,B; output reg [31:0] ALUOut; output Zero; assign Zero = (ALUOut==0); //Zero is true if ALUOut is 0; goes anywhere always @(ALUctl, A, B) //reevaluate if these change case (ALUctl) 0: ALUOut <= A & B; 1: ALUOut <= A | B; 2: ALUOut <= A + B; 6: ALUOut <= A – B; 7: ALUOut <= A < B ? 1:0; 12: ALUOut <= ~(A | B); // result is nor default: ALUOut <= 0; //default to 0, should not happen; endcase endmodule FIGURE C.4.3 A Verilog behavioral definition of a MIPS ALU. This could be synthesized using a module library containing basic arithmetic and logical operations. C-26 Appendix C The Basics of Logic Design Check Yourself ALU n. [Arthritic Logic Unit or (rare) Arithmetic Logic Unit] A random-number generator supplied as standard with all computer systems. Stan Kelly-Bootle, The Devil’s DP Dictionary, 1981 Assuming all values are initially zero, what are the values of A and B after executing this Verilog code inside an always block? C.5 C=1; A <= C; B = C; Constructing a Basic Arithmetic Logic Unit The arithmetic logic unit (ALU ) is the brawn of the computer, the device that performs the arithmetic operations like addition and subtraction or logical opera- tions like AND and OR. This section constructs an ALU from four hardware building blocks (AND and OR gates, inverters, and multiplexors) and illustrates how combinational logic works. In the next section, we will see how addition can be sped up through more clever designs. Because the MIPS word is 32 bits wide, we need a 32-bit-wide ALU. Let’s assume that we will connect 32 1-bit ALUs to create the desired ALU. We’ll therefore start by constructing a 1-bit ALU. A 1-Bit ALU The logical operations are easiest, because they map directly onto the hardware components in Figure C.2.1. The 1-bit logical unit for AND and OR looks like Figure C.5.1. The multiplexor on the right then selects a AND b or a OR b, depending on whether the value of Operation is 0 or 1. The line that controls the multiplexor is shown in color to dis- tinguish it from the lines containing data. Notice that we have renamed the control and output lines of the multiplexor to give them names that reflect the function of the ALU. The next function to include is addition. An adder must have two inputs for the operands and a single-bit output for the sum. There must be a second output to pass on the carry, called CarryOut. Since the CarryOut from the neighbor adder must be included as an input, we need a third input. This input is called CarryIn. Figure C.5.2 shows the inputs and the outputs of a 1-bit adder. Since we know what addition is supposed to do, we can specify the outputs of this “black box” based on its inputs, as Figure C.5.3 demonstrates. We can express the output functions CarryOut and Sum as logical equations, and these equations can in turn be implemented with logic gates. Let’s do Carry- Out. Figure C.5.4 shows the values of the inputs when CarryOut is a 1. We can turn this truth table into a logical equation: CarryOut = (b · CarryIn) + (a · CarryIn) + (a · b) + (a · b · CarryIn) C.5 Constructing a Basic Arithmetic Logic Unit C-27 a b Operation 0 1 Result FIGURE C.5.1 The 1-bit logical unit for AND and OR. CarryIn a b Sum + CarryOut A 1-bit adder. This adder is called a full adder; it is also called a (3,2) adder because it has 3 inputs and 2 outputs. An adder with only the a and b inputs is called a (2,2) adder or half-adder. FIGURE C.5.2 Inputs Outputs Comments a b CarryIn CarryOut Sum 0 0 0 0 0 0 + 0 + 0 = 00two 0 0 1 0 1 0 + 0 + 1 = 01two 0 1 0 0 1 0 + 1 + 0 = 01two 0 1 1 1 0 0 + 1 + 1 = 10two 1 0 0 0 1 1 + 0 + 0 = 01two 1 0 1 1 0 1 + 0 + 1 = 10two 1 1 0 1 0 1 + 1 + 0 = 10two 1 1 1 1 1 1 + 1 + 1 = 11two FIGURE C.5.3 Input and output specification for a 1-bit adder. C-28 Appendix C The Basics of Logic Design If a · b · CarryIn is true, then all of the other three terms must also be true, so we can leave out this last term corresponding to the fourth line of the table. We can thus simplify the equation to CarryOut = (b · CarryIn) + (a · CarryIn) + (a · b) Figure C.5.5 shows that the hardware within the adder black box for CarryOut consists of three AND gates and one OR gate. The three AND gates correspond exactly to the three parenthesized terms of the formula above for CarryOut, and the OR gate sums the three terms. Inputs a b CarryIn 0 1 1 1 0 1 1 1 0 1 1 1 FIGURE C.5.4 Values of the inputs when CarryOut is a 1. CarryIn a b FIGURE C.5.5 CarryOut Adder hardware for the CarryOut signal. The rest of the adder hardware is the logic for the Sum output given in the equation on this page. The Sum bit is set when exactly one input is 1 or when all three inputs are 1. The Sum results in a complex Boolean equation (recall that _a means NOT a): __ _______ _ _______ _ __ Sum = (a · b · CarryIn) + (a · b · CarryIn) + (a · b · CarryIn) + (a · b · CarryIn) The drawing of the logic for the Sum bit in the adder black box is left as an exercise for the reader. C.5 Constructing a Basic Arithmetic Logic Unit C-29 Figure C.5.6 shows a 1-bit ALU derived by combining the adder with the earlier components. Sometimes designers also want the ALU to perform a few more sim- ple operations, such as generating 0. The easiest way to add an operation is to expand the multiplexor controlled by the Operation line and, for this example, to connect 0 directly to the new input of that expanded multiplexor. Operation CarryIn 0 1 2 1 a b Result FIGURE C.5.6 A 1-bit ALU that performs AND, OR, and addition (see Figure C.5.5). CarryOut A 32-Bit ALU Now that we have completed the 1-bit ALU, the full 32-bit ALU is created by con- necting adjacent “black boxes.” Using xi to mean the ith bit of x, Figure C.5.7 shows a 32-bit ALU. Just as a single stone can cause ripples to radiate to the shores of a quiet lake, a single carry out of the least significant bit (Result0) can ripple all the way through the adder, causing a carry out of the most significant bit (Result31). Hence, the adder created by directly linking the carries of 1-bit adders is called a ripple carry adder. We’ll see a faster way to connect the 1-bit adders starting on page C-38. Subtraction is the same as adding the negative version of an operand, and this is how adders perform subtraction. Recall that the shortcut for negating a two’s complement number is to invert each bit (sometimes called the one’s complement) and then add 1. To invert each bit, we simply add a 2:1 multiplexor that chooses __ between b and b, as Figure C.5.8 shows. Suppose we connect 32 of these 1-bit ALUs, as we did in Figure C.5.7. The added multiplexor gives the option of b or its inverted value, depending on Binvert, but C-30 Appendix C The Basics of Logic Design a0 b0 a1 b1 a2 b2 a31 b31 . . . Result0 Result1 Result2 Result31 CarryIn Operation CarryIn ALU0 CarryOut CarryIn ALU1 CarryOut CarryIn ALU2 CarryOut A 32-bit ALU constructed from 32 1-bit ALUs. CarryOut of the less significant bit is connected to the CarryIn of the more significant bit. This organization is called ripple carry. this is only one step in negating a two’s complement number. Notice that the least significant bit still has a CarryIn signal, even though it’s unnecessary for addition. What happens if we set this CarryIn to 1 instead of 0? The adder will then calculate a + b + 1. By selecting the inverted version of b, we get exactly what we want: __ __ a+b +1=a+(b +1)=a+(−b)=a−b The simplicity of the hardware design of a two’s complement adder helps explain why two’s complement representation has become the universal standard for integer computer arithmetic. FIGURE C.5.7 CarryIn ALU31 C.5 Constructing a Basic Arithmetic Logic Unit C-31 Binvert Operation CarryIn 0 1 2 0 1 1 a b FIGURE C.5.8 Result CarryOut __ A 1-bit ALU that performs AND, OR, and addition on a and b or a and b. By __ selecting b (Binvert = 1) and setting CarryIn to 1 in the least significant bit of the ALU, we get two’s complement subtraction of b from a instead of addition of b to a. A MIPS ALU also needs a NOR function. Instead of adding a separate gate for NOR, we can reuse much of the hardware already in the ALU, like we did for sub- tract. The insight comes from the following truth about NOR: _____ _ __ (a+b)=a·b That is, NOT (a OR b) is equivalent to NOT a AND NOT b. This fact is called DeMorgan’s theorem and is explored in the exercises in more depth. Since we have AND and NOT b, we only need to add NOT a to the ALU. Figure C.5.9 shows that change. Tailoring the 32-Bit ALU to MIPS These four operations—add, subtract, AND, OR—are found in the ALU of almost every computer, and the operations of most MIPS instructions can be performed by this ALU. But the design of the ALU is incomplete. One instruction that still needs support is the set on less than instruction (slt). Recall that the operation produces 1 if rs < rt, and 0 otherwise. Consequently, slt will set all but the least significant bit to 0, with the least significant bit set according to the comparison. For the ALU to perform slt, we first need to expand the three-input C-32 Appendix C The Basics of Logic Design Ainvert Binvert Operation CarryIn 0 1 0 1 2 0 1 􏰀 a b Result FIGURE C.5.9 __ __ selecting a (Ainvert = 1) and b (Binvert = 1), we get a NOR b instead of a AND b. _ multiplexor in Figure C.5.8 to add an input for the slt result. We call that new input Less and use it only for slt. The top drawing of Figure C.5.10 shows the new 1-bit ALU with the expanded multiplexor. From the description of slt above, we must connect 0 to the Less input for the upper 31 bits of the ALU, since those bits are always set to 0. What remains to consider is how to compare and set the least significant bit for set on less than instructions. What happens if we subtract b from a? If the difference is negative, then a < b since (a − b) < 0 ⇒ ((a − b) + b) < (0 + b) ⇒a In addition to the basic laws we discussed in this section, there
are two important theorems, called DeMorgan’s theorems:
_____ __ _ ____ __ _
A + B = A · B and A · B = A + B
Prove DeMorgan’s theorems with a truth table of the form
A
B
__ A
__ B
______ A+ B
__ __ A·B
______ A ·B
__ __ A+B
0
0
1
1
1
1
1
1
0
1
1
0
0
0
1
1
1
0
0
1
0
0
1
1
1
1
0
0
0
0
0
0
C.2 [15] <§C.2> Prove that the two equations for E in the example starting on page C-7 are equivalent by using DeMorgan’s theorems and the axioms shown on page C-7.
C.3 [10]<§C.2>Showthatthereare2nentriesinatruthtableforafunctionwith n inputs.
C.4 [10]<§C.2>Onelogicfunctionthatisusedforavarietyofpurposes(includ- ing within adders and to compute parity) is exclusive OR. The output of a two- input exclusive OR function is true only if exactly one of the inputs is true. Show the truth table for a two-input exclusive OR function and implement this function using AND gates, OR gates, and inverters.
C.5 [15] <§C.2> Prove that the NOR gate is universal by showing how to build the AND, OR, and NOT functions using a two-input NOR gate.
C.6 [15] <§C.2> Prove that the NAND gate is universal by showing how to build the AND, OR, and NOT functions using a two-input NAND gate.
C.7 [10]<§§C.2,C.3>Constructthetruthtableforafour-inputodd-parityfunc- tion (see page C-65 for a description of parity).
C.8 [10] <§§C.2, C.3> Implement the four-input odd-parity function with AND and OR gates using bubbled inputs and outputs.
C.9 [10] <§§C.2, C.3> Implement the four-input odd-parity function with a PLA.

C.14 Exercises C-81
C.10 [15] <§§C.2, C.3> Prove that a two-input multiplexor is also universal by showing how to build the NAND (or NOR) gate using a multiplexor.
C.11 [5] <§§4.2, C.2, C.3> Assume that X consists of 3 bits, x2 x1 x0. Write four logic functions that are true if and only if
■ X contains only one 0
■ X contains an even number of 0s
■ X when interpreted as an unsigned binary number is less than 4
■ X when interpreted as a signed (two’s complement) number is negative
C.12 [5] <§§4.2, C.2, C.3> Implement the four functions described in Exercise C.11 using a PLA.
C.13 [5] <§§4.2, C.2, C.3> Assume that X consists of 3 bits, x2 x1 x0, and Y consists of 3 bits, y2 y1 y0. Write logic functions that are true if and only if
■ X < Y, where X and Y are thought of as unsigned binary numbers ■ X < Y, where X and Y are thought of as signed (two’s complement) numbers ■ X=Y Use a hierarchical approach that can be extended to larger numbers of bits. Show how can you extend it to 6-bit comparison. C.14 [5] <§§C.2, C.3> Implement a switching network that has two data inputs (A and B), two data outputs (C and D), and a control input (S). If S equals 1, the network is in pass-through mode, and C should equal A, and D should equal B. If S equals 0, the network is in crossing mode, and C should equal B, and D should equal A.
C.15 [15] <§§C.2, C.3> Derive the product-of-sums representation for E shown on page C-11 starting with the sum-of-products representation. You will need to use DeMorgan’s theorems.
C.16 [30] <§§C.2, C.3> Give an algorithm for constructing the sum-of-products representation for an arbitrary logic equation consisting of AND, OR, and NOT. The algorithm should be recursive and should not construct the truth table in the process.
C.17 [5] <§§C.2, C.3> Show a truth table for a multiplexor (inputs A, B, and S; output C ), using don’t cares to simplify the table where possible.

C-82 Appendix C The Basics of Logic Design
C.18 [5] <§C.3> What is the function implemented by the following Verilog modules:
module FUNC1 (I0, I1, S, out);
input I0, I1;
input S;
output out;
out = S? I1: I0;
endmodule
module FUNC2 (out,ctl,clk,reset);
output [7:0] out;
input ctl, clk, reset;
reg [7:0] out;
always @(posedge clk)
if (reset) begin
out <= 8'b0 ; end else if (ctl) begin out <= out + 1; end else begin out <= out - 1; end endmodule C.19 [5] <§C.4> The Verilog code on page C-53 is for a D flip-flop. Show the Verilog code for a D latch.
C.20 [10]<§§C.3,C.4>WritedownaVerilogmoduleimplementationofa2-to-4 decoder (and/or encoder).
C.21 [10] <§§C.3, C.4> Given the following logic diagram for an accumulator, write down the Verilog module implementation of it. Assume a positive edge- triggered register and asynchronous Rst.

C.14 Exercises C-83
In
Load Clk
􏰀
16
16
Adder
Rst Load
Register
C.22 [20] <§§C.3, C.4, C.5> Section 3.3 presents basic operation and possible implementations of multipliers. A basic unit of such implementations is a shift- and-add unit. Show a Verilog implementation for this unit. Show how can you use this unit to build a 32-bit multiplier.
C.23 [20] <§§C.3, C.4, C.5> Repeat Exercise C.22, but for an unsigned divider rather than a multiplier.
C.24 [15] <§C.5> The ALU supported set on less than (slt) using just the sign bit of the adder. Let’s try a set on less than operation using the values −7ten and 6ten. To make it simpler to follow the example, let’s limit the binary representations to 4 bits: 1001two and 0110two.
1001two – 0110two = 1001two + 1010two = 0011two
This result would suggest that −7 > 6, which is clearly wrong. Hence, we must factor in overflow in the decision. Modify the 1-bit ALU in Figure C.5.10 on page C-33 to handle slt correctly. Make your changes on a photocopy of this figure to save time.
C.25 [20] <§C.6> A simple check for overflow during addition is to see if the CarryIn to the most significant bit is not the same as the CarryOut of the most significant bit. Prove that this check is the same as in Figure 3.5 on page 232.
C.26 [5] <§C.6> Rewrite the equations on page C-44 for a carry-lookahead logic for a 16-bit adder using a new notation. First, use the names for the CarryIn signals of the individual bits of the adder. That is, use c4, c8, c12, . . . instead of C1, C2, C3, . . . . In addition, let Pi,j mean a propagate signal for bits i to j, and Gi,j mean a generate signal for bits i to j. For example, the equation
C2 = G1 + (P1 · G0) + (P1 · P0 · c0)
Out

C-84 Appendix C The Basics of Logic Design
can be rewritten as
This more general notation is useful in creating wider adders.
C.27 [15] <§C.6> Write the equations for the carry-lookahead logic for a 64-bit adder using the new notation from Exercise C.26 and using 16-bit adders as build- ing blocks. Include a drawing similar to Figure C.6.3 in your solution.
C.28 [10] <§C.6> Now calculate the relative performance of adders. Assume that hardware corresponding to any equation containing only OR or AND terms, such as the equations for pi and gi on page C-40, takes one time unit T. Equations that consist of the OR of several AND terms, such as the equations for c1, c2, c3, and c4 on page C-40, would thus take two time units, 2T. The reason is it would take T to produce the AND terms and then an additional T to produce the result of the OR. Calculate the numbers and performance ratio for 4-bit adders for both ripple carry and carry lookahead. If the terms in equations are further defined by other equations, then add the appropriate delays for those intermediate equations, and continue recursively until the actual input bits of the adder are used in an equation. Include a drawing of each adder labeled with the calculated delays and the path of the worst-case delay highlighted.
C.29 [15] <§C.6> This exercise is similar to Exercise C.28, but this time calculate the relative speeds of a 16-bit adder using ripple carry only, ripple carry of 4-bit groups that use carry lookahead, and the carry-lookahead scheme on page C-39.
C.30 [15]<§C.6>ThisexerciseissimilartoExercisesC.28andC.29,butthistime calculate the relative speeds of a 64-bit adder using ripple carry only, ripple carry of 4-bit groups that use carry lookahead, ripple carry of 16-bit groups that use carry lookahead, and the carry-lookahead scheme from Exercise C.27.
C.31 [10] <§C.6> Instead of thinking of an adder as a device that adds two num- bers and then links the carries together, we can think of the adder as a hardware device that can add three inputs together (ai, bi, ci) and produce two outputs (s, ci + 1). When adding two numbers together, there is little we can do with this observation. When we are adding more than two operands, it is possible to reduce the cost of the carry. The idea is to form two independent sums, called S′ (sum bits) and C′ (carry bits). At the end of the process, we need to add C′ and S′ together using a normal adder. This technique of delaying carry propagation until the end of a sum of numbers is called carry save addition. The block drawing on the lower right of Figure C.14.1 shows the organization, with two levels of carry save adders connected by a single normal adder.
Calculate the delays to add four 16-bit numbers using full carry-lookahead adders versus carry save with a carry-lookahead adder forming the final sum. (The time unit T in Exercise C.28 is the same.)
c8=G7,4 + (P7, 4 · G3, 0) + (P7, 4 · P3, 0 · c 0)

C.14 Exercises C-85
a3 b3
a2 b2
a1 b1
a0 b0
e0
f0
ABEF
1
e3
1
e2
1
e1
1
Traditional adder
1
1
f3
1
f2
1
f1
Traditional adder
1
1
1
1
1
Traditional adder
s5 s4 s3 s2 s1 s0
b3 e3 f3 b2 e2 f2 b1 e1 f1 b0 e0 f0 A
B E F
S
1
a
s’4 c’3 s c’0 s’0
s5 s4 s3 s2 s1 s0
C’
S
3 a2
1
a1
1
1
1
1
a0
‘3 c’2 s’2
c’1 s’1
1
1
Traditional adder
Carry save adder
Carry save adder
S’
1
1
1
1
FIGURE C.14.1 Traditional ripple carry and carry save addition of four 4-bit numbers. The details are shown on the left, with the individual signals in lowercase, and the corresponding higher-level blocks are on the right, with collective signals in uppercase. Note that the sum of four n-bit numbers can take n + 2 bits.
C.32 [20] <§C.6> Perhaps the most likely case of adding many numbers at once in a computer would be when trying to multiply more quickly by using many adders to add many numbers in a single clock cycle. Compared to the multiply algorithm in Chapter 3, a carry save scheme with many adders could multiply more than 10 times faster. This exercise estimates the cost and speed of a combinational multiplier to multiply two positive 16-bit numbers. Assume that you have 16 inter- mediate terms M15, M14, . . . , M0, called partial products, that contain the multi- plicand ANDed with multiplier bits m15, m14, . . ., m0. The idea is to use carry save adders to reduce the n operands into 2n/3 in parallel groups of three, and do this repeatedly until you get two large numbers to add together with a traditional adder.

C-86 Appendix C The Basics of Logic Design
First, show the block organization of the 16-bit carry save adders to add these 16 terms, as shown on the right in Figure C.14.1. Then calculate the delays to add these 16 numbers. Compare this time to the iterative multiplication scheme in Chapter 3 but only assume 16 iterations using a 16-bit adder that has full carry lookahead whose speed was calculated in Exercise C.29.
C.33 [10] <§C.6> There are times when we want to add a collection of numbers together. Suppose you wanted to add four 4-bit numbers (A, B, E, F) using 1-bit full adders. Let’s ignore carry lookahead for now. You would likely connect the 1-bit adders in the organization at the top of Figure C.14.1. Below the traditional orga- nization is a novel organization of full adders. Try adding four numbers using both organizations to convince yourself that you get the same answer.
C.34 [5]<§C.6>First,showtheblockorganizationofthe16-bitcarrysaveadders to add these 16 terms, as shown in Figure C.14.1. Assume that the time delay through each 1-bit adder is 2T. Calculate the time of adding four 4-bit numbers to the organization at the top versus the organization at the bottom of Figure C.14.1.
C.35 [5] <§C.8> Quite often, you would expect that given a timing diagram con- taining a description of changes that take place on a data input D and a clock input C (as in Figures C.8.3 and C.8.6 on pages C-52 and C-54, respectively), there would be differences between the output waveforms (Q) for a D latch and a D flip-flop. In a sentence or two, describe the circumstances (e.g., the nature of the inputs) for which there would not be any difference between the two output waveforms.
C.36 [5] <§C.8> Figure C.8.8 on page C-55 illustrates the implementation of the register file for the MIPS datapath. Pretend that a new register file is to be built, but that there are only two registers and only one read port, and that each register has only 2 bits of data. Redraw Figure C.8.8 so that every wire in your diagram corre- sponds to only 1 bit of data (unlike the diagram in Figure C.8.8, in which some wires are 5 bits and some wires are 32 bits). Redraw the registers using D flip-flops. You do not need to show how to implement a D flip-flop or a multiplexor.
C.37 [10] <§C.10> A friend would like you to build an “electronic eye” for use as a fake security device. The device consists of three lights lined up in a row, con- trolled by the outputs Left, Middle, and Right, which, if asserted, indicate that a light should be on. Only one light is on at a time, and the light “moves” from left to right and then from right to left, thus scaring away thieves who believe that the device is monitoring their activity. Draw the graphical representation for the finite-state machine used to specify the electronic eye. Note that the rate of the eye’s movement will be controlled by the clock speed (which should not be too great) and that there are essentially no inputs.
C.38 [10]<§C.10>{Ex.C.37}Assignstatenumberstothestatesofthefinite-state machine you constructed for Exercise C.37 and write a set of logic equations for each of the outputs, including the next-state bits.

C.14 Exercises
C-87
C.39 [15] <§§C.2, C.8, C.10> Construct a 3-bit counter using three D flip-flops and a selection of gates. The inputs should consist of a signal that resets the counter to 0, called reset, and a signal to increment the counter, called inc. The outputs should be the value of the counter. When the counter has value 7 and is incre- mented, it should wrap around and become 0.
C.40 [20] <§C.10> A Gray code is a sequence of binary numbers with the property that no more than 1 bit changes in going from one element of the sequence to another. For example, here is a 3-bit binary Gray code: 000, 001, 011, 010, 110, 111, 101, and 100. Using three D flip-flops and a PLA, construct a 3-bit Gray code counter that has two inputs: reset, which sets the counter to 000, and inc, which makes the counter go to the next value in the sequence. Note that the code is cyclic, so that the value after 100 in the sequence is 000.
C.41 [25] <§C.10> We wish to add a yellow light to our traffic light example on page C-68. We will do this by changing the clock to run at 0.25 Hz (a 4-second clock cycle time), which is the duration of a yellow light. To prevent the green and red lights from cycling too fast, we add a 30-second timer. The timer has a single input, called TimerReset, which restarts the timer, and a single output, called TimerSignal, which indicates that the 30-second period has expired. Also, we must redefine the traffic signals to include yellow. We do this by defining two output signals for each light: green and yellow. If the output NS green is asserted, the green light is on; if the output NSyellow is asserted, the yellow light is on. If both signals are off, the red light is on. Do not assert both the green and yellow signals at the same time, since American drivers will certainly be confused, even if European drivers understand what this means! Draw the graphical representation for the finite-state machine for this improved controller. Choose names for the states that are different from the names of the outputs.
C.42 [15] <§C.10> {Ex. C.41} Write down the next-state and output-function tables for the traffic light controller described in Exercise C.41.
C.43 [15]<§§C.2,C.10>{Ex.C.42}Assignstatenumberstothestatesinthetraf- fic light example of Exercise C.41 and use the tables of Exercise C.42 to write a set of logic equations for each of the outputs, including the next-state outputs.
C.44 [15] <§§C.3, C.10> {Ex. C.43} Implement the logic equations of Exercise C.43 as a PLA.
§C.2, page C-8: No. If A = 1, C = 1, B = 0, the first is true, but the second is false. §C.3, page C-20: C.
§C.4, page C-22: They are all exactly the same.
§C.4, page C-26:A= 0,B= 1.
§C.5, page C-38: 2. §C.6, page C-47: 1. §C.8, page C-58: c. §C.10, page C-72: b. §C.11, page C-77: b.
Answers to Check Yourself