Computer Architecture: Operand Addressing and Instruction Representation

AMAZON multi-meters discounts AMAZON oscilloscope discounts

1. Introduction

The previous sections discuss types of processors and consider processor instruction sets. This section focuses on two details related to instructions: the ways instructions are represented in memory and the ways that operands can be specified. We will see that the form of operands is especially relevant to programmers. We will also understand how the representation of instructions determines the possible operand forms.

The next section continues the discussion of processors by explaining how a Central Processing Unit (CPU) operates. We will see how a CPU combines many features we have discussed into a large, unified system.

2. Zero, One, Two, Or Three Address Designs

We said that an instruction is usually stored as an opcode followed by zero or more operands. How many operands are needed? The discussion in Section 5 assumes that the number of operands is determined by the operation being performed. Thus, an add instruction needs at least two operands because addition involves at least two quantities.

Similarly, a Boolean not instruction needs one operand because logical inversion only involves one quantity. However, the example MIPS instruction set in Section 5 em ploys an additional operand on each instruction that specifies the location for the result.

Thus, in the example instruction set, an add instruction requires three operands: two that specify values to be added and a third that specifies a location for the result.

Despite the intuitive appeal of a processor in which each instruction can have an arbitrary number of operands, many processors do not permit such a scheme. To under stand why, we must consider the underlying hardware. First, because an arbitrary number of operands implies variable-length instructions, fetching and decoding instructions is less efficient than using fixed-length instructions. Second, because fetching an arbitrary number of operands takes time, the processor will run slower than a processor with a fixed number of operands.

It may seem that parallel hardware can solve some of the inefficiency. Imagine, for example, parallel hardware units that each fetch one operand of an instruction. If an instruction has two operands, two units operate simultaneously; if an instruction has four operands, four units operate simultaneously. However, parallel hardware uses more space on a chip and requires additional power. In addition, the number of pins on a chip limits the amount of data from outside the chip that can be accessed in parallel.

Thus, parallel hardware is not an attractive option in many cases (e.g., a processor in a portable phone that operates on battery power).

Can an instruction set be designed without allowing arbitrary operands? If so, what is the smallest number of operands that can be useful for general computation? Early computers answered the question by using a scheme in which each instruction only has one operand. Later computers introduced instruction sets that limited each instruction to two operands. Surprisingly, computers also exist in which instructions have no operands in the instruction itself. Finally, as we have seen in the previous section, some processors limit instructions to three operands.

3. Zero Operands Per Instruction

An architecture in which instructions have no operands is known as a 0-address architecture. How can an architecture allow instructions that do not specify any operands? The answer is that operands must be implicit. That is, the location of the operands is already known. A 0-address architecture is also called a stack architecture because operands are kept on a run-time stack. For example, an add instruction takes two values from the top of the stack, adds them together, and places the result back on the stack. Of course, there are a few exceptions, and some of the instructions in a stack computer allow a programmer to specify an operand. For example, most zero-address architectures include a push instruction that inserts a new value on the top of the stack, and a pop instruction removes the top value from the stack and places the value in memory. Thus, on a stack machine, to add seven to variable X, one might use a sequence of instructions similar to the example in FIG. 1.

The chief disadvantage of a stack architecture arises from the use of memory - it takes much longer to fetch operands from memory than from registers in the processor.

A later section discusses the concept; for now, it is sufficient to understand why the computer industry has moved away from stack architectures.

push X

push 7

add

pop X

FIG. 1 An example of instructions used on a stack computer to add seven to a variable X. The architecture is known as a zero address architecture because the operands for an instruction such as add are found on the stack.

4. One Operand Per Instruction

An architecture that limits each instruction to a single operand is classified as a 1 address design. In essence, a 1-address design relies on an implicit operand for each instruction: a special register known as an accumulator†. One operand is in the instruction and the processor uses the value of the accumulator as a second operand. Once the operation has been performed, the processor places the result back in the accumulator.

We think of an instruction as operating on the value in the accumulator. For example, consider arithmetic operations. Suppose an addition instruction has operand X:

add X

When it encounters the instruction, the processor performs the following operation:

accumulator ---> accumulator + X

Of course, the instruction set for a 1-address processor includes instructions that al low a programmer to load a constant or the value from a memory location into the accumulator or store the current value of the accumulator into a memory location.

5. Two Operands Per Instruction

Although it works well for arithmetic or logical operations, a 1-address design does not allow instructions to specify two values. For example, consider copying a value from one memory location to another. A 1-address design requires two instructions that load the value into the accumulator and then store the value in the new location. The design is especially inefficient for a system that moves graphics objects in display memory.

To overcome the limitations of 1-address systems, designers invented processors that allow each instruction to have two addresses. The approach is known as a 2 address architecture. With a 2-address processor, an operation can be applied to a specified value instead of merely to the accumulator. Thus, in a 2-address processor,

†The general-purpose registers discussed in Section 5 can be considered an extension of the original accumulator concept.

add X Y

specifies that the value of X is to be added to the current value of Y:

Y --> Y+X

Because it allows an instruction to specify two operands, a 2-address processor can offer data movement instructions that treat the operands as a source and destination.

For example, a 2-address instruction can copy data directly from location Q to location R†:

move Q R

6. Three Operands Per Instruction

Although a 2-address design handles data movement, further optimization is possible, especially for processors that have multiple general-purpose registers: allow each instruction to specify three operands. Unlike a 2-address design, the key motivation for a 3-address architecture does not arise from operations that require three input values.

Instead, the point is that the third operand can specify a destination. For example, an addition operation can specify two values to be added as well as a destination for the result:

add X Y Z specifies an assignment of:

Z --> X+Y

7. Operand Sources and Immediate Values

The discussion above focuses on the number of operands that each instruction can have without specifying the exact details of an operand. We know that an instruction has a bit field for each operand, but questions arise about how the bits are interpreted.

How is each type of operand represented in an instruction? Do all operands use the same representation? What semantic meaning is given to a representation? To understand the issue, observe that the data value used as an operand can be obtained in many ways. FIG. 2 lists some of the possibilities for operands in a 3 address processor‡.

†Some architects reserve the term 2-address for instructions in which both operands specify a memory lo cation, and use the term 1 1/2- address for situations where one operand is in memory and the other operand is in a register.

‡To increase performance, modern 3-address architectures often limit operands so that at most one of the operands in a given instruction refers to a location in memory; the other two operands must specify registers.

Operand used as a source (item used in the operation)

-- A signed constant in the instruction d An unsigned constant in the instruction d The contents of a general-purpose register d The contents of a memory location Operand used as a destination (location to hold the result)

-- A general-purpose register d A contiguous pair of general-purpose registers d A memory location

FIG. 2 Examples of items an operand can reference in a 3-address processor. A source operand specifies a value and a destination operand specifies a location.

As the figure indicates, most architectures allow an operand to be a constant.

Although the operand field is small, having an explicit constant is important because programs use small constants frequently (e.g., to increment a loop index by 1); encoding a constant in the instruction is faster and requires fewer registers.

We use the term immediate value to refer to an operand that is a constant. Some architectures interpret immediate values as signed, some interpret them as unsigned, and others allow a programmer to specify whether the value is signed or unsigned.

8. The Von Neumann Bottleneck

Recall that conventional computers that store both programs and data in memory are classified as following a Von Neumann Architecture. Operand addressing exposes the central weakness of a Von Neumann Architecture: memory access can become a bottleneck. That is, because instructions are stored in memory, a processor must make at least one memory reference per instruction. If one or more operands specify items in memory, the processor must make additional memory references to fetch or store values. To optimize performance and avoid the bottleneck, operands must be taken from registers instead of memory.

The point is:

On a computer that follows the Von Neumann Architecture, the time spent accessing memory can limit the overall performance. Architects use the term Von Neumann bottleneck to characterize the situation, and avoid the bottleneck by choosing designs in which operands are found in registers.

9. Explicit And Implicit Operand Encoding

How should an operand be represented in an instruction? The instruction contains a bit field for each operand, but an architect must specify exactly what the bits mean (e.g., whether they contain an immediate value, the number of a register, or a memory address). Computer architects have used two interpretations of operands: implicit and explicit. The next sections describe each of the approaches.

9.1 Implicit Operand Encoding

An implicit operand encoding is easiest to understand: the opcode specifies the types of operands. That is, a processor that uses implicit encoding contains multiple op codes for a given operation - each opcode corresponds to one possible combination of operands. For example, FIG. 3 lists three instructions for addition that might be offered by a processor that uses implicit operand encoding.

FIG. 3 An example of addition instructions for a 2-address processor that uses implicit operand encoding. A separate opcode is used for each possible combination of operands.

As the figure illustrates, not all operands need to have the same interpretation. For example, consider the add immediate signed instruction. The instruction takes two operands: the first operand is interpreted to be a register number, and the second is interpreted to be a signed integer.

9.2 Explicit Operand Encoding

The chief disadvantage of implicit encoding is apparent from FIG. 3: multiple opcodes are needed for a given operation. In fact, a separate opcode is needed for each combination of operands. If the processor uses many types of operands, the set of op codes can be extremely large. As an alternative, an explicit operand encoding associates type information with each operand. FIG. 4 illustrates the format of two add instructions for an architecture that uses explicit operand encoding.

As the figure shows, the operand field is divided into two subfields: one specifies the type of the operand and the other specifies a value. For example, an operand that references a register begins with a type field that specifies the remaining bits are to be interpreted as a register number.

FIG. 4 Examples of operands on an architecture that uses explicit encoding. Each operand specifies a type as well as a value.

10. Operands That Combine Multiple Values

The discussion above implies that each operand consists of a single value extracted from a register, memory, or the instruction itself. Some processors do indeed restrict each operand to a single value. However, other processors provide hardware that can compute an operand value by extracting and combining values from multiple sources.

Typically, the hardware computes a sum of several values.

An example will help clarify how hardware handles operands composed of multiple values. One approach is known as a register-offset mechanism. The idea is straightforward: instead of two subfields that specify a type and value, each operand consists of three fields that specify a register-offset type, a register, and an offset.

When it fetches an operand, the processor adds the contents of the offset field to the contents of the specified register to obtain a value that is then used as the operand. FIG. 5 shows an example add instruction with register-offset operands.

FIG. 5 An example of an add instruction in which each operand consists of a register plus an offset. During operand fetch, the hardware adds the offset to the specified register to obtain the value of the operand.

In the figure, the first operand specifies the contents of register 2 minus the constant 17, and the second operand specifies the contents of register 4 plus the constant 76. When we discuss memory, we will see that allowing an operand to specify a register plus an offset is especially useful when referencing a data aggregate such as a C language instruct because a pointer to the structure can be left in a register and offsets used to reference individual items.

11 Tradeoffs in the Choice of Operands

The discussion above is unsatisfying - it seems that we have listed many design possibilities but have not focused on which approach has been adopted. In fact, there is no best choice, and each operand style we discussed has been used in practice. Why hasn't one particular style emerged as optimal? The answer is simple: each style represents a tradeoff between ease of programming, size of the code, speed of processing, and complexity of the hardware. The next paragraphs discuss several potential design goals, and explain how each relates to the choice of operands.

Ease Of Programming. Complex forms of operands make programming easier.

For example, we said that allowing an operand to specify a register plus an offset makes data aggregate references straightforward. Similarly, a 3-address approach that provides an explicit target means a programmer does not need to code separate instructions to copy results into their final destination. Of course, to optimize ease of programming, an architect needs to trade off other aspects.

Fewer Instructions. Increasing the expressive power of operands reduces the number of instructions in a program. For example, allowing an operand to specify both a register and an offset means that the program does not need to use an extra instruction to add an offset to a register. Increasing the number of addresses per instruction also lowers the count of instructions (e.g., a 3-address processor requires fewer instructions than a 2-address processor). Unfortunately, fewer instructions produce a tradeoff in which each instruction is larger.

Smaller Instruction Size. Limiting the number of operands, the set of operands types, or the maximum size of an operand keeps each instruction small because fewer bits are needed to identify the operand type or represent an operand value. In particular, an operand that specifies only a register will be smaller than an operand that specifies a register and an offset. As a result, some of the smallest, least powerful processors limit operands to registers - except for load and store operations, each value used in a pro gram must come from a register. Unfortunately, making each instruction smaller de creases the expressive power, and therefore increases the number of instructions needed.

Larger Range Of Immediate Values. Recall from Section 3 that a string of k bits can hold 2k possible values. Thus, the number of bits allocated to an operand deter mines the numeric range of immediate values that can be specified. Increasing the range of immediate values results in larger instructions.

Faster Operand Fetch And Decode. Limiting the number of operands and the possible types of each operand allows hardware to operate faster. To maximize speed, for example, an architect avoids register-offset designs because hardware can fetch an operand from a register much faster than it can compute the value from a register plus an offset.

Decreased Hardware Size And Complexity. The amount of space on an integrated circuit is limited, and an architect must decide how to use the space. Decoding complex forms of operands requires more hardware than decoding simpler forms. Thus, limiting the types and complexity of operands reduces the size of the circuitry required. Of course, the choice represents a tradeoff: programs are larger.

The point is:

Processor architects have created a variety of operand styles. No single form is optimal for all processors because the choice represents a compromise among functionality, program size, complexity of the hardware required to fetch values, performance, and ease of programming.

12. Values In Memory And Indirect Reference

A processor must provide a way to access values in memory. That is, at least one instruction must have an operand which the hardware interprets as a memory address†.

Accessing a value in memory is significantly more expensive than accessing a value in a register. Although it may make programming easier, a design in which each instruction references memory usually results in lower performance. Thus, programmers usually structure code to keep values that will be used often in registers and only reference memory when needed.

Some processors extend memory references by permitting various forms of indirection. For example, an operand that specifies indirection through register 6 causes a processor to perform two steps:

-- Obtain A, the current value from register 6.

-- Interpret A as a memory address, and fetch the operand from memory.

One extreme form of operand involves double indirection, or indirection through a memory location. That is, the processor interprets the operand as memory address M.

However, instead of loading or storing a value to address M, the processor assumes M contains the memory address of the value. In such cases, a processor performs the following steps:

-- Obtain M, the value in the operand itself.

-- Interpret M as a memory address, and fetch the value A from memory.

-- Interpret A as another memory address, and fetch the operand from memory.

Double indirection that goes through one memory location to another can be useful when a program has to follow a linked list in memory. However, the overhead is extremely high (execution of a single instruction entails multiple memory references).

13. Illustration Of Operand Addressing Modes

A processor usually contains a special internal register, called an instruction register, that is used to hold an instruction while the instruction is being decoded. The possible types of operand addresses and the cost of each can be envisioned by considering the location of the operand and the references needed to fetch the value. An immediate value is the least expensive because the value is located in the instruction register (i.e., in the instruction itself). A general-purpose register reference is slightly more expensive than an immediate value. A reference to memory is more expensive than a reference to a register. Finally, double indirection, which requires two memory references, is the most expensive. FIG. 6 lists the possibilities, and illustrates the hardware units involved in resolving each.

Immediate value (in the instruction)

Direct register reference Direct memory reference

Indirect through a register

Indirect memory reference

FIG. 6 Illustration of the hardware units accessed when fetching an operand in various addressing modes. Indirect references take longer than direct references.

In the figure, modes 3 and 5 each require the instruction to contain a memory ad dress. Although they were available on earlier computers, such modes have become un popular because they require an instruction to be quite large.

14. Summary

When designing a processor, an architect chooses the number and possible types of operands for each instruction. To make operand handling efficient, many processors limit the number of operands for a given instruction to three or fewer.

An immediate operand specifies a constant value; other possibilities include an operand that specifies using the contents of a register or a value in memory. Indirection allows a register to contain the memory address of the operand. Double indirection means the operand specifies a memory address and the value at the address is a pointer to another memory location that holds the value. The type of the operand can be encoded implicitly (i.e., in the opcode) or explicitly.

Many variations exist because the choice of operand number and type represents a tradeoff among functionality, ease of programming, and engineering details such as the speed of processing.

EXERCISES

1. Suppose a computer architect is designing a processor for a computer that has an extremely slow memory. Would the architect choose a zero-address architecture? Why or why not?

2. Consider the size of instructions in memory. If an architecture allows immediate operands to have large numeric values, an instruction takes more space in memory. Why?

3. Assume a stack machine keeps the stack in memory. Also assume variable p is stored in memory. How many memory references will be needed to increment p by seven?

4. Assume two integers, x and y are stored in memory, and consider an instruction that sets z to the sum of x+y. How many memory references will be needed on a two-address architecture? Hint: remember to include instruction fetch.

5. How many memory operations are required to perform an add operation on a 3-address architecture if each operand specifies an indirect memory reference?

6. If a programmer increments a variable by a value that is greater than the maximum immediate operand, an optimizing compiler may generate two instructions. For example, on a computer that only allows immediate values of 127 or less, incrementing variable x by 140 results in the sequence:

load r7, x add_immediate r7, 127 add_immediate t7, 13 store r7, x

Why doesn't the compiler store 140 in memory and add the value to register 7?

7. Assume a memory reference takes twelve times as long as a register reference, and assume a program executes N instructions on a 2-address architecture. Compare the running time of the program if all operands are in registers to the running time if all operands are in memory. Hint: instruction fetch requires a memory operation.

8. Consider each type of operand that FIG. 6 illustrates, and make a table that contains an expression for the number of bits required to represent the operand. Hint: the number of bits required to represent values from zero through N is:

log2N

9. Name one advantage of using a higher number of addresses per instruction.

10. Consider a two-address computer that uses implicit operands. Suppose one of the two operands can be any of the five operand types in FIG. 6, and the other can be any except an immediate value. List all the add instructions the computer needs.

11. Most compilers contain optimization modules that choose to keep frequently used variables in registers rather than writing them back to memory. What term characterizes the problem that such an optimization module is attempting to overcome?

PREV. | NEXT