Computer Architecture / Organization -- Machine Language (part 1)

AMAZON multi-meters discounts AMAZON oscilloscope discounts

1. Nature

1.1 Translation from programming language

Machine language is used for the description of the system process which may be presented to the raw machine. On the other hand, a programming language is used for a description of the same thing but presented in a manner understandable to a human. The programming language should also render the software modularity clear to a human. In other words the boundaries and interfaces of procedures and modules should be opaque in the programming language but transparent in the machine language.

The language understood by different machines varies greatly. Portability of programs between computers is achieved via use of common programming languages. The machine language program is not portable, except between machines of identical design.

The problem of informing a machine of the procedure you wish it to follow is much like that of informing a person, who speaks another language, how to perform a task. The language of the speaker (programmer) must be translated into that of the listener (machine). If the translation is carried out word by word, as it is spoken, it is referred to as interpretation. If it takes place after everything has been said it is referred to as compilation. Fig. 1 illustrates the problem.

The difference between the programming and machine language is called the semantic gap. It is possible to argue that hardware architecture should evolve so as to narrow this gap. This begs the question of the choice of programming language. Since it is now comparatively cheap to develop a new processor, a number of new designs are appearing which are optimized to close the semantic gap for a variety of programming languages.

Fig. 1: Translation of natural language to machine language

An alternative view is that the semantic gap is an inevitable consequence of the conflict between the machine requirements (for performance) and those of humans (for maintainability and ease of development). In this case the machine language design may take more account of the requirements of the compiler code generator, which must generate machine language "code" automatically. It must then be made easy to make choices logically where alternatives exist.

1.2 Structure

A machine language program takes the form of a stream of independent instructions. They are conventionally encoded as binary numbers and are executed sequentially in the order in which they are found. Each is made up of two parts…

• Operation

• Operands (0, 1, 2 or 3)

When the instruction is executed, the operation, whose encoding is known as an opcode, is performed upon the operand(s).

Behavior of some instructions must be conditional or conditional behavior of the process as a whole would not be possible. Some memory of the outcome of the execution of previous instructions is always provided and is referred to as the processor state. For example, a single latch will recall whether the result of the previous operation (e.g. a subtraction) was zero or not. Hence, for example, two numbers may be compared and subsequent action rendered dependent upon whether or not they share a common value. All processor state is stored collectively in the processor state register (PSR).

Machine code for the running process is stored in a large linear memory ( Fig. 2) which may be referenced randomly as an array. The array index is called an address and each memory cell a location. Array bounds are simply zero and its size (minus one). The address of the next instruction to execute is stored in another register, wide enough to accommodate any address, called the program counter (PC) ( Fig. 3). Sequencing is obtained automatically by incrementing the program counter after each instruction is executed.

Conditional instructions may be used to modify the control flow by conditionally updating the program counter. All selection and iteration constructs may be implemented using a single instruction, the conditional branch, which adds its single operand to the program counter if the condition succeeds but otherwise does nothing.

Almost all modern computers make use of the idea that the machine code should reside in the same memory device as data. Obviously care must be taken that the two occupy distinct areas of memory. However shared memory simplifies the architecture and, hence lowers the cost, of the whole computer.

Computers using this principle are often referred to as von Neumann machines, after the person credited with the innovation. Those which employ separate memories for code and data are referred to as Harvard machines.

Fig. 2: Linear memory

1.3 Interpretation

It is the function of the processor control unit to interpret machine language. In other words it translates each instruction, one at a time, into a sequence of physical microoperations. There may be two parallel components to a micro operation…

• Register transfer

• Control of functional unit

As an example, consider a two-operand instruction add r0, r1 which adds together the contents of two registers (r0 and r1) and places the result in the second (r1).

This will be translated into the following micro-operation sequence…

(1) alu.in.0 ? r0

(2) alu.in.1 ? r1, alu(add)

(3) r1 ? alu.out

alu.in.0, alu.in.1, alu.out denote the two inputs and one output of an arithmetic logic unit (ALU), which is a device capable of evaluating a number of arithmetic functions. The means used to express a micro-operation sequence is called a register transfer language (RTL). "A S B" denotes the transfer of the contents of register B into register A. Two items separated by a comma are understood to take place in parallel. The second item will always be the control of a functional unit if only a single transfer may take place at a time. This is indicated as a function bearing the name of the unit, whose argument is the "switch" to be set.

Thus alu(add) means switch the ALU to "add". There is no universally accepted standard RTL. The one used throughout this text is somewhat different to that found elsewhere. It has been designed to render clear each individual transfer rather than to be concise.

Fig. 3: Program counter sequencing of instruction execution

1.4 Instructions

Assignment and expression evaluation

The first group of instructions is that which implements assignment of an expression value to a variable. Actual assignment is achieved with the store instruction which typically takes two operands, the value and the variable, usually referred to as source and destination.

Expressions are evaluated by iteratively evaluating sub-expressions, usually from right to left or according to level of parenthesis. The term "expression" simply means a description of a function in terms of variables, constants and primitive functions called operators.

So far, two special registers have been mentioned, the program counter and the processor state register. In addition to these, a number of general purpose registers (GPRs) will be provided which are used to hold the value of both an expression and its sub-expressions as they are evaluated. Since local variables are implemented as memory locations, there must be two kinds of data transfer…

• Register to memory (store)

• Memory to register (load)

A store performs an assignment. A load of each required variable prepares the register file for expression evaluation.

Instructions which implement functions at the binary operator level are essential for expression evaluation. It may be shown that any computable function may be computed using just a few binary logical operators. In fact the set {and or not} is sufficient.

In reality any operator requires a sequence of operations (and hence some time) to be evaluated. For example, the function plus (a, b) will usually require transferring the values of a and b to the input of a physical adding device and then transferring the result to plus. In addition the adder must be activated at the right moment. It must run some process to generate a sum. We postpone these problems until Part II. It is vital to understand that the machine language represents the operation of the hardware at its highest physical level, not its lowest. Operator instructions may give the effect of "instant" evaluation when executed, but in fact many distinct register transfer operations may be required1.

Control flow

The use of a conditional branch instruction to modify the contents of the program counter depending on processor state is discussed above. We now turn to how it may be used to implement selection and iteration constructs.

Shown below are code segments for the machine language implementation of both WHILE and IF…THEN…ELSE constructs. In order to avoid binary notation, mnemonics are used for all instructions.

; start cb <else>

cb <exit> …

… br <exit>

br <start> ; else

; exit …

; exit

The meaning of the mnemonics used is as follows…

• cb ? Branch if condition fails

• br ? Branch always

<exit> denotes the offset to the next instruction in memory.

<else> denotes the offset to the code to be executed should the condition fail. Note how much more convenient it is to have an instruction which branches only if the condition fails.

Of course the actual condition needed may be the negated version of the one available.

[1. It is possible to devise a functional architecture where the need for assignment is eliminated. In this case the result of each operator is used as an argument to a function.

The value of the function is used as an argument to another, and so on until the system function is evaluated. ]

The conditional branch may be regarded as the programming atom, or primitive, from which all constructs are made. Together with a sufficient set of logical operators and {load store}, anything may be computed. Additional instructions are required to support the partitioning of software for the sole benefit of the software engineer. They add nothing to the capacity to compute and if anything reduce performance. Arithmetic operators are always included since they are almost universally required. However, it is quite possible, though laborious, to compute them using just logical and shift operators.

Linkage

In order to ease the engineering of software, it is necessary to provide support for procedure invocation. Procedure code, at the level of the machine, is referred to as a subroutine. Invocation requires the following…

• Branching to subroutine

• Returning from subroutine

• Passing parameters

The first is directly implemented with another branch instruction which we shall give the mnemonic bsr. It is beyond the scope of this section to discuss support for nested procedure invocation. The method adopted here is simply to save the incremented program counter value in a register as the return address. As well as doing this, bsr adds its operand to the program counter register in order to enter the subroutine.

Returning from subroutine is achieved by the ret instruction which simply copies the return address back into the program counter and must be the last in the subroutine. The thread of control is shown in Fig. 4.

Parameters may be passed by placing them in general purpose registers prior to bsr.

Application support

The fourth group of instructions is that which support a given set of applications.

For example, graphical applications require block move operations.

Although not strictly essential, such a group is necessary if the design is to be competitive as a product. Many manufacturers now offer a range of co processors which extend the instruction set or enhance the performance of a given subset for a specified applications group.

1.5 Operands

Number of operands

The number of operands depends on a design decision and on the instruction itself. It is possible for a design to require zero operands. This assumes that the computer organization is such that the operand (s) have a predetermined source (e.g. on top of a stack) and the result a predictable destination. At the other extreme three operands may be required for each operator (two arguments and one result).

There is disagreement among computer architects as to which is the better number. Fewer operands generally means shorter code but can mean more micro operations per instruction.

Storage class

The arguments to the instruction describe where to find the operands. Memory devices are usually grouped into storage classes. The concept of storage class represents the programmer's view of where the operand resides. A hardware engineer usually only perceives devices.

For example, the only complete computer we have so far met, the Turing Machine, has just two storage classes, processor state and the linear memory. We have met only two storage classes for a real modern computer, the register file and linear "main" memory.

Fig. 4: Thread of control through subroutine

It is also necessary to distinguish between two areas of memory, program memory and workspace. Constant data is often located within the program memory. The area of memory reserved for local variables is referred to as workspace. Each local variable is said to be bound to an offset from the value of yet another register, the workspace pointer. To summarize, the minimum set of storage classes usually found is…

• Program memory

• Register

• Workspace

The actual set found depends strongly upon its architecture design but the above is typical.

Due to constraints imposed by the ability of current technology to meet all the requirements of typical applications, real computers have two, three or more distinct memory devices. At least some of these will be available as distinct operand storage classes. Others will require special software to communicate with external processors which can gain direct access.

Access class

The access class of an operand is the manner in which it is referenced. It describes what happens to it and whether it is updated. There follows a summary…

• Read (R)

• Write (W)

• Read-Modify-Write (RMW)

Read access implies that the operand value remains unaltered by the reference and simply has its value used, for example as an operand of an operator. A two operand addition instruction may be defined to overwrite the second operand with the result. The first operand is of access class read and the second read-modify write. An example of write access is the destination of a store.

The access class of each operand must be specified in the definition of each instruction since it depends almost totally on the nature of the operation performed.

Addressing modes

Each instruction encodes an operation. Operations act on operands. Also encoded within the instruction is how to find the operands. For instance, if just one operand is required and it resides in memory then a further instruction field must communicate this fact and an instruction extension must indicate its address. An instruction is therefore a record with the following fields…

• Opcode

• Addressing mode(s)

• Address(es)

The addressing mode defines the storage class of the operand. When the operand is in memory it is called absolute or direct addressing. Absolute addressing has become progressively less common since the whole machine code program will need editing if the data area is moved. Relative addressing, where data is referenced via offsets from a workspace pointer, removes this difficulty. Only the pointer need be changed. Similarly code (or data) in program memory may be addressed relative to the program counter. Everything may be accessed via an offset from one or other register, be it data, a subroutine or construct code segment. Position independent code is said to result from such an architecture.

Immediate mode indicates to the processor that constant data follows in an instruction extension instead of an offset or an address. It may be thought of as a special case of program counter relative addressing. However, the data should be regarded as contained within the instruction.

Some addressing modes do not require qualification with an address. For example register addressing may be encoded as a distinct mode for each register. Hence no further qualification is required.

There follows a summary of common addressing modes…

• Immediate

• Register

• Workspace relative

• Program counter relative

• Absolute (direct)

Fig. 5: Indexing as an address modifier

Fig. 6: Indirection as an address modifier

Addressing mode modifiers

Addressing modes may be used to reference either scalar data or the base of a vector. Vector elements may be sequence associated or pointer associated. An addressing mode may be modified to locate an operand by either of the following…

• Indexing

• Indirection

Fig. 5 illustrates indexing and Fig. 6 indirection.

Indexing allows referencing of an element within an array. An index must be specified which must be checked first to see if it lies within array bounds. The array shown has only one dimension. Multi-dimensional arrays require one index per dimension specified, which are then used to calculate the element address.

(Memory has just one dimension so some mapping is necessary.) Indirect addressing means, instead of specifying the operand address, specifying the address of the address. Indirection is like crossing a pond via stepping stones. Each stone represents one level of indirection.

Fig. 7: Programmer's architecture for purely sequential processing

NEXT>>

PREV. | NEXT