Computer Architecture / Organization -- Software Engineering (part 1)

AMAZON multi-meters discounts AMAZON oscilloscope discounts

1 .Projects

1.1 Engineering design process

The following universal design process is employed in all fields of engineering…

1. Analyze

2. Design

3. Implement

4. Verify

Systems analysis should result in a requirements specification.

The designer then uses this in order to produce a design specification. A solution is then implemented, sufficiently self-documenting for any other engineer to understand.

Fig. 1 shows a flow diagram for the design process, which terminates only when the implementation has been verified against the requirements.

Failure to specify a requirement properly prior to designing a solution is clearly stupid. It is amazing to find how often such failure occurs. Specification means first breaking up the requirement into smaller, more manageable, blocks and identifying the interfaces between them. Systems analysis is most vitally concerned with specifying interfaces, particularly the human/machine interface.

Systems design requires a means of specifying how the requirement may be met. It proceeds by one or other means of reduction of the requirements into manageable modules which may in turn be broken down by…

• Procedure

• Process

• Object

• Relation

• Function

In any one design, all modules must be of the same kind to make interfaces between them possible.

Some means of dividing the implementation into corresponding software partitions must be available which permits separate development and separate verification.

Verification is usually the greater problem. One can rarely be certain that an implementation is totally correct. The best that is often possible is to do some tests to verify that the requirements specification is met with the largest possible set of inputs. Exhaustive verification means testing the system output for every legal input and is usually prohibitively expensive. It is often possible to classify input. If so, a test set of inputs may be selected with members from each class.

Fig. 1: How to solve a problem

Formal verification implies the use of mathematical transformations to establish analytically that the algorithm used will give correct output for any valid input.

1.2 Organization

A project itself may be thought of as a process whose output is the implementation of systems which meet requirements input. This time however the process is carried out by people. The implementation for a particular requirement is known as an application.

Projects make use of the design process described above. Each sub-process is usually carried out by a separate team. The reader should think of a line leading from requirement to verified solution and a sequence of sub-processes along it. At the design stage the requirement meets prior knowledge and a very poorly understood process known as human thought produces a solution.

The processes of analysis, design, implementation and verification, when applied to computer systems, are collectively known as software engineering. A readable text on software engineering is [Sommerville 85].

Few systems are simple to engineer, most are complex. Projects are usually con-strained to a tight time scale because of the need for a product to reach the market before its competition. As a result it is essential to distribute effort across a team.

Three problems arise here…

• Distribution: How may the project be broken down between team members?

• Interface: How may team members work together effectively?

• Co-ordination: How may the project now be managed? The problem may be summarized as how to get a ten man-year job done in a single year! There is no single clear answer to any of these questions. They are still the subject of much research. The ideas in modular software engineering go some way towards a solution.

Fig. 2: Top-down task decomposition diagram for the composition of this guide

1.3 Languages

Requirements

The concept of language is central to both computer science and software engineering. There is need for a language to communicate each of the following things…

• Requirements specification

• Design specification

• Implementation (programmer readable)

• Implementation (machine readable)

The language used to specify requirements is usually the local natural language (e.g. English). Design and implementation languages should be compatible since the programmer must translate the design specification into an implementation.

A semantic gap is said to separate the two implementation languages of the programmer and machine and is of a width which depends on the design philosophy of the .machine and is crossed via translation. The machine language is in terms of capabilities of the machine itself which will reflect both its design philosophy and the model of computation chosen.

2. Modular systems design

2.1 Tasks

Top-down task diagram

We commence by thinking of the requirement to be met as a task to be performed. Now we repeatedly subdivide it into a number of others which are simpler, or easier, to perform. Each task is considered done if all its offspring have been completed. Reduction proceeds until we reach a point at which a single engineer can easily specify software to perform each task at any level.

Fig. 2 illustrates a top-down task diagram for the task of writing a textbook such as this.

Balanced loading

The principle of balanced loading may be roughly stated as follows… All tasks, regardless of level, should require roughly equal effort to perform.

It does not help at all if we decompose a task without making its offspring significantly easier to perform. Ideally all tasks at all levels should be of equal difficulty to minimize the overall effort required and maximize its distribution.

These ideas can be usefully illustrated by the analysis of management structure. Fig. 3 (lower) represents the better structure. It may not necessarily be very good in practice. It is easy to imagine that one middle manager may have less to do than colleagues at the same level. The result would be an imbalance of load between them. There is no easy way to assess relative task loading. An initial design may have to be revised after it has been tested.

Although it may appear easy, obtaining a decomposition of a task, which is balanced in both dimensions, is far from it. It requires skill on the part of the designer which takes experience to gain.

Population growth

Following balanced loading, a second principle of top-down task decomposition may be stated… Each partition of a task should be into between two and five offspring.

The size of the "box" population should not grow too fast or too slow. Between two and five children per parent is recommended.

No single analysis should consider more than a few levels. A sensible number is three. Each terminal (lowest level) task should subsequently be the subject of a further analysis if necessary.

To continue with the management analogy, compare the two alternative structures in Fig. 3. It is hard to imagine how balanced loading could ever be achieved with the upper structure. Still worse is that extra, unnecessary, interaction must occur. The path of command is twice as long as in the lower structure.

Interaction

Once a satisfactory top-down diagram is obtained, the interface between each task and its parent must be rendered clear within the specification of both. A third principle is necessary…

No interaction should occur except between parent and child.

The precise description of the interaction between parent and offspring tasks is called the interface.

The ideas detailed here are based on those of Edward Yourdon. For a full account of top-down design see [Yourdon & Constantine 78].

Modules

The top-down diagram is progressively developed until each task may be delegated to a single engineer (or team of engineers) who may be expected to produce a verified system prior to a declared deadline. The software to perform each task delegated in this way is called a module. Each module will eventually exist in the following guises…

• Definition module

• Implementation source module

• Implementation object module

Fig. 3: Two alternative management structures

The definition module defines the module interface, whilst the implementation source module contains the software itself in humanly readable form using a programming language. The implementation object module contains the same but in machine readable form using a machine language.

To summarize, in order to render a large requirement manageable we break it down into tasks until a point is reached where individuals (or small teams) are able to implement software to perform each one. The system design is then a number of definition modules, corresponding to each task, which each specify the interface to both parent and offspring. It must be emphasized that all tasks at all levels are represented by modules. The first definition module to be written is the topmost.

Top-down decomposition is applicable with all programming models.

However, tasks are most easily related to procedures (which perform them). In the procedural model the communication of…

• Procedures

• Variables

• Constants

• Data types

…is undertaken. A parent is said to import any of these from its offspring. An offspring is said to export them to its parent.

The reader must be wary of confusion here. Modules are purely descriptive.

They offer a hierarchical method of describing the required system for human benefit only. For the sake of performance (i.e. fast execution) the machine should be as unhindered as possible by inter-module boundaries. It is not possible to have modular software without some performance diminution so modularity may need to be traded-off against performance.

In order to reduce development cost it is vital that software be reused whenever possible. Library modules of previously written software should be built up and used later. To summarize…

• Low development cost requires software reusability

• Reusability requires software modularity

• Modularity may require some trade-off against performance

Fig. 4: Data flow diagram for a vision system

The benefits of modules are twofold. Firstly, modules make the development of large systems manageable. Secondly, reusable software drastically reduces development cost. Modules should be thought of as hardware "chips". They prevent "reinvention of the wheel" and unnecessary duplication of effort. They form a software resource which need only be developed once and then used in conjunction with any (correctly interfaced) higher level modules. The interface, as detailed in the definition module, should be all that it is necessary to know in order to use a library module. It should not be necessary to even have available the corresponding implementation source module.

2.2 Processes

Data flow graph

A data flow graph is a graph where processes form nodes and channels (or buffers) form edges, through which "flow" data. It is a directed graph and may or may not be fully connected. A data flow diagram is a pictorial representation of the data flow graph. Fig. 4 shows an example.

No information is contained in the data flow graph about when the processes run, i.e. which must run concurrently and which must run successively. This information must be drawn separately out of the requirement specification.

The analysis of data flow renders clear the communication inherent in the requirement. Now we are able to identify precisely what we require for a process interface specification. It is simply a formal or informal specification of the protocol of both incoming and outgoing data.

Partitions

There is one other benefit gained from data flow analysis. A means of process oriented design is afforded which conforms to the notion of stepwise refinement.

Each node on the data flow graph may be reduced to a new subgraph. This continues until processes are reached which may be implemented without further reduction. In other words the system is partitioned into a network of processes.

Each process may then be separately developed, maintained, verified and reused as a software module.

2.3 Objects

Nature

The reader is expected to be familiar with the procedural programming model where software specifies system behavior in terms of procedures which act upon data structures which may be either static or dynamic. The object oriented programming model unifies data and procedure with the concept of an object.

The idea is rather like an extension of that of a record by allowing procedure fields which alone may act upon the associated data fields. These are termed methods and are invoked only by sending the object an appropriate message.

State is said to be encapsulated with methods and can only be updated by the arrival of a message. For instance, a graphics system may require the ability to draw a circle. A circle object is defined which responds to the message "draw" by invoking a suitable method to update the screen (state). Polymorphism is the name given to the ability of different objects to respond differently to the same message. A line object may also respond to "draw" with a completely different result.

Objects in the real world may be represented by rendering an abstract data type which represents not just its state but also the operations which may be performed on it. A system is represented as a network of communicating objects very similar to one composed of processes. The only truly significant difference is that objects possess a property known as inheritance. It is possible to declare the…

• State

• Methods

• Messages

…of an entire class of objects. When an object is required it is declared to be an instance of a given class and inherits its state, methods and messages. This is not all, for it is then possible to modify the instance to possess further properties.

Even whole new classes may be declared as modifications of existing ones. This property promotes the reusability of software to an unprecedented degree. A number of classes turn out to be very common natural classes and hence are usually provided within an object-oriented programming system. It is possible to implement a very wide range of applications very efficiently using these as a starting point. A few of them are discussed below.

In short the most important characteristics of the object model are…

• Encapsulation

• Message passing

• Polymorphism

• Inheritance

See [Thomas 89] for a more thorough and very readable introduction. The archetype object-oriented language is Smalltalk [Goldberg &; Robson 83]. The use of objects in system design is thoroughly treated in [Meyer 88].

The implementation of objects using a procedural language is limited since message passing and inheritance are not explicitly supported. However [Stubbs & Webre 87] is an excellent introduction to the implementation of abstract data types given such limitations.

Lists

A list is a dynamic abstract data structure, i.e. it may change in size while the process to which it belongs is running. List elements may be physically either sequence associated or pointer associated depending on implementation.

Arguably the list is the simplest object of all. The minimum set of messages it should recognize is…

• Insert

• Remove

Both are followed by a key which identifies the element affected.

Typically other messages would allow an element to be inspected without removal and check to see if a given key is present.

A list is referred to as a generic type of data structure. In other words it defines a class of object. Lists are linear and thus are ordered. Each element has precisely one successor and one predecessor except those at each end. Extra messages may be defined which exploit the structure best for a given application.

Stack

At the purely data level a stack is identical to a list. It is also a linear, ordered structure. It is its message protocol that differs (see Fig. 5). The minimum set of stack messages is…

• Push

• Pop

Push is followed by an element which is to be placed on top of the stack. Pop causes a responding message containing the element which has been removed from there. Great care should be taken in defining the pop protocol that the response be defined should the stack be empty. Similarly the push protocol must also be carefully defined since, in practice, no machine has infinite memory and so any stack can become full.

Stack protocol is referred to as last in first out (LIFO), or first in last out (FILO), since the last element in is the first out and the first in is the last out.

Stacks are particularly important in computer organization and will be returned to later in this guide. Their chief use is in expression evaluation.

Queue

The last object class to be introduced has the following minimum message set…

• Enqueue

• Serve

…and is also linear and ordered like a list.

Fig. 5: Internal operations of a stack object.

Enqueue is followed by the element to be placed at the back of the queue.

Serve causes a responding message containing the element found at the front.

Care must be taken to ensure that the protocol covers the possibility of the queue being either full or empty. Fig. 6 depicts the operation of the methods on receipt of each message.

The queue protocol is referred to as first in first out (FIFO) since the first element in is always the first out.

Like the stack, the queue is of great importance in computer organization and is used for buffering data in asynchronous communication.

An extension is the priority queue object where elements are enqueued internally according to an associated priority. Hence when an item arrives of priority higher than that of the back element, it is placed further up as appropriate.

Fig. 6: Internal operations of a queue object

NEXT>>

PREV. | NEXT