Programming paradigms

Programming Paradigms

4.5.0 Introduction

This Chapter uses a number of different programming languages to illustrate the points
being explained. You will not be asked to write program code in the examination. However,
you may well find it helpful, in answering examination questions, to include code in your
answers. This is perfectly satisfactory; indeed it will often help you to clarify what you are
saying. If you do write code, the accuracy of the syntax will not be marked. Examiners will
use your code to see if you understand the question and can explain your answer.

Thus, treat the code in this Chapter simply as an explanation of the different facilities
available in different programming paradigms. Do not think that you have to be able to
program in all the languages used in the following Sections.
4.5.1 Programming Paradigms

Programming paradigms are simply methods of programming. Initially, computers were
programmed using binary. This was difficult and led to many errors that were difficult to
find. Programs written in binary are said to be written in machine code, this is a very
low-level programming paradigm.

To make programming easier, assembly languages were developed. These replaced
machine code functions with mnemonics and addresses with labels. Assembly language
programming is also a low-level paradigm although it is a second generation paradigm.
Figure 4.5.1.1 shows an assembly language program that adds together two numbers and
stores the result.

Label Function Address Comments
LDA X Load the accumulator with the value of X
ADD Y Add the value of Y to the accumulator
STA Z Store the result in Z
STOP Stop the program
X: 20 Value of X = 20
Y: 35 Value of Y = 35
Z: Location for result

Figure 4.5.1.1

Although this assembly language is an improvement over machine code, it is still prone to
errors and code is difficult to debug, correct and maintain.

The next advance was the development of procedural languages. These are third generation
languages and are also known as high-level languages. These languages are problem
oriented as they use terms appropriate to the type of problem being solved. For example,
COBOL (Common Business Oriented Language) uses the language of business. It uses
terms like file, move and copy.

FORTRAN (FORmula TRANslation) and ALGOL (ALGOrithmic Language) were developed
mainly for scientific and engineering problems. Although one of the ideas behind the
development of ALGOL was that it was an appropriate language to define algorithms. BASIC
(Beginners All purpose Symbolic Instruction Code) was developed to enable more people to
write programs. All these languages follow the procedural paradigm. That is, they describe,
step by step, exactly the procedure that should be followed to solve a problem.

The problem with procedural languages is that it can be difficult to reuse code and to modify
solutions when better methods of solution are developed. In order to address these
problems, object-oriented languages (like Eiffel, Smalltalk and Java) were developed. In
these languages data, and methods of manipulating the data, are kept as a single unit called
an object. The only way that a user can access the data is via the object's methods. This
means that, once an object is fully working, it cannot be corrupted by the user. It also
means that the internal workings of an object may be changed without affecting any code
that uses the object.

A further advance was made when declarative programming paradigms were developed. In
these languages the computer is told what the problem is, not how to solve the problem.
Given a database the computer searches for a solution. The computer is not given a
procedure to follow as in the languages discussed so far.

Another programming paradigm is functional programming. Programs written using this
paradigm use functions, which may call other functions (including themselves). These
functions have inputs and outputs. Variables, as used in procedural languages, are not used
in functional languages. Functional languages make a great deal of use of recursion (see
Section 1.3.4).

4.5.2 Programming Paradigms – Examples and Uses

Procedural languages specify, exactly, the steps required to solve a problem. These
languages use the constructs sequence, selection and repetition (see Section 1.3.3). For
example, to find the area of a rectangle the steps are

Read the length
Read the breadth
Multiply the length by the breadth
Output the result

In C++ this can be coded as

cout << "Enter the length: ";
cin >> Length;
cout << "Enter the breadth: ";
cin >> Breadth;
Area = Length * Breadth;
cout << "The area is "
<< Area << endl;

Here each line of code is executed one after the other in sequence.

Most procedural languages have two methods of selection. These are the IF … THEN … ELSE
statement and the SWITCH or CASE statement. For example, in C++, we have

IF (Number > 0)
cout << "The number is positive.";
ELSE
{
IF (Number = = 0)
cout << "The number is zero.";
ELSE
cout << "The number is negative.";
}
In C++ multiple selections can be programmed using the SWITCH statement. For example,
suppose a user enters a single letter and the output depends on that letter, a typical piece
of code could be

switch (UserChoice)
{
case 'A':
cout << "A is for Apple.";
break;
case 'B':
cout << "B is for Banana.";
break;
case 'C':
cout << "C is for Cat.";
break;
default:
cout << "I don't recognise that letter.";
}

Repetition (or iteration) is another standard construct. Most procedural languages have
many forms of this construct such as

FOR … NEXT
REPEAT … UNTIL …
WHILE … DO …

A typical use of a loop is to add a series of numbers. The following pieces of C++ code add
the first ten positive integers.

//Using a FOR loop
Sum = 0;
FOR (int i = 1; i <= 10; i++)
{
Sum = Sum + i;
}
cout << "The sum is "
<< Sum;

//Using a WHILE loop
Sum = 0;
i = 1;
while (i <= 10)
{
Sum = Sum + i;
i++;
}
cout << "The sum is "
<< Sum;

The point to note with these procedural languages is that the programmer has to specify
exactly what the computer is to do.

Procedural languages are used to solve a wide variety of problems. Some of these
languages are more robust than others. This means that the compiler will not let the
programmer write statements that may lead to problems in certain circumstances. As stated
earlier, there are procedural languages designed to solve scientific and engineering
problems while others are more suitable for solving business problems. There are some that
are particularly designed for solving problems of control that need real time solutions.

Procedural languages may use functions and procedures but they always specify the order
in which instructions must be used to solve a problem. The use of functions and procedures
help programmers to reuse code but there is always the danger of variables being altered
inadvertently.

In the 1970s it was realised that code was not easily reused and there was little security of
data in a program. Also, the real world consists of objects not individual values. My car,
registration number W123ARB, is an object. Kay's car, registration number S123KAY, is
another object. Both of these objects are cars and cars have similar attributes such as
registration number, engine capacity, colour, and so on. That is, my car and Kay's car are
instances of a class called cars. In order to model the real world, the Object-oriented
Programming (OOP) paradigm was developed. Unfortunately, OOP requires a large amount
of memory and, in the 1970s, memory was expensive and CPUs still lacked power. This
slowed the development of OOP. However, as memory became cheaper and CPUs more
powerful, OOP became more popular. By the 1980s Smalltalk, and later Eiffel, had become
well established. These were true object-oriented languages. C++ also includes classes
although the programmer does not have to use them. This means that C++ can be used as
a standard procedural language or an object-oriented language or a mixture of both! Java,
with a syntax similar to C++, is a fully object-oriented language. Although OOP languages
are procedural in nature, OOP is considered to be a new programming paradigm.

The following gives an example, but may be omitted.

The following is an example, using Java, of a class that specifies a rectangle and the
methods that can be used to access and manipulate the data. A more detailed description is
given in Section 4.5.6.

class Shapes {

/* Shapes Version 1 by Roger Blackford July 2001
----------------------------------------------------------
This illustrates the basic ideas of OOP
*/

// Declare three object variables of type Rectangle
Rectangle small, medium, large;

// Create a constructor where the initial work is done
Shapes ( ) {
// Create the three rectangles
small = new Rectangle(2, 5);
medium = new Rectangle(10, 25);
large = new Rectangle(50, 100);

//Print out a header
System.out.println("The areas of the rectangles are:\n");

//Print the details of the rectangles
small.write( );
medium.write( );
large.write( );
}//end of constructor Shapes.

//All programs have to have a main method
public static void main(String [ ] args) {
//Start the programm from its constructor
new Shapes ( );
}//end of main method.

}//end of class Shapes.

class Rectangle {
//Declare the variables related to a rectangle
int length;
int width;
int area;

//Create a constructor that copies the initial values into the object's variables
Rectangle (int w, int l) {
width = w;
length = l;

//Calculate the area
area = width * length;
}//end of constructor Rectangle

//Create a method to output the details of the rectangle
void write ( ) {
System.out.println("The area of a rectangle " + width + " by " + length + " is " + area);
}//end of write method.
}//end of constructor

This example contains two classes. The first is called Shapes and is the main part of the
program. It is from here that the program will run. The second class is called Rectangle and
it is a template for the description of a rectangle.

The class Shapes has a constructor called Shapes, which declares two objects of type
Rectangle. This is a declaration and does not assign any values to these objects. In fact,
Java simply says that, at this stage, they have null values. Later, the new statement creates
actual rectangles. Here small is given a width of 2 and a length of 5, medium is given a
width of 10 and a length of 25 and large is given a width of 50 and a length of 100. When a
new object is created from a class, the class constructor, which has the same name as the
class, is called.

The class Rectangle has a constructor that assigns values to width and length and then
calculates the area of the rectangle.

The class Rectangle also has a method called write( ). This method has to be used to output
the details of the rectangles.

In the class Shapes, its constructor then prints a heading and the details of the rectangles.
The latter is achieved by calling the write method. Remember, small, medium and large are
objects of the Rectangle class. This means that, for example, small.write( ) will cause Java
to look in the class called Rectangle for a write method and will then use it.

More details of Object-oriented Programming will be given in Section 4.5.6.

The functional programming paradigm provides a very high level view of programming. All
programs consist of a series of functions that use parameters for input and pass values to
other functions. There are no variables like the ones in procedural languages. However, like
procedural languages, the programmer has to tell the computer the precise steps to be
taken to solve a problem. For example, in the language "Haskell", the following returns the
square of a number.

square :: Int ® Int
square n = n * n

is a function that squares a number.

square :: Int ® Int

says that we have a function called square that takes an integer as input and outputs an
integer.

square n = n * n

says that the function requires the value of n as input and outputs n * n.

Another example is

different :: Int ® Int ® Int ® Bool
different a b c = not (( a = = b) && (b = = c))

Here, different takes three integers as input and outputs a Boolean value True or False. The
output is True if a, b and c are not all the same.

Suppose a = 2, b = 5 and c = 7. Then (a = = b) and (b = = c) are both False. Therefore (a
= = b) && (b = = c) is False. Hence not (( a = = b) && (b = = c)) is True.

See what happens if

(i) a = 2, b = 2 and c = 3;
(ii) a = 2, b = 3 and c = 3;
(iii) a = 5, b = 5 and c = 5.

You should find that (i) and (ii) give an output of True and (iii) gives an output of False.

Most functions use guards to determine the output. Consider the following example.

min2 :: Int ® Int ® Int
min2 x y
| x <= y = x
| otherwise = y

Here | x <= y is a guard. The function first checks to see if x <= y is True. If it is, the
function outputs x and ends. If x <= y is False, Haskell moves to the next line and checks
the guard. In this case there is no guard. So the function outputs y.

Now consider this function

min3 Int ® Int ® Int ® Int
min3 x y z
| x <= y && x <= z = x
| y <= z = y
| otherwise = z

Here the function is expecting three integers as input and outputs an integer. It in fact
outputs the minimum of three integers. If x <= y and x <= z then x must be the minimum
and the function outputs x and stops. However, if this is not true, x is not the minimum
therefore y or z must be the minimum. The second guard checks this. If this returns a True
value, y is output. If the guard returns a False value, the function continues and outputs z.

Functional programming can be very powerful. The trick is to keep breaking a problem
down into sub-problems until the sub-problems can be solved by using simple functions.
Suppose we wish to find the minimum of four integers. There are many ways of doing this
but they all consist of sub-problems. One algorithm is

min of a b c d = min of (a and (min b c d))

This uses two functions namely the minimum of three integers and the minimum of two
integers. But we already have solutions to these problems so why not use them and we
have

min4 :: Int ® Int ® Int ® Int ® Int
min4 a b c d = min2 a (min3 b c d)

This is simply the use of step-wise refinement/top-down design as explained in Section
1.3.2.

Another facility of functional programming that makes it a powerful programming paradigm
is the use of recursion. Consider the problem of finding the sum of the first n integers.
Traditionally this could be written (in Visual Basic) as

sum = 0
For count = 1 to n
sum = sum + count
Next count

This may also be written in Haskell as

sum :: Int ® Int
sum n
|n = 1 = 1
| otherwise = n + sum (n-1)

This is simply saying that the function sum expects an integer as input and outputs an
integer. The first guard says that if the input integer is 1, output 1. If this guard is False,
then output n plus the value of sum (n – 1). Then sum (n – 1) calls the same function but
with the input being 1 less than the last call. This is repeated until the input is reduced to 1
when a value of 1 is output.

Now consider the call sum 3. Diagrammatically we have

sum 3
| 3 =1 False
| otherwise = 3 + sum 2 ® sum 2
|2 = 1 False
| otherwise = 2 + sum 1 ® sum 1
| 1 = 1 True
= 1
= 2 + 1
=3
= 3 + 3
= 6

The combination of step-wise refinement and recursion means that functional programming
is a very powerful tool. More powerful features of functional programming will be addressed
in Section 4.5.8.

N.B www.cs.bell-labs.com/who/wadler/realworld/ is worth a visit as it describes real world
problems that have been solved using functional programming. Because a program written
in a functional language consists only of functions it means that the program can be
mathematically proved to be correct. This means that very reliable programs can be
written. Common problems that have been solved using this paradigm are telecomm
switching, control of robots, airline scheduling and compiler writing.

Another programming paradigm is the declarative one. Declarative languages tell the
computer what is wanted but do not provide the details of how to do it. These languages are
particularly useful when solving problems in artificial intelligence such as medical diagnosis,
fault finding in equipment and oil exploration. The method is also used in robot control. An
example of a declarative language is Prolog. The idea behind declarative languages is
shown in Fig. 4.5.2.1.

Fig. 4.5.2.1

Here the user inputs a query to the search engine, which then searches the database for the
answers and returns them to the user. For example, using Prolog, suppose the database is

female(jane).
female(anne).
female(sandip).
male(charnjit).
male(jaz).
male(tom).

Note that in Prolog values start with a lowercase letter and variables start with an uppercase
letter. A user may want to know the names of all the males. The query

male(X).

will return

X = charnjit
X = jaz
X = tom

Notice that the user does not have to tell Prolog how to search for the values of X that
satisfy the query. In a procedural language the database may be held in a two-dimensional
array Gender as shown below.

Array Gender

1 2
1 female jane
2 female anne
3 female sandip
4 male charnjit
5 male jaz
6 male tom

In Visual Basic we could write

For count = 1 To 6
If Gender[count, 1] = "male" Then
picResults.Print Gender[count, 2]
End If

This is fairly straightforward. However, suppose we now add to the Prolog database the
following data.

parent(jane,mary).
parent(jane, rajinder).
parent(charnjit, mary).
parent(charnjit, rajinder).
parent(sandip, atif).
parent(jaz, atif).

and suppose we wish to know the name of the mother of Atif. In Prolog we use the query

parent(X, atif), female(X).

Prolog will output

X = sandip

Try writing a Visual Basic program to do this.

To get a list of all the fathers, we can simply write

parent(X, Y), male(X).

The result is

X = charnjit Y = mary
X = charnjit Y = rajinder
X = jaz Y = atif

If we only want a list of fathers we use the underscore and create the query

parent(X, _ ), male(X).
and the result is

X = charnjit
X = charnjit
X = jaz

Further examples are given in Section 4.5.7. At this stage the important point is that the
programmer does not have to tell the computer how to answer the query. There are no FOR
… NEXT, WHILE … DO … or REPEAT … UNTIL … loops as such. There is no IF … THEN …
statement. The system simply consists of a search engine and a database of facts and rules.
Examples of facts are given above. Examples of rules will be given in Section 4.5.7.

4.5.3 Structured Design

A complex problem needs to be broken down into smaller and smaller sub-problems until all
the sub-problems can be solved easily. This process is called step-wise refinement or
top-down design.

Consider the problem of calculating the wages for an hourly paid worker. The worker is paid
£6.50 per hour for up to 40 hours and time-and-a-half for all hours over 40. Tax and
National Insurance contributions have to be deducted. This can be represented by Fig.
4.5.3.1.

Fig. 4.5.3.1

An alternative way of writing this is to use numbered statements. This can be easier if there
are many sub-problems to be solved.

1. Wages
1.1 Get number of hours
1.2 Calculate gross pay
1.2.1 Calculate normal wages
1.2.2 Calculate overtime
1.3 Calculate deductions
1.3.1 Calculate tax
1.3.2 Calculate National Insurance
1.4 Calculate net pay
1.5 Output wages slip

Either of these designs can be turned into a series of functions and procedures. The
program could be called Wages and consist of the following functions and procedure.

Wages
GetHours( ) returns an integer in range 0 to 60
CalculateWages(Hours) returns gross wage
CalculateNormalWages(Hours) returns wage for up to 40 hours
CalculateOvertime(Hours) returns pay for any hours over 40
CalculateDeductions(GrossWage) returns total deductions
CalculateTax(GrossWage) returns tax due
CalculateNI(GrossWage) returns N.I. due
CalculateNetWage(GrossWage, Deductions) returns net wage after deductions
Procedure OutputResults(Hours, GrossWage, Tax, NI, Deductions, NetWage)
Procedure to print the wage slip

Here we can see that if a single value is to be returned, the simplest way to do this is to use
a function. If there are no values to be returned, then a procedure should be used. If more
than one value is to be returned, a procedure should be used.

For example, in the above, we may have

Int GetHours( ) or GetHours( ) As Integer

These statements state that the function will return an integer value and it does not expect
any values to be fed into it.

Double CalculateNormalWages(Int Hours)

or
CalculateNormalWages(Hours As Integer) As Double

expects to be given an integer value as input and returns a value of type Double.

Note: If you are programming in C, C++ or Java, there are no procedures. These
languages only use functions. A function has to be typed. That is, the programmer must
specify the type of value to be returned. This is true in all languages. In C, C++ and Java, if
a function is not going to return a value, its return type is void. That is, no value is actually
returned.

Parameters will be discussed further in Section 4.5.4.

Another type of diagram is used with the Jackson Structured Programming (JSP) technique.
Fig. 4.5.3.1 shows the sequence of steps.

Get number of hours
Calculate gross pay
Calculate deductions
Calculate net pay
Output wage slip

Calculate gross pay consists of the sequence

Calculate normal wages
Calculate overtime

Calculate deductions consists of the sequence

Calculate tax
Calculate National Insurance

The diagram does not illustrate selection (IF … THEN … ELSE …) nor does it show repetition
(FOR … DO … , WHILE … DO … , etc.). Fig. 4.5.3.2 shows selection in JSP design. Note the
use of a circle inside the boxes to indicate that the operations B and C are conditional.

Fig. 4.5.3.2

Repetition (also known as iteration) is shown in Fig. 4.5.3.3. The asterisk is used to indicate
that B is an iterative process; that is, B has to be repeated.

Fig. 4.5.3.3

Initially we shall consider diagrams that represent data. Consider a pack of ordinary playing
cards which has been divided into red and black suits. Fig. 4.5.3.4 shows this. The top level
shows we are using a pack; the second layer shows that the pack is divided into red and
black components. The third level shows that the red component consists of many (which
may be zero) cards. Similarly, black consists of many cards.

Fig. 4.5.3.4

Fig, 4.5.3.5 shows a pack divided into suits.

Fig, 4.5.3.5

Now suppose we deal a hand of cards from a shuffled pack until a spade is dealt. The data
structure is shown in Fig. 4.5.3.6.

Fig. 4.5.3.6
Finally consider a sequential file that contains daily takings of a shop in date order. At the
end of each month's takings is a grand total for the month. At the end of the file is the total
for all the months recorded. These totals act as validation checks. At the start of the file is a
header record describing the contents of the file. Fig. 4.5.3.7 shows this data structure.
Present and absent are processes that are performed according to the presence or absence
of the respective totals.

Fig. 4.5.3.7

The JSP diagrams we have seen so far have shown physical data structures. Logical data
structures describe the data with respect to a given application.

Suppose we wish to extract the takings for all Mondays from the file shown in Fig. 4.5.3.7.
The logical diagram is shown in Fig. 4.5.3.8. Notice that the diagram does not violate the
data structure shown in the previous diagram. Also, notice the use of null ( ? ) in the
decision at the bottom of the diagram; this shows that the ELSE part of the decision does
nothing. Further there are no decisions below the totals in this diagram because they are not
being used.

Fig. 4.5.3.8

Now suppose the application is to extract February's data. In this case we need to read past
January totals so these must be shown in the logical data structure. However, once we have
dealt with February, we do not wish to read the remaining months' data. Fig. 4.5.3.9 shows
the logical structure for this problem.

Fig. 4.5.3.9

The next step is to enter, on the diagram, the constraints. This is done by numbering the
constraints and then listing their meanings. Placing the constraints on Fig. 4.5.3.8 produced
Fig. 4.5.3.10 where

C1 is until end of file data
C2 is until end of month's body
C3 is day = Monday

Fig. 4.5.3.10

In this example, separate procedures can be written for each box in the logical structure.
That is, each section in the diagram can be a separate procedure or function. Each
procedure and function can use parameters to pass data to and from the calling procedure
or function. This is further explained in the next Section.

4.5.4 Standard Programming Techniques

The previous Section discussed top-down design and JSP diagrams. Both systems lead to
modules that can be programmed easily. Each module is a solution to an individual problem
and each module has to interface with other modules. As long as the interfaces are clearly
specified, each module can be given to a different programmer to code. All that the
programmers need to know is the problem and how its solution must communicate with the
solutions to other modules. This means that two programmers may happen to use the same
name for a variable. Also, the programmers will need to pass values to other modules and
be able to accept values from other modules. This was briefly discussed in Section 1.3.5.

Let us first consider how data can be input to a function or procedure. This is done by
means of parameters. The function below, written in Visual Basic, finds the perimeter of a
rectangle given its length and breadth. This is not the only way of finding the perimeter and
it probably is not the best way. However, it has been written like this in order to illustrate
certain programming points.

Public Function PerimeterOfRectangle(X As Integer, Y As Integer) As Integer

X = 2 * X
Y = 2 * Y
PerimeterOfRectangle = X + Y

End Function

In this function X and Y are integers the values of which must be passed to the function
before it can find the area of the rectangle. These variables are called formal parameters.
To use this function, another program will have to call it and provide the values for X and Y.
This can be done by means of a statement of the form

Perimeter = PerimeterOfRectangle(4, 6)

or we can use

A = 3
B = 4
Perimeter = PerimeterOfRectangle(A, B)

In both of these statements the variables inside the parentheses ( 4 and 6 in the first
example and A and B in the second) are called actual parameters. How the values are
passed to the function or procedure depends on the programming language. In the first
example the values 4 and 6 are stored in the variables X and Y. In the second example, in
Visual Basic, the addresses of A and B are passed to the function so that X and Y have the
same address as A and B respectively. In C++, in both cases the actual values are passed
to the function which stores them in its own variable space. Thus we have two different
ways of passing parameters. Fig. 4.5.4.1 shows how Visual Basic normally passes
parameters.

Calling Program Memory Location Function
A 3 X
B 4 Y

Fig. 4.5.4.1

Fig 4.5.4.2 shows what normally happens when C++ passes parameters. notice that two
extra memory locations are used and that C++ makes a copy of the values of A and B and
stores them in separate locations X and Y.

Calling Program Memory Location Function
A 3
B 4
3 X
4 Y

Fig 4.5.4.2

Visual Basic is said to pass parameters by reference (or address) and C++ passes them by
value. It is interesting to see the effect of passing values by address. Here is the function
described above and a copy of the calling function in Visual Basic.

Public Function PerimeterOfRectangle(X As Integer, Y As Integer) As Integer

X = 2 * X
Y = 2 * Y
PerimeterOfRectangle = X + Y

End Function

Private Sub cmdShow_Click()

Dim A As Integer
Dim B As Integer
Dim Perimeter As Integer

A = 3
B = 4

picResults.Print "Before call to Sub A = "; A; " and B = "; B
Perimeter = PerimeterOfRectangle(A, B)
picResults.Print "Perimeter = "; Perimeter
picResults.Print "After call to Sub A = "; A; " and B = "; B;

End Sub

Fig.4.5.4.3 shows the output when this program is run.

Fig.4.5.4.3

Notice that after the function has been run the values of A and B have changed. This is
because the addresses of A and B were passed not their actual values.

Visual Basic can pass parameters by value and C++ can pass parameters by reference. In
Visual Basic we have to use the ByVal key word if we want values to be passed by value.
Here is a modified form of the Visual Basic function together with the output from running
the modified program.

Public Function PerimeterOfRectangle(ByVal X As Integer, ByVal Y As Integer) As Integer

X = 2 * X
Y = 2 * Y
PerimeterOfRectangle = X + Y

End Function

Fig. 4.5.4.4

In order to pass parameters by reference in C++, the formal parameter must be preceded
by an ampersand (&).
Variables can have different values in different parts of the program. Look at the following
Visual Basic code and its output, shown in Fig. 4.5.4.5.

Private Sub cmdShow_Click()

Dim A As Integer
Dim B As Integer
Dim C As Integer
Dim Perimeter As Integer

A = 3
B = 4
C = 5

picResults.Print "Before call to Sub A = "; A; " and B = "; B
Perimeter = PerimeterOfRectangle(A, B)
picResults.Print "Perimeter = "; Perimeter
picResults.Print "After call to Sub A ="; A; " and B = "; B; " and C = "; C

End Sub

Public Function PerimeterOfRectangle(X As Integer, Y As Integer) As Integer

Dim C As Integer

C = 10

picResults.Print "In Sub C = "; C

X = 2 * X
Y = 2 * Y
PerimeterOfRectangle = X + Y

End Function

Fig. 4.5.4.5
This shows that C has a different value in the function PerimeterOfRectangle to in the calling
function cmdShow_Click. C is said to be a local variable and the C in PerimeterOfRectangle
is stored in a different address to the C in cmdShow_Click. Local variables only exist in the
block in which they are declared. This is very helpful as it means that different
programmers, writing different routines, do not have to worry about the names of variables
used by other programmers. However, it is sometimes useful to be able to use the same
variable in many parts of a program. To do this, the variable has to be declared as global.
In Visual Basic this is done by means of the statement

Public C As Integer

which is placed in a module. If we do this with the previous example the code becomes

Public C As Integer 'This is in the module

Private Sub cmdShow_Click()

Dim A As Integer
Dim B As Integer
Dim Perimeter As Integer

A = 3
B = 4
C = 5

picResults.Print "Before call to Sub A = "; A; " and B = "; B; " and C = "; C
Perimeter = PerimeterOfRectangle(A, B)
picResults.Print "Perimeter = "; Perimeter
picResults.Print "After call to Sub A ="; A; " and B = "; B; " and C = "; C

End Sub

Public Function PerimeterOfRectangle(X As Integer, Y As Integer) As Integer

C = 10

picResults.Print "In Sub C = "; C

X = 2 * X
Y = 2 * Y
PerimeterOfRectangle = X + Y

End Function

Fig. 4.5.4.6 shows that the value of C, when changed in the function PerimeterOfRectangle,
is changed in the calling routine also. In fact it is changed throughout the program.

Fig. 4.5.4.6

In C++ variables can be declared at any point in the program. This means that a variable
can be local to a small block of code. In the following example i is only available in the for
loop. The output statement after the loop is illegal as i no longer exists.

for (int i = 1; i<=10; i++)
{
cout<<i
}
cout<<i //This is illegal, i no longer exists.

We now have the ability to allow variables to be used only in certain parts of a program or
in any part. Global variables should be used as sparingly as possible as they can cause a
program to be very difficult to debug. This is because it is not always clear when global
variables are being changed.

What happens if a variable is declared as both global and local? The following code declares
C as global and C as local in the function PerimeterOfRectangle and the result of running it
is shown in Fig. 4.5.4.7. Notice that the value of C in the function cmdShow_Click is not
changed although its value is changed in PerimeterOfRectangle. This is because C is
declared as a local variable in this function and this means that the global C is not used.

Public C As Integer 'global declaration

Private Sub cmdShow_Click()

Dim A As Integer
Dim B As Integer
Dim Perimeter As Integer

A = 3
B = 4
C = 5

picResults.Print "Before call to Sub A = "; A; " and B = "; B; " and C = "; C
Perimeter = PerimeterOfRectangle(A, B)
picResults.Print "Perimeter = "; Perimeter
picResults.Print "After call to Sub A ="; A; " and B = "; B; " and C = "; C

End Sub

Public Function PerimeterOfRectangle(X As Integer, Y As Integer) As Integer

Dim C As Integer 'local declaration

C = 10

picResults.Print "In Sub C = "; C

X = 2 * X
Y = 2 * Y
PerimeterOfRectangle = X + Y

End Function

Fig. 4.5.4.7

4.5.5 Stacks and Procedures

When a procedure or function is called, the computer needs to know where to return to
when the function or procedure is completed. That is, the return address must be known.
Further, functions and procedures may call other functions and procedures which means
that not only must several return addresses be stored but they must be retrieved in the
right order. This can be achieved by using a stack. Fig. 4.5.5.1 shows what happens when
three functions are called after one another. The numbers represent the addresses of the
instructions following the calls to functions.

Main program
.
.
.
Call Function A Function A
100 …
.
. .
.
.
. Call Function B Function B
.
.
. 150 …
.
. .
.
.
End Return Call Function C Function C
250 …
.
. .
.
.
Return Return

Fig. 4.5.5.1

Notice that the addresses will be stored in the order 100, 150 then 250. When the returns
take place the addresses will be needed in the order 250, 150 then 100. That is, the last
address stored is the first address needed on returning from a function. This means that we
need a data structure that provides a last in first out facility. A stack does precisely this, so
we store the return addresses in a stack. In the above example, the addresses will be
stored in the stack each time a function is called and will be removed from the stack each
time a return instruction is executed. This is shown in Fig. 4.5.5.2.

Calls and Returns Stack

Call Function A

Stack pointer 100
Push return address onto stack

Call Function B

Stack pointer 150
100
Push return address onto stack

Call Function C

Stack pointer 250
150
100
Push return address onto stack

Return from C

250
Stack pointer 150
100
Pop return address off stack

Return from B

250
150
Stack pointer 100
Pop return address off stack

Return from A

250
150
100
Stack pointer NULL
Pop return address off stack

Fig.4.5.5.2

Now suppose that values need to be passed to, or from, a function or procedure. Again a
stack can be used. Suppose we have a main program and two procedures, Proc A(A1, A2)
and Proc B(B1, B2, B3). That is, A1 and A2 are the formal parameters for Proc A and B1, B2
and B3 are the formal parameters for Proc B. Now look at Fig. 4.5.5.3 which shows the
procedures being called and the return addresses that must be placed on the stack.

Main program
.
.
.
Call Proc A(X1,X2) Proc A(A1,A2)
200 … .
.
.
Call Proc B(Y1,Y2,Y3) Proc B(B1,B2,B3)
400 …
.
. .
.
.
Return Return

Fig. 4.5.5.3

Now let us suppose that all the parameters are being passed by value. Then, when the
procedures are called the actual parameters must be placed on the stack and the
procedures must pop the values off the stack and store the values in the formal
parameters. This is shown in Fig. 4.5.5.4; note how the stack pointer is moved each time an
address or actual parameter is popped onto or popped off the stack.

Stack pointer X2
X1
200
Call Proc A(X1, X2)
PUSH 200
PUSH X1
PUSH X2

X2
X1
Stack pointer 200
A2 = X2 (POP X2)
A1 = X1 (POP X1)

Stack pointer Y3
Y2
Y1
400
200
Call Proc B(Y1,Y2,Y3)
PUSH 400
PUSH Y1
PUSH Y2
PUSH Y3

Y3
Y2
Y1
Stack pointer 400
200
B3 = Y3 (POP Y3)
B2 = Y2 (POP Y2)
B1 = Y1 (POP Y1)

Y3
Y2
Y1
400
Stack pointer 200
Return from Proc B

Y3
Y2
Y1
400
200
Stack pointer NULL
Return from Proc A

Next we must consider what happens if the values are passed by reference. This works in
exactly the same way as the addresses of variables are passed so there is no need to
return the values via parameters. The procedures, or functions, will access the actual
addresses where the variables are stored. Finally, how do functions return values? Simply
push them on the stack immediately before returning. The calling program can then pop the
value off the stack. Note that the return address has to be popped off the stack before
pushing the return value onto the stack.

4.5.6 Object-Oriented Programming (OOP)

Section 4.5.1 showed some simple examples of programs written in an object-oriented
language. There are many such languages, some of which were designed as such (Eiffel,
Smalltalk) and others which have evolved (C++, Visual Basic). Java is an OOP language
whose syntax is based on C++. All these languages have classes and derived classes and
use the concepts encapsulation, inheritance and polymorphism. In this Section we consider
these concepts in a general way, using diagrams rather than any particular language.

Data encapsulation (or data hiding) has been explained in Section 4.5.1. It is the concept
that data can only be accessed via the methods provided by the class. This is shown in Fig.
4.5.6.1 where the objects, that is, instantiations of a class, are prevented from directly
accessing the data by the methods.

Fig. 4.5.6.1

Objects can only access the data by using the methods provided. An object cannot
manipulate the data directly. In the case of the rectangle class, an object of this class
cannot directly calculate its area. That is, we cannot write

area : = myRectangle.width * myRectangle.length;

where myRectangle is an object of type Rectangle.

To find the area of myRectangle, class Rectangle must provide a suitable method. The
example in Section 4.5.1 does not do this. The class Rectangle calculates the area when an
instance of a class is instantiated. The only way to find the area is to use the write( )
method which outputs the area. If a user wishes to access the width and length of a
rectangle, the class must provide methods to do this. Methods to return the width and length
are given below.
integer getWidth( ) {
getWidth := width;
}//end of getWidth method.

integer getLength( ) {
getLength := length;
}//end of getLength method.

myRectangle can now use these methods to get at the width and length. However, it cannot
change their values. To find the perimeter we can write

myWidth := myRectangle.getWidth( );
myLength := myRectangle.getLength( );
myPerimeter := 2 * (myWidth + myLength);

Thus, an object consists of the data and the methods provided by the class. The concept of
data being only accessible by means of the methods provided is very important as it
ensures data integrity. Once a class has been written and fully tested, neither its methods
nor the data can be tampered with. Also, if the original design of a method is found to be
inefficient, the design can be changed, unknowingly to the user, without the user's program
being affected.

Another powerful concept is that of inheritance. Inheritance allows the re-use of code and
the facility to extend the data and methods without affecting the original code. In the
following diagrams, we shall use a rounded rectangle to represent a class. The name of the
class will appear at the top of the rectangle, followed by the data followed by the methods.

Consider the class Person that has data about a person's name and address and a methods
called outputData( ) that outputs the name and address, getName( ) and getAddress( ) that
return the name and address respectively. This is shown in Fig. 4.5.6.2.

Fig. 4.5.6.2

Now suppose we want a class Employee that requires the same data and methods as Person
and also needs to store and output an employee's National Insurance number. Clearly, we
do not wish to rewrite the contents of the class person. We can do this by creating a class
called Employee that inherits all the details of the class Person and adds on the extra data
and methods needed. This is shown in Fig. 4.5.6.3 where the arrow signifies that Employee
inherits the data and methods provided by the class Person. Person is called the super-class
of Employee and Employee is the derived class from Person. An object of type Employee
can use the methods provided by Employee and those provided by Person.

Fig. 4.5.6.3

Notice that we now have two methods with the same name. How does the program
determine which one to use? If myPerson is an instantiation of the Person class, then

myPerson.outputData( );

will use the outputData( ) method from the Person class. The statement

myEmp.outputData( );

will use the method outputData( ) from the Employee class if myEmp is an instantiation of
the Employee class.

The concept of a method having two different meanings is called polymorphism.

Now suppose we have two types of employee; one is hourly paid and the other is paid a
salary. Both of these require the data and methods of the classes Person and Employee but
they also need different data to one another. This is shown in Fig. 4.5.6.4.

Fig. 4.5.6.4

How can an object of type Employee output the name and address as well as the N.I.
number? The outputData( ) method in class Employee can refer to the outputData( ) method
of its superclass. This is done by writing a method, in class Employee, of the form

void outputData( ) {
super.outputData( );
System.out.println("The N.I. number is " + NINumber);
}//end of outputData method.

Here super. outputData( ) calls the outputData( ) method of the super-class and then
outputs the N.I. number. Similarly, the other derived classes can call the methods of their
super classes.

In the above, we have explained the meanings of terms such as data encapsulation, class
and inheritance. However, sometimes the examiner may ask you to simply state the
meanings of these terms. In this case a simple definition is all that is required. Note also
that there will only be one (or possibly two) marks for this type of question. The following
definitions would be satisfactiory answers to questions that say 'State the meaning of the
term … '.

Definitions

Data encapsulation is the combining together of the variables and the methods that can
operate on the variables so that the methods are the only ways of using the variables..

A class describes the variables and methods appropriate to some real-world entity.

An object is an instance of a class and is an actual real-world entity.

Inheritance is the ability of a class to use the variables and methods of a class from which
the new class is derived.

4.5.7 Declarative Languages

In Section 4.5.1, we saw that, in declarative languages, the programmer can simply state
what is wanted having declared a set of facts and rules. We now look at how this works
using examples of Prolog scripts. In order to do this, we shall use the following facts.

female(jane).
female(anne).
female(sandip).
male(charnjit).
male(jaz).
male(tom).
parent(jane,mary).
parent(jane, rajinder).
parent(charnjit, mary).
parent(charnjit, rajinder).
parent(sandip, atif).
parent(jaz, atif).

Remember that variables must start with an uppercase letter; constants start with a
lowercase letter.

Suppose we ask
male(X).

Prolog starts searching the database and finds male(charnjit) matches male(X) if X is given
the value charnjit. We say that X is instantiated to charnjit. Prolog now outputs

X = charnjit

Prolog then goes back to the database and continues its search. It finds male(jaz) so
outputs

X = jaz

and again continues its search. It continues in this way until the whole database has been
searched. The complete output is

X = charnjit
X = jaz
X = tom
No

The last line means that there are no more solutions.

The query male(X) is known as a goal to be tested. That is, the goal is to find all X that
satisfy male(X). If Prolog finds a match, we say that the search has succeeded and the goal
is true. When the goal is true, Prolog outputs the corresponding values of the variables.

Now we shall add the rule

father(X, Y) :- parent(X, Y), male(X).

This rule states that X is father of Y if (the :- symbol) X is a parent of Y and (the comma) X
is male.

The database now looks like this.

female(jane).
female(anne).
female(sandip).
male(charnjit).
male(jaz).
male(tom).
parent(jane,mary).
parent(jane, rajinder).
parent(charnjit, mary).
parent(charnjit, rajinder).
parent(sandip, atif).
parent(jaz, atif).
father(X, Y) :- parent(X, Y), male(X).

Suppose our goal is to find the father of rajinder. That is, our goal is to find all X that satisfy

father(X, rajinder).

In the database and the rule the components female, male, parent and father are called
predicates and the values inside the parentheses are called arguments. Prolog now looks for
the predicate father and finds the rule

father(X, Y) :- parent(X, Y), male(X).

In this rule Y is instantiated to rajinder and Prolog starts to search the data base for

parent(X, rajinder)

This is the new goal. Prolog finds the match

parent(jane, rajinder)

if X is instantiated to jane. Prolog now uses the second part of the rule

male(X)

with X = jane. That is, Prolog's new goal is male(jane) which fails. Prolog does not give up
at this stage but backtracks to the match

parent(jane, rajinder)

and starts again, from this point in the database, to try to match the goal

parent(X, rajinder)

This time Prolog finds the match

parent(charnjit, rajinder)

with X instantiated to charnjit. The next step is to try to satisfy the goal

male(charnjit)

This is successful so

parent(charnjit, rajinder) and male(charnjit)

are true. Thus father(charnjit, rajinder) is true and Prolog returns

X = charnjit

Prolog continues to see if there are any more matches. There are no more matches so
Prolog outputs

No

A powerful tool in Prolog is recursion. This can be used to create alternative versions for a
rule. The Fig. 4.5.7.1 shows how ancestor is related to parent.

Fig. 4.5.7.1

This shows that X is an ancestor of Y if X is a parent of Y. But it also shows that X is an
ancestor of Y if X is a parent of Z and Z is a parent of Y. It also shows that X is an ancestor
of Y if X is a parent of Z , and Z is a parent of W and W is a parent of Y. This can continue
forever. Thus the rule is recursive. In Prolog we require two rules that are written as

ancestor(X, Y) :- parent(X, Y).
ancestor(X, Y) :- parent(X, Z),
ancestor(Z, Y).

The first rule states that X is an ancestor of Y if X is a parent of Y. In Fig. 4.5.7.1, this is
saying that a is an ancestor of b, b is an ancestor of c and c is an ancestor of d.

The second rule is in two parts. Let us see how it works using Fig.4.5.7.1 which represents
the database

parent(a, b).
parent(b, c).
parent(c, d).

by setting the goal

ancestor (a, c).

Prolog finds the first rule and tries to match parent(a, c) with each predicate in the
database. Prolog fails but does not give up. It backtracks and looks for another rule for
ancestor. Prolog finds the second rule and tries to match

parent(a, Z).

It finds

parent(a, b)

so instantiates Z to b.

This is now put into the second part of the rule to produce

ancestor(b, c).

This means that Prolog has to look for a rule for ancestor. It finds the first rule

ancestor(X, Y) :- parent(X, Y).

Thus Prolog looks for

parent(b, c)

and succeeds.

This means that with X = a, Y = c we have Z = b and the second rule succeeds. Therefore
Prolog returns Yes.

Now try to trace what happens if the goals are

ancestor(a,d)

and

ancestor(c, b).

You should find that the first goal succeeds and the second fails.

This form of programming is based on the mathematics of predicate calculus. Predicate
calculus is a branch of mathematics that deals with logic. All we have done in this Section is
based on the rules of predicate calculus. Prolog stands for Programming in LOGic and its
notation is based on that of predicate calculus. In predicate calculus the simplest structures
are atomic sentences such as

Mary loves Harry
Philip likes Zak

The words in italics are part of the names of relationships. We also have atomic conclusions
such as

Mary loves Harry if Harry loves Mary
John likes dogs if Jane likes dogs
Frank likes x if x likes computing

In the last atomic conclusion, x is a variable. The meaning of the conclusion is that Frank
likes anything (or anybody) that likes computing.

Joint conditions use the logical operators or and and, examples of which are

John likes x if x is female and x is blonde
Mary loves Bill or Mary loves Colin if Mary loves Don

The atomic formulae that serve as conditions and conclusions may be written in a simplified
form. In this form the name of the relation is written in front of the atomic formula. The
names of the relations are called predicate symbols. Examples are

loves(Mary, Harry)
likes(Philip, Zak)

We use the symbol ¬ to represent if and have

loves(Mary, Harry) ¬ loves(Harry, Mary)
likes(John, dogs) ¬ likes(Jane, dogs)

The and is represented by a comma in the condition part of the atomic conclusion. For
example

likes(John, x) ¬ female(x), blonde(x)

The or is represented by a comma in the conclusion. For example

female(x), male(x) ¬ human(x)

These examples show the connection between Prolog and predicate calculus. You do not
need to understand how to manipulate the examples you have seen any further than has
been shown in this Section.

AS with the previous Section we include here the definitions of terms used in this Section.
Remember, they can be used when a question says ''State the meaning of the term … '.

Definitions

Backtracking is going back to a previously found successful match in order to continue a
search.

Instantiation is giving a variable in a statement a value.

Predicate logic is a branch of mathematics that manipulates logical statements that can be
either True or False.

A goal is a statement that we are trying to prove whether or not it is True or False.
4.5.8 List Processing and Functional Languages

In Section 4.5.1 we introduced functional languages and saw how functions are defined in
the Haskell programming language. The important concept is pattern matching. Suppose we
have two numbers x and y and wish to output y if x is zero, otherwise output x. We can
write this as

output :: Int ® Int ® Int
output x, y
| x = = 0 = y
| otherwise = x

Here, the function called output uses guards to decide what to do. (Remember, guards are
like IF … THEN … ELSE … or CASE statements in other languages.) An alternative is to write
the function output in two parts.

output :: Int ® Int ® Int
output 0, y = y
output x, y = x

In this case pattern matching is used in a sequential order. If we write

output 0, 7

we will have output set equal to zero as

output 0, 7

matches with

output 0, y

As this match is successful, Haskell will not look at the next definition of output.

Now try

output 4, 7

Haskell tries to match this with

output 0, y

and fails. Haskell now continues its search and finds

output x, y

which produces a match if

x = 4 and y = 7

so output is set equal to 4.

In fact, in the second definition, y is never used so the definition can be written

output x, _ = x

The underscore is called a wild card and any value will match it.

Lists are written inside square brackets. The empty list is written [ ]. The list [2, 4, 6]
consists of three elements. In this case 2 is the head of the list and the list [4, 6] is the tail.
We can write

[2, 4, 6] = 2 : [4, 6]

We can also write

2 : 4 : 6 : [ ] =2 : 4 : [6]
=2 : [4, 6]
=[2, 4, 6]

Note the use of the empty list.

In general, if x is the head of a list, an xs is the tail we have

list = x : xs

Note that xs is a list. Thus, we have

x : y : zs = x : (y : zs)

That is, the colon operation is right associative.

let us now see how we can define a function head that returns the head of a non-empty list.

head :: [a] ® a
head [x : _ ] = x

Note the use of the underscore to represent the tail. This is because we are not interested in
the contents of the tail. Similarly we can define the tail.

tail :: [a] ® a
tail [_ : xs ] = xs

The letter a is used to represent a generic data type. That is, the list could be a list of Int, or
of char of any other data type.
Now let us see how to find the sum of a list on integers. We have

sum[1, 2, 4, 7, 11] = 1 + sum[2, 4, 7, 11]
= 1 + (2 + sum[4, 7, 11] )
= 1 + (2 + (4 + sum[7, 11] ))
= 1 + (2 + (4 + (7 + sum[11])))
= 1 + (2 + (4 + (7 + (11 +sum[ ]))))

But the sum of the empty list is zero. Thus we clearly have recursion as the sum of the list
is the head plus the sum of the tail of the list. The tail is steadily decreasing as each head is
removed so that the tail will eventually become the empty list. This gives us our stopping
value since the sum of an empty list is zero. Our function is

sum :: [Int] ® Int
sum [ ] = 0
sum (x : xs) = x + sum xs

Remember x is an Int and xs is a list, that is why there are no square brackets.

Let us trace this defintion for

sum [2, 3, 5]

sum [2, 3, 5]
match with sum [ ] and fail
match with sum ( x : xs) and succeed
sum = 2 + sum [3, 5]
match with sum [ ] and fail
match with sum ( x : xs) and succeed
sum = 3 + sum [5]
match with sum [ ] and fail
match with sum ( x : xs) and succeed
sum = 5 + sum [ ]
match with sum [ ] and succeed
sum = 0

sum = 5 + 0
= 5

sum = 3 + 5
= 8

sum = 2 + 8
= 10

Recursion is a very powerful and common feature of functional programming. Hence, here
is another similar example to clarify recursion.
Suppose we wish to find the product of a list of integers. That is

product [2, 3, 7] = 2 x 3 x 7 = 42

Clearly, this gives

product [ 2, 3, 7] = 2 x product [3, 7]
= 2 x (3 x product[7])
= 2 x (3 x (7 x product [ ]))

To make this work, we must define product [ ] as 1. We then have

product :: [Int] ® Int
product [ ] = 1
product (x : xs) = x * product xs

You should dry run this definition of product in the same way as sum was dry run to
convince yourself that it works.

Let us now look at a different problem. Suppose we wish to find the squares of each
element of a list of integers. That is, given the list

[2, 5, 3]

we wish to produce the list

[4, 25, 9]

Thus our input is a list of integers and our output is a list of integers. So we have

squares :: [Int] ® [Int]

Now

[2, 5, 3] = 2 : [5, 3] = 2 : (5 : [3]) = 2 : (5 : (3 : [ ]))

and

[4, 25, 9] = 4 : [25, 9] = 4 : (25 : [9]) = 4 : (25 : (9 : [ ]))

that is

squares (x : xs) = x2 : square xs
with
squares [ ] = [ ]

Our full definition is

squares :: [Int] ® [Int]
squares [ ] = [ ]
squares (x : xs) = x * x : square xs

The problem with recursion is that it makes very heavy use of memory. This is because it
has to store the values of the variables and the return address each time a call is made.
(See Section 4.5.5.) One way of overcoming this is to use tail recursion. That is, make the
recursive call the very last statement. This does not mean that it can be part of the last
statement, the call itself must be the last statement. Consider the following function.

fact :: Int ® Int
fact 0 =1
fact n = n * fact (n-1)

which calculates the factorial of n = n x (n – 1) x (n – 2) x … x 2 x 1.

When fact (n – 1) is called, the value of n must be stored so that when the value of fact (n –
1) is known the multiplication n * fact (n – 1) can be carried out. We want to avoid storing n
because the result

n x (n – 1) x (n – 2) x … x 2 x 1

cannot be evaluated until fact 0 is found. This can involve a lot of storage if n is large. Let us
define two new functions called factorial and newFact.

factorial :: Int ® Int
factorial n = newFact n, 1

newFact :: Int ® Int ® Int
newFact 0, p = p
newFact n, p = newFact (n – 1) (n * p)

Let us follow this by finding factorial 4 ( = 4 x 3 x 2 x 1).

factorial 4 = newFact 4,1
= newFact 3, 4
= newFact 2, 12
= newFact 1, 24
= newFact 0, 24
=24

which requires no storage of variables.

The following defines the sum of a list using tail recursion. Trace the solution of sumList [
2,5, 6,3].

sumList :: [Int] ® Int
sumList x : xs = newSum (x : xs) 0

newSum :: [Int] ® Int ® Int
newSum [ ] p = p
newSum (x : xs) p = newSum xs (x + p)

Head recursion is when the recursive call is made at the start of the function, immediately
after the decision that causes termination. This is very inefficient on storage as all variables
used in the function must be stored every time a call is made. Further examples of
definitions requiring tail and head recursion are given in Section 4.5.11.

As in the two previous Sections we include here two definitions that can be used to answer
questions of the form 'State the meaning of the term …'.

Defintions

Tail recursion is when the last statement in a function calls the function itself and is the only
content of the last statement.

Head recursion is when the call, to the function containing the call, is at the start of the
function and all other statements that manipulate the data are after the call.

4.5.9 Low Level Languages and the Use of Registers

Fig. 4.5.9.1 shows the minimum number of registers needed to execute instructions.
Remember that these are used to execute machine code instructions not high-level
language instructions.

Fig. 4.5.9.1

The program counter (PC) is used to keep track of the location of the next instruction to be
executed. This register is also known as the Sequence Control Register (SCR).

The memory address register (MAR) holds the address of the instruction or data that is to
be fetched from memory.

The current instruction register (CIR) holds the instruction that is to be executed, ready for
decoding.

The memory data register (MDR) holds data to be transferred to memory and data that is
being transferred from memory, including instructions on their way to the CIR. Remember
that the computer cannot distinguish between data and instructions. Both are held as binary
numbers. How these binary numbers are interpreted depends on the registers in which they
end up. The MDR is the only route between the other registers and the main memory of the
computer.

The accumulator is where results are temporarily held and is used in conjunction with a
working register to do calculations.

The index register is a special register used to adjust the address part of an instruction. This
will be explained in more detail later.

Note that the diagram does not show the control bus and the signals needed for instructions
to be correctly executed. These are not required for this examination.

We shall now see how these registers are used to execute instructions. In order to do this
we shall assume that a memory location can hold both the instruction code and the address
part of the instruction. For example, a 32-bit memory location may use 12 bits for the
instruction code and 20 bits for the address part. This will allow us to use up to 212 (= 4096)
instruction codes and 220 (= 1 048 576) memory addresses.

To simplify things further, we shall use mnemonics for instructions such as

Code Meaning
LDA load the accumulator
STA store the accumulator
ADD add the contents of memory to the accumulator
STOP Stop

We shall also use decimal numbers rather than binary for the address part of an instruction.

Suppose four instructions are stored in locations 300, 301, 302 and 303 as shown in the
following table and that the PC contains the number 300.

Address Contents Notes
.
.
. .
.
.
300 LDA 400 Load accumulator with contents of location 400
301 ADD 401 Add contents of location 401 to accumulator
302 STA 402 Store contents of accumulator in location 402
303 STOP Stop
304
.
.
. .
.
.
400 5 Location 400 contains the number 5
401 7 Location 401 contains the number 7
402 ? Not known what is in location 402
.
.
. .
.
.

The fetch part of the instruction is

Action PC MAR CIR MDR
Copy contents of PC to MAR 300 300 ? ?
Add 1 to PC 301 300 ? ?
Copy contents of location pointed to by MAR into MDR 301 300 ? LDA 400
Copy instruction in MDR to CIR 301 300 LDA 400 LDA 400

The instruction is now decoded (not shown in the table) and is interpreted as 'load the
contents of the location whose address is given into the accumulator'.

We now start the execution phase. As the contents of an address are needed, the address
part of the instruction is copied into the MAR, in this case 400.

Action PC MAR CIR MDR
Copy address part of instruction to MAR 301 400 LDA 400 LDA 400

Now use the MAR to find the value required and copy it into the MDR.

Action PC MAR CIR MDR
Copy contents of address given in MAR to MDR 301 400 LDA 400 5

Finally copy the contents of the MDR to the accumulator.

Now use the same steps to fetch and execute the next instruction. Note that the PC already
contains the address of the next instruction.

Note that all data moves between memory and the MDR via the data bus. All addresses use
the address bus.

A summary of the steps needed to fetch and execute the LDA instruction are shown in Fig.
4.5.9.2

Fig. 4.5.9.2

In Fig. 4.5.9.3, what happens during the execute cycle depends on the instruction. For
example, the STA n (store the contents of the accumulator in the location with address n)
has the execute steps shown in Fig. 4.5.9.3.

Fig. 4.5.9.3

This process works fine but only allows for the sequential execution of instructions. This is
because the PC is only changed by successively adding 1 to it. How can we arrange to
change the order in which instructions are fetched? Consider these instructions.

Address Contents Notes
.
.
. .
.
.
300 ADD 500 Add the contents of location 500 to the accumulator
301 JLZ 300 If accumulator < 0 go back to the instruction in location 300
.
.
. .
.
.

Suppose the PC contains the number 300, after the instruction ADD 500 has been fetched
and executed the PC will hold the number 301. Now the instruction JLZ 300 will be fetched in
the usual way and the PC will be incremented to 302. The next step is to execute this
instruction. The steps are shown in Fig. 4.5.9.4.

Fig. 4.5.9.4

So far we have used two copy instructions (LDA and STA), one arithmetic instruction (ADD)
and one jump instruction (JLZ). In the case of the copy and arithmetic instructions, the
address part has specified where to find or put the data. This is known as direct addressing.

An alternative method of using the address part of an instruction is called indirect
addressing. Here the address part of the instruction specifies where to find the address to
be used. Fig. 4.5.9.5 shows how this type of instruction is executed if the instruction is LDI
(load the accumulator indirectly).

Fig. 4.5.9.5
The advantage of this mode of addressing is that the actual address used in our example
can be the full 32 bits giving 232 addresses.

Often it is necessary to access successive memory locations for data. Suppose we wish to
add a series of numbers stored in locations with addresses 600 to 609. We do not want to
write a load instruction followed by nine add instructions. After all, what would happen if we
wished to add 100 numbers? We can solve this problem by using indexed addressing. We
could have an instruction, say ADDX, which uses index addressing. Index addressing uses
an index register that the programmer initially sets to zero. Index addressing adds the
contents of the index register to the address part of the instruction before using the
address. After each add instruction is executed, the programmer increments the index
register (IR).

Consider the instruction, in location 300,

ADDX 700

and let the contents of the index register be 5. The situation is

PC MDR CIR MAR IR
301 ADDX 700 ADDX 700 700 5

Before the ADDX instruction is executed the contents of the IR must be added to the MAR to
give

PC MDR CIR MAR IR
301 ADDX 700 ADDX 700 705 5

Thus the contents of address 705 are added to the accumulator. The programmer then
increments the IR to make it 6 so that the next time the ADDX 700 instruction is executed
the addressed used will be 706.

There are other addressing modes such as immediate addressing (the address part
specifies the value to be used), relative addressing (the address part specifies where the
location to be used is relative to where the current instruction is stored) and register
addressing (the address is held in a special register). However, none of these are in the
OCR specification.

4.5.10 Third and Fourth Generation Languages

Third generation languages are those that use a structured syntax such as C, C++ and
Pascal. Early versions of Fortran and BASIC were not structured and are usually treated as
second generation languages. However, Visual Basic is structured and can be treated as a
third generation language.

Third generation languages need the user to specify clearly all the steps that need to be
taken to solve a problem. Fourth generation languages do not do this. Languages that
accompany modern database, word processing and spreadsheet packages do not need the
user to do this. The users of these packages tell the application what they want to do not
how to do it. An example is mail merge . Here all the user has to do is tell the software what
table or database to use and the mail merge will take place. Databases often use query by
example (QBE). Here the user simply states what is required and the software will do the
task. For example, Microsoft Access lets a user specify conditions such as DOB < 01/01/90
and the necessary coding will be done. In fact Access uses the Structured Query Language
(SQL) to create the queries. Consider the following table called Students.

name height weight
Alan 150 31.2
Brenda 140 27.8
Charnjit 148 30.7
Dalvinder 152 32.8
Elmira 143 28.1
Frank 158 33.4
Georgina 151 28.2

Now suppose we wish to find the names of all those students who have a height greater
than 150. In Access we could simply create a query with columns for name and height and
in the height column we would write

> 150

for the criteria. We could also specify, by means of a check box, that only the name should
be printed. The result would be

Dalvinder
Frank
Georgina

In fact, we can write the query in SQL as

SELECT name
FROM Students
WHERE height > 150;

This is what Access does.

Notice that we do not have to give the steps needed to check each entry in the table
Students. A more complicated query is

SELECT name
FROM Students
WHERE height > 145
AND
weight > 32;

Again, we do not tell the computer exactly how to find the answer required as we would with
a third generation language.

The development of fourth generation languages has meant that people who are not
programmers can produce useful results.

4.5.11 Backus Naur Form and Syntax Diagrams

Since all programming languages have to be translated to machine code by means of a
computer, they must be clearly defined. Each statement must be of a prescribed form. An
example of the start of a FOR loop in Visual Basic is

For count = 1 To 10

but C++ expects

for (count = 1, count <= 10, count++)

A Visual Basic compiler would not understand the C++ syntax and vice versa. We therefore
need, for each language, a set of rules that specify precisely every part of the language.
These rules are specified using Backus Naur Form (BNF) or syntax diagrams.

All languages use integers, so we shall start with the definition of an integer. An integer is a
sequence of the digits 0, 1, 2, … , 9. Now the number of digits in an integer is arbitrary. That
is, it can be any number. A particular compiler will restrict the number of digits only because
of the storage space set aside for an integer. But a computer language does not restrict the
number of digits. Thus the following are all valid integers.

0
2
415
3040513002976
0000000123

Thus, an integer can be a single digit. We can write this as

<integer> ::= <digit>

This is read 'an integer is defined to be (::=) a digit'.

But we must now define a digit. A digit is 0 or 1 or 2 or … or 9 and we write this as

<digit> ::= 0|1|2|3|4|5|6|7|8|9

where the vertical line is read as or. Notice that all the digits have to be specified and that
they are not inside angle brackets (< and >) like <integer> and <digit>. This is because
integer and digit have definitions elsewhere; the digits 0, 1, 2, … , 9 do not.

Our full definition of a single digit integer is

<integer> ::= <digit>
<digit> ::= 0|1|2|3|4|5|6|7|8|9

This is called Backus Naur Form (BNF).
But how are we going to specify integers of any length? Consider the integer

147

This is a single digit integer ( 1 ) followed by the integer 47. But 47 is a single digit integer (
4 ) followed by a single digit integer ( 7 ). Thus, all integers of more than one digit start with
a single digit and are followed by an integer. Eventually the final integer is a single digit
integer. Thus, an indefinitely long integer is defined as

<integer> ::= <digit><integer>

This is a recursive definition as integer is defined in terms of itself. Applying this definition
several times produces the sequence

<integer> ::= <digit><integer>

=<digit><digit><integer>

=<digit><digit><digit><integer>

To stop this we use the fact that, eventually, <integer> is a single digit and write

<integer> ::= <digit>|<digit><integer>

That is, <integer> is a <digit> or a <digit> followed by an <integer>. This means that at
any time <integer> can be replaced by <digit> and the recursion stops. Strictly speaking
we have defined an unsigned integer as we have not allowed a leading plus sign ( + ) or
minus sign ( - ). This will be dealt with later. We now have the full definition of an unsigned
integer which, in BNF, is

<unsigned integer> ::= <digit>|<digit><unsigned integer>
<digit> ::= 0|1|2|3|4|5|6|7|8|9

This definition of an unsigned integer can also be described by means of syntax diagrams as
shown in Fig. 4.5.11.1.

Fig. 4.5.11.1

Now we shall define a signed integer such as

+27
-3415

This is simply an unsigned integer preceded by a + or – sign. Thus

<signed integer> ::= + <unsigned integer>| - <unsigned integer>

and we can use the earlier definition of an <unsigned integer>. It is usual to say that an
integer is an unsigned integer or a signed integer. If we do this we get the following
definition, in BNF, of an integer.

<integer> ::= <unsigned integer>|<signed integer>
<signed integer> ::= + <unsigned integer>| - <unsigned integer>
<unsigned integer> ::= <digit>|<digit><unsigned integer>
<digit> ::= 0|1|2|3|4|5|6|7|8|9

There are other valid ways of writing these definitions. However, it is better to use several
definitions than try to put all the possibilities into a single definition. In other words, try to
start at the top with a general definition and then try to break the definitions down into
simpler and simpler ones. That is, we have used top-down design when creating these
definitions. We have broken the definitions down until we have terms whose values can be
easily determined.

Fig.4.5.11.2 shows the corresponding syntax diagrams.

Fig.4.5.11.2

Care must be taken when positioning the recursion in the definitions using BNF. Suppose we
define a variable as a sequence of one or more characters starting with a letter. The
characters can be any letter, digit or the underscore. Valid examples are

A
x
sum
total24
mass_of_product
MyAge

Let us see what happens if we use a similar definition to that for an unsigned integer.

<variable> ::= <letter>|<character><variable>
<character> ::= <letter>|<digit>|<under-score>

Now 2Sum is valid as we use

<character><variable>

with <character> = 2 and <variable> = Sum. Continuing in this way we use 2, S and u for
<character> and then m for <letter>. This means that our definition simply means that we
must end with a letter not start with one. We must rewrite our definition in such a way as to
ensure that the first character is a letter. Moving the recursive call to the front of
<character> can do this. This means that the last time it is called it will be a letter and this
will be at the head of the variable. The correct definition is

<variable> ::= <letter>|<variable><character>
<character> ::= <letter>|<digit>|<under-score>
<letter> ::= <uppercase>|<lowercase>
<uppercase> ::= A|B|C|D|E|F|G|H|I|J|K|ZL|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z
<lowercase> ::= a|b|c|d|e|f|g|h|i|j|k|zl|m|n|o|p|q|r|s|t|u|v|w|x|y|z
<digit> ::= 0|1|2|3|4|5|6|7|8|9
<under-score> ::= _

A syntax diagram can also represent this. This is left as an exercise. You should also note
that, in the definition of integer, we used tail recursion, but here we have used head
recursion.

Let us now use our definition of an integer to define a real number such as

0.347
-2.862
+14.34
00235.006

The result is very simple, it is

<real number> ::= <integer> . <unsigned integer>

Finally, suppose we do not want to allow leading zeros in our integers. That is

00135 is not allowed
0 is allowed.

This means that an integer can be a

zero digit
non-zero digit
non-zero digit followed by any digit.

This means that an integer is

zero or digits

where digits must start with a non-zero digit. In BNF, this is

<integer> ::= <zero>|<digits>

<digits> must be a single non-zero digit or a non-zero digit followed by any digits. This
gives us

<digits> ::= <non-zero digit>|<digits><digit>

where

<zero> ::= 0
<non-zero integer> ::= 1|2|3|4|5|6|7|8|9
<digit> ::= <zero>|<non-zero digit>

Fig. 4.5.11.4 shows the corresponding syntax diagram.