|
Programming Tools and Techniques 1.3.1 Structure The solving of problems using computing techniques relies heavily on being able to divide the problem into a series of steps that should be solved in a sequence in order to solve the entire problem. The sequence of steps that need to be carried out in order to solve a problem, which takes into account the order of the steps, is called an algorithm. There are a number of ways of presenting such algorithms. Consider the problem: How to calculate the perimeter of a rectangle when it has length (l) and breadth (b). We all know that the answer is obtained by simply finding the lengths of the four sides and adding them up. Indeed, the last sentence is an algorithm. It tells you how to solve the problem that has been set and if the instructions are followed, the answer should be worked out. This is a prose solution to the problem because the answer is given as though it is spoken language. While it is a perfectly good description of the algorithm here, the problem is a very simple one. The difficulty with spoken language is that it is hard to describe more complex solutions because of ambiguities. Consider some other methods for writing down an algorithm. 1. In maths we use formulae for showing the solutions to simple problems, in this case Perimeter = length + breadth + length + breadth or Perimeter = 2 * (length + breadth) It is important to realise that there is no single right answer. Either of these formulae will work, and when we are considering an algorithm some may be more efficient than others, but if they work out the answer to the problem then that’s enough. A simple algorithm like this can sometimes be expressed as a single formula. 2. There are two things that have to be done to produce an algorithm to solve this problem. We have to times by two, and we have to add the length and breadth together. However there is one further factor, they must be done in the right order or the answer will be wrong. This leads to another way of writing down an algorithm: write down the steps and number them so that they are done in numerical order. In this case the algorithm is 1. Find the length and breadth 2. Add the length and the breadth together 3. Multiply the result by 2 4. End 3. Another way of showing this order is to use arrows pointing from one instruction to the next The example shows the individual instructions in nice neat little boxes, this is certainly not necessary, and also uses the normal rule which of reading down a page. However, providing the arrows are in the correct order then the instructions can be placed anywhere. This type of representation of the algorithm is known as a flow diagram. There are many other ways of defining the solutions to problems, some of which you will meet later in this course. 1.3.2 Techniques for Writing Software. When a piece of software needs to be produced, the problem to be solved is likely to be far more complex than the calculation of the perimeter of a rectangle. An algorithm like that is easy for a human being to be able think about because it is so short, however, most useful problems involve such complex algorithms that it is not possible to consider them in one go. For this reason problems are normally divided up into a number of smaller problems, each one of which can be solved separately, and then combined to give a solution to the whole problem. Consider that you have been asked to produce a solution to the problem of computerising a set of traffic lights in the centre of a town. The problem is a very complex one, but if it is divided into: How is the data going to be collected from sensors and stored in the system? How is the decision going to be made about the changing of the lights? How is the data collected from the pedestrian crossing and then stored.. and processed? What outputs are necessary, and how are they controlled? Then each problem becomes more manageable. This sort of approach to problem solving is called a top-down approach, in that we started with an original problem (the top) and split the problem into smaller and smaller parts until they became manageable. When each of the individual problems has been solved the solutions are combined to produce a solution to the whole problem. These individual parts of the solution are known as modules. There are other advantages in this sort of approach. More than one person can be engaged on solving parts of the same problem simultaneously, consequently the problem should be solved more quickly. Different people are good at different things, so our company uses Vicki to solve the problem of collecting the information from the sensors, because Leon is not very good at that, whereas he is an expert concerning controlling things using a computer, so he does the processing module. Last month, our company produced a similar solution for a set of traffic lights in a different town. The sensors used were different, so Vicki still has to produce her module, and the number of roads is different so Leon will still need to come up with a solution for his module. However, the lights and other output devices being controlled are the same, so we might as well use the same module that we used last time. This happens often, that one of the modules is the same as one used previously. Because of this, software firms store the modules that they have produced so that they can be used again if the chance arises. Collections of modules like this are known as software libraries. Because the algorithms for the modules are so much shorter than for the whole problem, it is likely that the person producing a module will make fewer mistakes. Also, because the module is quite short, it is much easier to spot any errors that have been made, and correct them. This means that the finished software should be more reliable when it is being used. One way of describing the modularisation of a problem is to use a diagram to show how the parts fit together. Don’t worry about such diagrams at the moment, we will meet them again in module 3 and they will become important when we want to explain the solutions of the major project in module 5 of this course. 1.3.3 Programming Constructs As we saw in the first section of this module, the solution to a problem can often be thought of as a sequence of steps which should be taken in order. The same is true when a program is being written. Generally, you start at the beginning and follow the steps, in order, until you come to the end, by which time the problem should have been solved. Looked at diagrammatically, a computer program could be drawn as Start Program instructions End This type of combination of operations is known as a sequence. However, the real power of computer programs becomes apparent when we change these rules. There are a number of ways of breaking out of the simple sequence, we will look at two of them here. 1. Selection Selection is the process of making a choice. The simplest example is the command If…Then…Else… This is used to divide the sequence into two possible routes. There is a condition (something like an answer being more than 10) and then two possible outcomes, one is followed if the condition is met (for instance, use a discount price because of the number bought), and the other is followed if the condition is not met (use the standard price). Diagrammatically, this can be represented as Start Start Condition Compare answer with 10 Met Not met Answer >10 Answer <= 10 This example simply gives two options, the condition statement can have a number of possible results, each giving a different route. Note that, eventually, all these different routes must either come back together or must have some suitable end point. 2. Repetition There are a number of ways of repeating the same instructions more than once. We will have a look at three methods here. Each one of these methods have two things in common. The first is some sort of pointer that sends control of an algorithm to some point other than the next instruction in the sequence, and the second is some method of controlling whether the pointer is followed or whether the normal sequence of instructions is followed. A repeating construct is often called a loop. a) The first of these loops is one where the control of the loop is at the end of the loop. It is called a REPEAT…UNTIL loop. Keep repeating the following commands (LOOP) Instructions inside loop Until some condition is met, (Condition not met) then exit loop (condition met) Note that the instructions in the loop have to be carried out at least once before there is the chance of meeting the condition. b) The second loop structure is a WHILE…ENDWHILE loop. This structure has the condition at the start so that it is possible to miss out the instructions in the loop entirely. (Condition not met) While the condition is met Do the following commands (Condition met) (LOOP) Instructions Go back to the start of the loop c) The third type of loop is called a FOR…NEXT loop. This loop is repeated a predetermined number of times. A counter keeps check of the number of times the loop has been done, and when it has done enough the loop is exited. (Decide how many times to do the loop) If the counter is less than, or equal to, the number of times selected, do the loop again Add 1 to the counter (LOOP) Go back to the start of the loop Note that with the For/Next loop the number of times around the loop is determined before entry into the loop. 1.3.4 Recursion and 1.3.5 Procedures and functions As we saw in section 1.3.2, it is useful to be able to divide a problem into modules so that each part of the problem is easier to solve. When the algorithm is turned into a computer program it makes sense to program each of these smaller modules separately and then connect them together to make the whole program. These small segments of program are called procedures and functions. To start with we will call them all procedures, they are very similar anyway, and later we’ll see what the difference is. Imagine a computer program. It is made up of a number of instructions taken in order. In section 1.3.3 we saw some constructs that could change the simple order, but for the sake of clarity we will ignore them and return to our original diagram of a program: Start Program instructions End If the program has been split into sections by modularising the algorithm, it is more likely to look like this: Start Start of Module 1 Start of module 2 End of module 1 End of module 2 End Each of the modules is called a procedure when it is in the form of a program of instructions. The diagram shows a main program which, just under half way through, runs a procedure to carry out a particular task, which itself runs another procedure. Note two things about these procedures: 1. If they work something out, presumably they must pass the answers back to the main program, otherwise they have been wasting their time. 2. When the procedure has been finished the next instruction is back where it came from in the first place. This means that the computer must remember whereabouts it was when it went to the procedure so that it can go back there. We are not going to answer either of these questions now, but will return to them later. A procedure could be used for putting a menu on the screen, and then another could be used for reading the choice made and acting upon it. At some point a procedure could be used to work out the square root of a number. This is slightly different because the result at the end of the procedure is a value which, presumably, will be used in the main program. A procedure which gives a value at the end is known as a function. Recursion Imagine a procedure which, instead of pointing to another procedure, points to itself so that it starts itself again. It would look like this: Start Start of procedure End of procedure End This sort of procedure has to be handled very carefully because it may get stuck in a never ending loop. However, the concept of a procedure pointing to itself is a very powerful one in programming. It is given a special name, procedures that point to themselves are called recursive, and the process of doing this is known as recursion. Recursion will be looked at again later in the course. 1.3.6 Program Translation When a human being writes a computer program it is important for them that they can use a language which is easy to understand. After all, the algorithm will probably be difficult enough to follow without the language being difficult as well. Computer languages like this, that the person finds easy to understand, but the computer can’t, are called high-level languages. A computer can only understand 0s and 1s, in other words, binary numbers. So the language which the computer uses is all in binary and very difficult for human beings to use. Languages like this are called machine code languages. If the program is written in a high level language and the computer can only run it in machine code, then it has to be translated from one to the other, just as a letter in German may need to be translated into English if an English speaker is to be able to read it. High-level Machine Language Translator code A program which is written in a high level language is called the source code. After the source code has been translated the program is called the object code (or executable form). Translator The object code is normally much bigger than the source code because it takes lots of machine code commands for each of the high level language commands. There are two types of translator, one is called a compiler and the other is called an interpreter. We do not need to know any more about these two translators yet. 1.3.7 Types of Programming Errors When programs are being written it is not surprising that mistakes are made, after all they are very complicated. There are three types of error which can be made when program code is being produced. 1. Syntax errors. These are errors in the use of the language. The rules have been broken. A command word might be PRINT. If it was typed in as PLINT the translator program would not recognise it and so would not be able to translate it. This is not a difficult error to spot because when the programmer tells the translator to translate it, a message will be returned saying that a mistake has been made and telling the programmer which word it cannot recognise. There are other ways of breaking the rules, for instance a language might be set up to only accept a mathematical formula if it is in the form of a variable followed by an equals sign and then a collection of variables. So X = 3*(2+A) would be accepted by the translator, but 3*(2+A) = X would be rejected as an error even though, mathematically, it is identical. 2. Logic errors. A logic error is a mistake in the way the program solution has been designed. For example, an instruction in a program may tell the computer to jump to another part of the program. This is fine as long as the instruction tells the computer to go to the right part of the program. If it tells the computer to go to the wrong place, it is highly unlikely that the program will produce the results that were expected. Unfortunately, such a mistake does not usually make the program stop working, it just makes it produce wrong results. This makes it difficult to spot that a mistake has been made, and even when the programmer realises that something has gone wrong, finding where the error is can be very difficult. Another sort of logic error is sometimes referred to as an arithmetic error. 3. Arithmetic error. This is a type of logic error. Arithmetic errors are created when inappropriate arithmetic is used. A good example is when a computer is asked to divide by 0, this is impossible and would cause the program to fail. 1.3.8 Testing Methods When a system is designed it is important that some consideration is given to making sure that no mistakes have been made. A schedule should be drawn up which contains a test for every type of input that could be made and methods of testing that the program actually does what it was meant to do. This schedule is known as the test plan. Note that it is produced before the system is produced. There are a number of ways of testing a program. 1. Different values can be input for variables to determine whether the program can cope with them. These values should include typical values, borderline values and values which are not acceptable. For example, if a program is written which uses marks out of 100 from a maths examination as input, the test data would include typical data like 27, 73.., borderline data which would be 0 and 100, and unacceptable data like –34, 123, 16.345 This type of testing is called black box testing. 2. White box testing is testing the program to determine whether all the possible paths through the program produce the desired results. As a large program can have a very large number of routes, when you take into account the different condition statements and loops, white box testing is rarely carried out exhaustively. 3. Alpha and Beta testing. When you have written a program and you sit down to test it, you have a certain advantage because you know what to expect. After all, you wrote the program. This can be extended to the whole software company, as the employees are all computer-minded people. Testing carried out by people like this is known as alpha testing. Eventually, the company will want ordinary users to test the program because they are likely to find errors that the software specialists did not find. Testing carried out by the users of the program is called beta testing. If you continue the course next year, in order to turn your AS grade into an A level, you will have to write a project. This involves using a computer to solve a problem for someone. When you have finished your project you will be expected to test whether or not it works, this is alpha testing. You will also get marks if you persuade the person whose problem you are solving to test it for you, this is the beta testing. 1.3.9 Debugging Errors in computer solutions are called bugs. They create two problems. One is correcting the error, this is normally fairly straightforward because most errors are caused by silly mistakes. The second problem, however, is much more complicated, the errors have to be found before they can be corrected. Finding where the error is, identifying it, can be very difficult and there are a number of techniques available for solving such problems. 1. Translator diagnostics. In section 1.3.6 we were introduced to the idea of a translator program. Each of the commands that are in the original program are looked at separately by the translator. Each command will have a special word which says what sort of command it is. The translator looks at the special word in the command and then goes to its dictionary to look it up. The dictionary tells the translator program what the rules are for that particular special word. If the word has been typed in wrongly, the translator will not be able to find it in the dictionary and will know that something is wrong. If the word is there, but the rules governing how it should be used have not been followed properly, the translator will know that there is something wrong. Either way, the translator program knows that a mistake has been made, it knows where the mistake is and, often, it also knows what mistake has been made. A message detailing all this can be sent to the programmer to give hints as to what to do. These messages are called translator diagnostics. 2. Sometimes the program looks alright to the translator, but it still doesn’t work properly. Debugging tools are part of the software which help the user to identify where the errors are. The techniques available include: a) Cross-referencing. This software checks the program that has been written and finds places where particular variables have been used. This lets the programmer check to make sure that the same variable has not been used twice for different things. b) Traces. A trace is where the program is run and the values of all the relevant variables are printed out, as are the individual instructions, as each instruction is executed. In this way, the values can be checked to see where they suddenly change or take on an unexpected value. c) Variable dumps. At specified parts of the program, the values of all the variables are displayed to enable the user to compare them with the expected results. 3. Desk checking is sometimes known as a dry run. The user works through the program instructions manually, keeping track of the values of the variables. Most computer programs require a very large number of instructions to be carried out, so it is usual to only dry run small segments of code that the programmer suspects of harbouring an error. 4. In section 1.3.2 we discussed the splitting up of a problem into smaller and smaller parts, until each part was a manageable size. When the parts were combined they would produce a solution to the original problem. Because we are starting with a big problem and splitting it into smaller problems, this was called top-down design. When the program is written, each small program is written separately, allowing the small programs to be tested thoroughly before being combined. This is called bottom-up programming. The technique is particularly useful for testing the finished program because it is far easier to test a lot of small programs than it is to test one large one. One problem can arise, because the small programs have to be joined together, these joints have to be tested too to make sure that no silly mistakes have been made like using the same variable name for two different things in two parts of the program (tested by cross referencing). 5. Test strategies are important to establish before the start of testing to ensure that all the elements of a solution are tested, and that unnecessary duplication of tests is avoided. 1.3.10 Annotation Program writers need to be aware that the programs that they write will need to be maintained long after they have gone on to other projects. For example, a program is written to control the stock in a warehouse. The program has been fully tested and any problems appear to have been solved. Some months later, a particular chain of events occurs that means that the software does some instructions in an order that has never arisen before, and an error occurs. Someone has to find the error in the program. Some time later, the warehouse expands in order to deal with an increased supply of goods and the computer system needs to be amended to allow enquiries from the Internet. This again implies that the program will need to be altered to accommodate the new circumstances. In both these cases it is highly unlikely that the original programmer will be the person that has to do the adapting of the code. The new programmer is going to be grateful for all the help they can get in understanding the old program. Help given in the code itself is known as annotation, the purpose of which is to make the code as simple as possible to follow. One way to help demonstrate the structure of the program is to make parts of the code stand out from the rest by indentation. If this is done without some thought then it is meaningless, but if a procedure of a program is made to stand out in this way then it is very easy to see whereabouts the procedure is. If there is a loop structure, perhaps a While…Endwhile loop inside the procedure, then that can also be made to stand out by using a further indent. Many programming languages automatically indent some of the programming constructs available in the language. Another way is to write comments in the code of the program which are aimed at a human being looking at the program, rather than at the computer that is running it. Most programming languages have an instruction that can be used for this. It simply tells the computer that anything that comes in this instruction is to be ignored. A further technique is to use sensible variable names. If an input is required at some point in the program, and the input is the number of marks that the student obtained in their maths exam, the variable used to store the number could be many things. However, if the variable name X was used, another programmer looking at the code would need extra information to understand what X stood for. If the original programmer had used the variable name MATHS_MARK, the name is self-explanatory and hence makes the program easier to follow. The other obvious way of making the program easier to follow is to adopt a top-down design which enables the individual procedures to be easier to understand than one large program. Example Questions. 1. A roll of wallpaper is 50cm wide. Each roll is 15m long. A room measures x metres long by y metres wide and 2.5 metres high. Ignoring the door and any windows that there may be, describe an algorithm for calculating the number of rolls of wallpaper needed to paper the room. (4) A. -Add twice x to twice y/Work out total length of walls -Divide by .5metres/work out the number of strips of paper needed -Divide 15metres by 2.5metres/work out the number of strips in one roll of paper -Divide number of strips needed by the number of strips in one roll -Indication of the order required. Note that the mark scheme is not presented in any particular algorithmic form. This is because the syllabus does not specify that a particular form be used so it is entirely up to the candidate, provided that the two points about stating the tasks to be carried out and specifying an order are met. Also notice that the first three steps of the algorithm are presented in two ways. One could be called mathematical while the other is descriptive. Either is perfectly acceptable. One last point about the mark scheme is that an algorithm implies that order is important, so it is highly likely that in an exam, such a question would insist on an order as an essential feature of the solution. In other words one mark would be reserved for providing a definite statement of order. 2. Set the counter to 0 and the total to 0 Read a number Add 1 to the counter Add the number to the total Yes Are there any more numbers? No Divide the Total by the counter Report the answer Describe the effect of this algorithm. (4) A. -An unknown number of numbers… -is read into the algorithm. -When there are no more numbers to read -The average (mean) is worked out -The answer is then output. Producing an algorithm to anything other than a trivial problem under exam conditions is difficult. One solution to this problem is for the examiner to produce the algorithm for the candidate and then expect the candidate to follow the logic. The example here is a very simple one but illustrates two points. First, the fact that whatever the algorithm is it will be short, and secondly that while many algorithms are mathematically based the mathematics is not necessary to attain full marks. When answering a question like this pay particular attention to the number of marks available, and be careful not to simply write down what is given in the algorithm. E.g. no marks would be given here for saying that one is added to the counter. 3. Give three advantages to using a top down approach when designing a solution to a problem. (3) A. -Many people can be involved in the solution -Individual skills can be used -A module can be used from the software library -Each module is far easier to understand than the whole program is. -Fewer errors are likely to be made -Any errors that are made will be easier to correct. A standard question, whose answer jumps from the notes. 4. A piece of software is to be written which will control a central heating system. It is decided that the software is to be written as a set of four modules. State four modules, whose combined solution would be a solution for the whole problem. A. -Handling of the input from the sensors -Interface with the owner of the system allowing input of important values -Processing module which makes decisions about the heating requirements -Control of the output devices in order to carry out the processing decisions -Alarm system in case of errors occurring. 5. X = 1 WHILE X < 9 DO X = X + 1 Y = X * X PRINT Y ENDWHILE The intention of this module is to print out the square numbers from 1 to 100. a) Write down the output that the module will produce. (2) A. -- 4,9,16,25,36,49,64,81 - An indication that the candidate realises that the first value of X that is used is 2 instead of 1 - An indication that the last value used is 9 when it should have been 10. Notice that the arithmetic is not being tested here. It is possible to get full marks even if you can’t multiply. The examiner is not trying to test your mental arithmetic but is seeing whether you can follow a small section of logic. b) Using the same structure, produce a similar module that will work. (2) A. - Change the X = 1 to X = 0 - Change the WHILE X<9 DO to WHILE X<10 DO These are not unique possibilities. There are plenty of other ways of achieving the same results. The mark scheme is really one mark for correcting each of the output errors in the original. 6. REPEAT INPUT NUMBER IF NUMBER <> 0 THEN TOTAL = TOTAL + NUMBER COUNT = COUNT + 1 ELSE ZERO = ZERO + 1 UNTIL COUNT = 5 ANSWER = TOTAL/COUNT PRINT ANSWER, ZERO a) If the numbers 3,5,4,0,4,0,9,1 are used as data for this procedure, state what the printout will be when the procedure is completed. (2) A. 5 and 2. (1 mark each) b) Identify the condition statement, and explain what it does. (3) A. -IF…THEN… ELSE… -Decides whether the number input is zero or not. -A count is kept of all the zeros input, and the other numbers are used in the calculation. Note that the algorithm is provided. It would be hard to expect candidates to produce algorithms of this type in an exam, to all but the most trivial of problems. The description in part (b) can be in the broadest terms. The examiner is not looking for a brilliant insight into the logic, simply for an indication that the student understands the general principles. 7. State what is meant by the terms (i) Source code (ii) Object code (iii) Machine code (4) A. -Generally, code means a set of instructions designed to carry out a task -Source code is the version of the program written by the programmer… - in high level language. -Object code is the executable form of the program… -created by the translator -Machine code is in binary form. Note that there are plenty of marks available here, and the fact that the mark comes at the end of the question implies that the dotties are lumped together and that the marks are largely interchangeable. Teacher’s note: there is no distinction between interpreter and compiler as yet. Even to the extent that students are not penalised for not knowing that the interpreter does not produce an object code. It is not yet in the syllabus, so should not be expected. 8. Describe the relationship between the source code and the object code. (2) A. -The source code is the original program while the object code is after translation. - There is a one to many relationship between the two. Note. This could have been a very complex answer. However, you don’t know a lot about translators and their effects yet, and a look at the number of marks available tells you that the answer is strictly limited. The second mark point is rather strangely put, it simply means that the object code is longer than the source code, and if that’s what you said you would get the mark. 9. State three types of program error and give an example of each. (6) -Syntax error -Mistyping a reserved word, eg typing plint instead of print -Logic error -A jump instruction that tells the computer to jump to the wrong place -Arithmetic error -Inappropriate use of arithmetic. Eg dividing by zero Note. There are many possible errors that could be given as examples of each of these. It is sensible to stick to the standard answers that way there is no chance of picking a wrong one. 10. A program is written that will allow the user to input a list of 5 numbers and will print out the largest and the smallest of them. The program is to be tested to see that it works under all conditions. The programmer decides to use the test data 3,4,2,5,6,7 in order to test what will happen if more than 5 numbers are input. Give four other different sets of test data explaining the purpose of each. (8) A. - 1,3,-4,6,5 -To determine how it handles negative numbers. - 1,3,4,-6,5 -To determine whether it chooses the largest number according to magnitude or whether it takes the sign into consideration. - 1.5,3,4,5,6 - To determine whether it can deal with fractions. - 1,1,1,1,1 -To see what happens if the largest and the smallest are the same. Note. There are many more sensible sets of test data that could be used. These are just examples. Make sure that you read the question carefully and give ‘different’ sets of test data. 11. Explain the difference between alpha testing and beta testing. (2) A. -Alpha testing is done by the programmer or in the software house. -Beta testing is done by the user. 12. Explain how the translator program can issue error diagnostics and give an example of an error that can be spotted by the translator. (3) A. - Translator compares reserved word/syntax (rules) in its dictionary - If different to what is in the dictionary the user can be told what and where the error is. -eg PLINT is not in the dictionary of words that it knows, so there must be an error. Note. The same example has been used again. I know this one is right, why risk changing it? 13. Bugs are common in computer programs. Describe two techniques that can be used to help debug a program. (4) A. - Cross referencing of variables - checks the program for occurences of the same variable being used more than once. - Trace table. - As the program is run the values of the variables are printed out as the instructions are executed, allowing the user to see where sudden, unexpected, changes have occurred. Note. There are lots of possible answers here, always choose the two answers that you are absolutely sure of. 14.a)Describe what is meant by annotation of a program code. (2) A. -Explanation of the techniques used in the algorithm -Needed by people employed to maintain or amend the program in the future… -by explaining the logic and the reasoning behind the code. b) Describe two methods that can be used to annotate code. (4) -Indentation of groups of statements. -This not only makes the group stand out, but also ensures that the person looking at the code knows which statements should be treated together. - Use of sensible variable names. -This means that long references to look up tables are not needed and helps to make the understanding of lines of instruction easier to follow. Note. Again, these are standard answers, but no less effective for that and safe because we know they are going to be right. |