




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Data Structures & Algorithms
Typology: Study notes
1 / 281
This page cannot be seen from the preview
Don't miss anything!
Page No. UNIT I Lesson 1 Introduction of Algorithm 7 Lesson 2 Analyzing and Designing Algorithms 36 UNIT II Lesson 3 Internal Sorting 69 Lesson 4 Searching 109 UNIT III Lesson 5 Graphs 129 Lesson 6 Spanning Tree 137 Lesson 7 String Matching 156 UNIT IV Lesson 8 Polynomials 183 Lesson 9 Matrices 197 Lesson 10 Dynamic Programming 210 UNIT V Lesson 11 Knapsack 231 Lesson 12 Other Algorithm 257
Introduction of Algorithm
Algorithms
After studying this lesson, you should be able to: Explain the concepts: problem, solution, instance of a problem, algorithm, computer program Understand the characteristics of an algorithm Learn the role of available tools in solving a problem Describe the basic instructions and control structures that are used in building up programs Explain how a problem may be analyzed in order to find out its characteristics so as to help us in designing its solution/algorithm
We are constantly involved in solving problem. The problems may concern our survival in a competitive and hostile environment, may concern our curiosity to know more and more of various facets of nature or may be about any other issues of interest to us. Problem may be a state of mind of a living being, of not being satisfied with some situation. However, for our purpose, we may take the unsatisfactory/ unacceptable/undesirable situation itself, as a problem. One way of looking at a possible solution of a problem, is as a sequence of activities (if such a sequence exists at all), that if carried out using allowed/available tools, leads us from the unsatisfactory (initial) position to an acceptable, satisfactory or desired position. For example, the solution of the problem of baking delicious pudding may be thought of as a sequence of activities, that when carried out, gives us the pudding (the desired state) from the raw materials that may include sugar, flour and water (constituting the initial position) using cooking gas, oven and some utensils etc. (the tools). The sequence of activities when carried out gives rise to a process. Technically, the statement or description in some notation, of the process is called an algorithm, the raw materials are called the inputs and the resulting entity (in the above case, the pudding) is called the output. In view of the importance of the concept of algorithm, we repeat: An algorithm is a description or statement of a sequence of activities that constitute a process of getting the desired outputs from the given inputs. Later we consider in detail the characteristics features of an algorithm. Next, we define a closely related concept of computer program. Computer Program: An algorithm, when expressed in a notation that can be understood and executed by a computer system is called a computer program or simply a program. We should be clear about the distinction between the terms viz., a process, a program and an algorithm. A process is a sequence of activities actually being carried out or executed, to solve a problem. But algorithm and programs are just descriptions of a process in some notation. Further, a program is an algorithm in a notation that can be understood and be executed by a computer system. It may be noted that for some problems and the available tools, there may not exist any algorithm that should give the desired output. For example, the problem of baking delicious pudding may not be solvable, if no cooking gas or any other heating substance is available. Similarly, the problem of reaching the moon is unsolvable, if no spaceship is available for the purpose.
9 Introduction of Algorithm These examples also highlight the significance of available tools in solving a problem. Later, we discuss some of mathematical problems which are not solvable. But, again these problems are said to be unsolvable, because of the fact that the operations (i.e., the tools) that are allowed to be used in solving the problems, are from a restricted pre-assigned set. Notation for Expressing Algorithms This issue of notation for representations of algorithms will be discussed in some detail, later. However, mainly, some combinations of mathematical symbols, English phrases and sentences, and some sort of pseudo-high-level language notations, shall be used for the purpose. Particularly, the symbol ‘←’ is used for assignment. For example, x←y + 3, means that 3 is added to the value of the variable y and the resultant value becomes the new value of the variable x. However, the value of y remains unchanged. If in an algorithm, more than one variable are required to store values of the same type, notation of the form A [1..n] is used to denote n variables A[1], A[2], … A[n]. In general, for the integers m, n with m ≤ n, A [m..n] is used to denote the variables A[m], A[m+1], …, A[n]. However, we must note that another similar notation A[m,n] is used to indicate the element of the matrix (or two-dimensional array) A, which is in mth^ row and nth^ column. Role and Notation for Comments The comments do not form that part of an algorithm, corresponding to which there is an (executable) action in the process. However, the comments help the human reader of the algorithm to better understand the algorithm. In different programming languages, there are different notations for incorporating comments in algorithms. We use the convention of putting comments between pair of braces, i.e., { }. The comments may be inserted at any place within an algorithm. For example, if an algorithm finds roots of a quadratic equation, then we may add the following comments, somewhere in the beginning of the algorithm, to tell what the algorithm does: {this algorithm finds the roots of a quadratic equation in which the coefficient of x^2 is assumed to be non-zero}.
An algorithm, named after the ninth century scholar Abu Jafar Muhammad Ibn Musa Al-Khowarizmi, is defined as follows: An algorithm is a set of rules for carrying out calculation either by hand or on a machine. An algorithm is a finite step-by-step procedure to achieve a required result. An algorithm is a sequence of computational steps that transform the input into the output. An algorithm is a sequence of operations performed on data that have to be organized in data structures. An algorithm is an abstraction of a program to be executed on a physical machine (model of computation).
11 Introduction of Algorithm {a new variable is used to store the remainder which is obtained by dividing m by n, with 0 r m} m n; {the value of n is assigned as new value of m; but at this stage value of n remains unchanged} m r; {the value of r becomes the new value of n and the value of r remains unchanged} end {of while loop} return (n). end; {of algorithm}
The difference between the two concepts viz., ‘problem’ and ‘instance’, can be understood in terms of the following example. An instance of a problem is also called a question. We know that the roots of a general quadratic equation ax^2 + bx + c = 0 a ≠ 0 …(1) are given by the equation b b^2 4ac x 2a
where a, b, c may be any real numbers except the restriction that a ≠ 0. Now, if we take a = 3, b = 4 and c = 1, we get the particular equation 3x^2 + 4x + 1 = 0 …(3) Using a, b, and c value in the equation of (2): 4 42 4 3 1 4 2 2 3 6
, i.e., 1 x 3
or – 1. With reference to the above discussion, the issue of finding roots of the general quadratic equation ax^2 + bx + c = 0, with a ≠ 0 is called a problem, whereas the issue of finding the roots of the particular equation. 3x^2 + 4x + 1 = 0 is called a question or an instance of the (general) problem. In general, a problem may have a large, possibly infinite, number of instances. The above-mentioned problem of finding the roots of the quadratic equation ax^2 + bx + c = 0 with a ≠ 0, b and c as real numbers, has infinitely many instances, each obtained by giving some specific real values to a, b and c, taking care that the value assigned to a is not zero. However, all problems may not be of generic nature. For some problems, there may be only one instance/question corresponding to each of the problems. For example, the problem of finding out the largest integer that can be stored or can be arithmetically operated on, in a given computer, is a single-instance problem. Many of the interesting problems like the ones given below, are just single-instance problems.
Algorithms Problem (i): Crossing the river in a boat which can carry at one time, along with the boatman only one of a wolf, a horse and a bundle of grass, in such a way that neither wolf harms horse nor horse eats grass. In the presence of the boatman, neither wolf attacks horse, nor horse attempts to eat grass. Problem (ii): The Four-Colour Problem* which requires us to find out whether a political map of the world, can be drawn using only four colours, so that no two adjacent countries get the same colour. The problem may be further understood through the following explanation. Suppose we are preparing a coloured map of the world and we use green colour for the terrestrial part of India. Another country is a neighbour of a given country if it has some boundary in common with it. For example, according to this definition, Pakistan, Bangladesh and Myanmar (or Burma) are some of the countries which are India’s neighbours. Then, in the map, for all the neighbour’s of India, including Pakistan, Bangladesh and Myanmar, we can not use green colour. The problem is to show that the minimum number of colours required is four, so that we are able to colour the map of the world under the restrictions of the problem.
We consider the concept of algorithm in more detail. While designing an algorithm as a solution to a given problem, we must take care of the following five important characteristics of an algorithm: Finiteness: An algorithm must terminate after a finite number of steps and further each step must be executable in finite amount of time. In order to establish a sequence of steps as an algorithm, it should be established that it terminates (in finite number of steps) on all allowed inputs. _Definiteness(no ambiguity):_* Each step of an algorithm must be precisely defined; the action to be carried out must be rigorously and unambiguously specified for each case. Through the next example, we show how an instruction may not be definite. Inputs: An algorithm has zero or more, but only finite, number of inputs. Examples of algorithms requiring zero inputs: Print the largest integer, say MAX, representable in the computer system being used. Print the ASCII code of each of the letter in the alphabet of the computer system being used. Find the sum S of the form 1+2+3+…, where S is the largest integer less than or equal to MAX. Output: An algorithm has one or more outputs. The requirement of at least one output is obviously essential, because, otherwise we can not know the answer/solution provided by the algorithm. The outputs have specific relation to the inputs, where the relation is defined by the algorithm. Effectiveness: An algorithm should be effective. This means that each of the operation to be performed in an algorithm must be sufficiently basic that it can, in principle, be done exactly and in a finite length of time, by a person using pencil and paper. It may be noted that the ‘FINITENESS’ condition is a special case of ‘EFFECTIVENESS’. If a sequence of steps is not finite, then it can not be effective also.
Algorithms Hence, by halving successive values of m (or (m – 1) when m is odd as explained below), we expect to reduce m to zero ultimately and stop, without affecting at any stage, the required product by doubling successive values of n along with some other modifications, if required. However, if m is odd then (m/2) is not an integer. In this case, we write m = (m – 1) + 1, so that (m – 1) is even and (m – 1)/2 is an integer. Then m n = ((m – 1) + 1) n = (m – 1)n + n = ((m – 1)/2) (2n) + n. where (m ─ 1)/2 is an integer as m is an odd integer. For example, m = 7 and n = 12 Then m * n = 7 * 11 = ((7 – 1) + 1) * 11 = (7 – 1) * 11 + 11 (7 1) 2
Therefore, if at some stage, m is even, we halve m and double n and multiply the two numbers so obtained and repeat the process. But, if m is odd at some stage, then we halve (m – 1), double n and multiply the two numbers so obtained and then add to the product so obtained the odd value of m which we had before halving (m – 1). Next, we describe the a’la russe method/algorithm. The algorithm that uses four variables, viz., First, Second, Remainder and, Partial- Result, may be described as follows: Step 1: Initialize the variables First, Second and Partial-Result respectively with m (the first given number), n (the second given number) and 0. Step 2: If First or Second is zero, return Partial-result as the final result and then stop. Else, set the value of the Remainder as 1 if First is odd, else set Remainder as 0. If Remainder is 1 then add Second to Partial-Result to get the new value of Partial Result. Step 3: New value of First is the quotient obtained on (integer) division of the current value of First by 2. New value of Second is obtained by multiplying Second by 2. Go to Step 2. Example: The logic behind the a’la russe method, consisting of Step 1, Step 2 and Step 3 given above, may be better understood, in addition to the argument given the box above, through the following explanation: Let First = 9 and Second = 16 Then First * Second = 9 * 16 = (4 * 2 + 1) * 16 = 4 * (2 * 16) + 1 * 16 where 4 = [9/2] = [First/2], 1 = Remainder. Substituting the values back, we First * second = [First/2] * ( 2 * Second) + Second. Let us take First 1 = [First/2] = 4 Second 1 = 2 * Second = 32 and Partial-Result = First 1 * Second 1.
15 Introduction of Algorithm Then from the above argument, we get First * Second = First 1 * Second 1 + Second = Partial-Result 1 + Second. Here, we may note that as First = 9 is odd and hence Second is added to Partial- Result. Also Partial-Result 1 = 432 = (2 * 2 + 0) * 32 = (2 * 2) * 32 + 0 * 32 = 2 (2 * 32) = First 2 * Second 2. Again we may note that First 1 = 4 is even and we do not add Second 2 to Partial- Result 2 , where Partial-Result 2 = First 2 * Second 2. Next, we execute the a’la russe algorithm to compute 45 * 19. First Second Remainder Partial Result Initially: 45 19 0 Step 2 As value of First ≠ 0, hence continue 1 19 Step 3 22 38 Step 2 Value of First ≠ 0, hence continue 0 Step 3 11 76 Step 2 Value of First ≠ 0, hence continue 1 76 + 19 = 95 Step 3 5 152 Step 2 Value of First ≠ 0, hence continue 1 152 + 95 = 247 Step 3 2 304 Step 2 Value of First ≠ 0, hence continue 0 Step 3 0 608 1 608 + 247 = 855 Step 2 As the value of the First is 0, the value 855 of Partial Result is returned as the result and stop.
We enumerate the basic actions and corresponding instructions used in a computer system based on a Von Neuman architecture. We may recall that an instruction is a notation for an action and a sequence of instructions defines a program whereas a sequence of actions constitutes a process. An instruction is also called a statement. The following three basic actions and corresponding instructions form the basis of any imperative language. For the purpose of explanations, the notation similar to that of a high-level programming language is used.
Introduction of Algorithm
In order to understand and to express an algorithm for solving a problem, it is not enough to know just the basic actions viz. , assignments, reads and writes. In addition, we must know and understand the control mechanisms. These are the mechanisms by which the human beings and the executing system become aware of the next instruction to be executed after finishing the one currently in execution. The sequence of execution of instructions need not be the same as the sequence in which the instructions occur in program text. First, we consider three basic control mechanisms or structuring rules, before considering more complicated ones. Direct Sequencing: When the sequence of execution of instructions is to be the same as the sequence in which the instruction are written in program text, the control mechanism is called direct sequencing. Control structure, (i.e., the notation for the control mechanism), for direct sequencing is obtained by writing of the instructions, one after the other on successive lines, or even on the some line if there is enough space on a line, separated by some statement separator, say semi-colons, and in the order of intended execution. For example, the sequence of the three next lines A; B; C; D; denotes that the execution of A is to be followed by execution of B, to be followed by execution of C and finally by that of D. When the composite action consisting of actions denoted by A, B, C and D, in this order, is to be treated as a single component of some larger structure, brackets such as ‘begin….end’ may be introduced, i.e., in this case we may use the structure Begin A;B;C;D end. Then the above is also called a (composite/compound) statement consisting of four (component) statement viz. A, B, C and D. Selection: In many situations, we intend to carry out some action A if condition Q is satisfied and some other action B if condition Q is not satisfied. This intention can be denoted by: If Q then do A else do B, Where A and B are instructions, which may be even composite instructions obtained by applying these structuring rules recursively to the other instructions. Further, in some situations the action B is null, i.e., if Q is false, then no action is stated. This new situation may be denoted by If Q then do A In this case, if Q is true, A is executed. If Q is not true, then the remaining part of the instruction is ignored, and the next instruction, if any, in the program is considered for execution. Also, there are situations when Q is not just a Boolean variable i.e., a variable which can assume either a true or a false value only. Rather Q is some variable capable of assuming some finite number of values say a, b, c, d, e, f. Further,
Algorithms suppose depending upon the value of Q, the corresponding intended action is as given by the following table: Value Action a A b A c B d NO ACTION e D f NO ACTION The above intention can be expressed through the following notation: Case Q of a, b : A; c : B; e : D; end; Repetition: Iterative or repetitive execution of a sequence of actions, is the basis of expressing long processes by comparatively small number of instructions. As we deal with only finite processes, therefore, the repeated execution of the sequence of actions, has to be terminated. The termination may be achieved either through some condition Q or by stating in advance the number of times the sequence is intended to be executed. When we intend to execute a sequence S of actions repeatedly, while condition Q holds, the following notation may be used for the purpose: While (Q) do begin S end; Check Your Progress 1
Though the above-mentioned three control structures, viz., direct sequencing, selection and repetition, are sufficient to express any algorithm, yet the following two advanced control structures have proved to be quite useful in facilitating the expression of complex algorithms viz. Procedure Recursion Let us first take the advanced control structure procedure.
Algorithms In order to explain the involved ideas, let us consider the following simple examples of a procedure and a program that calls the procedure. In order to simplify the discussion, in the following, we assume that the inputs etc., are always of the required types only, and make other simplifying assumptions. Recursion Next, we consider another important control structure namely recursion. In order to facilitate the discussion, we recall from Mathematics, one of the ways in which the factorial of a natural number n is defined: factorial (1) = 1 factorial (n) = n* factorial (n─1). …(3) For those who are familiar with recursive definitions like the one given above for factorial, it is easy to understand how the value of (n!) is obtained from the above definition of factorial of a natural number. However, for those who are not familiar with recursive definitions, let us compute factorial (4) using the above definition. By definition factorial (4) = 4 * factorial (3). Again by the definition factorial (3) = 3 * factorial (2) Similarly factorial (2) = 2* factorial (1) And by definition factorial (1) = 1 Substituting back values of factorial (1), factorial (2) etc., we get factorial (4) = 4 3 2 1 = 24, as desired. This definition suggests the following procedure/algorithm for computing the factorial of a natural number n: In the following procedure factorial (n), let fact be the variable which is used to pass the value by the procedure factorial to a calling program. The variable fact is initially assigned value 1, which is the value of factorial (1). Procedure factorial (n) fact: integer; begin fact 1 if n equals 1 then return fact else begin fact n * factorial (n – 1) return (fact) end; end; In order to compute factorial (n – 1), procedure factorial is called by itself, but this time with (simpler) argument (n – 1). The repeated calls with simpler arguments continue until factorial is called with argument 1. Successive multiplications of partial results with 2, 3, ….. up to n finally deliver the desired result.
21 Introduction of Algorithm Though, it is already mentioned, yet in view of the significance of the matter, it is repeated below. Each procedure call defines a variables fact, however, the various variables fact defined by different calls are different from each other. In our discussions, we may use the names fact1, fact2, fact3 etc. However, if there is no possibility of confusion then we may use the name fact only throughout. Let us consider how the procedure executes for n = 4 compute the value of factorial (4). Initially, 1 is assigned to the variable fact. Next the procedure checks whether the argument n equals 1. This is not true (as n is assumed to be 4). Therefore, the next line with n = 4 is executed i.e., fact is to be assigned the value of 4* factorial (3). Now n, the parameter in the heading of procedure factorial (n) is replaced by 3. Again as n ≠ 1, therefore the next line with n = 3 is executed i.e., fact = 3 * factorial (2) On the similar grounds, we get fact as 2* factorial (1) and at this stage n = 1. The value 1 of fact is returned by the last call of the procedure factorial. And here lies the difficulty in understanding how the desired value 24 is returned. After this stage, the recursive procedure under consideration executes as follows. When factorial procedure is called with n = 1, the value 1 is assigned to fact and this value is returned. However, this value of factorial (1) is passed to the statement fact ←2 * factorial (1) which on execution assigns the value 2 to the variable fact. This value is passed to the statement fact ←3 * factorial (2) which on execution, gives a value of 6 to fact. And finally this value of fact is passed to the statement fact← 4 * factorial (3) which in turn gives a value 24 to fact. And, finally, this value 24 is returned as value of factorial (4). Coming back from the definition and procedure for computing factorial (n), let us come to general discussion. Summarizing, a recursive mathematical definition of a function suggests the definition of a procedure to compute the function. The suggested procedure calls itself recursively with simpler arguments and terminates for some simple argument the required value for which, is directly given within the algorithm or procedure. Definition: A procedure, which can call itself, is said to be recursive procedure/ algorithm. For successful implementation of the concept of recursive procedure, the following conditions should be satisfied. There must be in-built mechanism in the computer system that supports the calling of a procedure by itself, e.g, there may be in-built stack operations on a set of stack registers. There must be conditions within the definition of a recursive procedure under which, after finite number of calls, the procedure is terminated. The arguments in successive calls should be simpler in the sense that each succeeding argument takes us towards the conditions mentioned in second. In view of the significance of the concept of procedure, and specially of the concept of recursive procedure, in solving some complex problems, we discuss another recursive algorithm for the problem of finding the sum of first n natural numbers, discussed earlier. For the discussion, we assume n is a non-negative integer Procedure SUM (n : integer) : integer s : integer; If n = 0 then return (0)