Describing Algorithms
In Analysis of algorithms we study and compare two or more algorithms under some specified circumstances which remains same for all and their results are measured on the basis of their space and run time requirements (Almuallim & Dietterich, 2014). The reason why analyze the algorithms is, to discover the most suitable algorithms for a particular application depending on its environment. To analyze the execution time of an algorithm the procedure that has to be followed is:
Implementation of all algorithms using same language.
Evaluate the time taken to perform all the basic operations
Discover unidentified quantities which can help to define basic operations’ execution frequency.
As an input to the program develop a realistic model.
Assuming the modeled input, examine the unknown quantities.
Evaluate the total execution time: multiply frequency by time and add.
To calculate the theoretical time efficiency of the two algorithms, Average-Case Analysis has been performed on both of them i.e. both algorithms have been tested over all possible inputs. To test the efficiency, the algorithms have been implemented in JAVA language using all its data sets and structure alignment. Similar basic operations and problem size have been applied to both to compare the results and predict the efficiency. A proper approach has been followed to develop test data and chose cases under different input sizes (Almuallim & Dietterich, 2014).
The last section of this report describes the outcome of the analysis and defines the best and worst algorithm, the basis of the outcome has been counted as ‘number of basic operations’ and ‘execution time’ i.e. space and time requirements.
The two algorithms provided for the comparison and analysis purpose performs the same function i.e. to calculate the maximum distance between two of its elements. For example, suppose there are five random numbers 12, 34, 40, 1 and 6, the algorithm will calculate the difference between every pair of numbers like (12 and 34), (12 and 40), (12 and 1), (12 and 6), (34 and 40) and so on (Almuallim & Dietterich, 2014). Every time a greater difference is found between two variables, the value of the return variable will be altered with the greater difference and after the loop ends the maximum difference possible will be returned. In this case the output that will be returned is 39 as the maximum difference between its two elements, which is the difference between 1 and 40. No other pair will have greater difference than this (Biebricher, Fuhr, Lustig, Schwantner & Knorz, 2016).
Let’s understand both algorithms in a more clear and understandable manner with the help of algorithm properties and elements.
The algorithm MaxDistance has been designed to generate the maximum distance between two of the elements that has been provided as input. It has following properties.
Finiteness: Algorithm MaxDistance has a finite number of steps and instructions and it will surely terminate after completion of all steps. This algorithm does not have any infinite loop running in it (Balcom, Laura & Tong, 2015).
Definiteness: Every instruction in the algorithm MaxDistance has been defined precisely and in a very clear manner which can be understood with very little efforts. No step or instruction has any kind of ambiguity.
Algorithm 1: MaxDistance
Example:
The above lines have been copied from the given algorithm which clearly shows that input and output statements have been defined in such manners which are self-explanatory (Balcom et. al, 2015).
Input: The algorithm MaxDistance takes a list of integer numbers as input which is entered by the user during run time. These numbers are finite and can either be defined in the source code or can be asked from the user during run time. These numbers will then be stored as an array of integers (Balcom et. al, 2015).
Output: The algorithm returns one single output that is the maximum distance between two of the elements provided in the input section as the list of integers (Balcom et. al, 2015).
The working of algorithm MaxDistance:
Acquire Data (Input): During run time the user will be prompted to enter ‘n’ numbers. This ‘n’ can either be defined explicitly within the source code with some finite digit or can be asked from the user while executing. Depending upon the size allowed an integer array A[] has been declared that can store that many numbers. When user is prompted to enter ‘n’ numbers, it is expected from the user to input a list of integer elements only otherwise run time error will be accounted (Biebricher, et. al, 2016).
Computation and Iteration: Few variables has been declared and defined for the computation purposes of the algorithm which are dmax, i and j. All three variables have been initialized with the value 0. There are two iterations in this algorithm one of them runs as the outer loop which is controlled by the ‘i’ variable and another runs as the inner loop which is controlled by the ‘j variable (Biebricher, et. al, 2016). See below:
The first loop runs from 0 to ‘n-1’ positions and second loop also runs from ‘0’ to ‘n-1’ positions. This means that there will be total ‘n’ iterations for the outer loop where inside each iteration of the outer loop there will be ‘n’ iterations. So there will be total ‘n * n’ iterations. For example:
When ‘i’ will be 0 it will be counted as:
i=0; Iteration 1:
j=0; iteration 1
j=1; iteration 2
.
j=n-1; iteration n
Similarly for i=1
i=1; Iteration 2
j=0; iteration 1
j=1; iteration 2
.
j=n-1; iteration n
upto i=n-1
Computations and evaluations inside inner loop:
For each iteration of variable ‘i’ these steps will be performed:
Step 1: The value of ‘j’ will be initialized to zero
Step 2: The value of ‘i’ and ‘j’ will be compared and if they are equal, the inner loop will break at that very point; next iteration of ‘j’ will be computed. In case variable ‘i’ and variable ‘j’ are not equal difference will be calculated between A[i] and A[j] and if the difference comes out to be greater than the value of dmax, the value of dmax will be replaced by the difference of A[i] and A[j]. The value of the variable ‘j’ would be incremented automatically by 1 and step 2 will be repeated until value of ‘j’ reaches ‘n-1’. After the value of ‘j’ becomes equal to ‘n’, the counter will move to step 1.
Properties
The outer loop will terminate after variable ‘i’ reaches value ‘n’. The iteration will not be calculated for ‘i=n’ and as soon the value of ‘i’ becomes equal to ‘n’ it will come out of the outer loop.
Report Results: The final value of the variable ‘dmax’ which has been computed in the processing steps will be returned and displayed to the user. This value will represent the maximum distance between two of its elements.
The algorithm MaxDistance2 has the same function as that of MaxDistance been i.e. to generate the maximum distance between two of the elements that has been provided as input. It has similar properties as that of MaxDistance.
Finiteness: Algorithm MaxDistance2 has a finite number of steps and instructions and it will surely terminate after completion of all steps. This algorithm does not have any infinite loop running in it (Balcom et. al, 2015).
Definiteness: Every instruction in the algorithm MaxDistance2 has been defined precisely and in a very clear manner which can be understood with very little efforts. No step or instruction has any kind of ambiguity (Balcom et. al, 2015).
Input: The algorithm MaxDistance2 takes a list of integer numbers as input which is entered by the user during run time. These numbers are finite and can either be defined in the source code or can be asked from the user during run time. These numbers will then be stored as an array of integers (Balcom et. al, 2015).
Output: The algorithm returns one single output that is the maximum distance between two of the elements provided in the input section as the list of integers (Balcom et. al, 2015).
The working of algorithm MaxDistance2:
Acquire Data (Input): During run time the user will be prompted to enter ‘n’ numbers. This ‘n’ can either be defined explicitly within the source code with some finite digit or can be asked from the user while executing. Depending upon the size allowed an integer array A[] has been declared that can store that many numbers. When user is prompted to enter ‘n’ numbers, it is expected from the user to input a list of integer elements only otherwise run time error will be accounted.
Computation and Iteration: Few variables has been declared and defined for the computation purposes of the algorithm which are dmax, temp, i and j. The three variables dmax, i and j have been initialized with the value 0 while temp has only been declared and not initialized. There are two iterations in this algorithm one of them runs as the outer loop which is controlled by the ‘i’ variable and another runs as the inner loop which is controlled by the ‘j variable. See below:
The first loop runs from 0 to ‘n-2’ positions and second loop runs from ‘i+1’ to ‘n-1’ positions. This means that the outer loop will run for total ‘n-1’ iterations while inner loop will start running first for ‘n-1’ iterations and with every increasing iteration of ‘i’, the number of times the loop ‘j’ will run will be one less than the previous time, and for the last iteration of j the loop will run only for one time. Hence for every iteration of ‘i’ the jth loop will run for (n-1)*(n-i-1) times. For example:
Elements
When ‘i’ will be 0 it will be counted as:
i=0; Iteration 1:
j=1; iteration 1
j=2; iteration 2
.
j=n-1; iteration n-1
Similarly for i=1
i=1; Iteration 2
j=2; iteration 1
j=3; iteration 2
.
j=n-1; iteration n-1
upto i=n-2, the last iteration for ‘i’ will be:
i=n-2; Iteration n-1
j=n-1; iteration 1
Terminate
Computations and evaluations inside inner loop:
Step 1: The value of variable ‘j’ will be initialized to ‘i+1’.
Step 2: The difference between the elements present at location A[i] and A[j] will be calculated. The value of the difference will be stored in the variable ‘temp’. Now value of ‘temp’ will be compared to the value of variable ‘dmax’, in case the value of ‘temp’ is greater than the value of ‘dmax’: the content of dmax will be changed by the value of temp i.e. dmax will now contain the new value which will be equal to the current difference between A[i] and A[j]. In case the value of ‘temp’ is smaller than ‘dmax’: the variable ‘j’ will be incremented by 1 automatically and step 2 will be repeated until value of ‘j’ reaches ‘n-1’. After the value of ‘j’ becomes equal to ‘n’, the counter will move to step 1.
The outer loop will terminate after variable ‘i’ reaches value ‘n-1’. The iteration will not be calculated for ‘i=n-1’ and as soon the value of ‘i’ becomes equal to ‘n-1’ it will come out of the outer loop.
Report Results: The final value of the variable ‘dmax’ which has been computed in the processing steps will be returned and displayed to the user. This value will represent the maximum distance between two of its elements.
Partial Correctness
To prove the total correctness of an algorithm, we must prove that the algorithm returns an output i.e. partial correctness of an algorithm and Termination of the algorithm (Cerny, Okseniuk & Lawrence, 2013).
|- assert(P); C; assert(Q);
An algorithm is partially correct if the following statement can be proved true:
“If the pre-condition is true, the post-condition must be true”.
In the statement ‘|- assert(P); C; assert(Q);’
Assert(P) defines the pre-condition before the execution of the algorithm, C defines the execution of the algorithm and Assert(Q) defines the post-condition before the execution of the algorithm (Cerny et. al, 2013).
Pre-Condition: We have assumed that pre-condition for MaxDistance is initially satisfied i.e. The array of integers A[1…n] hold only natural numbers between 1 and n such that 1<s<n; where s and n are natural numbers. Also the value of dmax has been defined to be natural integer (Cerny et. al, 2013).
dmax =0 and A[1—n] = 1<s<n
Post-Condition: The variable dmax will contain the maximum distance between two of its elements. To rule out the confusion of initial value of variable dmax and final value of dmax, let’s denote the final value of dmax by dmax’.
dmax = either 0 or Maximum Distance
These pre-conditions and post-conditions remain same for both the algorithms.
Proof:
To prove the correctness of given algorithm:
i=0;j=0;dmax=0;
for (i<n;i++)
{
for (j<n;j++)
{
if (i==j) and (A[i]-A[j])>dmax
dmax= A[i]-A[j]);
} end for
}end for
return dmax
We can see that the major part of the algorithm contains a loop; hence if we prove that the loop works in a correct manner and output is being produced, we can prove the partial correctness.
The following table shows the value of interest:
Iteration Number for outer loop (value of i) |
0 |
1 |
2 |
3 |
——– |
n-1 |
||||||||||||||||
Iteration Number for inner loop (value of j) |
0 |
1 |
2 |
n-1 |
0 |
1 |
2 |
n-1 |
0 |
1 |
2 |
n-1 |
0 |
1 |
2 |
n-1 |
——– |
0 |
1 |
2 |
n-1 |
|
dmax |
Max(A[i]-A[j])=dmax’ |
Max(A[i]-A[j])=dmax’ |
dmax’ |
dmax’ |
——- |
dmax’ |
Loop Invariant:
dmaxi>=0 and dmaxi = max(A[i]-A[j]) = dmax’
Base Case: Before the start of loop the value of i and j are 0. So the dmax = A[0]-A[0] =0 , so loop invariant holds.
Induction Step: Let us assume s>=0 and loop invariant holds for sth iteration. Then loop invariant will hold true for the s+1th iteration automatically.
Dmaxs+1 = Max of (A[s] – A[0],A[s]-A[1] ———–, A[s]-A[n])
= dmax’.
Hence for s+1th iteration the loop invariant hold true. This proves that the end of the program the algorithm returns the maximum difference between two of the array elements which proves the partial correctness.
Proof:
To prove the correctness of given algorithm:
i=0;j=i+1;dmax=0;
for (i<n-1;i++)
{
for (j<n;j++)
{
temp=A[i]-A[j];
if temp>dmax
dmax=temp;
} end for
}end for
return dmax
We can see that the major part of the algorithm contains a loop; hence if we prove that the loop works in a correct manner and output is being produced, we can prove the partial correctness.
Iteration Number for outer loop (value of i) |
0 |
1 |
2 |
3 |
——– |
n-2 |
||||||||||||
Iteration Number for inner loop (value of j) |
1 |
2 |
3 |
n-1 |
2 |
3 |
4 |
n-1 |
3 |
4 |
5 |
n-1 |
4 |
5 |
6 |
n-1 |
——– |
n-1 |
dmax |
Max(A[i]-A[j])=dmax’ |
Max(A[i]-A[j])=dmax’ |
dmax’ |
dmax’ |
——- |
dmax’ |
Loop Invariant:
dmaxi>=0 and dmaxi = max(A[i]-A[j]) = dmax’
Base Case: Before the start of loop the value of i is 0 and j is i+1 i.e. 1. So the dmax = A[0]-A[1] =either 0 or greater because statement ‘ if temp>dmax’ makes sure that dmax will only alter its value in case it is greater than the previous one.
, so loop invariant holds.
Induction Step: Let us assume s>=0 and loop invariant holds for sth iteration. Then loop invariant will hold true for the s+1th iteration automatically.
Dmaxs+1 = Max of (A[s] – A[s+1],A[s]-A[s+2] ———–, A[s]-A[n-1])
= dmax’.
As we can see that the algorithm looks for max distance between A[s]th variable and the rest of the list while the greater distance can even exist between A[s] and A[0,1—-s-1], the algorithm will only work correctly if the input is a non-decreasing list of integers, otherwise the output will not be as expected.
Hence for s+1th iteration the loop invariant hold true. This proves that the end of the program the algorithm returns the maximum difference between two of the array elements which proves the partial correctness.
In both algorithms MaxDistance and MaxDistance2:
As jk increases at each iteration, ‘n-jk’ decreases and at one point the value of jk will become sufficiently large to dis-satisfy the given condition and hence it will terminate after a finite number of steps.
Similarly, in the outer loop the value of ik increases with every iteration while the value of n-ik decreases. As per the condition has been defined the loop will terminate in MaxDistance as soon the value of ‘i’ becomes equal to ‘n’ while in algorithm MaxDistance2 the outer loop will terminate when ik reaches value n-1.
The efficiency of an algorithm depends upon two main parameters: Time and Space. Time refers to the time taken by the algorithm during execution to traverse all instructions and terminate and produce the output, whereas space denotes the number memory locations occupied by the variables defined in the algorithm.
To measure time efficiency of any algorithm we calculate the time complexity of the algorithm which is generally referred as function of input size or ‘problem size’, T(n), where n denotes the problem size. There are two ways to calculate time complexity: Empirical analysis or Posteriori and the other method is Mathematical analysis or Apriori.
The method used in this report to evaluate time complexity for MaxDistance and MaxDistance2 is Apriori.
In both algorithms there are two inputs required to be entered by the user.
The total number of elements to be entered into the array
The list of elements.
To calculate the maximum complexity of algorithms, the input size for total number of elements has been restricted to 10. It means that there can be maximum 10 elements in the array that needs to be compared.
Also, for every list element the input size is restricted to 1 to 100. It is assumed that user will not enter a list item which is greater than 100.
Also, it has been assumed that the value of n cannot be 0 and it should be at least 1 or greater than 1. So for value of n, the following condition must be true:
1<=n<=10
And for the list elements the condition that holds true is:
1<= A[i] <=100
The above statements are common base for evaluating efficiency of both algorithms and will hold true in each case.
When there are loops present in an algorithm, its time complexity is measured by the expression
T (n) ? n × TR
So for any statement S that requires ‘p’ operations and executes ‘q’ times within a loop, the total count of basic operations is counted as p × q operations. This principle is known as multiplication principle.
Also, the time complexity of decision statement i.e. IF than Else is represented as:
T(n) = Maximum { TP, TQ} where P represents the statement to be executed in case the condition holds true and Q otherwise. The maximum of both is chosen as the final count. But in given algorithms MaxDistance and MaxDistance2 there are no statements to be executed when case holds false so it will not make any difference in this case.
Now, let’s calculate time complexity for given algorithms:
Instruction |
Cost |
Frequency |
dmax=0 |
C1 |
1 |
For (i=0 to n-1)do |
C2 |
n |
For (j=0 to n-1)do |
C3 |
n |
If Condition |
C4 |
n |
dmax = A[i] – A[j] |
C5 |
n(maximum) |
return dmax |
C6 |
1 |
The total cost will be: C1 + (n.C2 (n.C3 (n.C4+n.C5))) +C6
àC1 + n2 C2 .C3(n.C4+n.C5) + C6
àC1 + n3C2.C3(C4+C5)+C6
Ignoring the co-efficient in the above function and using big O notation the complexity comes out to be O (n^3).
Instruction |
Cost |
Frequency |
dmax=0 |
C7 |
1 |
For (i=0 to n-2)do |
C8 |
n-1 |
For (j=i+1 to n-1)do |
C9 |
n-1 |
temp= A[i] – A[j] |
C10 |
n-1 |
If Condition |
C11 |
n-1 |
dmax = temp |
C12 |
n-1(maximum) |
return dmax |
C13 |
1 |
The total cost will be: C7 + (n-1).C8 (n-1.C9(n-1.C10+n-1.C11+ n-1.C12)) +C13
àC7 + (n-1)2 C8 .C9(n-1.C10+n-1.C11+n-1.C12) + C13
àC7 + (n-1)3 C8 .C9(C10+C11+C12) + C13
Ignoring the co-efficient in the above function and using big O notation the complexity comes out to be O (n-1^3).
Clearly, Algorithm MaxDistance2 is more efficient than MaxDistance on the Time Complexity factor as the loop runs for fewer times in MaxDistance2 as compared to MaxDistance.
In the case of first algorithm i.e. MaxDistance, we want to discover the case that whether it is possible to have less execution time than O (n^3). The answer is yes.
Suppose the input array is of size 5 where elements are [A[0]=5, A[1]= 1, A[2]= 2, A[3]= 3, A[4]= 4,
In this case statement:
dmaxß A[i]-A[j]
run only for 1 time as we get the maximum distance of two elements in the 2nd loop of the 1st iteration i.e. A[0]-A[1] where 0 represents i and 1 represents ‘j’.
Similarly worst case will be the input array is of size 5 where elements are [A[0]=3, A[1]= 4, A[2]= 2, A[3]= 1, A[4]= 5,
In this case statement:
dmaxß A[i]-A[j] runs for maximum number of times i.e. n*n as we get the maximum distance of two elements in the last loop of the last iteration i.e. A[4]-A[3] where 4 represents i and 3 represents ‘j’.
As we can see that the both loops (Outer and Inner) will continue till end termination, the worst case, best case and average case makes hardly any difference to the time efficiency of this algorithm (Gale & Church, 2010). In all cases the time efficiency of the algorithm MaxDistance remains same:
Worst Case: O (n^3)
Best Case: O (n^3)
Average Case: O (n^3)
Let’s discuss an average case of input for the algorithm MaxDistance and evaluate its complexity.
This is considerably the most useful measure because worst and best cases might be rare and in general the random input size does not belong to either of these cases. For example: The input size for the variable n is ‘4’ and the list of elements in the array are: (A [0] = 7, A [1] = 5, A [2] = 20, A [3] = 6). The execution run time will be as follows
Instruction |
Cost |
Frequency |
dmax=0 |
C1 |
1 |
For (i=0 to n-1)do |
C2 |
n |
For (j=0 to n-1)do |
C3 |
n |
If Condition |
C4 |
n |
dmax = A[i] – A[j] |
C5 |
n-1 |
return dmax |
C6 |
1 |
We see that the total cost in this case will be: C1 + (n.C2 (n.C3(n.C4+n-1.C5))) +C6.
àC1 + n2 C2 .C3(n.C4+n-1.C5) + C6
? C1 + n3C2.C3(C4+C5)+C6 (approximately)
Ignoring the co-efficient in the above function and using big O notation the complexity comes out to be O (n^3). Hence for any average case the time complexity would be same i.e. O (n^3).
In the second algorithm MaxDistance2, let’s determine the execution time for the same case we determined for MaxDistance, i.e. The input size for the variable n is ‘4’ and the lists of elements in the array are: (A [0] = 7, A [1] = 5, A [2] = 20, A [3] = 6). The execution run time will be as follows:
Instruction |
Cost |
Frequency |
dmax=0 |
C7 |
1 |
For (i=0 to n-2)do |
C8 |
n-1 |
For (j=i+1 to n-1)do |
C9 |
n-1 |
temp= A[i] – A[j] |
C10 |
n-1 |
If Condition |
C11 |
n-1 |
dmax = temp |
C12 |
n-3 |
return dmax |
C13 |
1 |
The statement dmax=temp will be executed less time while all other statements will execute maximum times. The complexity will be:
The total cost will be: C7 + (n-1).C8 (n-1.C9(n-1.C10+n-1.C11+ n-3.C12)) +C13
àC7 + (n-1)2 C8 .C9(n-1.C10+n-1.C11+n-3.C12) + C13
? C7 + (n-1)3 C8 .C9(C10+C11+C12) + C13 (approximately)
Ignoring the co-efficient in the above function and using big O notation the complexity comes out to be O (n-1^3).
For the example case n is equal to 4, so the time complexity for the algorithm MaxDistance will be (4*4*4) = 64, while the time complexity for the algorithm MaxDistance2 will be (3*3*3) = 27 which is too much faster than the MaxDistance. This implies that the algorithm MaxDistance2 is better in terms of time efficiency than MaxDistance (Fuhr, Hartmann, Lustig, & Schwantner, 2011).
Programming Language and Environment
After the time efficiency has been calculated theoretically, a practical evaluation has been conducted to experiment out the comparison and compare the results calculated theoretically. To implement both algorithms, JAVA has been chosen as the programming language chosen. The best advantage and reason of choosing this language is platform independency i.e. JAVA is hardware independent language (Buntine, 2010). When we compile a JAVA source code, it provides an equivalent bytecode that can be run on any machine irrespective of its hardware configurations. It is the responsibility of the JRE of the computer system to actually execute the program. So, by implementing code in JAVA language, it gives a degree of freedom to execute the algorithms on any hardware platform but the comparison results will always be same (Buntine, 2010).
In today’s heterogeneous networks to operate on multiple platforms, Java has been designed dynamically adaptable, portable and architecturally neutral. The programming concepts of Java are simple and familiar to be learned easily, and it works on the principles of object oriented programming i.e. abstraction, encapsulation, polymorphism and inheritance to take advantage of modern software development methodologies and for achieving the compatibility with client systems (Buntine, 2010)..
The computing environment under which both algorithms have been executed is:
Java Virtual Machine (JVM) 32-bit operating system
Java Platform Version- 1.8
Java Product Version- 1.8.0_121
Architecture : x86
All java objects for 32 bit JVM are aligned by 4 bytes boundary meaning that if there is an object with two fields ‘byte’ and ‘int’ doesn’t occupies 17 (12 for object header, 4 for int and 1 for byte ) but 20 bytes. The byte needed to store different data types in java is as foolows:
double, long à8 bytes
float, int à 4 bytes
char, short à 2 bytes
boolean, byte à 1 byte
In both algorithms MaxDistance and MaxDistance2, we are only concerned with integer type data type. So to hold each data, memory size requirement is of one word, because in our computing environment of JVM 32 bit one word is equal to 4 bytes (Deogun & Raghavan, 2016).
Java Implementation of MaxDistance
The first statement of the given algorithm MaxDistance (A[0—–n-1]) explains that MaxDistance is a function that contains the instruction to calculate the maximum distance between two elements , also it takes as input an array of integer numbers which ranges from 0 to n-1.
The implementation for this statement in java has been done like:
int MaxDistance (int A[],int n) {}
In the above statement the first int represents the return type of the method as the method returns single integer value. MaxDistance is the definition of the method while int A[] and int n defines that the method receives two input as formal parameters which is the list of array elements and size n.
int i,j,dmax=0;
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
if((i!=j)&&((A[i]-A[j])>dmax))
dmax= A[i]-A[j];
}
}
return dmax;
As we can see in the algorithm MaxDistance there are three more variables used along with A [] and n which are ‘i’, ‘j’, ‘dmax’ so while implementing the code in java it is must to declare and initialize them. The statement below has been used to declare and initialize the variable as integer data type which means that these variables can only accommodate integer values in it.
int i,j,dmax=0;
The statements below are the logic statements to the algorithm which processes the actual input and calculates output. “{}” These brackets show the entry and exit of the loop which explains that statement (for (j=0;j<n;j++) is running as an inner loop. The alignment of these loops in the given algorithm shows that both of these loops are nested hence implemented.
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
if((i!=j)&&((A[i]-A[j])>dmax))
dmax= A[i]-A[j];
}
}
Statement if((i!=j)&&((A[i]-A[j])>dmax)) has been replicated for the given statement if i ≠ j and |A[i] – [A[j] |> dmax where ‘not equal to’ sign and ‘and’ sign has been changed as per the rules of java. Also the java does not support brackets ‘||” so in place of them ‘()’ these brackets have been used.
return dmax;
The above statement returns the output of the function within the variable dmax.
To execute the method MaxDistance, a main method has been created as per language requirements. The main method traverses the statements in an orderly manner (Dolan, Goldman, Cuda, & Nakamura, 2014).
public static void main (String args[])throws Exception
All statements under this method have been used to make program behave in a required manner.
System.out.println(“Enter total Numbers Numbers to be entered”);
n = Integer.parseInt(br.readLine());
System.out.println(“Enter “+n+ ” Numbers”);
for(i=0;i<n;i++)
A[i] = Integer.parseInt(br.readLine());
These statements display a message to the user to enter the required input and take the input from the user and store them in the memory.
CalcMaxDistance obj = new CalcMaxDistance();
dmax=obj.MaxDistance(A,n);
The above statements creates an object of the class in which method MaxDistance has been defined and with the help of the object created, a call has been given to the method MaxDistance along with input data. The method MaxDistance returns an integer value that need to be stored, so dmax has been defined here to store the returned value.
System.out.println(“The maximum distance is: “+ dmax);
This statement displays the final output to the user.
The first statement of second algorithm is almost same as the first one with a slight difference in the method definition MaxDistance2 (A [0—–n-1]). Like algorithm MaxDistance takes as input an array of integer numbers which ranges from 0 to n-1 (Creecy, Masand, Smith & Waltz, 2014).
The implementation for this statement in java has been done like:
int MaxDistance2 (int A[],int n) {}
The functions of all variables are same as defined in the previous section with only difference in method name (DeJong, 2015).
In algorithm MaxDistance2, there are four local variables: i, j, dmax and temp. All these variables hold integer values and thus have been defined with the statement.
int i,j,dmax=0,temp;
The statements below are the logic statements to the algorithm which processes the actual input and calculates output. “{}” These brackets show the entry and exit of the loop which explains that statement (for (j=i+1;j<n;j++) is running as an inner loop. The alignment of these loops in the given algorithm shows that both of these loops are nested hence implemented (Crawford, Fung, Appelbaum & Tong, 2009).
for(i=0;i<n-1;i++)
{
for(j=i+1;j<n;j++)
{
temp=A[i]-A[j];
if(temp>dmax)
dmax= temp;
}
}
It can be seen that here all statements are exactly same as defined in the algorithm which makes sure that the logic has been replicated the structure of the algorithms as faithfully as possible. Also, like MaxDistance, this also has a main function which helps in traversing the statements in an orderly manner.
To test the correctness of the program, both programs were compiled using java compiler and then executed by the JVM installed in the computer system. The programs were thoroughly tested using different types of input and similar inputs were used on both programs to check their validity.
The programs were checked against:
Run time errors, such as divide by zero and loop may run indefinitely. As there is no division operation occurring in any of the program, divide by zero error has been ruled out. When the program executed, it terminated after a finite number of steps which shows that none of the loop runs indefinitely and program terminates in finite steps.
Input Data: Although it has been mentioned and assumed that the value of ‘n’ entered by the user is between 1<=n<=10, but in case user enters a value 0 or other than this, the program does not terminates abruptly. In case user enters a value ‘0’ for variable ‘n’ then the results displayed will be as follows:
The program has been checked on several groups of data which counts as worst case, best case and average case and the program executed correctly for every input data type in both algorithms. Although algorithm MaxDistance2 has greater time efficiency over algorithm MaxDistance but there is one major drawback that the algorithm will return maximum distance between its two elements only if the higher element existence in array is before lower element, for example: If there are two elements 1 and 5 stored in array which are lowest and highest element then MaxDistance2 will return correct results only if element 5 is stored in the array at a position before position of element 1, otherwise it will not evaluate correct results while algorithm MaxDistance works correctly in all kind of input data.
To generate the test data no automated generation tools have been used and all data has been generated manually. Different values have been assigned to the variable ‘n’ which defines the input size. The input size assigned has been within the boundary limits decided for the algorithms. The different values decided for the variable ‘n’ has been applied to both algorithms and results were compared.
After the different values of ‘n’ have been decided a random data was generated to fill up the list of array elements and similar data has been applied to both algorithms. The techniques used to generate different kind of random input data are:
Generated list of array elements sorted in ascending order
Generated list of array elements sorted in descending order
Generated list of array elements unsorted
Worst case scenarios where maximum distance between two elements lies at the last loop of the last iteration.
Best case scenarios where maximum distance found second loop of the first iteration.
To measure basic operations count, count variables have been inserted at appropriate positions and maintained. In both algorithms, initially count was declared and initialized with value 1 as declaring the variables itself is a basic operation which counts as 1 operation (Cleverdon, 2011).
After declaring the variable ‘count’ and initializing, the count has been increased by one every time any basic operation encountered with the help of the statement (Cleverdon, 2011)
count = count + 1
The below codes shows the use of count variables at different positions while implementation (Cleverdon, 2011):
Program MaxDistance
int i,j,dmax=0,count=1;
for(i=0;i<n;i++)
{
count = count + 1;
for(j=0;j<n;j++)
{
count=count + 1;
if((i!=j)&&((A[i]-A[j])>dmax))
{
dmax= A[i]-A[j];
count=count+1;
}
count=count+1;
}
}
System.out.println(“The total count is”+count);
Program MaxDistance2
int i,j,dmax=0,temp,count=1;
for(i=0;i<n-1;i++)
{
count = count+1;
for(j=i+1;j<n;j++)
{
count=count+1;
temp=A[i]-A[j];
count=count+1;
if(temp>dmax)
{
dmax= temp;
count=count+1;
}
count=count+1;
}
}
System.out.println(“The total count is”+count);
To measure execution time of both the programs MaxDistance and MaxDistance2 a variable has been declared at the start of main method “startTime” that fetches the system current time and store it in a long data variable. After that another variable has been declared and defined at the end of the main method which evaluates the difference between variable startTime and the system’s time after execution (Borko & Bernick, 2013).
After that it stores the value which has been calculated in nano-seconds into a variable declared as ‘duration’. To understand the execution time in a better way the time has been divided by 1000000000 to get the results converted in seconds which makes the execution time more clear (Borko & Bernick, 2013).
After the implementation of execution time it was proved practically too that in case of time efficiency algorithm MaxDistance2 is better than MaxDistance (Borko & Bernick, 2013).
The code that has been used for implementation purposes is:
final long startTime = System.nanoTime();
final long duration = System.nanoTime() – startTime;
System.out.println(“The total execution time is: “+duration);
second= duration/1000000000;
System.out.println(“The total execution time is: “+second);
References:
Almuallim, H. & Dietterich, T. G. (2014). Learning with many irrelevant features. AAAI-91, pages 547, 552.
Balcom, M., Laura, B. & Tong, R. M. (2015). Advanced Decision Systems: Description of the CODEX system as used for MUC-3. In Proceedings of the Third Message Understanding Evaluation and Conference, Los Altos, CA: Morgan Kaufmann.
Biebricher, P., Fuhr, N., Lustig, G., Schwantner., M. & Knorz, G. (2016). The automatic indexing system AIR/PHYS|from research to application. In Eleventh International Conference on Research & Development in Information Retrieval, pages 333:342.
Borko, H. & Bernick, M. (2013). Automatic document classication. Journal of the Association for Computing Machinery, pages 151:161, Volume 2, doi:10.1111/j.1365-2648.2007.04412.x.
Buntine, W. (2010). A theory of learning classication rules. PhD thesis, School of Computing Science, University of Technology, Sydney.
Cerny, B. A., Okseniuk, A. & Lawrence, J. D. (2013). A fuzzy measure of agreement between machine and manual assignment of documents to subject categories. In Proceedings of the 46th ASIS Annual Meeting, page 265 Volume 1, issue no 34.
Cleverdon, C. W. (2011). The signicance of the Craneld tests of index languages. In Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 3, 12.
Crawford, S. L., Fung, R. M., Appelbaum, L. A., & Tong, R. M. (2009). Classication trees for information retrieval. In Eighth International Workshop on Machine Learning, pages 245,249.
Creecy, R. H., Masand, B. M., Smith, S. J., & Waltz, D. L. (2014). Trading MIPS and memory for knowledge engineering: automatic classification of census returns on a massively parallel supercomputer. Technical Report TMC-192, Thinking Machines Corp., Cambridge, MA.
DeJong, G. (2015). An overview of the FRUMP system. In Lehnert, Wendy G. and Ringle, Martin H., editors, Strategies for Natural Language Processing, pages 149,176. Lawrence Erlbaum Associates, Hillsdale, New Jersey.
Dolan, C. P., Goldman, S. R., Cuda, T. V., & Nakamura, A. M. (2014). Hughes Trainable Text Skimmer: Description of the TTS system as used for MUC-3. In Proceedings of the Third Message Understanding Evaluation and Conference, Los Altos, CA: Morgan Kaufmann.
Deogun, J. S. and Raghavan, V. V. (2016). Description of the UNL/USL system used for MUC-3. In Proceedings of the Third Message Understanding Evaluation and Conference, Los Altos, CA: Morgan Kaufmann.
Fuhr, N., Hartmann, S., Lustig, G., & Schwantner, M. (2011) AIR/X|a rule-based multistage indexing system for large subjectelds. In RIAO 91 Conference Proceedings: Intel ligent Text and Image Hand ling, pages 606,623.
Gale, W. A. & Church, K. W. (2010). Poor estimates of context are worse than none. In Speech and Natural Language Workshop, pages 283,287, San Mateo, CA: Morgan Kaufmann.