[warebiz] :: Programming Bizness

Specializing in programming tutorials that make sense!

 Programming Tutorials

 Resources     » Articles
    » Projects
    » CS Dictionary
    » References
    » Books

 Site     » Home
    » About
    » Support
    » Make Donation
binary digits

 Random Insight ::

binary digits

 Book References ::

binary digits

 Geek Wisdom ::

Searching Arrays Article
Articles - Techniques for Sorting Arrays 

Introduction

Like array searching techniques, techniques for sorting array data are normally taught during entry level programming classes after control structures have been investigated. Also like search algorithms, sort algorithms require logical thinking and "looping" skills. It is extremely important for beginning programmers to fully understand these algorithms. The student should never attempt to use an algorithm that he does not actually understand. Fully understanding the algorithm will prevent logic errors and program "bugs".

There are many problems that may require a programmer to write a function to sort a particular array. Maybe a teacher would like to sort her grade book in descending order based on student term grades to reward her top students. Maybe a bank owner would like to sort a list of accounts in ascending order based on "bounced" check history to reward his reliable clients. In this article, I will present the following four standard sorting techniques: selection sort, bubble sort, merge sort, and tag sort. The tag sort is actually dependent upon a selection or bubble sort. The neat aspect of a tag sort is that it doesn't actually manipulate array elements. It simply keeps track of a sorted "order" of the elements. You will see more about this later in the reading. Please note that with minimal changes, these algorithms may be applied to arrays used to store integer, floating-point, and character values. In this article, I will first present a simple sorting problem, and then show examples of each sort method using the C++ and ADA programming languages.

"Sorting" Mindset

Array sorting techniques require much of the same thinking skills as searching. The most important aspect by far is simply knowing how to manipulate an array. By definition, an array is a homogeneous structure consisting of a collection of memory locations used to store data values all of the same type. In particular, the programmer will also need to be aware of how arrays store data using index or subscript values. Arrays always have a beginning subscript value and a terminating ending value. These subscript values make up the "range" of the array. For instance in C++, an array always has a beginning subscript value of zero. In other languages such as Ada, the beginning subscript value doesn't have to be zero. The programmer will also need to be comfortable with looping, especially for loops. When investigating the bubble sort, we will see how "nested" loops, loop structures within another loop structure, come into play. As with searching, array sorting will require the algorithm to traverse the "range" of the array while sorting array elements. We are most concerned with the method used to sort while traversing the array. The size of the array should also be known by the programmer. Here again just like searching, it is sometimes best to declare the array size as a global constant. If familiar with array and for loop logic, the code should be fairly straightforward to understand. I recommend using your "mind" to execute the code using sample data with pen and paper before attempting any actual program compilations. This will force you to fully understand the concept of sorting.

The Problem [Jump to Example Program]

You have been hired by some company leading its industry to develop a program that will sort the company's daily earnings during each month. The daily earnings need to be sorted into descending order for each month throughout the year. Fortunately for you, the company stores their daily earnings in an array. There is an array for each month. Since you are working with C++, the element in position 0 of a particular array will denote the company's earnings for day one of that month. Your job is to simply write a function that is capable of sorting an array of earnings (floating-point values). This function should be able to sort each month's earnings. Therefore, it should not be dependent on a specific size for sorting.

Sort Algorithms

You decide that it would be best to write the function using three different sorting techniques: selection, bubble, and tag. We will first attack the selection sort.

Selection Sort
A selection sort is capable of sorting a one-dimensional array into ascending or descending order. The sort itself, depending on if the array is to be sorted into ascending or descending order, will perform comparisons during each loop cycle to find the smallest or largest value that is in an "unlocked" position. This value will then be placed in a "locked" position so it is not compared in the next cycle with the other values. For example, if an array is to be sorted into ascending order, during the first loop cycle, the selection sort will find the smallest element and will "lock" that element in the first location of the array ( array[0] ). During the next cycle, the sort will begin searching from the next unlocked position (subscript two), find the next smallest value, and lock it into the second array location ( array[1] ). Because array values will be moved from an initial position to a new position, a "swap" function will be implemented and used to handle this sorting operation. The following is an example of a selection sort written in C++. Study the code very carefully. Try to mimic the "mind" of the computer while executing example array values of your choice.


// selection sort used to sort an array of double precision
// values into descending order
void selectionSort ( double arr[], int size )
{
    int indexOfMax, pass, j;
    for ( pass = 0; pass < size - 1; pass++ )
    {
            indexOfMax = pass;
            for ( j = pass + 1; j < size; j++ )
                if ( arr[j] > arr[indexOfMax] )
                    indexOfMax = j;
            if ( indexOfMax != pass )
            	swap ( &arr[pass], &arr[indexOfMax] );
    }
}

// swap function for floating point values
void swap ( double* x, double* y )
{
    double temp;
    temp = *x;
    *x = *y;
    *y = temp;
}

The above selection sort is a standard technique for sorting arrays. Notice the "enlarged" greater than sign (>) inside the "if" statement of the nested loop. This statement performs relational comparisons to find the largest element in the current "unlocked" array positions. The statement:


indexOfMax = pass;

sets a default subscript value for the largest value in an "unlocked" position for the current cycle. If during the rest of the loop cycle a value is found that is larger than the value subscripted by indexOfMax, we record its position and set it as indexOfMax. NOTE: If we need to sort the array into ascending order, the only code segment which needs to be changed is:


if ( arr[j] < arr[indexOfMin] )

Here, we change the (>) sign to (<) so we can find the smallest element in the current "unlocked" positions. Notice also that I renamed indexOfMax to indexOfMin for clearer code. Here is the same algorithm as above written using the ADA programming language.


MAXSIZE : constant integer := 10;
type tmyArr is array(0..MAXSIZE-1) of float;
.
.
.
procedure swap(x : in out float; y : in out float) is
	temp : float;
begin
	temp := x;
	x := y;
	y := temp;
end swap;

procedure selectionSort(arr : in out tmyArr; size : in integer) is
	indexOfMax  : integer;
begin
	for pass in 0..size-2 loop
		indexOfMax := pass;
		for n in pass+1..size-1 loop
			if arr(n) > arr(indexOfMax) then
				indexOfMax := n;
			end if;
		end loop;

		 if indexOfMax /= pass then
			swap(arr(indexOfMax), arr(pass));
		end if;
	end loop;
end selectionSort;

Bubble Sort
The bubble sort is another standard technique for sorting data in an array. Instead of making one "swap" after cycling through the array like the selection sort, the bubble sort could possibly make several "swaps" of values depending on whether we want to sort the data into ascending or descending order. The following occurs when we want to use the bubble sort to sort an array of values into descending order. The sort begins by evaluating the first two elements in the array. If the first element is less than the second element, then a "swap" will be made. If not, no "swap" is made. The second element and the third element are evaluated next. If the second element is less than the third, then a "swap" is made. If not, no "swap" is made. This exact same process of comparing adjacent array elements is repeated. If the first array element is the smallest value in the array, it will continue to be "swapped" or "bubbled" through the array until it reaches the last position in the array. Whereas the selection sort "locks" values into beginning locations of an array, the bubble sort "locks" values into locations at the end of the array. Once again, we must use a "swap" function along with our bubble sort function when sorting the array. The following is a C++ bubble sort function used to sort an array of double precision values into descending order.


// bubble sort function defined for descending order
// assortment of double precision values
void bubbleSort ( double arr [ ], int size )
{
    int last = size - 2;
    int isChanged = 1, k;

    while ( last >= 0 && isChanged )
    {
            isChanged = 0;
            for ( k = 0; k <= last; k++ )
                if ( arr[k] < arr[k+1] )
                {
                    swap ( &arr[k], &arr[k+1] );
                    isChanged = 1;
                }
            last--;
    }
}

// swap function defined for double precision values
void swap ( double* x, double* y )
{
    double temp;
    temp = *x;
    *x = *y;
    *y = temp;
}


Notice again within the above code the enlarged (>) sign within the "if" statement of the for loop. This statement compares adjacent elements to determine if a "swap" of the elements needs to be made. If we need to sort the array into ascending order, we can simple modify this statement to the following:


if ( arr[k] > arr[k+1] )

Here, we changed the (<) to (>) so we can "swap" adjacent elements when an arbitrary value arr[k] is greater in value that an arbitrary value arr[k+1]. Here is the same algorithm as above written using the ADA programming language.


MAXSIZE : constant integer := 10;
type tmyArr is array(0..MAXSIZE-1) of float;
.
.
.
procedure swap(x : in out float; y : in out float) is
	temp : float;
begin
	temp := x;
	x := y;
	y := temp;
end swap;

procedure bubbleSort(arr : in out tmyArr; size : in integer) is
	last : integer := size - 2;
	isChanged : boolean := TRUE;
begin
	while last >= 0 and isChanged loop
		isChanged := FALSE;
		for n in 0..last loop
			if arr(n) < arr(n+1) then
				swap(arr(n), arr(n+1));
				isChanged := TRUE;
			end if;
		end loop;
		last := last - 1;
	end loop;
end bubbleSort;

We have now covered two standard techniques for sorting data in an individual array: a selection and bubble sort. These algorithms are capable of solving our previous problem of sorting company daily earnings. However, you have probably noticed that both of these methods directly manipulate the array being sorted. That is, these methods actually move the elements from array position to array position until the array is sorted. What if there was an algorithm used to sort an array without modifying its original element positioning? Well, there is one. It is called the tag sort. This clever method will keep an "ordered" listing of array subscripts. This listing can then be used when needing to display the array in a sorted manner.

Tag Sort

Sometimes, a programmer will want to directly sort the elements in an array using a selection or bubble sort, but there may be times when he needs to keep the actual array in tact and use a "tagged" array to store the correct positioning of the array when it is sorted. When the programmer needs to refer to the sorted array, he can call upon this "tagged" array. In other words, the actual elements are not being changed during the sort process. The positions in the tag array are being changed so they will hold the correct ordering of the elements when they are sorted. For example, consider the following double precision array and double precision tag array:


arr[6] = { 334.3, 776.4, 112.58, 1252.31, 3998.45, 445.91 };
tag[6] = {     0,     1,      2,       3,       4,      5 };

This associates the position 0 in tag[] to the value of 334.3 in arr[], the position 1 in tag[] to the value of 776.4 in arr[], the position 2 in tag[] to the value of 112.58 in arr[], the position 3 in tag[] to the value of 1252.31 in arr[], the position 4 in tag[] to the value of 3998.45 in arr[], and the position 5 in tag[] to the value of 445.91 in arr[]. After the tag sort (for descending order) is executed, the arrays would contain the following elements:


arr[6] = { 334.3, 776.4, 112.58, 1252.31, 3998.45, 445.91 };
tag[6] = {     4,     2,      5,       1,       0,      3 };

As shown, the original elements in arr[] were not changed, but the original elements in tag[] were manipulated. The tag[] array now holds the correct subscript ordering of the elements in arr[] so the array can be sorted into descending order when the tag[] array is called upon.

The following is a tag sort function used to sort an array of double precision values into descending order based upon the selection sort. Note that the function will also use a "swap" function so values can be manipulated during the sort process:


// tag sort based on selection sort
void tagSort(double dataArr[], int tagArr[], int size)
{
    int k, indexOfMax, pass;

    for (k = 0; k < size; k++) // fill tag array with initial values
        tagArr[k] = k;

    for (pass = 0; pass < size - 1; pass++)
    {
        indexOfMax = pass;
        for (k = pass + 1; k < size; k++)
            if (dataArr[ tagArr[k] ] > dataArr[ tagArr[indexOfMax] ])
                indexOfMax = k;
        if (indexOfMax != pass)
            swap ( &tagArr[pass], &tagArr[indexOfMax] );
    }
}// end tagSort( )

// swap function defined for integers
void swap ( int* x, int* y )
{
    int temp;
    temp = *x;
    *x = *y;
    *y = temp;
}

Here is a tag sort with the help of the bubble sort written in ADA:


MAXSIZE : constant integer := 10;
type tmyFloatArr is array(0..MAXSIZE-1) of float;
type tmyIntArr is array(0..MAXSIZE-1) of integer;
.
.
.
procedure swap(x : in out integer; y : in out integer) is
	temp : integer;
begin
	temp := x;
	x := y;
	y := temp;
end swap;

procedure tagSort(dataArr : in out tmyFloatArr; tagArr : in out tmyIntArr;
					size : in integer) is
	last : integer := size - 2;
	isChanged : boolean := TRUE;
begin
	for k in 0..maxSize-1 loop
		tagArr(k) := k;
	end loop;

	while last >= 0 and isChanged loop
		isChanged := FALSE;
		for n in 0..last loop
			if dataArr(tagArr(n)) < dataArr(tagArr(n+1)) then
				swap(tagArr(n), TagArr(n+1));
				isChanged := TRUE;
			end if;
		end loop;
		last := last - 1;
	end loop;
end tagSort;

When you need to use the tag sort technique in a program, you obviously need to first call the tagSort() function, and then use the tagArr[] for array subscripting when you are displaying the contents of the actual sorted array. For example, if we wanted to display the dataArr array from above, which has a total of size elements, we could call our tagSort() function and use the following code when we want to display the sorted array:


int k;
for (k = 0; k < size; k++)
       cout << dataArr[ tagArr[k] ] << " ";

We have now covered three sorting techniques. Each method would be able to solve our previous company daily earnings sorting problem. Since the array data would probably never need to be kept in its original form, any algorithm would be sufficient for sorting. However, the bubble sort algorithm is far more efficient than the selection sort. When speed and efficiency is a factor, like when dealing with very large, space consuming arrays, the bubble sort will execute much faster.

Program Solution to the Problem

The following is a complete program written using C++ illustrating how the tag sort can be used to solve the problem presented at the beginning of this article. The program simply illustrates how the array denoting the earnings for the month of January can be sorted. The tag sort is based upon the bubble sort algorithm. The program successfully compiled using Microsoft Visual C++ V. 6 compiler.

Source Code

View Source Code


Merge Sort

NOTE: The following sort technique is not related to our previous sorting problem.

While the selection sort, bubble sort, and tag sort are used to sort individual arrays, the merge sort is used to combine two sorted arrays into a third array. The third array should be defined to keep the assortment of either ascending or descending order in tact. This will be taken care of during the "merge" process. It is important to note that a merge sort requires two arrays that have been previously sorted into ascending or descending order. When the sort first begins (for ascending order), the first elements of each of the arrays are compared to determine which array contains the smaller beginning value. The smaller value will then be placed as the first element in the third array. The sort continues to compare the elements of both arrays until the end of one of the arrays is reached. Once the sort has reached the end of one array, the remaining values in the other array will be attached to the end of the third array. The following is a complete program written in C++ demonstrating how to merge two arrays previously sorted into ascending order:

Source Code

View Source Code


This completes the Techniques for Sorting Arrays article. I hope you have found it to be interesting and informative. Happy coding!

[Back to Top]

.: Report Bug : Error Log : About Webmaster :.