Lab: 24 - Quicksort

Assigned
Wednesday, 25 October 2023
Summary
Let’s practice with sorting algorithms.

Lab Write Up Planning

In this lab, you will be modifying code and analyzing its performance over several files. You will hand in a copy of your data collection and your evaluation of what type of operation is the most significant for the Quicksort algorithm.

So, I recommend that you start a spreadsheet and prepare to record for each of the files:

  • The total number of tokens in the file (each is on its own line). This is your N.
  • The number of swaps done when you run the instrumented sort.
  • The number of assignments done when you run the instrumented sort.
  • The number of comparisons between an array element and another data element when you run the instrumented sort.

Remember that you need to track this information for each of the three files mentioned in the lab.

Setting Up

For today’s lab, we’ll take Weiss’s implementation of Quicksort as a starting point. It can be found on MathLAN on lines 159–248 of the file

/home/jimenezp/csc207/Lab-Sorting/Sort.java

You might already have this file in another Eclipse project and can copy the appropriate lines.

  • Create a new Java project for this lab, create a file for a new Quicksorter class, and copy into the definition of that class the definitions of the methods that make up Weiss’s implementation of quicksort. Look through Sort.java and copy the methods that are called for sorting arrays with quicksort. Note that you don’t need to copy the definition of the quickSelect method, but you should copy the insertionSort method that is used as a helper method by quicksort.

Read through the methods.

  • Trace the execution of quicksort on the following array {55, 10, 20, 60, 100, 70} (Update the code as needed to trace quicksort on an array of at least 3 elements).
    • Show the values of low and high for each recursive call, the pivot value, the array elements after each swap and the array elements returning from each call.
  • What strategy does this implementation use for determining the pivot?
  • Why is the pivot placed at the position high - 1?
  • Why the partitioning process uses the values i = low and j = high - 1 for the loop?
  • What is the purpose of CUTOFF?

QuicksorterApp

  • Create a file for a QuickSorterApp program. In this file, design and write a static method isCorrectlyOrdered that takes an array as its argument and determines whether its elements are correctly ordered. The elements of the array must implement the Comparable interface and in particular must be comparable with one another. You should be able to use your method from the previous lab with little or no modification.
  • As in the MergeSort lab, create and test a small Integer array to test that quicksort and isCorrectlyOrdered are working correctly. Once your testing and quicksort programs seem to work correctly, use quicksort on the eight-cousins file: Write and execute a program to determine whether Weiss’s quicksort procedure correctly rearranges an array that initially contains strings read in from the file:
    /home/jimenezp/csc207/Lab-Sorting/eight-cousins.txt.tokenized
    

    (This file contains the full text of Louisa May Alcott’s novel Eight Cousins, with each word or punctuation mark on a separate line.)

Counting Operations

  • Add to the Quicksorter class a static field called swaps that counts the number of calls to the swapReferences method. Add a statement at the beginning of the public quicksort method that initializes swaps to 0 and a statement at the end that prints out a message that includes the value of swaps at the end of the sorting process. Finally, add a statement to the swapReferences method that increments swaps by 1 each time swapReferences is invoked. Use your “instrumented” version of Weiss’s code to count the swaps that the quicksort method makes in sorting the Alcott data.
  • After sorting an array that initially contains the values 24, 13, 26, 1, 2, 27, 38, and 15, in that order, the instrumented version of Weiss’s quicksort method will report that no swaps at all are made. Referring to the code, give an explanation for this result. You may find it helpful to use the Eclipse debugger for tracing the call to the quicksort method.
  • Instead of counting swaps, suppose that we wanted to count data movements, that is, executions of assignment statements in which a value is moved from one array location to another, from an array location to a temporary variable, or from a temporary variable to an array location. Adapt Weiss’s code so that it counts and reports the total number of data movements in the course of one execution of the public quicksort method.
  • Adapt Weiss’s code so that it also counts and reports the total number of data comparisons, which are evaluations of Boolean expressions comparing one array element to another or one array element to the value of some variable. Note that when the test condition of an if-statement or a for-loop does not hold, it should be also counted as a data comparison.

Data Analysis

  • Considering the total numbers obtained above, what is the most executed operation? Confirm that this operation is the relevant operation by comparing its count to N log N (the Big-O running time of quicksort algorithm for an input size N). Do this comparison on two additional files: rose-in-bloom.txt.tokenized and little-men.txt.tokenized. (These files are in the same resources directory path shown in the QuicksorterApp section.)

You can compute log (base 2) in Java or use Excel or similar spreadsheet programs (you should be able to calculate the log (base 2) of a number with the formula LOG(N, 2)).

  1. Per each file include:
    1. the number of tokens in each file (this is your N)
    2. the number of swaps performed on array elements
    3. the number of assignments performed with array elements
    4. the number of comparisons performed with array elements
  2. For each file, you should calculate the expected N log N
  3. Based on your data and the expect value of N log N, which operation seems to be the most significant operation?

Lab Submission

[4 points] In your write-up report include the Data Analysis Section and answer to posted questions.

I strongly recommend that each student keep a copy of the lab. Therefore, don’t forget to share the files/folder with your lab partner!

Remember: Do not use magic numbers, write your code anticipating errors and print user-friendly error messages, all your public methods should be well documented (use Javadoc comments). Please include comments or organize your code and report in a way that will help the grader to find the answer to the exercises or posted questions.

Acknowledgements

From Data Structures & Problem Solving Using Java by Mark Allen Weiss