Sorting: Algorithm and Data Structure

No comments

The objective of the sorting algorithm is to rearrange the records so that their keys are ordered according to some well-defined ordering rule.
Problem:   Given an array of n real number A[1.. n].
Objective: Sort the elements of A in ascending order of their values. Internal Sort
 
External Sort
  • Sorting files from tape or disk.
  • In this method, an external sort algorithm must access records sequentially, or at least in the block.
 
Memory Requirement
  1. Sort in place and use no extra memory except perhaps for a small stack or table.
  1. Algorithm that use a linked-list representation and so use N extra words of memory for list pointers.
  1. Algorithms that need enough extra memory space to hold another copy of the array to be sorted.
 
Stability
The Decision Tree Model
 
 
A Lower Bound for the Worst Case
Taking logarithms on both sides

(lg(n!) ≤ h
        h  ≥ lg(n!)

Taking logarithms on both sides
(lg(n!) ≤ h
        h  ≥ lg(n!)

(lg(n!) ≤ h
        h  ≥ lg(n!)

        h  ≥ lg(n!)
If the file to be sorted will fit into memory or equivalently if it will fit into an array, then the sorting method is called internal. In this method, any record can be accessed easily.
A sorting algorithm is called stable if it is preserves the relative order of equal keys in the file. Most of the simple algorithm are stable, but most of the well-known sophisticated algorithms are not.


There are two classes of sorting algorithms namely, O(n2)-algorithms and O(n log n)-algorithms. O(n2)-class  includes bubble sort, insertion sort, selection sort and shell sort. O(n log n)-class includes heap sort, merge sort and quick sort.


O(n2) Sorting Algorithms



O(n log n) Sorting Algorithms


Now we show that comparison-based sorting algorithm has an Ω(log n) worst-case lower bound on its running time operation in sorting, then this is the best we can do. Note that in a comparison sort, we use only comparisons between elements to gain information about an input sequence <a1a2, . . . , an>. That is, given two elements ai and aj we perform one of the tests, ai < ajai ≤ aj,ai aj and ai ≥ aj to determine their relative order.
Given all of the input elements are distinct (this is not a restriction since we are deriving a lower bound), comparisons of the form ai =aj are useless, so no comparison of ai aj are made. We also note that the comparison ai ≤  aj , ai ≥ aj and ai < aare all equivalent. Therefore we assume that all comparisons have form ai ≥ aj.


Each time a sorting algorithm compares two elements aand aj , there are two outcomes: "Yes" or "No". Based on the result of this comparison, the sorting algorithm may perform some calculation which we are not interested in and will eventually perform another comparison between two other elements of input sequence, which again will have two outcomes. Therefore, we can represent a comparison-based sorting algorithm with a decision tree T.
As an example, consider the decision tree for insertion sort operating on given elements a1a2 and a3. There are are 3! = 6 possible permutations of the three input elements, so the decision tree must have at least 6 leaves.


In general, there are n! possible permutations of the n input elements, so decision tree must have at least n! leaves.
The length of the longest path from the root to any of its leaves represents the worst-case number of comparisons the sorting algorithm perform. Consequently, the worst-case number of comparisons corresponds to the height of its tree. A lower bound on the height of the tree is therefore a lower bound on the running time of any comparison sort algorithm.

Theorem    The running time of any comparison-based algorithm for sorting an n-element sequence is Ω(n lg n) in the worst case.

Examples of comparison-based algorithms (in CLR) are insertion sort, selection sort, merge sort, quicksort, heapsort, and treesort.

Proof    Consider a decision tree of height h that sorts n elements. Since there are n! permutation of n elements and the tree must have at least n! leaves. We have
n! ≤ 2h

Since the lg function is monotonically increasing, from Stirling's approximation we have
                          n! > (n/e)n         where e = 2.71828 . . .
           h  ≥  (n/e)n          which is Ω(lg n)

No comments :

Post a Comment