- Selection sort and bubble sort
- Posted by cr88192 on October 19th, 2007
"user923005" <dcorbit@connx.com> wrote in message
news:1192774610.213261.31130@e34g2000pro.googlegro ups.com...
I think the main issue is how one chooses the median.
in that particular version (the one I had used in a BWT implementation, and
a few others), 1 pass was used to determine the min and max, and the mid
point was calculated exactly.
more often (since then) I use either the middle or a random element.
now, one could always fall back to bubble sort as well (in cases where the
recursion depth is too deep, as opposed to selection sort as I had done in
the past...).
now, in this case, these rare cases are just slow (rather than overflowing
the stack due to recursing too deep...).
this still retains a reasonable amount of simplicity in any case...
seems it will go now...
- Posted by user923005 on October 19th, 2007
On Oct 19, 4:03 pm, "cr88192" <cr88...@nospam.hotmail.com> wrote:
This version will usually perform quite well, but loses some in
efficiency, since each recurive call will visit every item in the set
to calculate the median. Usually, a set is sampled randomly and the
median is estimated using the sample.
Unless you calculate the median exactly (and that cannot be done by
averaging min and max) then you can still have serious problems with
some distributions. Calculating the median is O(n) on average, but
there is a large constant of proportionaltiy.
It's a lot simpler to simply revert to heapsort if we see that we have
recursed farther than log(n) times.
Falling back to bubble sort will not solve the problem. Bubble sort
is O(N^2) and worst case quick sort is O(N^2) and so the fallback to
bubble sort won't change the behavior when it goes bad.
- Posted by cr88192 on October 20th, 2007
"user923005" <dcorbit@connx.com> wrote in message
news:1192837761.239691.303570@e34g2000pro.googlegr oups.com...
yeah, I had used both approaches at different times (exact calculation, and
the '(min+max)/2' estimate).
yes, but note that heapsort is not the simplest of sorting methods (unlike
selection sort, or a modified bubble sort), which is why I had not done so
before...
yes, it does change the behavior. how: bad inputs on large sets will not
cause the stack to overflow...
(with a 'pure' quicksort, I had ran into this problem more than a few
times).
well, also note that I had typically not used a pure bubble sort, but a
modified bubble sort (a general approach I had tweaked over the years to
make it faster).
the tweaks had consisted of going back and forth in both directions
(generally caused values to sort out much faster), and also eliminating the
min and max extremes (noting that the largest and smallest values tend to
move directly to the ends). likewise, I tended to cut out early, as very
often sorting would seem to get done much faster than the theoretical limit.
this was actually my preferred sorting method for years until I figured out
how to implement in-place quicksort...
but, it may not seem like much, but these approaches have held up fairly
well IME...
- Posted by Charles Richmond on October 21st, 2007
user923005 wrote:
One case where bubble sort has its *best* run time, and quicksort
has its *worst* run time... is on data that is *almost* sorted,
just one or two items out of order. Say, if two items are out
of order in a list of 1000 items, the bubble sort will only make
three or so passes over the list of items, but quicksort will
have a run time of O(N^2).
If you have a list of data items where you *know* for some reason
that there will only be one or two items out of order, then bubble
sort will beat quicksort.
--
+----------------------------------------------------------------+
| Charles and Francis Richmond richmond at plano dot net |
+----------------------------------------------------------------+
- Posted by user923005 on October 22nd, 2007
On Oct 20, 3:40 am, "cr88192" <cr88...@nospam.hotmail.com> wrote:
/*
This version of heapsort is adapted from:
Data Structures and Algorithm Analysis in C (Second Edition)
by Mark Allen Weiss
Addison-Wesley, 1997
ISBN: 0-201-49840-5
Page 228.
It is simple and interesting, but quick-sort is better than heap-sort.
*/
#define LeftChild( i ) ( 2 * ( i ) + 1 )
void PERCDOWN(Etype A[], int i, int N)
{
int Child;
Etype Tmp;
for (Tmp = A[i]; LeftChild(i) < N; i = Child) {
Child = LeftChild(i);
if (Child != N - 1 && GT(A[Child + 1], A[Child]))
Child++;
if (LT(Tmp, A[Child]))
A[i] = A[Child];
else
break;
}
A[i] = Tmp;
}
void HEAPSORT(Etype A[], int N)
{
int i;
for (i = N / 2; i >= 0; i--)
/* BuildHeap */
PERCDOWN(A, i, N);
for (i = N - 1; i > 0; i--) {
SWAP(&A[0], &A[i]);
/* DeleteMax */
PERCDOWN(A, 0, i);
}
}
No. WHen you switch to heapsort, you have no danger of stack overflow
either. So that is not an advantage.
This is always an interesting exercise.
You can limit stack problems with quicksort to a large degree by
always searching the smallest partition first. And (of course) the
'introspective sort' approach completely eliminates the problem.
- Posted by user923005 on October 22nd, 2007
On Oct 21, 12:29 am, Charles Richmond <friz...@tx.rr.com> wrote:
This is true for the naive version of quicksort, but if the data is
mostly sorted (or mostly reversed) then a partition check for sorted
or reversed detects the situation and is also O(n).
/* Test to see if segment [Lo, ... Hi] is sorted already */
PRELUDE
long ARRAYISSORTED(Etype A[], unsigned long Lo, unsigned
long Hi)
{
unsigned long i;
unsigned int ascending;
if (Lo == Hi)
return 1;
if ((ascending = (LE(A[Lo], A[Lo + 1]))))
for (i = Lo + 1; i < Hi; i++) {
/* Is the next item bigger than this one? */
if (GT(A[i], A[i + 1])) {
/* Were all of them the same size up to here? */
if (EQ(A[i], A[Lo]))
/* Maybe the partition is reversed? */
if (ARRAYISREVERSED(A, i, Hi)) {
/* It was reversed. Invert it and we are
done! */
REVERSEARRAY(A, Lo, Hi);
return 1;
} else { /* Nope. The partition is not
reversed */
return 0;
}
else
return 0;
}
}
/* not ascending */
else if (ARRAYISREVERSED(A, Lo, Hi)) {
/* It was reversed. Invert it and we are done! */
REVERSEARRAY(A, Lo, Hi);
return 1;
} else
return 0;
return 1;
}
/* Test to see if segment [Lo, ... Hi] is reverse sorted already */
PRELUDE
long ARRAYISREVERSED(Etype A[], unsigned long Lo, unsigned
long Hi)
{
unsigned long i;
for (i = Lo; i < Hi; i++) {
if (LE(A[i], A[i + 1]))
return 0;
}
return 1;
}
/*
Reverse the order of a partition.
*/
PRELUDE
void REVERSEARRAY(Etype * A, unsigned long Lo, unsigned
long Hi)
{
Etype *r = A + Lo;
for (r = A + (Hi - Lo); A < r; A++, r--) {
Etype Tmp = *A;
*A = *r;
*r = Tmp;
}
}
But you had better be pretty darn sure.
;-)