article

A ShellSort with a twist - Revised and even faster! (7 March 2009)

Email
Submitted on: 2/19/2015 1:34:00 AM
By: Rde (from psc cd)  
Level: Intermediate
User Rating: By 20 Users
Compatibility: VB 4.0 (32-bit), VB 5.0, VB 6.0
Views: 2199
 
     This shell algorithm is founded on a very solid performer that was originally developed by vb2themax, and optimized further by several very talented coders, before I twisted a little more out of it in this hybrid version. ---- This latest version employs a clever SAFEARRAY substitution technique to trick VB into thinking the four-byte string pointers in the string array are just VB longs in a native VB long array, and is now much faster. ---- Also uses valid addition and subtraction of unsigned long integers to guarantee safe arithmetic operations on memory address pointers. ---- Includes both Shell and ShellHyb algorithms with indexed versions in 25kb class and demo project. ---- Update 21 June 2008 inspired by comments from Roger Gilchrist. Improved indexed sorting support routine to better exploit fast re-sorting performance, and added much needed index sorting documentation. ---- Obscure Bug Fix 7 March 09. I documented they 'can sort sub-sets of the array data' but with the indexed versions if you do an error *could* occur without this very small change.

This article has accompanying files

 
				

Shell Hybrid


This shell algorithm is founded on a very solid performer that was originally developed by vb2themax, and optimized further by several very talented coders, before I twisted a little more out of it in this hybrid version. This shellsort is unbelievably optimized and is a solid performer on all data with no known weaknesses.

Although ShellHyb is optimized for re-sorting and reverse sorting, it also has been greatly boosted in raw outright speed to square up against anything, anytime!

This is the fastest shellsort** I know of. This algorithm excels on both un-sorted and pre-sorted data.

**A hybrid shellsort with a twist, and no bubbles.

Shell algorithms very smartly reduce large sized arrays down to many small semi-sorted chunks, but then spend most of their short working time reducing these chunks down to ordered pairs to finish like a bubblesort.

As these chunks get smaller the code makes less and less actual changes, and the bubble finishing run therefore makes very few changes to the almost sorted array.

This means a shell algorithm is very good at pre-sorting but falls behind the fastest quicksort in outright speed because of its dependance on a slow bubble finish.

This hybrid shell addresses this issue by replacing the bubble with a built in algorithm that is optimized for pre-sorted data.

Revised Version

The latest version of this algorithm employs a SAFEARRAY substitution technique to trick VB into thinking the four-byte string pointers in the string array are just VB longs in a native VB long array.

The technique simply uses CopyMemory to point a VB long array (defined in the class) at the first of the string pointers in memory, and sets its lower-bound and item count to match (as if it had been redimmed).

This allows us to treat the string pointers as if they were simply four-byte long values in a long array and can be swapped around as needed without touching the actual strings that are pointed to.

Reading and assigning to a VB long array is lightning fast, and proves to be considerably faster when copying only one item than the previous method of copying the string pointers using CopyMemory.

Indexed Sort

Included are indexed versions which receive a dynamic long array to hold references to the string array indices which is known as an indexed sort. No changes are made to the source string array.

After a sort procedure is run the long array is ready as a sorted index (lookup table) to the string array items, so strA(idxA(lo)) returns the lo item in the string array whose index may be anywhere in the string array.

Usage: The index array can be redimmed to match the source string array boundaries or it can be erased or left uninitialized before sorting a string array for the first time. However, if you modify string items and re-sort you should not redim or erase the index array to take advantage of the fast refresh sorting performance. This also allows the index array to be passed on to other sorting processes to be further manipulated.

Even when using redim with the preserve keyword and adding more items to the string array you can pass the index array unchanged and the new items will be sorted into the previously sorted array. The index array will automatically return with boundaries matching the string array boundaries.

Only when you reload the string array items with new array boundaries should you erase the index array for the first sorting operation. Also, if you redim the source string array to smaller boundaries you should erase the index array before sorting the new smaller data set for the first time.

Unsigned Longs

This new version also uses a function to enable valid addition and subtraction of unsigned long integers. This guarantees safe arithmetic operations on memory address pointers.

Features

This algorithm has the following features:

- It can handle sorting arrays of millions of string items.
- It can handle sorting in ascending and descending order.
- It can handle case-sensitive and case-insensitive criteria.
- It can handle zero or higher based arrays.
- It can handle negative lb and positive ub.
- It can handle negative lb and zero or negative ub.
- It can sort sub-sets of the array data.

Happy coding :)

...

winzip iconDownload article

Note: Due to the size or complexity of this submission, the author has submitted it as a .zip file to shorten your download time. Afterdownloading it, you will need a program like Winzip to decompress it.Virus note:All files are scanned once-a-day by Planet Source Code for viruses, but new viruses come out every day, so no prevention program can catch 100% of them. For your own safety, please:
  1. Re-scan downloaded files using your personal virus checker before using it.
  2. NEVER, EVER run compiled files (.exe's, .ocx's, .dll's etc.)--only run source code.
  3. Scan the source code with Minnow's Project Scanner

If you don't have a virus scanner, you can get one at many places on the net including:McAfee.com


Other 30 submission(s) by this author

 


Report Bad Submission
Use this form to tell us if this entry should be deleted (i.e contains no code, is a virus, etc.).
This submission should be removed because:

Your Vote

What do you think of this article (in the Intermediate category)?
(The article with your highest vote will win this month's coding contest!)
Excellent  Good  Average  Below Average  Poor (See voting log ...)
 

Other User Comments


 There are no comments on this submission.
 

Add Your Feedback
Your feedback will be posted below and an email sent to the author. Please remember that the author was kind enough to share this with you, so any criticisms must be stated politely, or they will be deleted. (For feedback not related to this particular article, please click here instead.)
 

To post feedback, first please login.