This shell algorithm is founded on a very solid performer that was originally developed by vb2themax, and optimized further by several very talented coders, before I twisted a little more out of it in this hybrid version.
----
This latest version employs a clever SAFEARRAY substitution technique to trick VB into thinking the four-byte string pointers in the string array are just VB longs in a native VB long array, and is now much faster.
----
Also uses valid addition and subtraction of unsigned long integers to guarantee safe arithmetic operations on memory address pointers.
----
Includes both Shell and ShellHyb algorithms with indexed versions in 25kb class and demo project.
----
Update 21 June 2008 inspired by comments from Roger Gilchrist. Improved indexed sorting support routine to better exploit fast re-sorting performance, and added much needed index sorting documentation.
----
Obscure Bug Fix 7 March 09. I documented they 'can sort sub-sets of the array data' but with the indexed versions if you do an error *could* occur without this very small change.
Terms of Agreement:
By using this article, you agree to the following terms...
You may use
this article in your own programs (and may compile it into a program and distribute it in compiled format for languages that allow it) freely and with no charge.
You MAY NOT redistribute this article (for example to a web site) without written permission from the original author. Failure to do so is a violation of copyright laws.
You may link to this article from another website, but ONLY if it is not wrapped in a frame.
You will abide by any additional copyright restrictions which the author may have placed in the article or article's description.
Shell Hybrid
This shell algorithm is founded on a very solid performer that was originally developed by vb2themax, and optimized further by several very talented coders, before I twisted a little more out of it in this hybrid version. This shellsort is unbelievably optimized and is a solid performer on all data with no known weaknesses.
Although ShellHyb is optimized for re-sorting and reverse sorting, it also has been greatly boosted in raw outright speed to square up against anything, anytime!
This is the fastest shellsort** I know of. This algorithm excels on both un-sorted and pre-sorted data.
**A hybrid shellsort with a twist, and no bubbles.
Shell algorithms very smartly reduce large sized arrays down to many small semi-sorted chunks, but then spend most of their short working time reducing these chunks down to ordered pairs to finish like a bubblesort.
As these chunks get smaller the code makes less and less actual changes, and the bubble finishing run therefore makes very few changes to the almost sorted array.
This means a shell algorithm is very good at pre-sorting but falls behind the fastest quicksort in outright speed because of its dependance on a slow bubble finish.
This hybrid shell addresses this issue by replacing the bubble with a built in algorithm that is optimized for pre-sorted data.
Revised Version
The latest version of this algorithm employs a SAFEARRAY substitution
technique to trick VB into thinking the four-byte string pointers in
the string array are just VB longs in a native VB long array.
The technique simply uses CopyMemory to point a VB long array (defined
in the class) at the first of the string pointers in memory, and sets
its lower-bound and item count to match (as if it had been redimmed).
This allows us to treat the string pointers as if they were simply
four-byte long values in a long array and can be swapped around as
needed without touching the actual strings that are pointed to.
Reading and assigning to a VB long array is lightning fast, and proves
to be considerably faster when copying only one item than the previous
method of copying the string pointers using CopyMemory.
Indexed Sort
Included are indexed versions which receive a dynamic long array to hold
references to the string array indices which is known as an indexed sort.
No changes are made to the source string array.
After a sort procedure is run the long array is ready as a sorted
index (lookup table) to the string array items,
so strA(idxA(lo)) returns
the lo item in the string array whose index
may be anywhere in the string array.
Usage: The index array can be redimmed to match the source string array
boundaries or it can be erased or left uninitialized before sorting
a string array for the first time. However, if you modify string items
and re-sort you should not redim or erase the index array to
take advantage of the fast refresh sorting performance. This also allows
the index array to be passed on to other sorting processes to be further manipulated.
Even when using redim with the preserve keyword and adding more items to
the string array you can pass the index array unchanged and the new items
will be sorted into the previously sorted array. The index array will automatically
return with boundaries matching the string array boundaries.
Only when you reload the string array items with new array boundaries should you
erase the index array for the first sorting operation. Also, if you redim the
source string array to smaller boundaries you should erase the index array before
sorting the new smaller data set for the first time.
Unsigned Longs
This new version also uses a function to enable valid addition and subtraction of unsigned long integers. This guarantees safe arithmetic operations on memory address pointers.
Features
This algorithm has the following features:
- It can handle sorting arrays of millions of string items.
- It can handle sorting in ascending and descending order.
- It can handle case-sensitive and case-insensitive criteria.
- It can handle zero or higher based arrays.
- It can handle negative lb and positive ub.
- It can handle negative lb and zero or negative ub.
- It can sort sub-sets of the array data.
Note: Due to the size or complexity of this submission, the author has submitted it as a .zip file to shorten your download time. Afterdownloading it, you will need a program like Winzip to decompress it.Virus note:All files are scanned once-a-day by Planet Source Code for viruses, but new viruses come
out every day, so no prevention program can catch 100% of them. For your own safety, please:
Re-scan downloaded files using your personal virus checker before using it.
NEVER, EVER run compiled files (.exe's, .ocx's, .dll's etc.)--only run source code.
Scan the source code with Minnow's Project Scanner
If you don't have a virus scanner, you can get one at many places on the net
including:McAfee.com
Terms of Agreement:
By using this article, you agree to the following terms...
You may use
this article in your own programs (and may compile it into a program and distribute it in compiled format for languages that allow it) freely and with no charge.
You MAY NOT redistribute this article (for example to a web site) without written permission from the original author. Failure to do so is a violation of copyright laws.
You may link to this article from another website, but ONLY if it is not wrapped in a frame.
You will abide by any additional copyright restrictions which the author may have placed in the article or article's description.
Acknowledgement must go to LukeH (Selftaught) and Tai Chi Minh Ralph Eastwood for sharing the SAFEARRAY substitution technique used in this latest version.
Luke introduced me to this trick with his Replace function that is the fastest in existance!
Ralphs string sorting routines are also the fastest around, and his integer/unicode technique is both clever and elegant.
This submission is the result of the sharing of code for the benifit of others.
Thanks Paul and Russell for your generous votes. (If this comment was disrespectful, please report it.)
GREAT JOB!! Well documented and F-A-S-T! This one is a keeper. I'm adding this to my sort collection. Thanks for sharing - we all benefit from your great work!
Sincerely, James Tracy JamesTracy95820@gmail.com Sacramento, California USA
(If this comment was disrespectful, please report it.)
Thanks fella's for you kind words. I'm glad you like it and you're welcome.
Well James, thanks for the very humbling comment but I could not claim the credit for this as it's strength lies squarely on the shell algo developed and optimized by others before I added my little bit.
It is amazing that this shell algo is such a solid performer with such little amount of code, a testament to the coders who have lovingly optimized it to within an inch of its life :)
Even my additions are the result of generous code collaboration.
Happy coding, Rd :) (If this comment was disrespectful, please report it.)
How can i sort starting from a position within the strings? example: =====V start sort from here to right.ty 3234/AA... 4323/BB... 5324/CC... (If this comment was disrespectful, please report it.)
I dream that someone will attempts writing a program to sort s file in VB that faster than Foxpro :) (If this comment was disrespectful, please report it.)
At your earliest convenience - would you please provide me working sample of how to call the sort routine(s). (If this comment was disrespectful, please report it.)
I agree with Nigel... with a working sample of how to call the sort routine(s) the article could be more explicit... you have the option to upload one. (If this comment was disrespectful, please report it.)
'+++++++++++++++++++++++ With cShellSort .SortOrder = eDirection .SortMethod = eCaseSensitive .strShellHyb sA, lb, ub End With '+++++++++++++++++++++++ For i = lb To ub lstDisplay.AddItem sA(i) Next '+++++++++++++++++++++++ Indexed: '+++++++++++++++++++++++ With cShellSort .SortOrder = eDirection .SortMethod = eCaseSensitive curElapse = ProfileStart .strShellHybIndexed sA, idxA, lb, ub End With '+++++++++++++++++++++++ For i = lb To ub lstDisplay.AddItem sA(idxA(i)) Next '+++++++++++++++++++++++ (If this comment was disrespectful, please report it.)
Dim eDirection As Long If chkDescending.Value = 0 Then '0 >> 1 : 1 >> -1 eDirection = 1 Else 'chkDescending.Value = 1 eDirection = -1 End If
Dim eCaseSensitive As Long If chkCaseSensitive.Value = 0 Then '0 >> 1 : 1 >> 0 eCaseSensitive = 1 Else 'chkDescending.Value = 1 eCaseSensitive = 0 End If
Private cShellSort As cStrShell
Private Sub Form_Load() Set cShellSort = New cStrShell End Sub
Private Sub Form_Unload(Cancel As Integer) Set cShellSort = Nothing End Sub (If this comment was disrespectful, please report it.)
Public Enum eSortOrder Descending = -1& Default = 0& Ascending = 1& End Enum
Public Enum eCompareMethod BinaryCompare = &0 TextCompare = &1 End Enum
Dim eDirection As Long If chkDescending.Value = 0 Then eDirection = Ascending Else 'chkDescending.Value = 1 eDirection = Descending End If
Dim eCaseSensitive As Long If chkCaseSensitive.Value = 0 Then eCaseSensitive = TextCompare Else 'chkDescending.Value = 1 eCaseSensitive = BinaryCompare End If (If this comment was disrespectful, please report it.)
Thanks Rohan .. :D .. Dave. (If this comment was disrespectful, please report it.)
Add Your Feedback
Your feedback will be posted below and an email sent to
the author. Please remember that the author was kind enough to
share this with you, so any criticisms must be stated politely, or they
will be deleted. (For feedback not related to this particular article, please
click here instead.)