Important alert: (current site time 7/16/2013 2:18:58 AM EDT)
 

article

A ShellSort with a twist - Revised and even faster! (7 March 2009)

Email
Submitted on: 3/11/2009 10:59:34 AM
By: Rde 
Level: Intermediate
User Rating: By 21 Users
Compatibility: VB 4.0 (32-bit), VB 5.0, VB 6.0
Views: 14461
author picture
 
     This shell algorithm is founded on a very solid performer that was originally developed by vb2themax, and optimized further by several very talented coders, before I twisted a little more out of it in this hybrid version. ---- This latest version employs a clever SAFEARRAY substitution technique to trick VB into thinking the four-byte string pointers in the string array are just VB longs in a native VB long array, and is now much faster. ---- Also uses valid addition and subtraction of unsigned long integers to guarantee safe arithmetic operations on memory address pointers. ---- Includes both Shell and ShellHyb algorithms with indexed versions in 25kb class and demo project. ---- Update 21 June 2008 inspired by comments from Roger Gilchrist. Improved indexed sorting support routine to better exploit fast re-sorting performance, and added much needed index sorting documentation. ---- Obscure Bug Fix 7 March 09. I documented they 'can sort sub-sets of the array data' but with the indexed versions if you do an error *could* occur without this very small change.

This article has accompanying files

 
 
Terms of Agreement:   
By using this article, you agree to the following terms...   
  1. You may use this article in your own programs (and may compile it into a program and distribute it in compiled format for languages that allow it) freely and with no charge.
  2. You MAY NOT redistribute this article (for example to a web site) without written permission from the original author. Failure to do so is a violation of copyright laws.   
  3. You may link to this article from another website, but ONLY if it is not wrapped in a frame. 
  4. You will abide by any additional copyright restrictions which the author may have placed in the article or article's description.
				

Shell Hybrid


This shell algorithm is founded on a very solid performer that was originally developed by vb2themax, and optimized further by several very talented coders, before I twisted a little more out of it in this hybrid version. This shellsort is unbelievably optimized and is a solid performer on all data with no known weaknesses.

Although ShellHyb is optimized for re-sorting and reverse sorting, it also has been greatly boosted in raw outright speed to square up against anything, anytime!

This is the fastest shellsort** I know of. This algorithm excels on both un-sorted and pre-sorted data.

**A hybrid shellsort with a twist, and no bubbles.

Shell algorithms very smartly reduce large sized arrays down to many small semi-sorted chunks, but then spend most of their short working time reducing these chunks down to ordered pairs to finish like a bubblesort.

As these chunks get smaller the code makes less and less actual changes, and the bubble finishing run therefore makes very few changes to the almost sorted array.

This means a shell algorithm is very good at pre-sorting but falls behind the fastest quicksort in outright speed because of its dependance on a slow bubble finish.

This hybrid shell addresses this issue by replacing the bubble with a built in algorithm that is optimized for pre-sorted data.

Revised Version

The latest version of this algorithm employs a SAFEARRAY substitution technique to trick VB into thinking the four-byte string pointers in the string array are just VB longs in a native VB long array.

The technique simply uses CopyMemory to point a VB long array (defined in the class) at the first of the string pointers in memory, and sets its lower-bound and item count to match (as if it had been redimmed).

This allows us to treat the string pointers as if they were simply four-byte long values in a long array and can be swapped around as needed without touching the actual strings that are pointed to.

Reading and assigning to a VB long array is lightning fast, and proves to be considerably faster when copying only one item than the previous method of copying the string pointers using CopyMemory.

Indexed Sort

Included are indexed versions which receive a dynamic long array to hold references to the string array indices which is known as an indexed sort. No changes are made to the source string array.

After a sort procedure is run the long array is ready as a sorted index (lookup table) to the string array items, so strA(idxA(lo)) returns the lo item in the string array whose index may be anywhere in the string array.

Usage: The index array can be redimmed to match the source string array boundaries or it can be erased or left uninitialized before sorting a string array for the first time. However, if you modify string items and re-sort you should not redim or erase the index array to take advantage of the fast refresh sorting performance. This also allows the index array to be passed on to other sorting processes to be further manipulated.

Even when using redim with the preserve keyword and adding more items to the string array you can pass the index array unchanged and the new items will be sorted into the previously sorted array. The index array will automatically return with boundaries matching the string array boundaries.

Only when you reload the string array items with new array boundaries should you erase the index array for the first sorting operation. Also, if you redim the source string array to smaller boundaries you should erase the index array before sorting the new smaller data set for the first time.

Unsigned Longs

This new version also uses a function to enable valid addition and subtraction of unsigned long integers. This guarantees safe arithmetic operations on memory address pointers.

Features

This algorithm has the following features:

- It can handle sorting arrays of millions of string items.
- It can handle sorting in ascending and descending order.
- It can handle case-sensitive and case-insensitive criteria.
- It can handle zero or higher based arrays.
- It can handle negative lb and positive ub.
- It can handle negative lb and zero or negative ub.
- It can sort sub-sets of the array data.

Happy coding :)

...

winzip iconDownload article

Note: Due to the size or complexity of this submission, the author has submitted it as a .zip file to shorten your download time. Afterdownloading it, you will need a program like Winzip to decompress it.Virus note:All files are scanned once-a-day by Planet Source Code for viruses, but new viruses come out every day, so no prevention program can catch 100% of them. For your own safety, please:
  1. Re-scan downloaded files using your personal virus checker before using it.
  2. NEVER, EVER run compiled files (.exe's, .ocx's, .dll's etc.)--only run source code.
  3. Scan the source code with Minnow's Project Scanner

If you don't have a virus scanner, you can get one at many places on the net including:McAfee.com

 
Terms of Agreement:   
By using this article, you agree to the following terms...   
  1. You may use this article in your own programs (and may compile it into a program and distribute it in compiled format for languages that allow it) freely and with no charge.
  2. You MAY NOT redistribute this article (for example to a web site) without written permission from the original author. Failure to do so is a violation of copyright laws.   
  3. You may link to this article from another website, but ONLY if it is not wrapped in a frame. 
  4. You will abide by any additional copyright restrictions which the author may have placed in the article or article's description.


Other 52 submission(s) by this author

 


Report Bad Submission
Use this form to tell us if this entry should be deleted (i.e contains no code, is a virus, etc.).
This submission should be removed because:

Your Vote

What do you think of this article (in the Intermediate category)?
(The article with your highest vote will win this month's coding contest!)
Excellent  Good  Average  Below Average  Poor (See voting log ...)
 

Other User Comments

6/8/2008 6:50:35 PMRussell Sanders

It seems you have put a lot of effort into this. Good job, and thanks. *****
(If this comment was disrespectful, please report it.)

 
6/9/2008 5:23:31 AMPaul Turcksin

A keeper. Thanks for sharing.
(If this comment was disrespectful, please report it.)

 
6/9/2008 8:24:25 AMRde


Acknowledgement must go to LukeH (Selftaught) and Tai Chi Minh Ralph Eastwood for sharing the SAFEARRAY substitution technique used in this latest version.

Luke introduced me to this trick with his Replace function that is the fastest in existance!

Ralphs string sorting routines are also the fastest around, and his integer/unicode technique is both clever and elegant.

This submission is the result of the sharing of code for the benifit of others.

Thanks Paul and Russell for your generous votes.
(If this comment was disrespectful, please report it.)

 
6/9/2008 1:41:14 PMJames Tracy

GREAT JOB!! Well documented and F-A-S-T! This one is a keeper. I'm adding this to my sort collection. Thanks for sharing - we all benefit from your great work!

Sincerely,
James Tracy
JamesTracy95820@gmail.com
Sacramento, California
USA

(If this comment was disrespectful, please report it.)

 
6/11/2008 9:50:29 AMRde

Thanks fella's for you kind words. I'm glad you like it and you're welcome.

Well James, thanks for the very humbling comment but I could not claim the credit for this as it's strength lies squarely on the shell algo developed and optimized by others before I added my little bit.

It is amazing that this shell algo is such a solid performer with such little amount of code, a testament to the coders who have lovingly optimized it to within an inch of its life :)

Even my additions are the result of generous code collaboration.

Happy coding,
Rd :)
(If this comment was disrespectful, please report it.)

 
6/23/2008 11:03:17 AMNoName

How can i sort starting from a position within the strings?
example:
=====V start sort from here to right.ty
3234/AA...
4323/BB...
5324/CC...
(If this comment was disrespectful, please report it.)

 
6/26/2008 9:15:21 AMRde

Hi O Nameless One

I think this is your answer:

StrComp(vba.mid$(sa(),5),vba.mid$(sa(),5))=
(If this comment was disrespectful, please report it.)

 
6/26/2008 9:24:09 AMRde

StrComp(vba.mid$(sa(),6),vba.mid$(sa(),6))=
(If this comment was disrespectful, please report it.)

 
6/29/2008 5:41:53 AMRde

StrComp(vba.mid$(sItem,6),vba.mid$(sa(),6))=
(If this comment was disrespectful, please report it.)

 
9/18/2008 1:25:53 PMAchmad Junus

I dream that someone will attempts writing a program to sort s file in VB that faster than Foxpro :)
(If this comment was disrespectful, please report it.)

 
11/22/2008 2:17:14 AMRde

Thanks everyone for your generous votes to make this the June 08 contest winner!

Hope you all find it useful,
and happy coding from Rd :)


(If this comment was disrespectful, please report it.)

 
1/3/2010 12:28:30 AMNigel

At your earliest convenience - would you please provide me working sample of how to call the sort routine(s).
(If this comment was disrespectful, please report it.)

 
6/2/2010 3:30:20 PMSalvador

I agree with Nigel... with a working sample of how to call the sort routine(s) the article could be more explicit... you have the option to upload one.
(If this comment was disrespectful, please report it.)

 
6/6/2010 4:16:40 AMRde

Hi Nigel and Salvador

This submission has always had a demo project which shows the
usage of the sorting routines?

Feel free to let me know if you have any problems using these sorting routines.

Happy coding,
Rd :)
(If this comment was disrespectful, please report it.)

 
6/6/2010 4:28:14 AMRde

'+++++++++++++++++++++++
With cShellSort
.SortOrder = eDirection
.SortMethod = eCaseSensitive
.strShellHyb sA, lb, ub
End With
'+++++++++++++++++++++++
For i = lb To ub
lstDisplay.AddItem sA(i)
Next
'+++++++++++++++++++++++
Indexed:
'+++++++++++++++++++++++
With cShellSort
.SortOrder = eDirection
.SortMethod = eCaseSensitive
curElapse = ProfileStart
.strShellHybIndexed sA, idxA, lb, ub
End With
'+++++++++++++++++++++++
For i = lb To ub
lstDisplay.AddItem sA(idxA(i))
Next
'+++++++++++++++++++++++
(If this comment was disrespectful, please report it.)

 
6/9/2010 3:22:38 PMSalvador


Excelent code... thanks to the examples you posted I could try it, and it is very fast.

The only doubt I have now is what values I have to use for:
.SortOrder = eDirection
.SortMethod = eCaseSensitive
if you can, please orient me.

Thank you

(If this comment was disrespectful, please report it.)

 
6/11/2010 7:21:00 AMRde

Dim eDirection As Long
If chkDescending.Value = 0 Then '0 >> 1 : 1 >> -1
eDirection = 1
Else 'chkDescending.Value = 1
eDirection = -1
End If

Dim eCaseSensitive As Long
If chkCaseSensitive.Value = 0 Then '0 >> 1 : 1 >> 0
eCaseSensitive = 1
Else 'chkDescending.Value = 1
eCaseSensitive = 0
End If

Private cShellSort As cStrShell

Private Sub Form_Load()
Set cShellSort = New cStrShell
End Sub

Private Sub Form_Unload(Cancel As Integer)
Set cShellSort = Nothing
End Sub
(If this comment was disrespectful, please report it.)

 
6/12/2010 3:36:51 AMRde

Public Enum eSortOrder
Descending = -1&
Default = 0&
Ascending = 1&
End Enum

Public Enum eCompareMethod
BinaryCompare = &0
TextCompare = &1
End Enum

Dim eDirection As Long
If chkDescending.Value = 0 Then
eDirection = Ascending
Else 'chkDescending.Value = 1
eDirection = Descending
End If

Dim eCaseSensitive As Long
If chkCaseSensitive.Value = 0 Then
eCaseSensitive = TextCompare
Else 'chkDescending.Value = 1
eCaseSensitive = BinaryCompare
End If
(If this comment was disrespectful, please report it.)

 
10/23/2010 2:42:36 AMDave Carter

Thanks Rohan .. :D .. Dave.
(If this comment was disrespectful, please report it.)

 

Add Your Feedback
Your feedback will be posted below and an email sent to the author. Please remember that the author was kind enough to share this with you, so any criticisms must be stated politely, or they will be deleted. (For feedback not related to this particular article, please click here instead.)
 

To post feedback, first please login.