article

Create string in VB more than 200 times faster! See code!

Email
Submitted on: 1/30/2015 4:42:00 AM
By: jbay101 (from psc cd)  
Level: Advanced
User Rating: By 14 Users
Compatibility: VB 5.0, VB 6.0
Views: 2117
 
     This article shows how to use the OLE Automation DLL to create a string without relying on VB to do it. When VB creates a string, it automaticly fills it with data (which takes a great deal of time when dealling with large strings). This way bypasses VB and creates the string itself, without filling it with data. The result? 200 times faster! I have updated the tutorial to include Example 1 re-written using the faster method. The code includes a benchmark and the function described below. Please vote or leave a comment!

This article has accompanying files

 
				





Visual Basic vs


Fast string in Visual Basic, version 2

Visual Basic vs. C++

Visual Basic stores it's strings in a type referred to in C++ as a BSTR. This type is completely different from the C char type, as a BSTR doesn't necessarily terminate with a null, and it has a different header. The C char is stored as an array of bytes, terminating at a null byte or character 0x0. Unlike C or C++, when you create a string in VB it is automatically filled with data.

The SLOW way  - Visual Basic's String creation
When you dynamically create a string in Visual Basic, there are only two methods that VB supports. These are:
1. Using the String function
    Example:
    Dim strData As String    'our string variable
    Open "test.bin" For Binary Access Read As #1    'open a file
    strData = String(LOF(1), 0)    'create a buffer
    Get #1, , strData    'read data into the buffer
    Close #1    'close the file
    The String function takes two parameters, the length of the string and the character to fill the string with.

2. Using the Space function
    This is much like using the String function, except it automatically fills the string with spaces.

Now, for the example above all we want is an empty storage space to fill with data. But VB doesn't do this. In both instances, VB fills the string with data, which can take a lot of time. This is where the API optimization comes into play.

 

The FAST way - the OLE Automation library
The OLE Automation library provides support, not only for the BSTR type but also for all variable-related operations. To increase the speed of the string creation, we want to tell the OLE Automation library to create a region of memory that we can access - without filling it with data. To do this we will use two functions, RtlMoveMemory in the windows kernel and SysAllocStringByteLen is the OLE Automation library. The declarations are below.

Declare Sub RtlMoveMemory Lib "kernel32" (dst As Any, src As Any, ByVal nBytes&)
Declare Sub SysAllocStringByteLen& Lib "oleaut32" (ByVal olestr&, ByVal BLen&)

The RltMoveMemory function copies nBytes bytes from the src address to the dst address. The SysAllocStringByteLen allocates BLen of storage space for a BSTR, or in this case a Visual Basic String. In reality, the Visual Basic String is nothing more than a pointer, or a reference to an address in memory that can be used to store the data. With this in mind, we can create out own string allocation function, as shown below.

Public Function AllocString(ByVal lSize As Long) As String
RtlMoveMemory ByVal VarPtr(AllocString_ADVANCED), SysAllocStringByteLen(0&, lSize + lSize), 4&
End Function

This may look a bit complicated at first but it is really relatively simple. The function allocates the space and then copies the 4 byte pointer from this space to the string returned by the function. If we were to expand the function a little it would look like this:

Public Function AllocString(ByVal lSize As Long) As String
Dim lPtr As Long    'the address of the allocated memory
Dim lRetPtr As Long    'the pointer to the return variable
Dim sBuffer As String    'the variable to return
lRetPtr = VarPtr(sBuffer)    'the pointer to the string buffer
lPtr = SysAllocStringByteLen(0&, lSize + lSize)    'allocate the memory and get it's pointer
RtlMoveMemory ByVal lRetPtr, lPtr, 4&    'copy the pointer address
AllocString = sBuffer    'return the string with the modified pointer
End Function

As someone highlighted in the previous tutorial, when a value is returned it is duplicated and added to the stack. When the function ends, this value is pushed off the stack and return to the assigned variable. However, this is where some more knowledge of how VB works is required. Visual Basic is not duplicating the data. All that Visual Basic is doing is duplicating the pointer to the data. Why move 30 MB when you can move 4 bytes? Still, returning a value does take time, and if you a looking for a few more miliseconds you could try making the call inline (removing the function all together). For example, if your string is called strBuffer you could use the code below.

Dim strBuffer As String
RtlMoveMemory ByVal VarPtr(strBuffer), SysAllocStringByteLen(0&, 100 + 100), 4& ' allocate 100 bytes

This method will be slightly faster, but I don't think it's worth the trouble (unless you only need to allocate the data once)

As most of you know, when dealing with the API it is very important to free all the memory you allocate, otherwise you can easily develop memory leaks. But the best part of using the above method is that we don't have to worry about freeing the memory block. When your Visual Basic program ends (of a function/sub containing the relative variable ends), VB automatically checks each variable and frees the memory associated with them. But this is not a VB variable you may say? Wrong. This is a normal VB string variable, we have just created it without VB. Any Visual Basic string function will still work on the data. VB just doesn't know how it was allocated - but VB doesn't care. If you really wanted, you could write a small function to delete a string. Just be careful about how you do it. Since to create the variable we just copied the pointer, some people may think that the below code would free the string.

Public Function DeallocString(sString As String)
Dim lPtr As Long    'the address of the allocated memory
lPtr = VarPtr(sBuffer)    'the pointer to the string buffer
RtlMoveMemory ByVal lPtr, 0&, 4&    'copy the pointer address (nulls)
End Function

When dealing with other API types (and VB types), erasing the pointer will tell VB or the API that the variable hasn't been initialized. But VB will loose track of all the memory associated with the string in this case. For the enclosed sample, that is 30 MB or RAM!!! The correct way to remove the string is to use VB to do it. The easiest way is to assign its value to "". But if you really MUST write a function, you could tri the one below.

Public Function DeallocString(sString As String)
sString = ""
End Function

Sometimes the simplest way is the best!

Now that we know how to allocate strings the fast way, we can re-write the sample in Example 1.

    Dim strData As String    'our string variable
    Open "test.bin" For Binary Access Read As #1    'open a file
    strData = AllocString(LOF(1))    'create a buffer using out function
    Get #1, , strData    'read data into the buffer
    Close #1    'close the file

It really is not that difficult, and it makes a HUGE speed increase. This article comes with the above function and a benchmark to show the dramatic speed difference.

The next tutorial will talk about making string functions (compare, join etc) as fast as C and will show how to make a C string in Visual Basic. Please leave a comment or vote!

winzip iconDownload article

Note: Due to the size or complexity of this submission, the author has submitted it as a .zip file to shorten your download time. Afterdownloading it, you will need a program like Winzip to decompress it.Virus note:All files are scanned once-a-day by Planet Source Code for viruses, but new viruses come out every day, so no prevention program can catch 100% of them. For your own safety, please:
  1. Re-scan downloaded files using your personal virus checker before using it.
  2. NEVER, EVER run compiled files (.exe's, .ocx's, .dll's etc.)--only run source code.
  3. Scan the source code with Minnow's Project Scanner

If you don't have a virus scanner, you can get one at many places on the net including:McAfee.com


Other 6 submission(s) by this author

 


Report Bad Submission
Use this form to tell us if this entry should be deleted (i.e contains no code, is a virus, etc.).
This submission should be removed because:

Your Vote

What do you think of this article (in the Advanced category)?
(The article with your highest vote will win this month's coding contest!)
Excellent  Good  Average  Below Average  Poor (See voting log ...)
 

Other User Comments


 There are no comments on this submission.
 

Add Your Feedback
Your feedback will be posted below and an email sent to the author. Please remember that the author was kind enough to share this with you, so any criticisms must be stated politely, or they will be deleted. (For feedback not related to this particular article, please click here instead.)
 

To post feedback, first please login.