VB icon

_A Smart String Comparison

Email
Submitted on: 1/7/2015 11:53:00 AM
By: Atul Brad Buono (from psc cd)  
Level: Advanced
User Rating: By 34 Users
Compatibility: VB 3.0, VB 4.0 (16-bit), VB 4.0 (32-bit), VB 5.0, VB 6.0, VB Script, ASP (Active Server Pages)
Views: 3784
 
     This takes 2 strings and returns the percent alike that they are. (i.e. "test string number 1" is 86.48% similar to "teststring numb 2") This function is very useful! You can use it in databases to match data that may have errors in it. Examples being people's names, company names, addresses, or anything else where you may encounter misspellings or inconsistencies in the data. Your feedback and/or votes are greatly appreciated! -- NEW - updated to use byte arrays instead of strings, 50-300% performance improvement! An implementation of the , Ratcliff/Obershelp/Levenshtein method.
 
code:
Can't Copy and Paste this?
Click here for a copy-and-paste friendly version of this code!
				
'**************************************
' Name: _A Smart String Comparison
' Description:This takes 2 strings and returns the percent alike that they are. (i.e. "test string number 1" is 86.48% similar to "teststring numb 2")
This function is very useful! You can use it in databases to match data that may have errors in it. Examples being people's names, company names, addresses, or anything else where you may encounter misspellings or inconsistencies in the data. Your feedback and/or votes are greatly appreciated! -- NEW - updated to use byte arrays instead of strings, 50-300% performance improvement!
An implementation of the , Ratcliff/Obershelp/Levenshtein method.
' By: Atul Brad Buono (from psc cd)
'
' Inputs:mainstring and checkstring, the 2 strings to compare
'
' Returns:how similar the 2 strings are (percent, as in .8)
'
' Assumes:This code recursively loops through the 2 strings, finding the largest common substring, then checking the remainder of the string.
'**************************************

Private b1() As Byte
Private b2() As Byte
Public Function Simil(String1 As String, String2 As String) As Double
 Dim l1 As Long
 Dim l2 As Long
 Dim l As Long
 Dim r As Double
 If UCase(String1) = UCase(String2) Then
r = 1
 Else
l1 = Len(String1)
l2 = Len(String2)
If l1 = 0 Or l2 = 0 Then
 r = 0
Else
 ReDim b1(1 To l1): ReDim b2(1 To l2)
 For l = 1 To l1
b1(l) = Asc(UCase(Mid(String1, l, 1)))
 Next
 For l = 1 To l2
b2(l) = Asc(UCase(Mid(String2, l, 1)))
 Next
 r = SubSim(1, l1, 1, l2) / (l1 + l2) * 2
End If
 End If
 Simil = r
 Erase b1
 Erase b2
End Function
Private Function SubSim(st1 As Long, end1 As Long, st2 As Long, end2 As Long) As Long
 Dim c1 As Long
 Dim c2 As Long
 Dim ns1 As Long
 Dim ns2 As Long
 Dim i As Long
 Dim max As Long
 If st1 > end1 Or st2 > end2 Or st1 <= 0 Or st2 <= 0 Then Exit Function
 For c1 = st1 To end1
For c2 = st2 To end2
 i = 0
 Do Until b1(c1 + i) <> b2(c2 + i)
i = i + 1
If i > max Then
 ns1 = c1
 ns2 = c2
 max = i
End If
If c1 + i > end1 Or c2 + i > end2 Then Exit Do
 Loop
Next
 Next
 max = max + SubSim(ns1 + max, end1, ns2 + max, end2)
 max = max + SubSim(st1, ns1 - 1, st2, ns2 - 1)
 SubSim = max
End Function


Other 2 submission(s) by this author

 


Report Bad Submission
Use this form to tell us if this entry should be deleted (i.e contains no code, is a virus, etc.).
This submission should be removed because:

Your Vote

What do you think of this code (in the Advanced category)?
(The code with your highest vote will win this month's coding contest!)
Excellent  Good  Average  Below Average  Poor (See voting log ...)
 

Other User Comments


 There are no comments on this submission.
 

Add Your Feedback
Your feedback will be posted below and an email sent to the author. Please remember that the author was kind enough to share this with you, so any criticisms must be stated politely, or they will be deleted. (For feedback not related to this particular code, please click here instead.)
 

To post feedback, first please login.