Public Function LevenshteinDistance( _ ByVal sString1 As String, _ ByVal sString2 As String) As Integer '********************************************************** ' Copyright © Michael Gilleland ' Merriam Park Software ' http://www.merriampark.com/ld.htm '********************************************************** ' Calculate the Levenshtein Distance '********************************************************** ' Levenshtein distance (LD) is a measure of the similarity ' between two strings, which are referred to as the source ' string (s) and the target string (t). The distance is the ' number of deletions, insertions, or substitutions required ' to transform s into t. For example: ' ' If s is "test" and t is "test", then LD(s,t) = 0, because ' no transformations are needed. The strings are already ' identical. ' If s is "test" and t is "tent", then LD(s,t) = 1, because ' one substitution (change "s" to "n") is sufficient to ' transform s into t. ' The greater the Levenshtein distance, the more different ' the strings are. ' ' The Levenshtein distance is named after the Russian scientist ' Vladimir Levenshtein, who devised the algorithm in 1965. If ' you can't spell or pronounce Levenshtein, the metric is also ' sometimes called edit distance. ' ' The Levenshtein distance algorithm has been used in: ' * Spell checking ' * Speech recognition ' * DNA analysis ' * Plagiarism detection '********************************************************** Dim iMatrix() As Integer ' Matrix Dim iLenString2 As Integer ' Length of sString2 Dim iLenString1 As Integer ' Length of sString1 Dim iCtrString1 As Integer ' Counter to iterate through sString1 Dim iCtrString2 As Integer ' Counter to iterate through sString2 Dim String1Char As String ' nth character of sString1 Dim String2Char As String ' nth character of sString2 Dim iCost As Integer ' Cost ' Step 1 ' Construct a matrix containing 0..m rows and 0..n columns iLenString1 = Len(sString1) iLenString2 = Len(sString2) If iLenString1 = 0 Then LevenshteinDistance = iLenString2 Exit Function End If If iLenString2 = 0 Then LevenshteinDistance = iLenString1 Exit Function End If ReDim iMatrix(0 To iLenString1, 0 To iLenString2) As Integer ' Step 2 ' Initialize the first row to 0..n ' Initialize the first column to 0..m For iCtrString1 = 0 To iLenString1 iMatrix(iCtrString1, 0) = iCtrString1 Next iCtrString1 For iCtrString2 = 0 To iLenString2 iMatrix(0, iCtrString2) = iCtrString2 Next iCtrString2 ' Step 3 ' Examine each character of sString1 (from 1 to n). For iCtrString1 = 1 To iLenString1 String1Char = Mid$(sString1, iCtrString1, 1) ' Step 4 ' Examine each character of sString2 (from 1 to m). For iCtrString2 = 1 To iLenString2 String2Char = Mid$(sString2, iCtrString2, 1) ' Step 5 ' If the character in sString1 equals the character in ' sString2, the cost is 0, otherwise the cost is 1. If String1Char = String2Char Then iCost = 0 Else iCost = 1 End If ' Step 6 ' Set the bottom-right matrix cell equal to the minimum of: ' a. The cell immediately above, plus 1. ' b. The cell immediately to the left, plus 1. ' c. The cell diagonally above and to the left, plus the cost. iMatrix(iCtrString1, iCtrString2) = _ Minimum(iMatrix(iCtrString1 - 1, iCtrString2) + 1, _ iMatrix(iCtrString1, iCtrString2 - 1) + 1, _ iMatrix(iCtrString1 - 1, iCtrString2 - 1) + iCost) Next iCtrString2 Next iCtrString1 ' Step 7 ' Return the overall cost (the bottom-right matrix cell). LevenshteinDistance = iMatrix(iLenString1, iLenString2) Erase iMatrix End Function Private Function Minimum(ByVal iVal1 As Integer, _ ByVal iVal2 As Integer, _ ByVal iVal3 As Integer) As Integer '************************************** ' Copyright © Michael Gilleland ' Merriam Park Software ' http://www.merriampark.com/ld.htm '************************************** ' Return the minimum of three values '************************************** Dim iMin As Integer iMin = iVal1 If iVal2 < iMin Then iMin = iVal2 End If If iVal3 < iMin Then iMin = iVal3 End If Minimum = iMin End Function