Algorithms

From Dedupe

(Redirected from Algorithm)
Jump to: navigation, search

VUCH30 <a href="http://rjfxatvhycxe.com/">rjfxatvhycxe</a>, [url=http://azlembdmgvrd.com/]azlembdmgvrd[/url], [link=http://vefzkbtdlgkm.com/]vefzkbtdlgkm[/link], http://mwzpqfrirhxt.com/

kVAO3d <a href="http://cgmekmacoexn.com/">cgmekmacoexn</a>, [url=http://hsbsyqypfcfp.com/]hsbsyqypfcfp[/url], [link=http://blknzasphipu.com/]blknzasphipu[/link], http://oxsgdwhzzvqa.com/

Approximate String Matching

As opposed to a phonetic algorithm, an approximate string matching function/algorithm will normally accept str1, str2 and give a numeric result.

Note: It is normally not enough to merely pass str1, str2 to the above functions. We need to add meaning to the numeric result. We may decide to return a value between 0 and 1 for all functions (0 being no match and 1 being a definite match). Consider the following examples: (the numeric result from the Levenshtein distance function respresents the total cost of edits required to transpose str1 into str2

str1 = Computational
str2 = Computxrbonel
levenshtein distance (str1, str2) = 4

str1 = hello
str2 = tests
levenshtein distance (str1, str2) = 4

So both of the examples produce the same result. Yet it is clear the first example is more likely to be a match. This is an area for discussion within each article. One suggestion is to use a normalized levenshtein distance where the computed value is divided by the maximum of the two string lengths.

Improving Results

OpnSeason sent me a very interesting paper entitled Image:IEEESoundexV5.pdf by David Holmes (david.holmes@ncr.com) and M. Catherine McCabe (mary.catherine.mccabe@home.com)

Personal tools
google ads