Caverphone

From Dedupe

Jump to: navigation, search

Contents

Definition

The exact algorithm is as follows:

  1. Convert to lowercase
  2. Remove anything not A-Z
  3. If the name starts with
    1. cough make it cou2f
    2. rough make it rou2f
    3. tough make it tou2f
    4. enough make it enou2f
    5. gn make it 2n
    6. mb make it m2 --I have a sneaking suspicion this should be at the end of the name?
  4. If the name ends in mb make it m2 --like this?
  5. Replace
    1. cq with 2q
    2. ci with si
    3. ce with se
    4. cy with sy
    5. tch with 2ch
    6. c with k
    7. q with k
    8. x with k
    9. v with f
    10. dg with 2g
    11. tio with sio
    12. tia with sia
    13. d with t
    14. ph with fh
    15. b with p
    16. sh with s2
    17. z with s
    18. any initial vowel with an A
    19. all other vowels with a 3
    20. 3gh3 with 3kh3
    21. gh with 22
    22. g with k
    23. groups of the letter s with a S
    24. groups of the letter t with a T
    25. groups of the letter p with a P
    26. groups of the letter k with a K
    27. groups of the letter f with a F
    28. groups of the letter m with a M
    29. groups of the letter n with a N
    30. w3 with W3
    31. wy with Wy
    32. wh3 with Wh3
    33. why with Why
    34. w with 2
    35. any initial h with an A
    36. all other occurrences of h with a 2
    37. r3 with R3
    38. ry with Ry
    39. r with 2
    40. l3 with L3
    41. ly with Ly
    42. l with 2
    43. j with y
    44. y3 with Y3
    45. y with 2
  6. remove all
    1. 2s
    2. 3s
  7. put six 1s on the end
  8. take the first six characters as the code

Examples

Lee -> lee
lee -> l33
l33 -> l
l -> l111111
l111111 -> l11111
Thompson -> thompson
thompson -> th3mps3n
th3mps3n -> t23mps3n
t23mps3n ->  tmpsn
tmpsn111111 -> tmpsn1

Code

It is important to note that the caverphone() function supports caverphone key generation for a single word/name. If we wish to generate a key for a full name / sequence of words we need to split them into tokens and parse each token seperately.

Visual Basic

--Ltickett 16:00, 24 April 2006 (BST)

I've thrown together the basic caverphone function in vb - vb_caverphone0.1.bas (I haven't had time to fully comment, tidy or test the code but it will give you/me something to start from...) - Please note in order for the function to work you must also download the latest vb_replace and vb_replace_recur functions.

References and papers

Personal tools
google ads