Peacock Data introduced the next generation of their fuzzy logic technology with last month’s release of the California-based firm’s pdSurname Pro last name matching software.
Accordion to company spokesperson Barbara Adair, “pdSurname facilitates identifying last names that are true variations or phonetically similar, while the fuzzy logic technology in the enhanced Pro edition allows finding names even when there are misspellings or other typographical errors.”
“We introduced fuzzy logic with our pdNickname Pro and pdGender Pro software in late 2013, but the new fuzzy logic generation 2.0 is a great enhancement,” Barbara Adair exclaimed.
According to the company, most of the enhancements were achieved after they developed a giant library of more than 80,000 language rules based on hundreds of dialects from around the world. Barbara Adair said, “Many misspellings occur as transcribers enter the sounds they hear. The character sequences and the sounds they produce are different for each language and situation, such as before, after, or between certain vowels and consonants, so our substitutions are language-rule based.”
The company explained additionally that their algorithms go even further by considering both how a name may sound to someone who speaks English as well as how it may sound to someone who speaks Spanish, which is often different. Barbara Adair explained, “Take the letter-pair ‘SC’ as an example. Before the vowels ‘E’ or ‘I’ it is most likely to be misspelled by an English speaker as ‘SHE’ or ‘SHI’ while a Spanish speaker may hear ‘CHE’ or ‘CHI’ and sometimes ‘YE’ or ‘YI’.”
Company literature indicates the new fuzzy logic generation 2.0 technology has five layers:
1. Phonetic misspellings: such as GUALTIERREZ misspelled as GUALTIEREZ, AAGARD misspelled as OUGHGARD, and YOUNGMAN misspelled as YONGMAN.
2. Reversed letters: such as DIELEMAN misspelled as DEILEMAN and RODREGUEZ misspelled as RODREUGEZ. These algorithms look for errors due to reversed digraphs (two letter sequences that form one phoneme or distinct sound) which are a common typographical issue, such as “IE” substituted for “EI”.
3. Double letter misspellings: such as HUMBER misspelled as HUMBEER and ZWOLLE misspelled as ZWOLE. The most common typographical issues occur with the characters, in order of frequency, “SS”, “EE”, “TT”, “FF”, “LL”, “MM”, and “OO”.
4. Missed keystrokes: such as HUNTER misspelled as UNTER, missing the initial “H”, and TAMERON misspelled as TAMRON, missing the “E” in the middle.
5. Other typographical errors: which cover a variety of additional misspelling issues.
The pdSurname Pro software with the new fuzzy logic generation 2.0 technology is available for immediate download and can currently be purchased at a 25 percent introductory discount (sale, $371.25; regular, $495) or as part of bundles also on sale, pdSuite Names (sale, $645; regular $795) and pdSuite Master Collection (sale, $795; regular, $995).
For users of other Peacock Data name software, Barbara Adair noted, “pdNickname Pro and pdGender Pro will be updated with fuzzy logic generation 2.0 capabilities this fall, and the upgrades will be free for anyone owning the older version.”