March 12, 2013
SOUNDEX was invented as a compact indexing code and retrieval tool for name records, with tolerance for "dirty" data from phonetic approximation. The underlying principle is to exploit fundamental limitations in the variety of vocal tract configurations, hence coding for necessarily similar sounding formants. In US1261167, Russell distinguished: oral resonants; labials and labio-dentals; gutterals and sibilants; dental mutes; palatal fricative; labio-nasal; dento or lingua-nasal; and, dental fricative. Notionally, that efficiently encodes two formants per byte, and compression by combination further improved matters for his purpose and scope. His purpose, though, was to better manage card index files with his scope limited to pronunciation in (American) English and necessarily the sounds of the formants therein. Later developments addressed other languages and better optimization for electronic databases, but without significant change in purpose o much in scope.
If we return to the underlying principles, it would seem that they might be usefully applied in the domain of Onomastics. All humans have the same vocal tracts, and the only linguistic area not covered appears to be the click and yodelling languages. Where Russell grouped formants like those for "B" and "P", "T" and "D", for similarity of formation, name drift can be fairly trivially shewn to be common. Have SOUNDEX-like approaches been applied in this field?
November 6, 2012
Yes, I have used soundex to identify pairs of place-name forms which are potentially variant spellings of the same name. An implementation of soundex is available at http://code.activestate.com/re.....algorithm/, in python, so easy to integrate with other software. It would be interesting to develop such methods further to cope with e.g. medieval spelling systems.
Most Users Ever Online: 43
Currently Browsing this Page:
John Turl: 4
Guest Posters: 1
Newest Members:jackjons, KiraFlasove, BarbaraFlary, Chandrabhan02, WilliamLAt, TyroneOrape, StevenCealf, antoha, Yourdream, yflz
Administrators: Alice: 57, Leonie: 17, Scott: 11