SOUNDEX etc. | General | Forum

Avatar

Please consider registering
guest

sp_LogInOut Log In sp_Registration Register

Register | Lost password?
Advanced Search

— Forum Scope —




— Match —





— Forum Options —





Minimum search word length is 3 characters - maximum search word length is 84 characters

sp_Feed Topic RSS sp_TopicIcon
SOUNDEX etc.
April 12, 2013
1:48 pm
Avatar
kb
Member
Members
Forum Posts: 6
Member Since:
March 12, 2013
sp_UserOfflineSmall Offline

SOUNDEX was invented as a compact indexing code and retrieval tool for name records, with tolerance for "dirty" data from phonetic approximation. The underlying principle is to exploit fundamental limitations in the variety of vocal tract configurations, hence coding for necessarily similar sounding formants. In US1261167, Russell distinguished: oral resonants; labials and labio-dentals; gutterals and sibilants; dental mutes; palatal fricative; labio-nasal; dento or lingua-nasal; and, dental fricative. Notionally, that efficiently encodes two formants per byte, and compression by combination further improved matters for his purpose and scope. His purpose, though, was to better manage card index files with his scope limited to pronunciation in (American) English and necessarily the sounds of the formants therein. Later developments addressed other languages and better optimization for electronic databases, but without significant change in purpose o much in scope.

If we return to the underlying principles, it would seem that they might be usefully applied in the domain of Onomastics. All humans have the same vocal tracts, and the only linguistic area not covered appears to be the click and yodelling languages. Where Russell grouped formants like those for "B" and "P", "T" and "D", for similarity of formation, name drift can be fairly trivially shewn to be common. Have SOUNDEX-like approaches been applied in this field?

April 24, 2013
4:23 pm
Avatar
Keith
New Member
Members
Forum Posts: 1
Member Since:
November 6, 2012
sp_UserOfflineSmall Offline

Yes, I have used soundex to identify pairs of place-name forms which are potentially variant spellings of the same name. An implementation of soundex is available at http://code.activestate.com/re.....algorithm/, in python, so easy to integrate with other software. It would be interesting to develop such methods further to cope with e.g. medieval spelling systems.

Keith

Forum Timezone: Europe/London

Most Users Ever Online: 43

Currently Online:
1 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

Alturlie: 27

Carole: 11

virgiliuspaul: 8

Jake: 7

kb: 6

elmaasheley147: 6

Luther: 5

FlorenceEliz: 4

John Turl: 4

ellyllcarw: 3

Member Stats:

Guest Posters: 1

Members: 24475

Moderators: 0

Admins: 3

Forum Stats:

Groups: 1

Forums: 6

Topics: 92

Posts: 197

Newest Members:

sagunjan, RodneyFut, shyaonjill, Arthurmax01, TimothyAcape, Yvonnebag, kitoinfocom, Kevinvab, isilicence, clarkthomas

Administrators: Alice: 57, Leonie: 17, Scott: 11