SOUNDEX etc. | General | Forum

Avatar

Please consider registering
guest

sp_LogInOut Log In sp_Registration Register

Register | Lost password?
Advanced Search

— Forum Scope —




— Match —





— Forum Options —





Minimum search word length is 3 characters - maximum search word length is 84 characters

sp_Feed Topic RSS sp_TopicIcon
SOUNDEX etc.
April 12, 2013
1:48 pm
Avatar
kb
Member
Members
Forum Posts: 6
Member Since:
March 12, 2013
sp_UserOfflineSmall Offline

SOUNDEX was invented as a compact indexing code and retrieval tool for name records, with tolerance for "dirty" data from phonetic approximation. The underlying principle is to exploit fundamental limitations in the variety of vocal tract configurations, hence coding for necessarily similar sounding formants. In US1261167, Russell distinguished: oral resonants; labials and labio-dentals; gutterals and sibilants; dental mutes; palatal fricative; labio-nasal; dento or lingua-nasal; and, dental fricative. Notionally, that efficiently encodes two formants per byte, and compression by combination further improved matters for his purpose and scope. His purpose, though, was to better manage card index files with his scope limited to pronunciation in (American) English and necessarily the sounds of the formants therein. Later developments addressed other languages and better optimization for electronic databases, but without significant change in purpose o much in scope.

If we return to the underlying principles, it would seem that they might be usefully applied in the domain of Onomastics. All humans have the same vocal tracts, and the only linguistic area not covered appears to be the click and yodelling languages. Where Russell grouped formants like those for "B" and "P", "T" and "D", for similarity of formation, name drift can be fairly trivially shewn to be common. Have SOUNDEX-like approaches been applied in this field?

April 24, 2013
4:23 pm
Avatar
Keith
New Member
Members
Forum Posts: 1
Member Since:
November 6, 2012
sp_UserOfflineSmall Offline

Yes, I have used soundex to identify pairs of place-name forms which are potentially variant spellings of the same name. An implementation of soundex is available at http://code.activestate.com/re.....algorithm/, in python, so easy to integrate with other software. It would be interesting to develop such methods further to cope with e.g. medieval spelling systems.

Keith

Forum Timezone: Europe/London

Most Users Ever Online: 138

Currently Online:
6 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

[email protected]: 181

Kaka88: 64

Dasik22: 61

ACGrant: 38

macgaihari1: 22

Virginia Romo: 22

hvttalatathui: 21

phocohanoi7 phocohanoi7: 21

Randylia: 15

Custom Packaging Boxes: 13

Member Stats:

Guest Posters: 1

Members: 26047

Moderators: 0

Admins: 3

Forum Stats:

Groups: 1

Forums: 6

Topics: 614

Posts: 1073

Newest Members:

DayanaVob, zarickaanasa09, saniyathakre, samuelfishman, [email protected], oliverevans, no1assignmenthelpuk, Jamesboite, rktaxiservices, mirlacro

Administrators: Alice: 57, Leonie: 17, Scott: 11