SOUNDEX etc. | General | Forum

Please consider registering
guest

Log In Register

Register | Lost password?
Advanced Search:

— Forum Scope —



— Match —



— Forum Options —




Wildcard usage:
*  matches any number of characters    %  matches exactly one character

Minimum search word length is 4 characters - maximum search word length is 84 characters

Topic RSS
SOUNDEX etc.
April 12, 2013
1:48 pm
kb
Member
Forum Posts: 6
Member Since:
March 12, 2013
Offline

SOUNDEX was invented as a compact indexing code and retrieval tool for name records, with tolerance for "dirty" data from phonetic approximation. The underlying principle is to exploit fundamental limitations in the variety of vocal tract configurations, hence coding for necessarily similar sounding formants. In US1261167, Russell distinguished: oral resonants; labials and labio-dentals; gutterals and sibilants; dental mutes; palatal fricative; labio-nasal; dento or lingua-nasal; and, dental fricative. Notionally, that efficiently encodes two formants per byte, and compression by combination further improved matters for his purpose and scope. His purpose, though, was to better manage card index files with his scope limited to pronunciation in (American) English and necessarily the sounds of the formants therein. Later developments addressed other languages and better optimization for electronic databases, but without significant change in purpose o much in scope.

If we return to the underlying principles, it would seem that they might be usefully applied in the domain of Onomastics. All humans have the same vocal tracts, and the only linguistic area not covered appears to be the click and yodelling languages. Where Russell grouped formants like those for "B" and "P", "T" and "D", for similarity of formation, name drift can be fairly trivially shewn to be common. Have SOUNDEX-like approaches been applied in this field?

April 24, 2013
4:23 pm
Keith
New Member
Forum Posts: 1
Member Since:
November 6, 2012
Offline

Yes, I have used soundex to identify pairs of place-name forms which are potentially variant spellings of the same name. An implementation of soundex is available at http://code.activestate.com/re…..algorithm/, in python, so easy to integrate with other software. It would be interesting to develop such methods further to cope with e.g. medieval spelling systems.

Keith

Forum Timezone: UTC 1

Most Users Ever Online: 27

Currently Online:
9 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

Alturlie: 18

Carole: 11

virgiliuspaul: 8

Jake: 7

kb: 6

elmaasheley147: 6

Luther: 5

FlorenceEliz: 4

John Turl: 4

ellyllcarw: 3

Member Stats:

Guest Posters: 1

Members: 13320

Moderators: 0

Admins: 3

Forum Stats:

Groups: 1

Forums: 6

Topics: 100

Posts: 201

Newest Members: Edwardnof, Rachioiqc, Batteryydh, Rachioqbc, Holographicjve, Arnottyxz, Extractionqfs, Mojavedqa, Candyxdz, Foamugl

Administrators: Alice (57), Leonie (17), Scott (11)