prendradjaja/phoneme-frequencies
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Versions of local copies: - cmudict: 0.7b. Retrieved May 28, 2018. - kilgarriff: Retrieved May 28, 2018. Sources: - https://cmloegcmluin.wordpress.com/2012/11/10/relative-frequencies-of-english-phonemes/ - http://www.speech.cs.cmu.edu/cgi-bin/cmudict - http://www.kilgarriff.co.uk/bnc-readme.html To do: . use more than just first pronunciation in cmudict . <er> is one phoneme etc . phone or phoneme? . manual error checking . transcribe some of the uncorrelateds . reread cmloegcmluin x ARPAbet -> IPA Changelog: - v0.3.0: Various - Add local copies of source data and results - Default to local copies - Move all data files into subdirectories - Add MIT license - Refactor: Move file paths out of Python and into Makefile - v0.2.0: Translate ARPAbet to IPA - v0.1.0: First steps; Frequencies generally and post-/w/ - Processing: - Use only the first pronunciation in cmudict - Discard uncorrelateds entirely - No manual error checking etc - Results: - Q1: Frequencies of phonemes generally - Q2: Frequencies of phonemes post-/w/