Contents
Digital Speech Generation
Page Under Construction
Background
- Digital Speech Generation, as the name suggests, is the process of making a computer "speak" a sequence of letters/words in a meaningful way.
- For English, this is especially hard, because it is a very un-mathematical language.
- What I mean by unmathematical, is that the "a" in "apple" is not pronounced the same way as the "a" in "hate". In other words, letters do not sound the same way in different words.
- Sure, this might be true for consonants, but vowels, not by a long shot.
- Hindi for example, is a little more mathematical. The "आ" in "आप", meaning "you" with respect, will be the same as the "आ" in "आरंभ" which means "start", and this is true for all uses of आ.
- If English were the same way, all we would have to do would be to assign a sound to every letter and then just play it.
- Unfortunately, its not that simple.
- What is done, is to use another type of alphabet, the phonetic alphabet, which basically assigns a sound to every alphabet within it, and in doing so, this set of alphabets can generate any sound in the human language (Or can they?). http://www.langsci.ucl.ac.uk/ipa/pulmonic.html
Motivation
- The above technique sounds great right? The question is, how easy is it to implement?
- The following experiment will explore if this can indeed be done.
- Unfortunately, IPA (the International Phonetic Association) does not give you these phonetic elements in a neat little zip file, nicely documented and ready to use.
- What is gives you is a folder called American English, which has a bunch of words that you hear along with a $30 Handbook to know what each phoneme means.
- However, if we ignore our desire to conform to standards for a minute, we can easily sieve out the syllables from these words that contain the phonemes that we need.
- I named my own phonemes, and broke down my test word "ECE 438" into a composition of these phonemes. So, please don't take the names on these phonemes as standard IPA language. For example, if I say ee, I mean the ee in eel. The phonetic code for that is $ i: $
- Also, I am required to add this copyright information. For the record, I am allowed to use it for educational purposes. Copyright:phonetic_sounds
Experiment: Can MATLAB say ECE 438?
- The answer to the above question is Yes, it can.
- As I stated above, IPA gives you a bunch of words like "sky","tie" etc in their zip file.
- What I did was to load these into MATLAB, cut out the part of the word containing the syllable I needed and then concatenated these syllables together to form the speech I needed.
- The following code will clarify this method:
% % % % % % BIG WORDS TO CLEAVE BITS FROM % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % bead = wavread('Phonetic_sounds\American-English\Vowels\01-bead.wav'); sigh = wavread('Phonetic_sounds\American-English\Consonants\12-sigh.wav'); fie = wavread('Phonetic_sounds\American-English\Consonants\04-fie.wav'); bode = wavread('Phonetic_sounds\American-English\Vowels\07-bode.wav'); thigh = wavread('Phonetic_sounds\American-English\Consonants\10-thigh.wav'); kite = wavread('Phonetic_sounds\American-English\Consonants\16-kite.wav'); bade = wavread('Phonetic_sounds\American-English\Vowels\03-bayed.wav'); % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % WORD BITS % % % % % % % % % % % pause = (zeros(1,1000))'; ee = bead(3000:5000); clear bead; ss = sigh(1:4000); clear sigh; ff = fie(1:3500); clear fie; oh = bode(4000:7000);clear bode; thin = thigh(1:4000);clear thigh; tee = kite(6000:10000);clear kite; ay = bade(3000:7000);clear bade; % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % total = [ee;ss;ee;pause;ee;pause;pause;ff;oh;pause;thin;tee;ee;pause;ay;tee]; sound(total,15000); wavwrite(total,15000,'ece438.wav')
Resources
- While the page gets worked on, here are some really cool web-pages about speech and phonetics that I came across in my research:
- The Upenn course slides are particularly good. http://www.ling.upenn.edu/courses/Fall_2008/ling520/week1/week1.pdf .You can change the week number at the end of the hyperlink to access the other slides.
- This page is good to hear what different parts of the phonetic alphabet sound like. http://web.uvic.ca/ling/resources/ipa/charts/IPAlab/IPAlab.htm
- Also, on the UPenn webpage http://www.ling.upenn.edu/courses/Fall_2008/ling520/ try the links to the labs they have some interesting stuff
--Dlamba 20:00, 23 October 2009 (UTC)