Line 7: Line 7:
  
 
== Challenges  ==
 
== Challenges  ==
The toughest part of the original plan was in finding a table that mapped out the formants of the 42 phonemes [1] in the English language. This left me with nothing to compare my results to so the computer had no viable data to use. For this reason, I changed the direction of the project to the current goal.
+
The toughest part of the original plan was in finding a table that mapped out the formants of the 42 phonemes [1] in the English language. This left me with nothing to compare my results to so the computer had no viable data to use. For this reason, I changed the direction of the project to the current goal. Additionally, it was hard, even as a native English speaker, to understand how the IPA is set up [3]. In essence, it was difficult to get through the literature and understand how to accurately map a word, preserving each sound. Charts were found for vowels, however. The phonemes were found for each word [4], but there were no data about their corresponding formants.  
  
 
== Approach ==
 
== Approach ==
Line 13: Line 13:
 
Once I had the audio file, I manually trimmed the data to, basically, get rid of any dead time. One assumption I made was that each of the 10 letters lasted the same amount of time. For this reason, I took 10 DFTs using the 'DFTwin' function we created in lab 9a [1]. From there, I extracted the first 2 largest peaks (the formants). Once I had these, I was able to plot them in 3-space with respect to the letters.   
 
Once I had the audio file, I manually trimmed the data to, basically, get rid of any dead time. One assumption I made was that each of the 10 letters lasted the same amount of time. For this reason, I took 10 DFTs using the 'DFTwin' function we created in lab 9a [1]. From there, I extracted the first 2 largest peaks (the formants). Once I had these, I was able to plot them in 3-space with respect to the letters.   
  
http://www.zamzar.com/
 
 
 
 
[[File:3D plot.PNG|thumbnail]]
 
 
[[File:Audio.PNG|thumbnail]]
 
[[File:Audio.PNG|thumbnail]]
 
[[File:Code.PNG|thumbnail|Matlab Code]]
 
[[File:Code.PNG|thumbnail|Matlab Code]]
 
[[File:O Hello.PNG|thumbnail]]
 
[[File:O Hello.PNG|thumbnail]]
https://en.wikipedia.org/wiki/International_Phonetic_Alphabet
+
 
http://phonemicchart.com/transcribe/1000_basic_words.html
+
== Conclusion ==
 +
 
 +
In a strange (and error-prone way), I was able to collect some data about my speech and where exactly my formants lie.
 +
[[File:3D plot.PNG|thumbnail]]
 +
 
 +
 
 +
 
 +
[1] Purdue ECE 438, "ECE438 - Laboratory 9: Speech Processing (Week 1)", October 6, 2010,
 +
https://engineering.purdue.edu/VISE/ee438L/lab9/pdf/lab9a.pdf.
 +
 
 +
Lab used for general direction and background information, including the formant chart for vowels
 +
 
 +
[2] http://www.zamzar.com/
 +
 
 +
Free online software used to convert the audio file
 +
 
 +
[3] https://en.wikipedia.org/wiki/International_Phonetic_Alphabet
 +
 
 +
Background information on how the words are structured and understood
 +
 
 +
[4] http://phonemicchart.com/transcribe/1000_basic_words.html
 +
 
 +
Used to get the official phonetic spelling for both words

Revision as of 22:43, 23 April 2017

Hello, World!

by Alden Fisher


Introduction

My original intent was to insert an audio file into Matlab and have it print out what I was saying in plain text. This proved to be a challenge for several reasons which I will get to later. What I ended up doing instead was finding the 1st and 2nd formants in the famous sentence "Hello, world."

Challenges

The toughest part of the original plan was in finding a table that mapped out the formants of the 42 phonemes [1] in the English language. This left me with nothing to compare my results to so the computer had no viable data to use. For this reason, I changed the direction of the project to the current goal. Additionally, it was hard, even as a native English speaker, to understand how the IPA is set up [3]. In essence, it was difficult to get through the literature and understand how to accurately map a word, preserving each sound. Charts were found for vowels, however. The phonemes were found for each word [4], but there were no data about their corresponding formants.

Approach

I audio recorded me in a quiet room saying the phrase "Hello, World." This was recorded on my iPhone which has a sampling rate of 44.1kHz. From there, I converted the file [2] to a '.wav' so that it would be compatible on all computing platforms. Once I had the audio file, I manually trimmed the data to, basically, get rid of any dead time. One assumption I made was that each of the 10 letters lasted the same amount of time. For this reason, I took 10 DFTs using the 'DFTwin' function we created in lab 9a [1]. From there, I extracted the first 2 largest peaks (the formants). Once I had these, I was able to plot them in 3-space with respect to the letters.

Audio.PNG
Matlab Code
O Hello.PNG

Conclusion

In a strange (and error-prone way), I was able to collect some data about my speech and where exactly my formants lie.

3D plot.PNG


[1] Purdue ECE 438, "ECE438 - Laboratory 9: Speech Processing (Week 1)", October 6, 2010, https://engineering.purdue.edu/VISE/ee438L/lab9/pdf/lab9a.pdf.

Lab used for general direction and background information, including the formant chart for vowels

[2] http://www.zamzar.com/

Free online software used to convert the audio file

[3] https://en.wikipedia.org/wiki/International_Phonetic_Alphabet

Background information on how the words are structured and understood

[4] http://phonemicchart.com/transcribe/1000_basic_words.html

Used to get the official phonetic spelling for both words

Alumni Liaison

Ph.D. on Applied Mathematics in Aug 2007. Involved on applications of image super-resolution to electron microscopy

Francisco Blanco-Silva