Line 3: | Line 3: | ||
*The server failed??? When?? Zach do you know anything about this? --[[User:Mboutin|Mboutin]] 19:45, 3 November 2009 (UTC) | *The server failed??? When?? Zach do you know anything about this? --[[User:Mboutin|Mboutin]] 19:45, 3 November 2009 (UTC) | ||
*It was from around 2pm till about 5:30pm Tuesday. When I tried to preview my page that I had started writing, it said something like "Server not available." --[[User:Pclay|Pclay]] | *It was from around 2pm till about 5:30pm Tuesday. When I tried to preview my page that I had started writing, it said something like "Server not available." --[[User:Pclay|Pclay]] | ||
+ | *We will look into this. Thanks for the detailed info Peter! --[[User:Mboutin|Mboutin]] 13:33, 4 November 2009 (UTC) | ||
+ | |||
+ | |||
Notes for speech lecture: | Notes for speech lecture: | ||
Revision as of 08:33, 4 November 2009
Due to a kiwi server fail, my pre-lecture notes are not as substantial as I would have liked See my post-lecture notes for a more detailed description.
- The server failed??? When?? Zach do you know anything about this? --Mboutin 19:45, 3 November 2009 (UTC)
- It was from around 2pm till about 5:30pm Tuesday. When I tried to preview my page that I had started writing, it said something like "Server not available." --Pclay
- We will look into this. Thanks for the detailed info Peter! --Mboutin 13:33, 4 November 2009 (UTC)
Notes for speech lecture:
Structure: -> Basic speech stuff (pipes, fricatives) -> Voiced vs. Unvoiced
1) avg power 2) zero crossing
-> x(t) -> v(t) => s(t) = conv( x(t), v(t) )
periodic filter phoneme pulse train
-> Model vocal tract as a series of tubes
- Going through tube delays the signal (show function) - between tubes (show function)
+ This model leads to a transfer function -> Transfer function V(d)
Since the vocal tract is a cavity that resonates, it amplifies certain frequencies X(f) is sum(a_k * delta(f-kf_a))
This frequencies, which are the local maxes of |S(f)| are called formants
- Generally, the vocal tract transfer function is an all-pole filter where a real pole or a complex pole pair correspond to a resonance. - Also, if you are given a z-model, F = theta / (2*pi*T) where T is the sampling period. (same thing as wT = theta
- zeros, anti-resonances, of the transfer function will occur when there is no measurable output (i.e. Nasals and Fricatives) - Nasal => output from the mouth is zero Fricatives/stop consonants => blockage behind source is infinite (forcing air through constriction)
-> Spectrograms
- Models frequency vs. time - Use a short-time DTFT to obtain useful info about an utterance X_m(e^jw) = sum( x(n)w(n-m)e^(-jwn)) - wideband uses window length = one period - high time resolution, low freq - striations due to energy variation - narrowband captures several periods - high freq, low time - striations correspond to peaks in frequency spectrum.
The formants correspond to the dark bands.
-> How to read a spectrogram by Rob Hagiwara
http://home.cc.umanitoba.ca/~robh/howto.html