m |
m (Fixed category link (to page I just created)) |
||
(One intermediate revision by one other user not shown) | |||
Line 1: | Line 1: | ||
− | [[Category: | + | [[Category:ECE438Fall2016Boutin]] |
<center><font size= 5>Neil Armstrong Moon Landing Speech Analysis</font size></center> | <center><font size= 5>Neil Armstrong Moon Landing Speech Analysis</font size></center> | ||
Line 22: | Line 22: | ||
The whole audio can be divided to 5 parts: | The whole audio can be divided to 5 parts: | ||
<center> | <center> | ||
− | [[File:Original Speech2.jpg|center|Original Speech]] | + | [[File:Original Speech2.jpg|750px|thumb|center|Original Speech]] |
</center> | </center> | ||
Line 43: | Line 43: | ||
Spectrogram is a three dimensional diagram,in which the x-axis represents time, y-axis represents frequency and z-axis represents the FFT (fast fourier transform) at each short period of time. It tells us what is the overall distribution of the audio frequency. Based on the diagram, we can determine how to design our bandpass filter. | Spectrogram is a three dimensional diagram,in which the x-axis represents time, y-axis represents frequency and z-axis represents the FFT (fast fourier transform) at each short period of time. It tells us what is the overall distribution of the audio frequency. Based on the diagram, we can determine how to design our bandpass filter. | ||
{| class="wikitable sortable" | {| class="wikitable sortable" | ||
+ | |- | ||
| | | | ||
− | + | [[File:Noise 1 Spectrogram.jpg|450px|thumb|center|Noise 1 Spectrogram]] | |
− | [[File:Noise 1 Spectrogram.jpg| | + | |
|| | || | ||
− | [[File:Noise 2 Spectrogram.jpg| | + | [[File:Noise 2 Spectrogram.jpg|450px|thumb|center|Noise 2 Spectrogram]] |
+ | |- | ||
| | | | ||
− | + | [[File:OneSmallStep Spectrogram.jpg|450px|thumb|center|OneSmallStep Spectrogram]] | |
− | [[File:OneSmallStep Spectrogram.jpg| | + | |
|| | || | ||
− | [[File:OneGiantLeap Spectrogram.jpg| | + | [[File:OneGiantLeap Spectrogram.jpg|450px|thumb|center|OneGiantLeap Spectrogram]] |
|} | |} | ||
Line 110: | Line 110: | ||
|- | |- | ||
| | | | ||
− | [[File:BandPassSpeech.jpg| | + | [[File:BandPassSpeech.jpg|450px|thumb|center|BandPass Speech Before and After]] |
|| | || | ||
− | [[File:BandPassSpeechFrequency.jpg| | + | [[File:BandPassSpeechFrequency.jpg|450px|thumb|center|BandPass Speech in Frequency Before and After]] |
|} | |} | ||
Line 178: | Line 178: | ||
</source> | </source> | ||
− | <center>[[File:NoiseReductionSpeech.jpg| | + | <center>[[File:NoiseReductionSpeech.jpg|450px|thumb|center|NoiseReductionSpeech Before and After]]</center> |
<center> | <center> | ||
[[File:NR Speech.wav|thumbnail|Noise Reduced Speech]] | [[File:NR Speech.wav|thumbnail|Noise Reduced Speech]] |
Latest revision as of 23:27, 25 November 2016
Contents
Introduction
When Neil Armstrong landed on the moon, he said "That's one small step for (a) man; one giant leap for mankind." However,because all the noises (machine vibration,breathing and white noise in radio frequency), people cannot distinguish if he said "a man" or just "man". A lot of experts tried different methods but still could not extract the "a" from the background noise. Hence, the answer is still unknown.
This small project will focus on noise reduction techniques to increase the overall signal to noise ratio (SRN)
Two techniques will be introduced:
- Bandpass Filter
- Noise Gate
Original Speech Analysis
Speech in Time domian
The original speech file can be found from NASA page: July 20, 1969: One Giant Leap For Mankind
Play the audio
While playing the audio, we can hear a constant machine vibration and white noise in the background.
The whole audio can be divided to 5 parts:
We can use MATLAB to load the audio file and then divide the audio to different parts.
Here,I divided the audio to five parts.
[speech,f_sampling] = audioread('Original Speech.wav'); noise1 = speech(1:3*f_sampling); speech1 = speech(round(3.652*f_sampling):round(5.682*f_sampling)); noise2 = speech(round(5.694*f_sampling):round(15.388*f_sampling)); One_Man = speech(round(15.337*f_sampling):round(18.584*f_sampling)); One_Mankind = speech(round(20.633*f_sampling):floor(24.1*f_sampling)); save noise.mat noise1 noise2 save Main.mat speech1 One_Man One_Mankind
Speech in Frequency domian
Then we should look at the speech in frequency domain.
The analysis tool is Spectrogram, which is based on the codes from Lab 9a-Speech Processing I.
Spectrogram is a three dimensional diagram,in which the x-axis represents time, y-axis represents frequency and z-axis represents the FFT (fast fourier transform) at each short period of time. It tells us what is the overall distribution of the audio frequency. Based on the diagram, we can determine how to design our bandpass filter.
CLEAR; FileCutting; delta_t = 40; overlap = 20; N = 512; F1 = figure(1); autoSpecgm(noise1,f_sampling,delta_t,overlap,N,F1); title('Noise1') F2 = figure(2); autoSpecgm(noise2,f_sampling,delta_t,overlap,N,F2); title('Noise2') F3 = figure(3); autoSpecgm(One_Man,f_sampling,delta_t,overlap,N,F3); title('One Man') F4 = figure(4); autoSpecgm(One_Mankind,f_sampling,delta_t,overlap,N,F4); title('One Mankind') function autoSpecgm(signal,fs,delta_t,overlap,N,PIC) SIGNAL = Specgm(signal,delta_t,overlap,N); SIGNAL = transpose(SIGNAL); t = 1:(delta_t-overlap):length(signal); f = 0:fs/2; figure(PIC) subplot(1,2,1) [row,col] = size(SIGNAL); imagesc(t,f,abs(SIGNAL(1:row/2,:))) axis xy subplot(1,2,2) mesh(abs(SIGNAL(1:row/2,:))) end
Noise Reduction in frequency domain using bandpass filter
From the Spectrogram, we know that the constant noise is concentrated on frequency spectrum from 1 to 300Hz, whereas the speech is concentrated on 250-2000Hz.
Therefore, we can design a bandpass filer in a range from 200 to 2000Hz. The frequencies that are out of this range will be attenuated.
order = 10; fcutlow = 200; fcuthigh = 2000; [b,a] = butter(order,[fcutlow,fcuthigh]/(f_sampling/2), 'bandpass'); Filtered_speech = filter(b,a,speech);
Clearly, after filtering,the part of the noises can be attenuated.However,the noises are still significant.
Noise Gate: a technique that can erase noises
After applying the bandpass filter, we can see the filter will never attenuate the noise without touching the signal's frequency.
Therefore, we need aother technique to reduce noise.
Searching online, I found a technique that is used by some commercial software called Noise Gate.
Basically, we need a noise sample from the audio and to calculate the average power of this sample. Usually for background noise, the average powers of the noise and signal are different. Signal generally has higher average power. Based on this feature, we can implement an if-else statement to decide whether or not a given audio segment is signal or noise.
For more details, please look at this wiki page: Noise Gate
The Noise gate should be applied after frequency filtering. Because the frequency filter will attenuate the overall noise amplitude.The average power of noise will decrease, so that the noise gate can better erase the noise segments.
CLEAR; FileCutting; overlap = 20; delta_t = 40; N = 512; order = 10; fcutlow = 170; fcuthigh = 1750 ; [b,a] = butter(order,[fcutlow,fcuthigh]/(f_sampling/2), 'bandpass'); Filtered_speech = filter(b,a,speech); SpeechCutting(Filtered_speech) load('noise.mat'); load('Main.mat'); ave_noise2 = ave_power(Fnoise2); ave_noise = ave_noise2; windowSize = 1000; Len_speech = length(Filtered_speech); steps = floor(Len_speech/windowSize); NR_speech = Filtered_speech; error_edge = 0.01; error = zeros(1,steps); for i = 0:steps-1 current_speech = Filtered_speech(i*windowSize+1:(i+1)*windowSize); current_power = ave_power(current_speech); error(i+1) = current_power - ave_noise; if error(i+1) > error_edge NR_speech(i*windowSize+1:(i+1)*windowSize) = current_speech; elseif error(i+1) < error_edge & error(i+1)>0 NR_speech(i*windowSize+1:(i+1)*windowSize) = 0.01* current_speech; elseif error(i+1) < error_edge & error(i+1)<0 NR_speech(i*windowSize+1:(i+1)*windowSize) = 0 * current_speech; end end figure(1) subplot(2,1,1) plot(Filtered_speech) subplot(2,1,2) plot(NR_speech) SpeechCutting(NR_speech) load('Main.mat');
Conclusion
The bandpass and noise gate techniques can reduce a large portion of noises in the Neil Armstrong Speech and increase the signal to noise ratio of the speech.By optimalizing parameters, we can get a better quality speech.
Reference
Noise gate wiki: https://en.wikipedia.org/wiki/Noise_gate?oldformat=true
ECE438 - Laboratory 9:Speech Processing: https://engineering.purdue.edu/VISE/ee438L/lab9/pdf/lab9a.pdf
Noise Reduction: http://wiki.audacityteam.org/wiki/Noise_Reduction