(Creation of Sam Garvis text2speech program)
 
m (added garvis to wav files)
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
== Introduction ==
 +
by Sam Garvis
 +
 +
For my project I wanted to have Matlab take in phoneme values as specified by the user then repeat them back through audio based on the wav files I created of each phoneme.  Although creating the files was slightly difficult due to constraints on what Matlab could use I was able to finish the project.
 +
 +
== How it was done ==
 +
After recording every phoneme I allocated each file to a variable name.  This name could be called by the user when they input their text for speech conversion.  From there I combined the full text converted speech into one double variable that would be input into sound() with a specified frequency. 
 +
This would be plotted on a 2-D graph to show how the speech looked.
 +
 +
== Issues with project ==
 +
Due to how I converted my recordings from mp3 to wav there was a required size for it to be noticed as a file.  As such some of the files have been adjusted to allow for conversion, most files done this way where the unvoiced sounds as lengthening them is very difficult.
 +
 +
== Results ==
 +
The combinations of phonemes that humans use are more complicated than we realize and thanks to our ability to modulate our voice, add inflections to our words, and many other vocal abilities allows us to speak so well.  This is proven when compared to how a computer would speak given the English phonemes and the same speech.
 +
 
%% Information
 
%% Information
 
% Here is the text2speech program for Sam Garvis, this program will only  
 
% Here is the text2speech program for Sam Garvis, this program will only  
 
% work with a folder of wav files for text2speech use.  It can be edited  
 
% work with a folder of wav files for text2speech use.  It can be edited  
 
% (via wav files) to add more sounds and or edit current sounds.
 
% (via wav files) to add more sounds and or edit current sounds.
 +
 +
Matlab code
 +
-----------------------------------------------------------------------------
 +
% added length to certain files in order for them to be picked up
 +
% by audio converter
  
 
%% Sites used
 
%% Sites used
Line 23: Line 43:
  
 
%% Loading data from folder
 
%% Loading data from folder
clear,clc
+
clear,clc % clear space
  
fs = 44100;
+
fs = 44100; % frequency
  
phonems = dir('*.wav');  
+
phonems = dir('*.wav'); % checks current directory for .wav files
numfiles = length(phonems);
+
numfiles = length(phonems); % finds number of wav files
  
 +
% converts each wav file into its coresponding variable with name
 
for i = 1:numfiles
 
for i = 1:numfiles
 
     titles = phonems(i,1).name;
 
     titles = phonems(i,1).name;
 
     [~,sound_name,ext] = fileparts(titles);
 
     [~,sound_name,ext] = fileparts(titles);
     eval([sound_name '= audioread(phonems(i,1).name);']);
+
     eval([sound_name '= audioread(phonems(i,1).name);']);  
   
+
 
end
 
end
  
 +
%input section
 
input_phonems = ['Input the word you want in the form\n of the phonems shown'...
 
input_phonems = ['Input the word you want in the form\n of the phonems shown'...
 
     'as wav files with\n 1 space inbetween each phonem\n'];
 
     'as wav files with\n 1 space inbetween each phonem\n'];
  
text = input(input_phonems,'s');
+
% text = input(input_phonems,'s');
 +
 
 +
text = 'ha eh ll O wu er ll du th ii ss wu aa eh zz du nn bu I ss aa mm gu ar vv ii ss'; % this is here for the example
 +
 
 
parts = strsplit(text); % splits text into readable parts
 
parts = strsplit(text); % splits text into readable parts
speech = [];
+
 
 +
speech = []; % creates speech bracket space
 +
 
 +
% adds on to previous speech in order to make the speech
 
for j = 1:length(parts)
 
for j = 1:length(parts)
 
     eval(['section = ',char(parts(j)),';']); % evaluates each section and returns the double struct
 
     eval(['section = ',char(parts(j)),';']); % evaluates each section and returns the double struct
Line 54: Line 81:
 
ylabel('Amplitude')
 
ylabel('Amplitude')
 
% example: ha eh ll O wu er ll du - hello world
 
% example: ha eh ll O wu er ll du - hello world
 +
 +
Wav Files
 +
[[Media:A_garvis.wav]]
 +
[[Media:aa_garvis.wav]]
 +
[[Media:ar_garvis.wav]]
 +
[[Media:aw_garvis.wav]]
 +
[[Media:bu_garvis.wav]]
 +
[[Media:ch_garvis.wav]]
 +
[[Media:du_garvis.wav]]
 +
[[Media:E_garvis.wav]]
 +
[[Media:eh_garvis.wav]]
 +
[[Media:er_garvis.wav]]
 +
[[Media:ff_garvis.wav]]
 +
[[Media:gu_garvis.wav]]
 +
[[Media:ha_garvis.wav]]
 +
[[Media:I_garvis.wav]]
 +
[[Media:ii_garvis.wav]]
 +
[[Media:jj_garvis.wav]]
 +
[[Media:ks_garvis.wav]]
 +
[[Media:ku_garvis.wav]]
 +
[[Media:kw_garvis.wav]]
 +
[[Media:ll_garvis.wav]]
 +
[[Media:mm_garvis.wav]]
 +
[[Media:ng_garvis.wav]]
 +
[[Media:nn_garvis.wav]]
 +
[[Media:O_garvis.wav]]
 +
[[Media:oa_garvis.wav]]
 +
[[Media:oo_garvis.wav]]
 +
[[Media:ou_garvis.wav]]
 +
[[Media:ouh_garvis.wav]]
 +
[[Media:pu_garvis.wav]]
 +
[[Media:rr_garvis.wav]]
 +
[[Media:sh_garvis.wav]]
 +
[[Media:ss_garvis.wav]]
 +
[[Media:th_garvis.wav]]
 +
[[Media:tu_garvis.wav]]
 +
[[Media:U_garvis.wav]]
 +
[[Media:uh_garvis.wav]]
 +
[[Media:vv_garvis.wav]]
 +
[[Media:wu_garvis.wav]]
 +
[[Media:yu_garvis.wav]]
 +
[[Media:zh_garvis.wav]]
 +
[[Media:zz_garvis.wav]]

Latest revision as of 15:50, 24 April 2017

Introduction

by Sam Garvis

For my project I wanted to have Matlab take in phoneme values as specified by the user then repeat them back through audio based on the wav files I created of each phoneme. Although creating the files was slightly difficult due to constraints on what Matlab could use I was able to finish the project.

How it was done

After recording every phoneme I allocated each file to a variable name. This name could be called by the user when they input their text for speech conversion. From there I combined the full text converted speech into one double variable that would be input into sound() with a specified frequency. This would be plotted on a 2-D graph to show how the speech looked.

Issues with project

Due to how I converted my recordings from mp3 to wav there was a required size for it to be noticed as a file. As such some of the files have been adjusted to allow for conversion, most files done this way where the unvoiced sounds as lengthening them is very difficult.

Results

The combinations of phonemes that humans use are more complicated than we realize and thanks to our ability to modulate our voice, add inflections to our words, and many other vocal abilities allows us to speak so well. This is proven when compared to how a computer would speak given the English phonemes and the same speech.

%% Information % Here is the text2speech program for Sam Garvis, this program will only % work with a folder of wav files for text2speech use. It can be edited % (via wav files) to add more sounds and or edit current sounds.

Matlab code


% added length to certain files in order for them to be picked up % by audio converter

%% Sites used

% http://www.auburn.edu/academic/education/reading_genie/spellings.html

% https://online-voice-recorder.com/

% http://online-audio-converter.com/

% https://www.mathworks.com/help/matlab/ref/audioread.html

% https://www.mathworks.com/help/matlab/ref/dir.html

% https://www.mathworks.com/help/matlab/import_export/process-a-sequence-of-files.html

% https://www.mathworks.com/help/matlab/ref/fileparts.html

% https://www.mathworks.com/help/matlab/ref/genvarname.html

%% Loading data from folder clear,clc % clear space

fs = 44100; % frequency

phonems = dir('*.wav');  % checks current directory for .wav files numfiles = length(phonems); % finds number of wav files

% converts each wav file into its coresponding variable with name for i = 1:numfiles

   titles = phonems(i,1).name;
   [~,sound_name,ext] = fileparts(titles);
   eval([sound_name '= audioread(phonems(i,1).name);']);   

end

%input section input_phonems = ['Input the word you want in the form\n of the phonems shown'...

   'as wav files with\n 1 space inbetween each phonem\n'];

% text = input(input_phonems,'s');

text = 'ha eh ll O wu er ll du th ii ss wu aa eh zz du nn bu I ss aa mm gu ar vv ii ss'; % this is here for the example

parts = strsplit(text); % splits text into readable parts

speech = []; % creates speech bracket space

% adds on to previous speech in order to make the speech for j = 1:length(parts)

   eval(['section = ',char(parts(j)),';']); % evaluates each section and returns the double struct
   speech = [speech; section];   

end

sound(speech,fs) % plays text plot(speech) title('Speech') xlabel('time (s)') ylabel('Amplitude') % example: ha eh ll O wu er ll du - hello world

Wav Files Media:A_garvis.wav Media:aa_garvis.wav Media:ar_garvis.wav Media:aw_garvis.wav Media:bu_garvis.wav Media:ch_garvis.wav Media:du_garvis.wav Media:E_garvis.wav Media:eh_garvis.wav Media:er_garvis.wav Media:ff_garvis.wav Media:gu_garvis.wav Media:ha_garvis.wav Media:I_garvis.wav Media:ii_garvis.wav Media:jj_garvis.wav Media:ks_garvis.wav Media:ku_garvis.wav Media:kw_garvis.wav Media:ll_garvis.wav Media:mm_garvis.wav Media:ng_garvis.wav Media:nn_garvis.wav Media:O_garvis.wav Media:oa_garvis.wav Media:oo_garvis.wav Media:ou_garvis.wav Media:ouh_garvis.wav Media:pu_garvis.wav Media:rr_garvis.wav Media:sh_garvis.wav Media:ss_garvis.wav Media:th_garvis.wav Media:tu_garvis.wav Media:U_garvis.wav Media:uh_garvis.wav Media:vv_garvis.wav Media:wu_garvis.wav Media:yu_garvis.wav Media:zh_garvis.wav Media:zz_garvis.wav

Alumni Liaison

Ph.D. 2007, working on developing cool imaging technologies for digital cameras, camera phones, and video surveillance cameras.

Buyue Zhang