(8 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
= Audio Signal Generating and Processing Project = | = Audio Signal Generating and Processing Project = | ||
+ | Student project for [[ECE438]] | ||
+ | ---- | ||
+ | Introduction: | ||
+ | |||
+ | :Listen to this piece of music. | ||
+ | ::[[Media:Audio_Signal_Generating_and_Processing_Project_final_verison.wav]] | ||
+ | :Just soso, right? but this is generated by computer software by MATLAB. | ||
+ | ---- | ||
- '''Abstract''' - | - '''Abstract''' - | ||
− | This project is intent to analysis different musical instrument's sound, and try to create artificial musical instrument sounds to play a piece | + | :This project is intent to analysis different musical instrument's sound, and try to create artificial musical instrument sounds to play a piece. |
− | + | ||
− | + | ||
+ | ---- | ||
- '''Procedure''' - | - '''Procedure''' - | ||
− | : | + | : A record of limited number of keys on a piano keyboard was used. The original sample is here. |
− | + | ::[[Media:Orginal_sound_sample.wav]] | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | ::[[Media: | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | : | + | : After the first frustrating method, I decided to up/down sample the keys by right order, then place them in right key. |
− | : | + | : According to modern music theory of interval, each intervals are equally spaced, each octave is equally spaced in to 12 intervals. A octave higher means twice the frequency. So, each interval is spaced by frequency ration of <math> 2^{ \frac{1}{12}} = 1.05946309</math> |
+ | :But here comes a problem for up/down sample, it can only up/down sample by a integer factor. One can't upsampling by 1.05946309. | ||
+ | ::However, inspect the rational number <math> { \frac{18}{17}} = 1.05882353</math> , that is relatively close to 1.05946309. | ||
+ | ::Next closer fraction is <math> { \frac{107}{101}} = 1.059405941</math>, But this fraction doesn't change too much accuracy, but as we can see below, it increase the computation steps rapidly. So I choose <math> { \frac{18}{17}} = 1.05882353</math> as the approximate factor. | ||
− | :: | + | :Next, use this fraction, apply the following: |
+ | :: if a note half-step above the original is desired, then upsample by 17, then down sample by 18, call this as "move up" | ||
+ | :::In this case, the signal is preserved, but at a lower sampling frequency. If play at the original frequency, then the note half-step above is played. | ||
+ | :: if a note half-step below the original is desired, then upsample by 18, then down sample by 17, call this as "move down" | ||
+ | :::In this case, the signal is preserved, but at a higher sampling frequency. If play at the original frequency, then the note half-step below is played. | ||
− | + | :For each interval(from lower C to higher C), | |
+ | :take the lower C, "move up" by step recursively, then get a map of full chromatic scale, define map1, with the exact timber of the lower C; | ||
+ | :take the higher C, "move down" by step recursively, then get a map of full chromatic scale, with the exact timber of the higher C; | ||
+ | :if we pick higher part of the scale as map1, lower part map2, then at the junction, the timber suddenly changed, makes the sound very unnatural. | ||
+ | :: You can hear it in here [[Media:Audio_Signal_Generating_and_Processing_Project_Timber_before.wav]] | ||
− | : | + | :Instead, apply the following method: |
+ | ::a given note is contribute by both map1 and map2, and proportional to the end point. | ||
+ | ::For example, the note C# is constructed by | ||
+ | :: <math> {C^\#} = { \frac{11}{12}}*map_1(C^\#) + { \frac{1}{12}}*map_2(C^\#)</math> | ||
+ | :: <math> F = { \frac{7}{12}}*map_1(F) + { \frac{5}{12}}*map_2(F) </math> | ||
+ | :: <math> A = { \frac{3}{12}}*map_1(A) + { \frac{9}{12}}*map_2(A) </math> | ||
+ | ::This take cares of the timbre difference. Minor detail is still not perfect, but maybe just change the original signal can improve it. It is a very poor recorded signal. | ||
− | + | ---- | |
− | : | + | :Error analysis: |
− | :: | + | ::The ratio I pick is 1.05882353 versus the accurate factor = 1.05946309; |
− | :: | + | ::Error factor is <math>\frac{1.05946309}{1.05882353} = 1.00060403</math> |
− | :: | + | ::Since this error accumulates, and I am generating 12 notes with one real notes, take <math>1.00060403^{12} = 1.00727253</math> as the maximum error factor. |
− | :: | + | ::This difference is <math>log_{1.05946309}(1.00727253) = 0.12544891</math>, about 1/8 of a step; |
+ | ::In modern music, pitch was divided in to the term "cents" to measure smaller difference in pitch. Each step contains 100 cents. | ||
+ | ::In this case, the error is within 13 cents. For pure frequency, the smallest pitch difference human ears can distinguish is about 6~7 cents. | ||
+ | ::In string musical instrument, human can distinguish about 12~20 cents. | ||
+ | ::These data need to be verify, but on my opinion, that data is the best record of all human being. I have a experiment with my music teacher, I myself can only distinguish about 1/3 of a step in string instrument(about 35 cents), and even my music teacher can only distinguish about 1/4 of a step(about 25 cents) | ||
+ | ::On the other hand, a not well toned piano can easily go off 20 cents. | ||
+ | ::So I will claim that, this approach is acceptable in pitch level. | ||
+ | ---- | ||
− | :: The | + | :Hence, we have a full piano keyboard by now. |
− | : | + | :The data was saved in a matrix into a .mat file. |
− | + | :A script was wrote, that use a special pattern of pitch and rhythm matrix to call the corresponding column of the keyboard matrix. | |
− | : | + | :Then combines the different duration and pitch notes in to a song, as you heard at the beginning. |
− | |||
− | |||
− | + | ::There's another method I tried, which is to generate signal directly by inspecting a musical instrument's FFT, but this method doesn't turn up good result. Documentation can be found here: | |
− | + | [[Audio_Signal_Generating_and_Processing_Project%2C_Previous_method]] | |
[[2011 Fall ECE 438 Boutin|Back to 2011 Fall ECE 438 Boutin]] | [[2011 Fall ECE 438 Boutin|Back to 2011 Fall ECE 438 Boutin]] | ||
[[Category:2011_Fall_ECE_438_Boutin]] | [[Category:2011_Fall_ECE_438_Boutin]] | ||
+ | [[Category:bonus point project]] | ||
+ | [[Category:ECE438]] |
Latest revision as of 06:19, 21 March 2013
Audio Signal Generating and Processing Project
Student project for ECE438
Introduction:
- Listen to this piece of music.
- Just soso, right? but this is generated by computer software by MATLAB.
- Abstract -
- This project is intent to analysis different musical instrument's sound, and try to create artificial musical instrument sounds to play a piece.
- Procedure -
- A record of limited number of keys on a piano keyboard was used. The original sample is here.
- After the first frustrating method, I decided to up/down sample the keys by right order, then place them in right key.
- According to modern music theory of interval, each intervals are equally spaced, each octave is equally spaced in to 12 intervals. A octave higher means twice the frequency. So, each interval is spaced by frequency ration of $ 2^{ \frac{1}{12}} = 1.05946309 $
- But here comes a problem for up/down sample, it can only up/down sample by a integer factor. One can't upsampling by 1.05946309.
- However, inspect the rational number $ { \frac{18}{17}} = 1.05882353 $ , that is relatively close to 1.05946309.
- Next closer fraction is $ { \frac{107}{101}} = 1.059405941 $, But this fraction doesn't change too much accuracy, but as we can see below, it increase the computation steps rapidly. So I choose $ { \frac{18}{17}} = 1.05882353 $ as the approximate factor.
- Next, use this fraction, apply the following:
- if a note half-step above the original is desired, then upsample by 17, then down sample by 18, call this as "move up"
- In this case, the signal is preserved, but at a lower sampling frequency. If play at the original frequency, then the note half-step above is played.
- if a note half-step below the original is desired, then upsample by 18, then down sample by 17, call this as "move down"
- In this case, the signal is preserved, but at a higher sampling frequency. If play at the original frequency, then the note half-step below is played.
- if a note half-step above the original is desired, then upsample by 17, then down sample by 18, call this as "move up"
- For each interval(from lower C to higher C),
- take the lower C, "move up" by step recursively, then get a map of full chromatic scale, define map1, with the exact timber of the lower C;
- take the higher C, "move down" by step recursively, then get a map of full chromatic scale, with the exact timber of the higher C;
- if we pick higher part of the scale as map1, lower part map2, then at the junction, the timber suddenly changed, makes the sound very unnatural.
- You can hear it in here Media:Audio_Signal_Generating_and_Processing_Project_Timber_before.wav
- Instead, apply the following method:
- a given note is contribute by both map1 and map2, and proportional to the end point.
- For example, the note C# is constructed by
- $ {C^\#} = { \frac{11}{12}}*map_1(C^\#) + { \frac{1}{12}}*map_2(C^\#) $
- $ F = { \frac{7}{12}}*map_1(F) + { \frac{5}{12}}*map_2(F) $
- $ A = { \frac{3}{12}}*map_1(A) + { \frac{9}{12}}*map_2(A) $
- This take cares of the timbre difference. Minor detail is still not perfect, but maybe just change the original signal can improve it. It is a very poor recorded signal.
- Error analysis:
- The ratio I pick is 1.05882353 versus the accurate factor = 1.05946309;
- Error factor is $ \frac{1.05946309}{1.05882353} = 1.00060403 $
- Since this error accumulates, and I am generating 12 notes with one real notes, take $ 1.00060403^{12} = 1.00727253 $ as the maximum error factor.
- This difference is $ log_{1.05946309}(1.00727253) = 0.12544891 $, about 1/8 of a step;
- In modern music, pitch was divided in to the term "cents" to measure smaller difference in pitch. Each step contains 100 cents.
- In this case, the error is within 13 cents. For pure frequency, the smallest pitch difference human ears can distinguish is about 6~7 cents.
- In string musical instrument, human can distinguish about 12~20 cents.
- These data need to be verify, but on my opinion, that data is the best record of all human being. I have a experiment with my music teacher, I myself can only distinguish about 1/3 of a step in string instrument(about 35 cents), and even my music teacher can only distinguish about 1/4 of a step(about 25 cents)
- On the other hand, a not well toned piano can easily go off 20 cents.
- So I will claim that, this approach is acceptable in pitch level.
- Hence, we have a full piano keyboard by now.
- The data was saved in a matrix into a .mat file.
- A script was wrote, that use a special pattern of pitch and rhythm matrix to call the corresponding column of the keyboard matrix.
- Then combines the different duration and pitch notes in to a song, as you heard at the beginning.
- There's another method I tried, which is to generate signal directly by inspecting a musical instrument's FFT, but this method doesn't turn up good result. Documentation can be found here:
Audio_Signal_Generating_and_Processing_Project,_Previous_method