Line 372: | Line 372: | ||
<math>g^{(k)} = \begin{bmatrix} | <math>g^{(k)} = \begin{bmatrix} | ||
− | + | 4 & 4 \\ | |
− | + | 4 & 2 | |
\end{bmatrix} x^{(k)} - \begin{bmatrix} | \end{bmatrix} x^{(k)} - \begin{bmatrix} | ||
− | + | -1 \\ | |
1 | 1 | ||
\end{bmatrix} , \text{so}</math> | \end{bmatrix} , \text{so}</math> | ||
<math>g^{(0)} = \begin{bmatrix} | <math>g^{(0)} = \begin{bmatrix} | ||
− | + | 1 \\ | |
-1 | -1 | ||
\end{bmatrix},</math> <math>d^{(0)} = -H_0 g^{(0)} =- \begin{bmatrix} | \end{bmatrix},</math> <math>d^{(0)} = -H_0 g^{(0)} =- \begin{bmatrix} | ||
Line 386: | Line 386: | ||
0 & 1 | 0 & 1 | ||
\end{bmatrix}\begin{bmatrix} | \end{bmatrix}\begin{bmatrix} | ||
− | + | 1 \\ | |
-1 | -1 | ||
\end{bmatrix} = \begin{bmatrix} | \end{bmatrix} = \begin{bmatrix} | ||
− | + | -1 \\ | |
1 | 1 | ||
\end{bmatrix}</math> | \end{bmatrix}</math> | ||
<math>\alpha_0 = - \frac{g^{(0)^T}d^{(0)}}{d^{(0)^T}Qd^{(0)}} = - \frac{\begin{bmatrix} | <math>\alpha_0 = - \frac{g^{(0)^T}d^{(0)}}{d^{(0)^T}Qd^{(0)}} = - \frac{\begin{bmatrix} | ||
− | + | 1 & -1 | |
\end{bmatrix}\begin{bmatrix} | \end{bmatrix}\begin{bmatrix} | ||
− | + | -1 \\ | |
1 | 1 | ||
\end{bmatrix}}{\begin{bmatrix} | \end{bmatrix}}{\begin{bmatrix} | ||
− | + | -1 & 1\end{bmatrix}\begin{bmatrix} | |
− | + | 4 & 2 \\ | |
− | + | 2 & 2 | |
\end{bmatrix}\begin{bmatrix} | \end{bmatrix}\begin{bmatrix} | ||
− | + | -1 \\ | |
1 | 1 | ||
\end{bmatrix}} = \frac{1}{2}</math> | \end{bmatrix}} = \frac{1}{2}</math> | ||
− | <math>x^{(1)} = x^{(0)} + \alpha d^{(0)} = | + | <math>x^{(1)} = x^{(0)} + \alpha d^{(0)} = \begin{bmatrix} |
− | + | -1 \\ | |
1 | 1 | ||
− | + | \end{bmatrix} </math> | |
− | + | ||
− | + | ||
− | \end{bmatrix}</math> | + | |
<math>\Delta x^{(0)} = x^{(1)}- x^{(0)} = \begin{bmatrix} | <math>\Delta x^{(0)} = x^{(1)}- x^{(0)} = \begin{bmatrix} | ||
− | 1 \\ | + | -1 \\ |
− | + | 1 | |
\end{bmatrix}</math> | \end{bmatrix}</math> | ||
<math>g^{(1)} =\begin{bmatrix} | <math>g^{(1)} =\begin{bmatrix} | ||
− | + | 4 & 2 \\ | |
− | + | 2 & 2 | |
\end{bmatrix} x^{(1)} - \begin{bmatrix} | \end{bmatrix} x^{(1)} - \begin{bmatrix} | ||
− | + | -1 \\ | |
1 | 1 | ||
\end{bmatrix}= \begin{bmatrix} | \end{bmatrix}= \begin{bmatrix} | ||
− | - | + | -1 \\ |
− | 1 | + | -1 |
\end{bmatrix}</math> | \end{bmatrix}</math> | ||
<math>\Delta g^{(0)} = g^{(1)} - g^{(0)} = \begin{bmatrix} | <math>\Delta g^{(0)} = g^{(1)} - g^{(0)} = \begin{bmatrix} | ||
− | - | + | -2 \\ |
− | + | 0 | |
\end{bmatrix} </math> | \end{bmatrix} </math> | ||
Line 444: | Line 441: | ||
0 & 1 | 0 & 1 | ||
\end{bmatrix} + \begin{bmatrix} | \end{bmatrix} + \begin{bmatrix} | ||
− | \frac{ | + | \frac{1}{2} & -\frac{1}{2} \\ |
− | \frac{1}{ | + | -\frac{1}{2} & \frac{1}{2} |
− | \end{bmatrix} - | + | \end{bmatrix} - \begin{bmatrix} |
− | + | 1 & 0 \\ | |
− | + | 0 & 0 | |
\end{bmatrix} = \begin{bmatrix} | \end{bmatrix} = \begin{bmatrix} | ||
− | \frac{ | + | \frac{1}{2} & -\frac{1}{2} \\ |
− | -\frac{ | + | -\frac{1}{2} & \frac{3}{2} |
\end{bmatrix}</math> | \end{bmatrix}</math> | ||
− | <math>d^{(1)} = -H_1 g^{(1)} = | + | <math>d^{(1)} = -H_1 g^{(1)} = \begin{bmatrix} |
− | + | 0 \\ | |
− | + | ||
− | + | ||
− | + | ||
1 | 1 | ||
− | |||
− | |||
− | |||
\end{bmatrix}</math> | \end{bmatrix}</math> | ||
− | <math>\alpha_1 = - \frac{g^{(1)^T}d^{(1)}}{d^{(1)^T}Qd^{(1)}} = | + | <math>\alpha_1 = - \frac{g^{(1)^T}d^{(1)}}{d^{(1)^T}Qd^{(1)}} = \frac{1}{2}</math> |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
<math>x^{(2)} = x^{(1)} + \alpha_1 d^{(1)} = \begin{bmatrix} | <math>x^{(2)} = x^{(1)} + \alpha_1 d^{(1)} = \begin{bmatrix} | ||
− | 1 \\ | + | -1 \\ |
− | + | 3/2 | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
\end{bmatrix} </math> | \end{bmatrix} </math> | ||
Line 498: | Line 471: | ||
<math> | <math> | ||
g^{(2)} = \begin{bmatrix} | g^{(2)} = \begin{bmatrix} | ||
− | + | 4 & 4 \\ | |
− | + | 4& 2 | |
\end{bmatrix} x^{(2)} - \begin{bmatrix} | \end{bmatrix} x^{(2)} - \begin{bmatrix} | ||
− | + | -1 \\ | |
1 | 1 | ||
\end{bmatrix} = \begin{bmatrix} | \end{bmatrix} = \begin{bmatrix} | ||
Line 510: | Line 483: | ||
<math> | <math> | ||
\text{When the gradient is 0, we reach the minimum point, which is } x^{(2)}=\begin{bmatrix} | \text{When the gradient is 0, we reach the minimum point, which is } x^{(2)}=\begin{bmatrix} | ||
− | + | -1 \\ | |
− | + | 3/2 | |
\end{bmatrix} | \end{bmatrix} | ||
</math> | </math> | ||
[[ QE2012 AC-3 ECE580-1|Back to QE2012 AC-3 ECE580-1]] | [[ QE2012 AC-3 ECE580-1|Back to QE2012 AC-3 ECE580-1]] |
Revision as of 18:45, 26 January 2013
QE2012_AC-3_ECE580-2
Solution:
$ f = \frac{1}{2}x^TQx - x^Tb+c $
Use initial point x(0) = [0,0]T</sub> and H0 = I2
In this case
$ g^{(k)} = \begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix} x^{(k)} - \begin{bmatrix} 2 \\ 1 \end{bmatrix} $
Hence $ g^{(0)} = \begin{bmatrix} -2 \\ -1 \end{bmatrix}, $ $ d^{(0)} = -H_0g^{(0)} =- \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} -2 \\ -1 \end{bmatrix} = \begin{bmatrix} 2 \\ 1 \end{bmatrix} $
Because f is a quadratic function
$ \alpha_0 = argminf(x^{(0)} + \alpha d^{(0)}) = - \frac{g^{(0)^T}d^{(0)}}{d^{(0)^T}Qd^{(0)}} = - \frac{\begin{bmatrix} -2 & -1 \end{bmatrix}\begin{bmatrix} 2 \\ 1 \end{bmatrix}}{\begin{bmatrix} 2 & 1\end{bmatrix}\begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix}\begin{bmatrix} 2 \\ 1 \end{bmatrix}} = \frac{1}{2} $
$ x^{(1)} = x^{(0)} + \alpha d^{(0)} = \frac{1}{2} \begin{bmatrix} 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 1 \\ \frac{1}{2} \end{bmatrix} $
$ \Delta x^{(0)} = x^{(1)}- x^{(0)} = \begin{bmatrix} 1 \\ \frac{1}{2} \end{bmatrix} $ $ g^{(1)} =\begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix} x^{(1)} - \begin{bmatrix} 2 \\ 1 \end{bmatrix}= \begin{bmatrix} -\frac{1}{2} \\ 1 \end{bmatrix} $
$ \Delta g^{(0)} = g^{(1)} - g^{(0)} = \begin{bmatrix} -\frac{3}{2} \\ 2 \end{bmatrix} $
Observe that $ \Delta x^{(0)} \Delta x^{(0)^T} = \begin{bmatrix} 1 \\ \frac{1}{2} \end{bmatrix} \begin{bmatrix} 1 & \frac{1}{2} \end{bmatrix} = \begin{bmatrix} 1 & \frac{1}{2} \\ \frac{1}{2} & \frac{1}{4} \end{bmatrix} $ $ \Delta x^{(0)^T} \Delta g^{(0)} = \begin{bmatrix} 1 & \frac{1}{2} \end{bmatrix}\begin{bmatrix} \frac{3}{2} \\ 2 \end{bmatrix} = \frac{5}{2} $ $ H_0 \Delta g^{(0)} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \frac{3}{2} \\ 2 \end{bmatrix} = \begin{bmatrix} \frac{3}{2} \\ 2 \end{bmatrix}, $ $ (H_0 \Delta g^{(0)})(H_0 \Delta g^{(0)})^T = \begin{bmatrix} \frac{9}{4} & 3 \\ 3 & 4 \end{bmatrix} $ $ \Delta g^{(0)^T}H_0 \Delta g^{(0)} = \begin{bmatrix} \frac{3}{2} & 2 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \frac{3}{2} \\ 2 \end{bmatrix} = \frac{25}{4} $ Using the above, now we have $ H_1 = H_0 + \frac{\Delta x^{(0)} \Delta x^{(0)^T}}{\Delta x^{(0)^T} \Delta g^{(0)}} - \frac{(H_0 \Delta g^{(0)})(H_0 \Delta g^{(0)})^T}{\Delta g^{(0)^T}H_0 \Delta g^{(0)} } = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} + \begin{bmatrix} \frac{2}{5} & \frac{1}{5} \\ \frac{1}{5} & \frac{1}{10} \end{bmatrix} - \frac{25}{4}\begin{bmatrix} \frac{9}{4} & 3 \\ 3 & 4 \end{bmatrix} = \begin{bmatrix} \frac{26}{25} & -\frac{7}{25} \\ -\frac{7}{25} & \frac{23}{50} \end{bmatrix} $
T'hen we have, $ d^{(1)} = -H_1 g^{(0)} = - \begin{bmatrix} \frac{26}{25} & -\frac{7}{25} \\ -\frac{7}{25} & \frac{23}{50} \end{bmatrix} \begin{bmatrix} -\frac{1}{2} \\ 1 \end{bmatrix} = \begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix} $
$ \alpha_1 = argminf(x^{(1)} + \alpha d^{(1)}) = - \frac{g^{(1)^T}d^{(1)}}{d^{(1)^T}Qd^{(1)}} = - \frac{\begin{bmatrix} -2 & 1 \end{bmatrix}\begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix}}{\begin{bmatrix} \frac{4}{5} & -\frac{3}{5}\end{bmatrix}\begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix}\begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix}} = \frac{5}{2} $
$ x^{(2)} = x^{(1)} + \alpha_1 d^{(1)} = \begin{bmatrix} 1 \\ \frac{1}{2} \end{bmatrix} + \frac{5}{2}\begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix} = \begin{bmatrix} 3 \\ -1 \end{bmatrix} $
$ \Delta x^{(1)} = x^{(2)} - x^{(1)} = \begin{bmatrix} 2 \\ -\frac{3}{2} \end{bmatrix} $ $ g^{(2)} = \begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix} x^{(0)} - \begin{bmatrix} 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} $
Note that we have $ d^{(0)^T}Qd^{(0)} = 0; $ that is, $ d^{(0)} = \begin{bmatrix} 2 \\ 1 \end{bmatrix} $ and $ d^{(1)} = \begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix} $ are Q-conjugate directions.
Solution 2:
$ \text{Let the initial point be } x^{(0)}= \begin{bmatrix} 0\\ 0 \end{bmatrix} \text{and initial Hessian be } H_0=\begin{bmatrix} 1 & 0\\ 0 & 1\end{bmatrix} $
$ g^{(k)} = \begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix} x^{(k)} - \begin{bmatrix} 2 \\ 1 \end{bmatrix} , \text{so} $
$ g^{(0)} = \begin{bmatrix} -2 \\ -1 \end{bmatrix}, $ $ d^{(0)} = -H_0 g^{(0)} =- \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} -2 \\ -1 \end{bmatrix} = \begin{bmatrix} 2 \\ 1 \end{bmatrix} $
$ \alpha_0 = - \frac{g^{(0)^T}d^{(0)}}{d^{(0)^T}Qd^{(0)}} = - \frac{\begin{bmatrix} -2 & -1 \end{bmatrix}\begin{bmatrix} 2 \\ 1 \end{bmatrix}}{\begin{bmatrix} 2 & 1\end{bmatrix}\begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix}\begin{bmatrix} 2 \\ 1 \end{bmatrix}} = \frac{1}{2} $
$ x^{(1)} = x^{(0)} + \alpha d^{(0)} = \frac{1}{2} \begin{bmatrix} 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 1 \\ \frac{1}{2} \end{bmatrix} $
$ \Delta x^{(0)} = x^{(1)}- x^{(0)} = \begin{bmatrix} 1 \\ \frac{1}{2} \end{bmatrix} $
$ g^{(1)} =\begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix} x^{(1)} - \begin{bmatrix} 2 \\ 1 \end{bmatrix}= \begin{bmatrix} -\frac{1}{2} \\ 1 \end{bmatrix} $
$ \Delta g^{(0)} = g^{(1)} - g^{(0)} = \begin{bmatrix} -\frac{3}{2} \\ 2 \end{bmatrix} $
$ \text{If we plug in the above numbers in the formula, we can get} $
$ H_1 = H_0 + \frac{\Delta x^{(0)} \Delta x^{(0)^T}}{\Delta x^{(0)^T} \Delta g^{(0)}} - \frac{(H_0 \Delta g^{(0)})(H_0 \Delta g^{(0)})^T}{\Delta g^{(0)^T}H_0 \Delta g^{(0)} } = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} + \begin{bmatrix} \frac{2}{5} & \frac{1}{5} \\ \frac{1}{5} & \frac{1}{10} \end{bmatrix} - \frac{25}{4}\begin{bmatrix} \frac{9}{4} & 3 \\ 3 & 4 \end{bmatrix} = \begin{bmatrix} \frac{26}{25} & -\frac{7}{25} \\ -\frac{7}{25} & \frac{23}{50} \end{bmatrix} $
$ d^{(1)} = -H_1 g^{(1)} = - \begin{bmatrix} \frac{26}{25} & -\frac{7}{25} \\ -\frac{7}{25} & \frac{23}{50} \end{bmatrix} \begin{bmatrix} -\frac{1}{2} \\ 1 \end{bmatrix} = \begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix} $
$ \alpha_1 = - \frac{g^{(1)^T}d^{(1)}}{d^{(1)^T}Qd^{(1)}} = - \frac{\begin{bmatrix} -2 & 1 \end{bmatrix}\begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix}}{\begin{bmatrix} \frac{4}{5} & -\frac{3}{5}\end{bmatrix}\begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix}\begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix}} = \frac{5}{2} $
$ x^{(2)} = x^{(1)} + \alpha_1 d^{(1)} = \begin{bmatrix} 1 \\ \frac{1}{2} \end{bmatrix} + \frac{5}{2}\begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix} = \begin{bmatrix} 3 \\ -1 \end{bmatrix} $
$ \Delta x^{(1)} = x^{(2)} - x^{(1)} = \begin{bmatrix} 2 \\ -\frac{3}{2} \end{bmatrix} $
$ g^{(2)} = \begin{bmatrix} 1 & 1 \\ 1 & 2 \end{bmatrix} x^{(2)} - \begin{bmatrix} 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} $
$ \text{When the gradient is 0, we reach the minimum point, which is } x^{(2)}=\begin{bmatrix} 3 \\ -1 \end{bmatrix} $
$ \color{blue}\text{Related Problem: } $ Employ the DFP method to find the minimizer of the following function
$ f = \frac{1}{2}x^TQx - x^Tb+c $ $ =\frac{1}{2}x^T \begin{bmatrix} 4 & 2 \\ 2 & 2 \end{bmatrix}x-x^T\begin{bmatrix} -1 \\ 1 \end{bmatrix} $
Solution: This problem is essentially the same with the previous one except that the numbers are different.
$ \text{Let the initial point be } x^{(0)}= \begin{bmatrix} 0\\ 0 \end{bmatrix} \text{and initial Hessian be } H_0=\begin{bmatrix} 1 & 0\\ 0 & 1\end{bmatrix} $
$ g^{(k)} = \begin{bmatrix} 4 & 4 \\ 4 & 2 \end{bmatrix} x^{(k)} - \begin{bmatrix} -1 \\ 1 \end{bmatrix} , \text{so} $
$ g^{(0)} = \begin{bmatrix} 1 \\ -1 \end{bmatrix}, $ $ d^{(0)} = -H_0 g^{(0)} =- \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} 1 \\ -1 \end{bmatrix} = \begin{bmatrix} -1 \\ 1 \end{bmatrix} $
$ \alpha_0 = - \frac{g^{(0)^T}d^{(0)}}{d^{(0)^T}Qd^{(0)}} = - \frac{\begin{bmatrix} 1 & -1 \end{bmatrix}\begin{bmatrix} -1 \\ 1 \end{bmatrix}}{\begin{bmatrix} -1 & 1\end{bmatrix}\begin{bmatrix} 4 & 2 \\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1 \\ 1 \end{bmatrix}} = \frac{1}{2} $
$ x^{(1)} = x^{(0)} + \alpha d^{(0)} = \begin{bmatrix} -1 \\ 1 \end{bmatrix} $
$ \Delta x^{(0)} = x^{(1)}- x^{(0)} = \begin{bmatrix} -1 \\ 1 \end{bmatrix} $
$ g^{(1)} =\begin{bmatrix} 4 & 2 \\ 2 & 2 \end{bmatrix} x^{(1)} - \begin{bmatrix} -1 \\ 1 \end{bmatrix}= \begin{bmatrix} -1 \\ -1 \end{bmatrix} $
$ \Delta g^{(0)} = g^{(1)} - g^{(0)} = \begin{bmatrix} -2 \\ 0 \end{bmatrix} $
$ \text{If we plug in the above numbers in the formula, we can get} $
$ H_1 = H_0 + \frac{\Delta x^{(0)} \Delta x^{(0)^T}}{\Delta x^{(0)^T} \Delta g^{(0)}} - \frac{(H_0 \Delta g^{(0)})(H_0 \Delta g^{(0)})^T}{\Delta g^{(0)^T}H_0 \Delta g^{(0)} } = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} + \begin{bmatrix} \frac{1}{2} & -\frac{1}{2} \\ -\frac{1}{2} & \frac{1}{2} \end{bmatrix} - \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} = \begin{bmatrix} \frac{1}{2} & -\frac{1}{2} \\ -\frac{1}{2} & \frac{3}{2} \end{bmatrix} $
$ d^{(1)} = -H_1 g^{(1)} = \begin{bmatrix} 0 \\ 1 \end{bmatrix} $
$ \alpha_1 = - \frac{g^{(1)^T}d^{(1)}}{d^{(1)^T}Qd^{(1)}} = \frac{1}{2} $
$ x^{(2)} = x^{(1)} + \alpha_1 d^{(1)} = \begin{bmatrix} -1 \\ 3/2 \end{bmatrix} $
$ \Delta x^{(1)} = x^{(2)} - x^{(1)} = \begin{bmatrix} 2 \\ -\frac{3}{2} \end{bmatrix} $
$ g^{(2)} = \begin{bmatrix} 4 & 4 \\ 4& 2 \end{bmatrix} x^{(2)} - \begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} $
$ \text{When the gradient is 0, we reach the minimum point, which is } x^{(2)}=\begin{bmatrix} -1 \\ 3/2 \end{bmatrix} $