Gradient descent algorithms are iterative methods for finding the minimum of a function in $ \Re^n $, where n is usually at least 3.

The general form of the algorithm is: $ x^{(k+1)} = x^{(k)} - \alpha_k \nabla f(x^{(k)}) $

Where (k) represents the current iteration and (k+1) the next. $ \alpha_k $ is referred to as the step size and is itself a function that differs between the various gradient descent algorithms. Lastly, $ \nabla f(x^{(k)}) $ is the gradient_OldKiwi of the function f evaluated at the current x and is referred to as the direction of the algorithm. The algorithm terminates when the gradient becomes equal to a vector of 0's, which indicates that a local minimum has been reached.

For an example of a common $ \alpha_k $, let us examine the steepest descent algorithm, which is simply one of many gradient descent algorithms. If the function is quadratic, $ \alpha_k = \frac{\nabla f(x^{(k)})^T \nabla f(x^{(k)})}{\nabla f(x^{(k)})^T F(x^{(k)}) \nabla f(x^{(k)})} $.

Where F(x) is the Hessian_OldKiwi of f(x).

Alumni Liaison

Correspondence Chess Grandmaster and Purdue Alumni

Prof. Dan Fleetwood