Line 1: Line 1:
=Proof that <math>I(θ) = E[(s(θ;X))^2]</math>=
+
=Proof of Fischer's Information=
Using the definition of Variance, as proved below:
+
Variance can also be shown as differences between the expected value of the square of event Y and the square of the expected value of Y. This theorem is proven below:
 
<div style="margin-left: 3em;">
 
<div style="margin-left: 3em;">
 
<math>
 
<math>
Line 11: Line 11:
 
</math>
 
</math>
 
</div>
 
</div>
 +
<small>''note: This definition is used often in statistics and therefore I will not be explaining the derivation of this identity. If you would like to know more, check our "More Sources" tab.''</small><br />
 +
 +
 +
Using the identity of variance above, we can show that Continuing the calculation above using the identity above, we can show:<br />
 +
<math>I(θ) = var(s(θ;X)) = E[(s(θ;X))^2] - (E[s(θ;X)])^2</math><br />
 +
 +
As you can see we already have our E[(s(θ;X))^2] term which is part of our definition. From here, we can use integrals to make <math>(E[s(θ;X)])^2</math><br />
 +
Recall that the score function is equal to the gradient with respect to θ of the natural log of the likelihood function with parameters θ and X. Also denoted like this:
 +
<math>s(θ;X) = \nabla [ln(L(θ,X))]</math>

Revision as of 21:23, 6 December 2020

Proof of Fischer's Information

Variance can also be shown as differences between the expected value of the square of event Y and the square of the expected value of Y. This theorem is proven below:

$ \begin{align} \bar Var(Y) &= E[(Y-E(Y))^2]\\ &= E[Y^2-2YE[Y]+(E[Y])^2]\\ &= E[Y^2]-2(E[Y])^2+(E[Y])^2\\ &= E[Y^2] - (E[Y])^2 \end{align} $

note: This definition is used often in statistics and therefore I will not be explaining the derivation of this identity. If you would like to know more, check our "More Sources" tab.


Using the identity of variance above, we can show that Continuing the calculation above using the identity above, we can show:
$ I(θ) = var(s(θ;X)) = E[(s(θ;X))^2] - (E[s(θ;X)])^2 $

As you can see we already have our E[(s(θ;X))^2] term which is part of our definition. From here, we can use integrals to make $ (E[s(θ;X)])^2 $
Recall that the score function is equal to the gradient with respect to θ of the natural log of the likelihood function with parameters θ and X. Also denoted like this: $ s(θ;X) = \nabla [ln(L(θ,X))] $

Alumni Liaison

BSEE 2004, current Ph.D. student researching signal and image processing.

Landis Huffman