Fisher Information and Frequentist Statistics:
What is Frequentist Statistics?
Frequentist statistics emphasizes the frequency and proportion of data to draw conclusions. This area of statistics is associated with the frequentist interpretation of probability, which states that any given experiment can be seen as an infinite series of possible repetitions of the same experiment, with each of these experiments producing results that are statistically independent from one another. The frequentist approach to making conclusions from the given data requires that the correct conclusion can be formulated with a high probability among the set of repetitions.
When using fisher information, recall that θ is the unknown, and to infer its value one might provide a best guess in terms of the point estimate, suggest its value and test whether the value lines up with the data, or derive a confidence interval (a defined range of values with a specified probability that the value of a parameter lies within). In the frequentist structure, each of these tools exploits the data generative interpretation of a probability mass function (pmf).
Given a model $ f(x^n ∣ θ) $ and a known θ, one can view the resulting pmf $ (x^n) $ as a recipe that defines how θ equals the chances for which $ X^n $ realizes the possible outcomes $ x^n $. This data generative outlook is vital to Fisher’s uses of the maximum likelihood estimator.
For example, using the graph above as a guide, let's say a model is created for a coin flip with a hypothetical propensity θ = 0.5 will generate 7 heads out of 10 trials with 11.7% chance. Using a hypothetical propensity of θ = 0.7 will generate the same outcome of 7 heads, but with a 26.7% chance. For this specific observation of 7 heads, θ being 0.7 would be the maximum likelihood estimator since it has the highest percentage.