Bayes' Theorem
by Maliha Hossain
keyword: probability, Bayes' Theorem, Bayes' Rule
INTRODUCTION
Bayes' Theorem (or Bayes' Rule) allows us to calculate P(A|B) from P(B|A) given that P(A) and P(B) are also known, where A and B are events. In this tutorial, we will derive Bayes' Theorem and illustrate it with a few examples.
Note that this tutorial assumes familiarity with conditional probability and the axioms of probability.
Contents - Bayes' Theorem - Proof - Example 1: Quality Control - Example 2 - Example 3 - References
Bayes' Theorem
Let $ B_1, B_2, ..., B_n $ be a partition of the sample space $ S $, i.e. $ B_1, B_2, ..., B_n $ are mutually exclusive events whose union equals the sample space S. Suppose that the event $ A $ occurs. Then, by Bayes' Theorem, we have that
$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{P[A]}, j = 1, 2, . . . , n $
Bayes' Theorem is also often expressed in the following form:
$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{\sum_{k=1}^n P[A|B_k]P[B_k]} $
Proof
We will now derive Bayes'e Theorem as it is expressed in the second form, which simply takes the expression one step further than the first.
Let $ A $ and $ B_j $ be as defined above. By definition of the conditional probability, we have that
$ P[A|B_j] = \frac{P[A\cap B_j]}{P[B_j]} $
Multiplying both sides with $ B_j $, we get
$ P[A\cap B_j] = P[A|B_j]P[B_j] $
Using the same argument as above, we have that
$ P[B_j|A] = \frac{P[B_j\cap A]}{P[A]} $
$ \Rightarrow P[B_j\cap A] = P[B_j|A]P[A] $
Because of the commutativity property of intersection, we can say that
$ P[B_j|A]P[A] = P[A|B_j]P[B_j] $
Dividing both sides by $ P[A] $, we get
$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{P[A]} $
Finally, the denominator can be broken down further using the theorem of total probability so that we have the following expression
$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{\sum_{k=1}^n P[A|B_k]P[B_k]} $
Example 1: Quality Control
The following problem has been adapted from a few practice problems from chapter 2 of Probability, Statistics and Random Processes for Electrical Engineers by Alberto Leon-Garcia. The example illustrates how Bayes' Theorem plays a role in quality control.
A manufacturer produces a mix of "good" chips and "bad" chips. The proportion of good chips whose lifetime exceeds time $ t $ seconds decreases exponentially at the rate $ \alpha $. The proportion of bad chips whose lifetime exceeds t decreases much faster at a rate $ 1000\alpha $. Suppose that the fraction of bad chips is $ p $, and of good chips, $ 1 - p $
Let $ C $ be the event that the chip is functioning after $ t $ seconds. Let $ G $ be the event that the chip is good. Let $ B $ be the event that the chip is bad.
Here's what we can infer from the problem statement thus far:
the probability that the lifetime of a good chip exceeds $ t $: $ P[C|G] = e^{-\alpha t} $
the probability that the lifetime of a bad chip exceeds $ t $: $ P[C|B] = e^{-1000\alpha t} $
So by the theorem of total probability, we have that
$ P[C] = P[C|G]P[G] + P[C|B]P[B] $
$ = e^{-\alpha t}(1-p) + e^{-1000\alpha t}p $
Now suppose that in order to weed out the bad chips, every chip is tested for t seconds prior to leaving the factory. the chips that fail are discarded and the remaining chips are sent out to customers. Can you find the value of $ t $ for which 99% of the chips sent out to customers are good?
The problem requires that we find the value of $ t $ such that
$ P[G|C] = .99 $
We find $ P[G|C] $ by applying Bayes' Theorem
$ P[G|C] = \frac{P[C|G]P[G]}{P[C|G]P[G] + P[C|B]P[B]} $
$ = \frac{e^{-\alpha t}(1-p)}{e^{-\alpha t}(1-p) + e^{-1000\alpha t}} $
$ = \frac{1}{1 + \frac{pe^{-1000\alpha t}}{e^{-\alpha t}(1-p)}} = .99 $
The above equation can be solved for $ t $
$ t = \frac{1}{999\alpha}ln(\frac{99p}{1-p}) $