m (Protected "ECE600 F13 Statistical Independence mhossain" [edit=sysop:move=sysop])
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
[[Category:ECE600]]
 
[[Category:ECE600]]
 +
[[Category:Lecture notes]]
 +
 +
[[ECE600_F13_notes_mhossain|Back to all ECE 600 notes]]<br/>
 +
[[ECE600_F13_Conditional_probability_mhossain|Previous Topic: Conditional Probability]]<br/>
 +
[[ECE600_F13_rv_definition_mhossain|Next Topic: Random Variables: Definition]]
 +
 +
----
 +
[[Category:ECE600]]
 +
[[Category:probability]]
 
[[Category:lecture notes]]
 
[[Category:lecture notes]]
 +
[[Category:slecture]]
  
 
<center><font size= 4>
 
<center><font size= 4>
'''Random Variables and Signals'''
+
[[ECE600_F13_notes_mhossain|'''The Comer Lectures on Random Variables and Signals''']]
 
</font size>
 
</font size>
 +
 +
[https://www.projectrhea.org/learning/slectures.php Slectures] by [[user:Mhossain | Maliha Hossain]]
 +
  
 
<font size= 3> Topic 4: Statistical Independence </font size>
 
<font size= 3> Topic 4: Statistical Independence </font size>
 
</center>
 
</center>
 
  
 
----
 
----
 
+
----
 
==Definition==
 
==Definition==
  
Line 22: Line 34:
  
 
'''Notation''' <math>\qquad</math> A, B are independent ⇔ <br/>
 
'''Notation''' <math>\qquad</math> A, B are independent ⇔ <br/>
<center><math>X \perp\!\!\!\perp Y </math> </center>
+
<center><math>A \perp\!\!\!\perp B </math> </center>
  
 
'''Note''' <math>\qquad</math> We will usually refer to statistical independence as simply independence in this course.
 
'''Note''' <math>\qquad</math> We will usually refer to statistical independence as simply independence in this course.
Line 31: Line 43:
 
<center><math>\begin{align}
 
<center><math>\begin{align}
 
B&=(A\cap B)\cup(A'\cap B) \\
 
B&=(A\cap B)\cup(A'\cap B) \\
\Rightarrow P(B)&=p(A\cap B)+P(A'\cap B) \\
+
\Rightarrow P(B)&=P(A\cap B)+P(A'\cap B) \\
 
&=P(A)P(B) + P(A'\cap B)  
 
&=P(A)P(B) + P(A'\cap B)  
 
\end{align}</math><br/>
 
\end{align}</math><br/>
Line 40: Line 52:
 
\end{align}</math></center>
 
\end{align}</math></center>
  
'''Definition''' <math>\qquad</math>  Events <math>A_1,A_2,...,A_n</math> are statistically independent iff for any <math>k=1,2,...,n</math>, and any 1≤<math>j_1,j_2,...,j_k</math>≤n, for any finite n, we have that <br/>
+
'''Definition''' <math>\qquad</math>  Events <math>A_1,A_2,...,A_n</math> are statistically independent iff for any <math>k=1,2,...,n</math>, and any 1 ≤ <math>j_1,j_2,...,j_k</math> ≤ n, for any finite n, we have that <br/>
<center><math> P(A_{j_1}\cap...\cap A_{j_1}) = \prod_{i=1}^k P(A_{j_1})</math></center>
+
<center><math> P(A_{j_1}\cap...\cap A_{j_k}) = \prod_{i=1}^k P(A_{j_i})</math></center>
  
Note that to show that a collection of sets is disjoint, we only need to consider pairwise intersections of the sets. But to show that a collection of sets is independent, we need to consider the every possible combination of sets from the collection. <br/>
+
Note that to show that a collection of sets is disjoint, we only need to consider pairwise intersections of the sets. But to show that a collection of sets is independent, we need to consider every possible combination of sets from the collection. Then we say that the events are mutually independent. Mutual independence implies pairwise independence but the converse is not true.<br/>
 
Note that for a given n, there are <math>2^n-(n+1)</math> combinations. To see this, use the Binomial Theorem: <br>
 
Note that for a given n, there are <math>2^n-(n+1)</math> combinations. To see this, use the Binomial Theorem: <br>
 
<center><math> \sum_{k=0}^n {n\choose k}a^kb^{n-k} = (a+b)^n, \;\;\mbox{with}\; a=b=1</math></center>
 
<center><math> \sum_{k=0}^n {n\choose k}a^kb^{n-k} = (a+b)^n, \;\;\mbox{with}\; a=b=1</math></center>
Line 58: Line 70:
 
\end{align}</math></center><br/>
 
\end{align}</math></center><br/>
 
So there are <math>2^n-(n+1)</math> combinations of sets that need to be checked for independence.<br/>
 
So there are <math>2^n-(n+1)</math> combinations of sets that need to be checked for independence.<br/>
in practice, generally we do not check <math>2^n-(n+1)</math> combinations for every n. Often, we just assume independence based on intuition and see if results are reasonable. We cannot however, use this approach in this class.  
+
In practice, generally we do not check <math>2^n-(n+1)</math> combinations for every n. Often, we just assume independence based on intuition and see if results are reasonable. We cannot however, use this approach in this class.  
  
  
Line 75: Line 87:
 
<center><math> \mathcal F = \{A\times B:\;A\subset\mathcal S_1,\;B\subset\mathcal S_2\}</math></center><br/>
 
<center><math> \mathcal F = \{A\times B:\;A\subset\mathcal S_1,\;B\subset\mathcal S_2\}</math></center><br/>
 
Or, let  
 
Or, let  
<center><math> \mathcal F = \{A\times B:\;A\subset\mathcal F_1,\;B\subset\mathcal F_2\}</math></center><br/>
+
<center><math> \mathcal F = \{A\times B:\;A\in\mathcal F_1,\;B\in\mathcal F_2\}</math></center><br/>
Note that for discrete ''S'', these are typically the same. But ∃ subsets of ''S''that are not in ''F'' for both of these definitions, some of which might be events we care about.  
+
Note that for discrete ''S'', these are typically the same. But ∃ subsets of ''S'' = <math>S_1</math>×<math>S_2</math> that are not in ''F'' for both of these definitions, some of which might be events we care about.  
  
 
'''Example'''<br/>
 
'''Example'''<br/>
Line 131: Line 143:
 
==Bernoulli Trials ==
 
==Bernoulli Trials ==
  
Bernoulli trials are an important combined experiment that models repeated independent trials of an experiment with one particular event of interest. COnsider an experiment with probability space <math>S_0,F_0,</math>P<math>_0</math>), and an event A ∈ <math>F_0</math> with P(A) = p. For this combined experiment, let ''S'' = <math>S_0x...xS_0</math> (n terms) ⇒ ''S'' = {n-tuples with each component an element of <math>S_0</math>}, where n is the number of trials. We are interested in the probability that A occurs k times in the n trials for k = 0,1,...,n. <br/>
+
Bernoulli trials are an important combined experiment that models repeated independent trials of an experiment with one particular event of interest. Consider an experiment with probability space (<math>S_0,F_0,</math>P<math>_0</math>), and an event A ∈ <math>F_0</math> with P(A) = p. For this combined experiment, let ''S'' = <math>S_0</math> × ... ×  <math>S_0</math> (n terms) ⇒ ''S'' = {n-tuples with each component an element of <math>S_0</math>}, where n is the number of trials. We are interested in the probability that A occurs k times in the n trials for k = 0,1,...,n. <br/>
 
Let <math>p_n(k)</math> ≡ P(A occurs k times in n trials). <br/>
 
Let <math>p_n(k)</math> ≡ P(A occurs k times in n trials). <br/>
 
Then, it can be shown that <br/>
 
Then, it can be shown that <br/>
Line 137: Line 149:
  
 
'''Overview of proof''':
 
'''Overview of proof''':
From combinatorics, There are <math>^nC_k</math> (n choose k) ways to have k successes (A occurs) in n trials. Due to the independence of the trials, the probability that a give n combination occurs is <math>P^k(1-p)^{n-k}</math>. Since the collection of all combinations is disjoint, <br/>
+
From combinatorics, There are <math>^nC_k</math> (n choose k) ways to have k successes (A occurs) in n trials. Due to the independence of the trials, the probability that a give n combination occurs is <math>p^k(1-p)^{n-k}</math>. Since the collection of all combinations is disjoint, <br/>
 
<center><math>\begin{align}
 
<center><math>\begin{align}
 
p_n(k) &= \sum_{j=1}^{n\choose k}p^k(1-p)^{n-k} \\
 
p_n(k) &= \sum_{j=1}^{n\choose k}p^k(1-p)^{n-k} \\
Line 161: Line 173:
 
----
 
----
  
[[ECE600_F13_notes_mhossain|Back to all ECE 600 notes]]
+
==[[Talk:ECE600_F13_Statistical_Independence_mhossain|Questions and comments]]==
 +
 
 +
If you have any questions, comments, etc. please post them on [[Talk:ECE600_F13_Statistical_Independence_mhossain|this page]]
 +
 
 +
 
 +
----
 +
 
 +
[[ECE600_F13_notes_mhossain|Back to all ECE 600 notes]]<br/>
 +
[[ECE600_F13_Conditional_probability_mhossain|Previous Topic: Conditional Probability]]<br/>
 +
[[ECE600_F13_rv_definition_mhossain|Next Topic: Random Variables: Definition]]

Latest revision as of 11:11, 21 May 2014

Back to all ECE 600 notes
Previous Topic: Conditional Probability
Next Topic: Random Variables: Definition


The Comer Lectures on Random Variables and Signals

Slectures by Maliha Hossain


Topic 4: Statistical Independence



Definition

People generally have an idea of the concept of independence, but we will formalize it for probability theory here.

Definition $ \qquad $ Given (S,F,P) and A, B ∈ F, events A and B are statistically indendent iff

$ P(A\cap B) = P(A)P(B) $

It may seem more intuitive if we consider that, from the definition above, we can say that A and B are statistically independent iff P(A|B)P(B) = P(A)P(B), which means P(A|B) = P(A)

Notation $ \qquad $ A, B are independent ⇔

$ A \perp\!\!\!\perp B $

Note $ \qquad $ We will usually refer to statistical independence as simply independence in this course.

Note $ \qquad $ If A and B are independent, then A' and B are also independent.

Proof:

$ \begin{align} B&=(A\cap B)\cup(A'\cap B) \\ \Rightarrow P(B)&=P(A\cap B)+P(A'\cap B) \\ &=P(A)P(B) + P(A'\cap B) \end{align} $
$ \begin{align} \Rightarrow P(A'\cap B) &= P(B) - P(A)P(B) \\ &= P(B)[1-P(A)] \\ &= P(B)P(A') \end{align} $

Definition $ \qquad $ Events $ A_1,A_2,...,A_n $ are statistically independent iff for any $ k=1,2,...,n $, and any 1 ≤ $ j_1,j_2,...,j_k $ ≤ n, for any finite n, we have that

$ P(A_{j_1}\cap...\cap A_{j_k}) = \prod_{i=1}^k P(A_{j_i}) $

Note that to show that a collection of sets is disjoint, we only need to consider pairwise intersections of the sets. But to show that a collection of sets is independent, we need to consider every possible combination of sets from the collection. Then we say that the events are mutually independent. Mutual independence implies pairwise independence but the converse is not true.
Note that for a given n, there are $ 2^n-(n+1) $ combinations. To see this, use the Binomial Theorem:

$ \sum_{k=0}^n {n\choose k}a^kb^{n-k} = (a+b)^n, \;\;\mbox{with}\; a=b=1 $

Then

$ \begin{align} \sum_{k=0}^n {n\choose k}a^kb^{n-k} &= \sum_{k=0}^n {n\choose k} \\ &=2^n \end{align} $

So,

$ \begin{align} \sum_{k=2}^n {n\choose k} &= {n\choose 2} + {n\choose 3} +...+{n\choose n} \\ &=2^n-{n\choose 0} - {n\choose 1} \\ &=2^n-(n+1) \end{align} $

So there are $ 2^n-(n+1) $ combinations of sets that need to be checked for independence.
In practice, generally we do not check $ 2^n-(n+1) $ combinations for every n. Often, we just assume independence based on intuition and see if results are reasonable. We cannot however, use this approach in this class.



Combined Experiments

Consider two experiments with probability spaces ($ S_1,F_1, $P$ _1 $) and ($ S_2,F_2, $P$ _2 $). Consider a combined experiment with probability space (S,F,P), where

$ \begin{align} \mathcal S &= \mathcal S_1 \times \mathcal S_2 \\ &= \{(\omega_1,\omega_2):\;\omega_1\in\mathcal S_1,\;\omega_2\in\mathcal S_2 \} \end{align} $

where S is the cartesian product of $ S_1 $ and $ S_2 $.
What should F be?
Let

$ \mathcal F = \{A\times B:\;A\subset\mathcal S_1,\;B\subset\mathcal S_2\} $

Or, let

$ \mathcal F = \{A\times B:\;A\in\mathcal F_1,\;B\in\mathcal F_2\} $

Note that for discrete S, these are typically the same. But ∃ subsets of S = $ S_1 $×$ S_2 $ that are not in F for both of these definitions, some of which might be events we care about.

Example
Experiment 1: Toss a coin
Experiment 2: Roll a die

$ \begin{align} \mathcal S_1 &= \{H,T\} \\ \mathcal S_2 &= \{1,2,3,4,5,6\} \\ \Rightarrow\mathcal S &= \mathcal S_1\times\mathcal S_2 \\ &=\{(i,j):\;i=H,T; \; j=1,2,...,6\} \end{align} $
$ \begin{align} \mathcal F_1 &= \{ \{H\},\; \{T\},\;\mathcal S_1=\{H,T\},\;\varnothing \} \\ \mathcal F_2 &= \mathcal{P(S_2)} \end{align} $

If we let

$ \mathcal F= \{A\times B:\;A\in \mathcal F_1;\;\;B\in \mathcal F_2\} $

then, the event $ \{(H,2),(T,6)\} $ is not in F, since $ \{(H,2),(T,6)\} $$ AxB $ for any A ∈ $ F_1 $, B ∈ $ F_2 $.

Instead, we let

$ \mathcal F = \sigma(\{ A\times B:\;A\in\mathcal F_1;B\in\mathcal F_2 \}) $

This F contains the events we want.

For example, consider $ \{(H,2),(T,6)\} $. Let A = {H} ∈ $ F_1 $, B = {2} ∈ $ F_2 $, C = {T} ∈ $ F_1 $, D = {6} ∈ $ F_2 $
Then, $ AxB $={(H,2)} ∈ F and $ CxD $={(T,6)} ∈ F.
Since F is a $ \sigma $-field, {(H,2)} ∪ {(T,6)} = {(H,2),(T,6)} ∈ F.

Now, what about P? We need P(C) ∀C ∈ F. Start with $ P(AxS_2)=P_1(A) $ ∀A ∈ $ F_1 $ since $ P(AxS_2) $ is the event that A occurs in Experiment 1 and a "don't care" in Experiment 2.
Also, let $ P(S_1xB)=P_2(B) $ ∀B ∈ $ F_2 $.
What about other Cs?
We will assume that the two experiments are independent in the sense that $ A $$ F_1 $, $ B $$ F_2 $

$ A\times\mathcal S_2 \perp\!\!\!\perp \mathcal S_1\times B $

Then, we have

$ \begin{align} P(A\times B) &= P((A\times \mathcal S_2)\cap(\mathcal S_2\times B)) \\ &= P(A\times \mathcal S_2)P(\mathcal S_1\times B) \\ &= P_1(A)P_2(B) \end{align} $

So now we have that P(C) if

  • C = Ax$ S_2 $ or C = $ S_1 $xB for some A ∈ $ F_1 $ or B ∈ $ F_2 $
  • C = AxB for A ∈ $ F_1 $, B ∈ $ F_2 $.

What about other Cs?
Any $ C $ that cannot be written as $ AxB $ for some A ∈ $ F_1 $ and B ∈ $ F_2 $ can be written in terms of unions, intersections and complements of such $ AxB $ sets, so the probability axioms can be used to define $ P(C) $ in that case.
Note that the above discussion can be easily extended for n combined experiments.



Bernoulli Trials

Bernoulli trials are an important combined experiment that models repeated independent trials of an experiment with one particular event of interest. Consider an experiment with probability space ($ S_0,F_0, $P$ _0 $), and an event A ∈ $ F_0 $ with P(A) = p. For this combined experiment, let S = $ S_0 $ × ... × $ S_0 $ (n terms) ⇒ S = {n-tuples with each component an element of $ S_0 $}, where n is the number of trials. We are interested in the probability that A occurs k times in the n trials for k = 0,1,...,n.
Let $ p_n(k) $ ≡ P(A occurs k times in n trials).
Then, it can be shown that

$ p_n(k) = {n\choose k}p^k(1-p)^{n-k} $

Overview of proof: From combinatorics, There are $ ^nC_k $ (n choose k) ways to have k successes (A occurs) in n trials. Due to the independence of the trials, the probability that a give n combination occurs is $ p^k(1-p)^{n-k} $. Since the collection of all combinations is disjoint,

$ \begin{align} p_n(k) &= \sum_{j=1}^{n\choose k}p^k(1-p)^{n-k} \\ &= {n\choose k} p^k(1-p)^{n-k} \end{align} $


Example:
Consider a binary communication system.
Let A = {bit error} = {send 0, decode 1} ∪ {send 1, decode 0}
Let p = P(A) = $ 10^{-3} $
Transmit each bit 3 times and use the majority vote to decode.
Then, P(wrong decision) = P(2 or more bit errors) = $ p_3(2)+p_3(3) $, which is approximately $ 3 x 10^{-6} $



References



Questions and comments

If you have any questions, comments, etc. please post them on this page



Back to all ECE 600 notes
Previous Topic: Conditional Probability
Next Topic: Random Variables: Definition

Alumni Liaison

Correspondence Chess Grandmaster and Purdue Alumni

Prof. Dan Fleetwood