Line 49: Line 49:
 
Because of the commutativity property of intersection, we can say that
 
Because of the commutativity property of intersection, we can say that
  
 +
<math>P[B_j|A]P[A] = P[A|B_j]P[B_j]</math>
  
 
----
 
----

Revision as of 15:53, 13 March 2013

Bayes' Theorem

by Maliha Hossain

 keyword: probability, Bayes' Theorem, Bayes' Rule 

INTRODUCTION

Bayes' Theorem (or Bayes' Rule) allows us to calculate P(A|B) from P(B|A) given that P(A) and P(B) are also known, where A and B are events. In this tutorial, we will derive Bayes' Theorem and illustrate it with a few examples.

Note that this tutorial assumes familiarity with conditional probability and the axioms of probability.

 Contents
- Bayes' Theorem
- Proof
- Example 1
- Example 2
- Example 3
- References

Bayes' Theorem

Let $ B_1, B_2, ..., B_n $ be a partition of the sample space $ S $, i.e. $ B_1, B_2, ..., B_n $ are mutually exclusive events whose union equals the sample space S. Suppose that the event $ A $ occurs. Then, by Bayes' Theorem, we have that

$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{P[A]}, j = 1, 2, . . . , n $

Bayes' Theorem is also often expressed in the following form:

$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{\sum_{k=1}^n P[A|B_k]P[B_k]} $


Proof

We will now derive Bayes'e Theorem as it is expressed in the second form, which simply takes the expression one step further than the first.

Let $ A $ and $ B_j $ be as defined above. By definition of the conditional probability, we have that

$ P[A|B_j] = \frac{P[A\cap B_j]}{P[B_j]} $

multiplying both sides with $ B_j $, we get

$ P[A\cap B_j] = P[A|B_j]P[B_j] $

using the same argument as above, we have that

$ P[B_j|A] = \frac{P[B_j\cap A]}{P[A]} $

$ \Rightarrow P[B_j\cap A] = P[B_j|A]P[A] $

Because of the commutativity property of intersection, we can say that

$ P[B_j|A]P[A] = P[A|B_j]P[B_j] $


$ \frac{\partial \rho \vec{u}}{\partial t} + \left(\vec{u}\cdot\nabla\right)\rho\vec{u} = -\nabla P + \nabla\cdot(\mu\nabla\vec{u}) $

With the Navier-Stokes equations in terms of partial derivatives in Cartesian coordinates

$ \frac{\partial \rho u}{\partial t} + u\frac{\partial u}{\partial x} + v\frac{\partial u}{\partial y} + w\frac{\partial u}{\partial z} = -\frac{\partial P}{\partial x} + \frac{\partial}{\partial x}(\mu \frac{\partial u}{\partial x}) + \frac{\partial}{\partial y}(\mu \frac{\partial u}{\partial y}) + \frac{\partial}{\partial z}(\mu \frac{\partial u}{\partial z}) $

$ \frac{\partial \rho v}{\partial t} + u\frac{\partial v}{\partial x} + v\frac{\partial v}{\partial y} + w\frac{\partial v}{\partial z} = -\frac{\partial P}{\partial y} + \frac{\partial}{\partial x}(\mu \frac{\partial v}{\partial x}) + \frac{\partial}{\partial y}(\mu \frac{\partial v}{\partial y}) + \frac{\partial}{\partial z}(\mu \frac{\partial v}{\partial z}) $

$ \frac{\partial \rho w}{\partial t} + u\frac{\partial w}{\partial x} + v\frac{\partial w}{\partial y} + w\frac{\partial w}{\partial z} = -\frac{\partial P}{\partial z} + \frac{\partial}{\partial x}(\mu \frac{\partial w}{\partial x}) + \frac{\partial}{\partial y}(\mu \frac{\partial w}{\partial y}) + \frac{\partial}{\partial z}(\mu \frac{\partial w}{\partial z}) $

The subtle point is that although the latter three equations would appear different if written in cylindrical coordinates (the partial derivatives in $ x, y, z $ would be replaced with ones in $ r,\theta,z $), the vector equation does not. However, the implementations of the operators gradient and divergence do depend on the coordinate system.

In Cartesian coordinates, gradient and divergence are defined as below, where $ n $ is the number of spatial dimensions involved. If $ x_1, x_2, ..., x_n $ are the coordinate directions and

$ \hat{e}_i , i = 1,2,...,n $

are the unit vectors in those directions, then

$ \nabla\cdot\vec{v} = \sum_{i=1}^n \frac{\partial v_i}{\partial x_i} \text{, where } \vec{v} = \sum_{i=1}^n v_i \hat{e}_i $

$ \nabla\phi = \sum_{i=1}^n \frac{\partial \phi}{\partial x_i} \hat{e}_i $

Based on this definition, one might expect that in cylindrical coordinates, the gradient operation would be

$ \nabla\phi \neq \frac{\partial \phi}{\partial r}\hat{e}_r + \frac{\partial \phi}{\partial \theta}\hat{e}_{\theta} + \frac{\partial \phi}{\partial z}\hat{e}_z $

By simply taking the partial derivatives of $ \phi $ with respect to each coordinate direction, multiplying each derivative by the corresponding unit vector, and adding the resulting components together. This is actually not correct for coordinate systems other than Cartesian. One could arrive at the correct formula for the gradient by performing some tedious changes of variables, and repeat the process for the other vector derivatives. However that approach has many opportunities for error and does not produce much insight as to why the coefficients of the partial derivatives are what they are. This tutorial shows a different way to arrive at the same results but with less calculation.


Preliminaries

This tutorial will denote vector quantities with an arrow atop a letter, except unit vectors that define coordinate systems which will have a hat. 3-D Cartesian coordinates will be indicated by $ x, y, z $ and cylindrical coordinates with $ r,\theta,z $.

CoordinateSystems.jpg

This tutorial will make use of several vector derivative identities. In particular, these:

$ \nabla\cdot(\phi\vec{v}) = \nabla\phi\cdot\vec{v} + \phi \nabla\cdot\vec{u} $

On some occasions we will also have to translate between partial derivatives in various coordinate systems. Start with the multivariate chain rule:

$ \frac{\partial \phi}{\partial r} = \frac{\partial \phi}{\partial x}\frac{\partial x}{\partial r} + \frac{\partial \phi}{\partial y}\frac{\partial y}{\partial r} + \frac{\partial \phi}{\partial z}\frac{\partial z}{\partial r} $

$ \frac{\partial \phi}{\partial \theta} = \frac{\partial \phi}{\partial x}\frac{\partial x}{\partial \theta} + \frac{\partial \phi}{\partial y}\frac{\partial y}{\partial \theta} + \frac{\partial \phi}{\partial z}\frac{\partial z}{\partial \theta} $

$ \frac{\partial \phi}{\partial z} = \frac{\partial \phi}{\partial x}\frac{\partial x}{\partial z} + \frac{\partial \phi}{\partial y}\frac{\partial y}{\partial z} + \frac{\partial \phi}{\partial z}\frac{\partial z}{\partial z} $

In matrix form:

$ \begin{bmatrix} \frac{\partial \phi}{\partial r} \\ \frac{\partial \phi}{\partial \theta} \\ \frac{\partial \phi}{\partial z} \end{bmatrix} = \begin{bmatrix} \frac{\partial x}{\partial r} & \frac{\partial y}{\partial r} & \frac{\partial z}{\partial r} \\ \frac{\partial x}{\partial \theta} & \frac{\partial y}{\partial \theta} & \frac{\partial z}{\partial \theta} \\ \frac{\partial x}{\partial z} & \frac{\partial y}{\partial z} & \frac{\partial z}{\partial z} \end{bmatrix} \begin{bmatrix} \frac{\partial \phi}{\partial x} \\ \frac{\partial \phi}{\partial y} \\ \frac{\partial \phi}{\partial z}\end{bmatrix} $

The entries of the square matrix come from the coordinate transformation itself:

$ x = r \cos \theta \rightarrow \frac{\partial x}{\partial r} = \cos \theta \text{ , } \frac{\partial x}{\partial \theta} = -r\sin \theta $

$ y = r \sin \theta \rightarrow \frac{\partial y}{\partial r} = \sin \theta \text{ , } \frac{\partial y}{\partial \theta} = r\cos \theta $

$ z = z \rightarrow \frac{\partial x}{\partial z} = \frac{\partial y}{\partial z} = 0 \text{ , } \frac{\partial z}{\partial z} = 1 $

$ \begin{bmatrix} \frac{\partial \phi}{\partial r} \\ \frac{\partial \phi}{\partial \theta} \\ \frac{\partial \phi}{\partial z}\end{bmatrix} = \begin{bmatrix} \cos \theta & \sin \theta & 0 \\ -r \sin\theta & r \cos\theta & 0 \\ 0 & 0 & 1\end{bmatrix} \begin{bmatrix} \frac{\partial \phi}{\partial x} \\ \frac{\partial \phi}{\partial y} \\ \frac{\partial \phi}{\partial z} \end{bmatrix} $

This gives the partial derivatives with respect to cylindrical coordinate variables in terms of partial derivatives with respect to Cartesian coordinate variables. We can go the other way by inverting this linear system:

$ \begin{bmatrix} \frac{\partial \phi}{\partial x} \\ \frac{\partial \phi}{\partial y} \\ \frac{\partial \phi}{\partial z}\end{bmatrix} = \begin{bmatrix} \cos \theta & -\frac{\sin \theta}{r} & 0 \\ \sin \theta & \frac{\cos\theta}{r} & 0 \\ 0 & 0 & 1\end{bmatrix} \begin{bmatrix} \frac{\partial \phi}{\partial r} \\ \frac{\partial \phi}{\partial \theta} \\ \frac{\partial \phi}{\partial z} \end{bmatrix} $

Note that $ \phi $ can be
any</br> scalar field for which all partial derivatives exist, including the coordinate variables themselves.

We are now ready to tackle the gradient in cylindrical coordinates.

Gradient in Cylindrical Coordinates

Obviously, the gradient can be written in terms of the unit vectors of cylindrical and Cartesian coordinate systems as $ a\frac{\partial \phi}{\partial r}\hat{e}_r + b\frac{\partial \phi}{\partial \theta}\hat{e}_{\theta} + c\frac{\partial \phi}{\partial z}\hat{e}_z = \nabla\phi = \frac{\partial \phi}{\partial x}\hat{e}_x + \frac{\partial \phi}{\partial y}\hat{e}_y + \frac{\partial \phi}{\partial z}\hat{e}_z $

Where $ a,b,c $ are coefficients to be determined. We can single out components of the left-hand side by taking dot products with the cylindrical unit vectors. This approach yields three equations:

$ a\frac{\partial \phi}{\partial r} = \frac{\partial \phi}{\partial x}\hat{e}_x\cdot\hat{e}_r + \frac{\partial \phi}{\partial y}\hat{e}_y\cdot\hat{e}_r = \frac{\partial \phi}{\partial x}\cos\theta + \frac{\partial \phi}{\partial y}\sin\theta $

$ b\frac{\partial \phi}{\partial \theta} = \frac{\partial \phi}{\partial x}\hat{e}_x\cdot\hat{e}_{\theta} + \frac{\partial \phi}{\partial y}\hat{e}_y\cdot\hat{e}_{\theta} = -\frac{\partial \phi}{\partial x}\sin\theta + \frac{\partial \phi}{\partial y}\cos\theta $

$ c\frac{\partial \phi}{\partial z} = \frac{\partial \phi}{\partial z} \rightarrow c = 1 $

Solve for $ a,b $ by substituting into the first two of these equations the first two rows of the change-of-variable matrix:

$ \frac{\partial \phi}{\partial x}\cos\theta + \frac{\partial \phi}{\partial y}\sin\theta = a\left(\frac{\partial \phi}{\partial x} \cos\theta + \frac{\partial \phi}{\partial y} \sin\theta \right) \rightarrow a = 1 $

$ -\frac{\partial \phi}{\partial x}\sin\theta + \frac{\partial \phi}{\partial y}\cos\theta = b\left(-\frac{\partial \phi}{\partial x} r\sin\theta + \frac{\partial \phi}{\partial y} r\cos\theta\right) \rightarrow b = \frac{1}{r} $

So the gradient expression we sought turns out to be

$ \nabla\phi = \frac{\partial \phi}{\partial r}\hat{e}_r + \frac{1}{r}\frac{\partial \phi}{\partial \theta}\hat{e}_{\theta} + \frac{\partial \phi}{\partial z}\hat{e}_z $

Divergence in Cylindrical Coordinates

We want an expression for

$ \nabla\cdot\vec{u} = \nabla\cdot\left(u_r\hat{e}_r + u_{\theta}\hat{e}_{\theta} + u_z\hat{e}_z\right) $

That involves only derivatives in cylindrical coordinates. Using the vector identity mentioned in the preliminaries, this equation can be expanded as:

$ \nabla\cdot\vec{u} = \left(\nabla u_r\right)\cdot\hat{e}_r + u_r\left(\nabla\cdot\hat{e}_r\right) + \left(\nabla u_{\theta}\right)\cdot\hat{e}_{\theta} + u_{\theta}\left(\nabla\cdot\hat{e}_{\theta}\right) + \left(\nabla u_z\right)\cdot\hat{e}_z + u_z\left(\nabla\cdot\hat{e}_z\right) $

The terms involving gradients of the components of the vector field simplify to the partial derivatives of components with respect to their corresponding directions, multiplied by the coefficients found in the previous section:

$ \nabla\cdot\vec{u} = \frac{\partial u_r}{\partial r} + \frac{1}{r}\frac{\partial u_{\theta}}{\partial \theta} + \frac{\partial u_z}{\partial z} + u_r\left(\nabla\cdot\hat{e}_r\right) + u_{\theta}\left(\nabla\cdot\hat{e}_{\theta}\right) + u_z\left(\nabla\cdot\hat{e}_z\right) $

So a divergence "correction" must be applied, which arises from the divergence of the unit vector fields. Technically the unit "vectors" referred to in this tutorial are actually vector fields, since the unit vectors of a coordinate system are defined at all points in space (other than zero).

File:UnitVectorFields.jpg

So we're interested now in the divergences these fields in order to complete the previous equation.

References


$ \int_0^1\int_0^4\int_{-1}^7\nabla\phi {dV} = \frac{\partial u}{\partial x}\hat{e}_x $

$ \iiint_{\partial \Omega} {\mathbb R} $

Here's Google


Back to Math Squad page

Alumni Liaison

Basic linear algebra uncovers and clarifies very important geometry and algebra.

Dr. Paul Garrett