Engineering Probability Class 16 Mon 2020-03-23

W Randolph Franklin (WRF), Rensselaer Polytechnic Institute (RPI)

2020-03-06 00:00

Table of contents::

1 Revised course format because of the current situation

No physical lectures.
I will use Webex.

Connect via W Randolph Franklin's Personal Room.
For most classes after today, I'll assign one of Prof Rich Radke's lectures to watch before class.
Class time will be used for enrichment material, as discussion time for you to ask questions, and for me to work out textbook exercises.
I will add a small homework after each class, due in a few days.
I will continue to distribute material with this class blog.
Continue to use piazza for questions and written discussions.
Continue to submit work with gradescope.
I do not plan to use LMS.
Your feedback is welcome. This is an unprecedented situation. We profs have been officially told to be humane. So, no more 57 question homeworks. :-)

2 Normal distribution table

For your convenience. I computed it with Matlab.:

x          f(x)      F(x)      Q(x)
-3.0000    0.0044    0.0013    0.9987
-2.9000    0.0060    0.0019    0.9981
-2.8000    0.0079    0.0026    0.9974
-2.7000    0.0104    0.0035    0.9965
-2.6000    0.0136    0.0047    0.9953
-2.5000    0.0175    0.0062    0.9938
-2.4000    0.0224    0.0082    0.9918
-2.3000    0.0283    0.0107    0.9893
-2.2000    0.0355    0.0139    0.9861
-2.1000    0.0440    0.0179    0.9821
-2.0000    0.0540    0.0228    0.9772
-1.9000    0.0656    0.0287    0.9713
-1.8000    0.0790    0.0359    0.9641
-1.7000    0.0940    0.0446    0.9554
-1.6000    0.1109    0.0548    0.9452
-1.5000    0.1295    0.0668    0.9332
-1.4000    0.1497    0.0808    0.9192
-1.3000    0.1714    0.0968    0.9032
-1.2000    0.1942    0.1151    0.8849
-1.1000    0.2179    0.1357    0.8643
-1.0000    0.2420    0.1587    0.8413
-0.9000    0.2661    0.1841    0.8159
-0.8000    0.2897    0.2119    0.7881
-0.7000    0.3123    0.2420    0.7580
-0.6000    0.3332    0.2743    0.7257
-0.5000    0.3521    0.3085    0.6915
-0.4000    0.3683    0.3446    0.6554
-0.3000    0.3814    0.3821    0.6179
-0.2000    0.3910    0.4207    0.5793
-0.1000    0.3970    0.4602    0.5398
      0    0.3989    0.5000    0.5000
 0.1000    0.3970    0.5398    0.4602
 0.2000    0.3910    0.5793    0.4207
 0.3000    0.3814    0.6179    0.3821
 0.4000    0.3683    0.6554    0.3446
 0.5000    0.3521    0.6915    0.3085
 0.6000    0.3332    0.7257    0.2743
 0.7000    0.3123    0.7580    0.2420
 0.8000    0.2897    0.7881    0.2119
 0.9000    0.2661    0.8159    0.1841
 1.0000    0.2420    0.8413    0.1587
 1.1000    0.2179    0.8643    0.1357
 1.2000    0.1942    0.8849    0.1151
 1.3000    0.1714    0.9032    0.0968
 1.4000    0.1497    0.9192    0.0808
 1.5000    0.1295    0.9332    0.0668
 1.6000    0.1109    0.9452    0.0548
 1.7000    0.0940    0.9554    0.0446
 1.8000    0.0790    0.9641    0.0359
 1.9000    0.0656    0.9713    0.0287
 2.0000    0.0540    0.9772    0.0228
 2.1000    0.0440    0.9821    0.0179
 2.2000    0.0355    0.9861    0.0139
 2.3000    0.0283    0.9893    0.0107
 2.4000    0.0224    0.9918    0.0082
 2.5000    0.0175    0.9938    0.0062
 2.6000    0.0136    0.9953    0.0047
 2.7000    0.0104    0.9965    0.0035
 2.8000    0.0079    0.9974    0.0026
 2.9000    0.0060    0.9981    0.0019
 3.0000    0.0044    0.9987    0.0013

x is often called z.

More info: https://en.wikipedia.org/wiki/Standard_normal_table

3 The large effect of a small bias

This is enrichment material. It is not in the text, and will not be on the exam. However, it might be in a future homework.

Consider tossing $n=10^6$ fair coins.

P[more heads than tails] = 0.5
Now assume that each coin has chance of being heads $p=0.5005$.

What's P[more heads than tails]?
1. Approx with a Gaussian. $\mu=500500, \sigma=500$.
2. Let X be the r.v. for the number of heads.
3. P[X>500000] = Q(-1) = .84
4. I.e., increasing the probability of winning 1 toss by 1 part in 1000, increased the probability of winning 1,000,000 tosses from 50% to 84%.
Now assume that 999,000 of the coins are fair, but 1,000 will always be heads.

What's P[more heads than tails]?
1. Let X = number of heads in 999,000 tosses.
2. We want P[X>499,000].
1. Approx with a Gaussian. $\mu=499,500, \sigma=500$.
2. P[X>499,000] = Q(-1) = .84 as before.
3. I.e., fixing 0.1% of the coins increased the probability of winning 1,000,000 tosses from 50% to 84%.

4 Section 5.7 Conditional probability ctd

I'll do these sections only if there's time and interest.

Example 5.35 Maximum A Posteriori Receiver on page 268.
Example 5.37, page 270.
Remember equations 5.49 a,b for total probability on page 269-70 for conditional expectation of Y given X.

5 Section 5.8 page 271: Functions of two random variables, ctd

Example 5.39 Sum of Two Random Variables, page 271.
Example 5.40 Sum of Nonindependent Gaussian Random Variables, page 272.

I'll do an easier case of independent N(0,1) r.v. The sum will be N(0, $\sqrt{2}$ ).
Example 5.44, page 275. Tranform two independent Gaussian r.v from

(X,Y) to (R, $\theta$).

6 Section 5.9, page 278: pairs of jointly Gaussian r.v.

I will simplify formula 5.61a by assuming that $\mu=0, \sigma=1$.

$$f_{XY}(x,y)= \frac{1}{2\pi \sqrt{1-\rho^2}} e^{ \frac{-\left( x^2-2\rho x y + y^2\right)}{2(1-\rho^2)} } $$ .
The r.v. are probably dependent. $\rho$} says how much.
The formula degenerates if $|\rho|=1$ since the numerator and denominator are both zero. However the pdf is still valid. You could make the formula valid with l'Hopital's rule.
The lines of equal probability density are ellipses.
The marginal pdf is a 1 variable Gaussian.
Example 5.47, page 282: Estimation of signal in noise
1. This is our perennial example of signal and noise. However, here the signal is not just $\pm1$ but is normal. Our job is to find the ''most likely'' input signal for a given output.
Important concept in the noisy channel example (with X and N both being Gaussian): The most likely value of X given Y is not Y but is somewhat smaller, depending on the relative sizes of $\sigma_X$ and $\sigma_N$. This is true in spite of $\mu_N=0$. It would be really useful for you to understand this intuitively. Here's one way:

If you don't know Y, then the most likely value of X is 0. Knowing Y gives you more information, which you combine with your initial info (that X is $N(0,\sigma_X)$ to get a new estimate for the most likely X. The smaller the noise, the more valuable is Y. If the noise is very small, then the mostly likely X is close to Y. If the noise is very large (on average) then the most likely X is still close to 0.

7 Tutorial on probability density - 2 variables

In class 15, I tried to motivate the effect of changing one variable on probability density. Here's a try at motivating changing 2 variables.

We're throwing darts uniformly at a one foot square dartboard.
We observe 2 random variables, X, Y, where the dart hits (in Cartesian coordinates).
$$f_{X,Y}(x,y) = \begin{cases} 1& \text{if}\,\, 0\le x\le1 \cap 0\le y\le1\\ 0&\text{otherwise} \end{cases}$$
$$P[.5\le x\le .6 \cap .8\le y\le.9] = \int_{.5}^{.6}\int_{.8}^{.9} f_{XY}(x,y) dx \, dy = 0.01 $$
Transform to centimeters: $$\begin{bmatrix}V\\W\end{bmatrix} = \begin{pmatrix}30&0\\0&30\end{pmatrix} \begin{bmatrix}X\\Y\end{bmatrix}$$
$$f_{V,W}(v,w) = \begin{cases} 1/900& \text{if } 0\le v\le30 \cap 0\le w\le30\\ 0&\text{otherwise} \end{cases}$$
$$P[15\le v\le 18 \cap 24\le w\le27] = \\ \int_{15}^{18}\int_{24}^{27} f_{VW}(v,w)\, dv\, dw = \frac{ (18-15)(27-24) }{900} = 0.01$$
See Section 5.8.3 on page 286.
Next: We've seen 1 r.v., we've seen 2 r.v. Now we'll see several r.v.

8 To watch before next class

Radke's Engineering Probability Lecture 16: Conditional PDFs; Bayesian and maximum likelihood estimation.

9 Xkcd comic

Conditional Risk