Engineering Probability Class 22 Thu 2019-04-04

W Randolph Franklin (WRF), RPI

2019-04-04 00:00

Source

Table of contents

1 Iclicker questions
2 Mathematica demo
3 Material from text
- 3.1 Section 6.5, page 332: Estimation of random variables
- 3.2 Central limit theorem etc

1 Iclicker questions

X and Y are two uniform r.v. on the interval [0,1]. X and Y are independent. Z=X+Y. What is E[Z]?
1. 0
2. 1/2
3. 2/3
Now let W=max(X,Y). What is E[W]?
1. 0
2. 1/2
3. 2/3
Experiment: toss two fair coins, one after the other. Observe two random variables:
1. X is the number of heads.
2. Y is the toss when the first head occurred, with 0 meaning both coins were tails.
What is P[X=1]?
1. 0
2. 1/4
3. 1/2
4. 3/4
5. 1
What is P[Y=1]?
1. 0
2. 1/4
3. 1/2
4. 3/4
5. 1
What is P[Y=1 & X=1]?
1. 0
2. 1/4
3. 1/2
4. 3/4
5. 1
What is P[Y=1|X=1]?
1. 0
2. 1/4
3. 1/2
4. 3/4
5. 1
What is P[X=1|Y=1]?
1. 0
2. 1/4
3. 1/2
4. 3/4
5. 1

2 Mathematica demo

Exercise 6.47, page 353.

3 Material from text

3.1 Section 6.5, page 332: Estimation of random variables

Assume that we want to know X but can only see Y, which depends on X.
This is a generalization of our long-running noisy communication channel example. We'll do things a little more precisely now.
Another application would be to estimate tomorrow's price of GOOG (X) given the prices to date (Y).
Sometimes, but not always, we have a prior probability for X.
For the communication channel we do, for GOOG, we don't.
If we do, it's a ''maximum a posteriori estimator''.
If we don't, it's a ''maximum likelihood estimator''. We effectively assume that that prior probability of X is uniform, even though that may not completely make sense.
You toss a fair coin 3 times. X is the number of heads, from 0 to 3. Y is the position of the 1st head. from 0 to 3. If there are no heads, we'll say that the first head's position is 0.

(X,Y) p(X,Y)

(0,0) 1/8

(1,1) 1/8

(1,2) 1/8

(1,3) 1/8

(2,1) 2/8

(2,2) 1/8

(3,1) 1/8

E.g., 1 head can occur 3 ways (out of 8): HTT, THT, TTH. The 1st (and only) head occurs in position 1, one of those ways. p=1/8.
Conditional probabilities:

p(x|y) y=0 y=1 y=2 y=3

x=0 1 0 0 0

x=1 0 1/4 1/2 1

x=2 0 1/2 1/2 0

x=3 0 1/4 0 0

$g_{MAP}(y)$ 0 2 1 or 2 1

$P_{error}(y)$ 0 1/2 1/2 0

p(y) 1/8 1/2 1/4 1/8

The total probability of error is 3/8.
We observe Y and want to guess X from Y. E.g., If we observe $$\small y= \begin{pmatrix}0\\1\\2\\3\end{pmatrix} \text{then } x= \begin{pmatrix}0\\ 2 \text{ most likely} \\ 1, 2 \text{ equally likely} \\ 1 \end{pmatrix}$$
There are different formulae. The above one was the MAP, maximum a posteriori probability.

$$g_{\text{MAP}} (y) = \max_x p_x(x|y) \text{ or } f_x(x|y)$$

That means, the value of $x$ that maximizes $p_x(x|y)$
What if we don't know p(x|y)? If we know p(y|x), we can use Bayes. We might measure p(y|x) experimentally, e.g., by sending many messages over the channel.
Bayes requires p(x). What if we don't know even that? E.g. we don't know the probability of the different possible transmitted messages.
Then use maximum likelihood estimator, ML. $$g_{\text{ML}} (y) = \max_x p_y(y|x) \text{ or } f_y(y|x)$$
There are other estimators for different applications. E.g., regression using least squares might attempt to predict a graduate's QPA from his/her entering SAT scores. At Saratoga in August we might attempt to predict a horse's chance of winning a race from its speed in previous races. Some years ago, an Engineering Assoc Dean would do that each summer.
Historically, IMO, some of the techniques, like least squares and logistic regression, have been used more because they're computationally easy than because they're logically justified.

(X,Y)	p(X,Y)
(0,0)	1/8
(1,1)	1/8
(1,2)	1/8
(1,3)	1/8
(2,1)	2/8
(2,2)	1/8
(3,1)	1/8

p(x\|y)	y=0	y=1	y=2	y=3
x=0	1	0	0	0
x=1	0	1/4	1/2	1
x=2	0	1/2	1/2	0
x=3	0	1/4	0	0

$g_{MAP}(y)$	0	2	1 or 2	1
$P_{error}(y)$	0	1/2	1/2	0
p(y)	1/8	1/2	1/4	1/8

3.2 Central limit theorem etc

Review: Almost no matter what distribution the random variable X is, $F_{M_n}$ quickly becomes Gaussian as n increases. n=5 already gives a good approximation.
nice applets:
1. http://onlinestatbook.com/stat_sim/normal_approx/index.html This tests how good is the normal approximation to the binomial distribution.
2. http://onlinestatbook.com/stat_sim/sampling_dist/index.html This lets you define a distribution, and take repeated samples of a given size. It shows how the means of the samples are distributed. For sample with more than a few observations, they look fairly normal.
3. http://www.umd.umich.edu/casl/socsci/econ/StudyAids/JavaStat/CentralLimitTheorem.html This might also be interesting.
Sample problems.
1. Problem 7.1 on page 402.
2. Problem 7.22.
3. Problem 7.25.