Engineering Probability Class 20 Mon 2018-04-02

W Randolph Franklin (WRF), RPI

2018-04-02 00:00

Table of contents

1 Final exam conflicts

If you have 3 exams on that day, or another exam at the same time, please email me this week. Tell me the other courses.
The RPI rule is that the lower numbered course has precedence. If one of your other classes has a number higher than 2500, then it gives the conflict exam. If all the other courses with exams that day are lower than 2500, then I do.

2 Normal distribution table

For your convenience. I computed it with Matlab.:

x          f(x)      F(x)      Q(x)
-3.0000    0.0044    0.0013    0.9987
-2.9000    0.0060    0.0019    0.9981
-2.8000    0.0079    0.0026    0.9974
-2.7000    0.0104    0.0035    0.9965
-2.6000    0.0136    0.0047    0.9953
-2.5000    0.0175    0.0062    0.9938
-2.4000    0.0224    0.0082    0.9918
-2.3000    0.0283    0.0107    0.9893
-2.2000    0.0355    0.0139    0.9861
-2.1000    0.0440    0.0179    0.9821
-2.0000    0.0540    0.0228    0.9772
-1.9000    0.0656    0.0287    0.9713
-1.8000    0.0790    0.0359    0.9641
-1.7000    0.0940    0.0446    0.9554
-1.6000    0.1109    0.0548    0.9452
-1.5000    0.1295    0.0668    0.9332
-1.4000    0.1497    0.0808    0.9192
-1.3000    0.1714    0.0968    0.9032
-1.2000    0.1942    0.1151    0.8849
-1.1000    0.2179    0.1357    0.8643
-1.0000    0.2420    0.1587    0.8413
-0.9000    0.2661    0.1841    0.8159
-0.8000    0.2897    0.2119    0.7881
-0.7000    0.3123    0.2420    0.7580
-0.6000    0.3332    0.2743    0.7257
-0.5000    0.3521    0.3085    0.6915
-0.4000    0.3683    0.3446    0.6554
-0.3000    0.3814    0.3821    0.6179
-0.2000    0.3910    0.4207    0.5793
-0.1000    0.3970    0.4602    0.5398
      0    0.3989    0.5000    0.5000
 0.1000    0.3970    0.5398    0.4602
 0.2000    0.3910    0.5793    0.4207
 0.3000    0.3814    0.6179    0.3821
 0.4000    0.3683    0.6554    0.3446
 0.5000    0.3521    0.6915    0.3085
 0.6000    0.3332    0.7257    0.2743
 0.7000    0.3123    0.7580    0.2420
 0.8000    0.2897    0.7881    0.2119
 0.9000    0.2661    0.8159    0.1841
 1.0000    0.2420    0.8413    0.1587
 1.1000    0.2179    0.8643    0.1357
 1.2000    0.1942    0.8849    0.1151
 1.3000    0.1714    0.9032    0.0968
 1.4000    0.1497    0.9192    0.0808
 1.5000    0.1295    0.9332    0.0668
 1.6000    0.1109    0.9452    0.0548
 1.7000    0.0940    0.9554    0.0446
 1.8000    0.0790    0.9641    0.0359
 1.9000    0.0656    0.9713    0.0287
 2.0000    0.0540    0.9772    0.0228
 2.1000    0.0440    0.9821    0.0179
 2.2000    0.0355    0.9861    0.0139
 2.3000    0.0283    0.9893    0.0107
 2.4000    0.0224    0.9918    0.0082
 2.5000    0.0175    0.9938    0.0062
 2.6000    0.0136    0.9953    0.0047
 2.7000    0.0104    0.9965    0.0035
 2.8000    0.0079    0.9974    0.0026
 2.9000    0.0060    0.9981    0.0019
 3.0000    0.0044    0.9987    0.0013

3 Not in text enrichment - large effect of small bias

Consider tossing $n=10^6$ fair coins.

P[more heads than tails] = 0.5
Now assume that each coin has chance of being heads $p=0.5005$.

What's P[more heads than tails]?
Now assume that 999,000 of the coins are fair, but 1,000 will always be heads.

What's P[more heads than tails]?

4 Material from text

Example 5.17 on page 253. P[X+Y<=1]
Example 5.18 on page 253. Joint Gaussian.
Example 5.19 on page 255. Independence.
Example 5.20 on page 255. Independence of Q and R in the block transmission example.
Independence: Example 5.22 on page 256. Are 2 normal r.v. independent for different values of $\rho$ ?
Example 5.31 on page 264. This is a noisy comm channel, now with Gaussian (normal) noise. The problems are:
1. what input signal to infer from each output, and
2. how accurate is this?
5.6.2 Joint moments etc
1. Work out for 2 3-sided dice.
2. Work out for tossing dart onto triangular board.
Example 5.27: correlation measures ''linear dependence''. If the dependence is more complicated, the variables may be dependent but not correlated.
Covariance, correlation coefficient.
Section 5.7, page 261. Conditional pdf. There is nothing majorly new here; it's an obvious extension of 1 variable.
1. Discrete: Work out an example with a pair of 3-sided loaded dice.
2. Continuous: a triangular dart board. There is one little trick because for P[X=x]=0 since X is continuous, so how can we compute P[Y=y|X=x] = P[Y=y & X=x]/P[x]? The answer is that we take the limiting probability P[x<X<x+dx] etc as dx shrinks, which nets out to using f(x) etc.
Example 5.31 on page 264. This is a noisy comm channel, now with Gaussian (normal) noise. This is a more realistic version of the earlier example with uniform noise. The application problems are:
1. what input signal to infer from each output,
2. how accurate is this, and
3. what cutoff minimizes this?
In the real world there are several ways you could reduce that error:
1. Increase the transmitted signal,
2. Reduce the noise,
3. Retransmit several times and vote.
4. Handshake: Include a checksum and ask for retransmission if it fails.
5. Instead of just deciding X=+1 or X=-1 depending on Y, have a 3rd decision, i.e., uncertain if $|Y|<0.5$, and ask for retransmission in that case.
Section 5.8 page 271: Functions of two random variables.
1. We already saw how to compute the pdf of the sum and max of 2 r.v.
What's the point of transforming variables in engineering? E.g. in video, (R,G,B) might be transformed to (Y,I,Q) with a 3x3 matrix multiply. Y is brightness (mostly the green component). I and Q are approximately the red and blue. Since we see brightness more accurately than color hue, we want to transmit Y with greater precision. So, we want to do probabilities on all this.
Functions of 2 random variables
1. This is an important topic.
2. Example 5.44, page 275. Tranform two independent Gaussian r.v from (X,Y) to (R, $\theta$} ).
3. Linear transformation of two Gaussian r.v.
4. Sum and difference of 2 Gaussian r.v. are independent.
Section 5.9, page 278: pairs of jointly Gaussian r.v.
1. I will simplify formula 5.61a by assuming that $\mu=0, \sigma=1$.
  
  $$f_{XY}(x,y)= \frac{1}{2\pi \sqrt{1-\rho^2}} e^{ \frac{-\left( x^2-2\rho x y + y^2\right)}{2(1-\rho^2)} } $$ .
2. The r.v. are probably dependent. $\rho$} says how much.
3. The formula degenerates if $|\rho|=1$ since the numerator and denominator are both zero. However the pdf is still valid. You could make the formula valid with l'Hopital's rule.
4. The lines of equal probability density are ellipses.
5. The marginal pdf is a 1 variable Gaussian.
Example 5.47, page 282: Estimation of signal in noise
1. This is our perennial example of signal and noise. However, here the signal is not just $\pm1$ but is normal. Our job is to find the ''most likely'' input signal for a given output.
Next time: We've seen 1 r.v., we've seen 2 r.v. Now we'll see several r.v.

Engineering Probability Homework 8 due Mon 2018-04-09 2359 EST

W Randolph Franklin (WRF), RPI

2018-04-02 00:00

How to submit

Submit to LMS; see details in syllabus.

Questions

All questions except #1 are from the text.

Each part of a question is worth 5 points.

(15 pts) This exercise shows how the sum of independent r.v. start looking like a normal distribution.
1. Compute the pdf of the sum of 2,3,4, and 5 independent r.v. that are each U[0,1].
2. Compute the mean and standard deviation of each pdf.
3. For each sum, plot the pdf and overlay it with a plot of the pdf of the normal distribution with the same mean and standard deviation.
(30 pts) Textbook problem 5.1 on page 288 - 6 parts: a, b, c, da, db, dc.
(20 pts) Problem 5.2. (c) is 2 parts. Do all parts.
(20 pts) Problem 5.3 (c) is 2 parts. Do all parts.
(35 pts) Problem 5.12. (d) has 4 parts, giving 7 parts total.
(35 pts) Problem 5.13. (d) has 3 parts, giving 7 total.

Total: 155.

Engineering Probability Class 19 and Exam 2 - Thu 2018-03-29

W Randolph Franklin (WRF), RPI

2018-03-29 00:00

Name, RCSID:

.



.

Rules:

You have 80 minutes.
You may bring two 2-sided 8.5"x11" paper with notes.
You may bring a calculator.
You may not share material with each other during the exam.
No collaboration or communication (except with the staff) is allowed.
Check that your copy of this test has all eleven pages.
Each part of a question is worth 5 points.
You may cross out two questions, which will not be graded.
When answering a question, don't just state your answer, prove it.

Questions:

Consider this probability distribution:

$$f_X(x)= \begin{cases} a(2-x) & \text{if } 0\le x\le1\\ 0&\text{otherwise}\end{cases}$$
1. What is $a$?
```
.











.
```
2. What is $F_X(x)$?
```
.











.
```
3. What is E[X]?
```
.











.
```
4. What is the reliability, R[x]?
```
.











.
```
5. What is the MTTF?
```
.











.
```
6. What is the failure rate, r(x)?
```
.











.
```
7. What is $f_X(x|x>.5)$?
```
.











.
```
Define a new r.v. Y=2X, where X is the r.v. in the previous question.
1. What is $f_Y(y)$?
```
.











.
```
2. What is $F_Y(y)$?
```
.











.
```
3. What is E[Y]?
```
.











.
```
Your web server gets on the average 1 hit per second. The possible clients are independent of each other.
1. What is the name of appropriate distribution for the number of hits per second?
```
.











.
```
2. What is the probability that it gets exactly one hit in the next two seconds?
```
.











.
```
3. What is the name of appropriate probability distribution for the time between successive hits?
```
.











.
```
4. What is the probability that the time between two successive hits is less than two seconds?
```
.











.
```
Let X be an exponential random variable with mean 1.
1. Using the Markov inequality, what's P[X>3]?
```
.









.
```
2. Using the Chebyshev inequality, what's P[X>3]?
```
.








.
```
3. What's the exact P[X>3]?
```
.











.
```
Let X be a normal random variable with mean 100 and standard deviation 10. Give the following numbers, using the supplied table.
1. P[X>100].
```
.






.
```
2. P[80<X<100].
```
.






.
```
You're tossing 10000 fair coins. What's the probability of getting between 5000 and 5100 heads? Use the table.
```
.









.
```
Evaluate $$\int_0^\infty e^{-2 x^2} dx$$
```
.











.
```
Let $f_X(x) = 1$ and $f_Y(y)=2y$, both in the range $0\le x, y\le1$.

Let Z=max(X,Y).

What is E[Z]?
```
.














.
```

Normal distribution:

x          f(x)      F(x)      Q(x)
-3.0000    0.0044    0.0013    0.9987
-2.9000    0.0060    0.0019    0.9981
-2.8000    0.0079    0.0026    0.9974
-2.7000    0.0104    0.0035    0.9965
-2.6000    0.0136    0.0047    0.9953
-2.5000    0.0175    0.0062    0.9938
-2.4000    0.0224    0.0082    0.9918
-2.3000    0.0283    0.0107    0.9893
-2.2000    0.0355    0.0139    0.9861
-2.1000    0.0440    0.0179    0.9821
-2.0000    0.0540    0.0228    0.9772
-1.9000    0.0656    0.0287    0.9713
-1.8000    0.0790    0.0359    0.9641
-1.7000    0.0940    0.0446    0.9554
-1.6000    0.1109    0.0548    0.9452
-1.5000    0.1295    0.0668    0.9332
-1.4000    0.1497    0.0808    0.9192
-1.3000    0.1714    0.0968    0.9032
-1.2000    0.1942    0.1151    0.8849
-1.1000    0.2179    0.1357    0.8643
-1.0000    0.2420    0.1587    0.8413
-0.9000    0.2661    0.1841    0.8159
-0.8000    0.2897    0.2119    0.7881
-0.7000    0.3123    0.2420    0.7580
-0.6000    0.3332    0.2743    0.7257
-0.5000    0.3521    0.3085    0.6915
-0.4000    0.3683    0.3446    0.6554
-0.3000    0.3814    0.3821    0.6179
-0.2000    0.3910    0.4207    0.5793
-0.1000    0.3970    0.4602    0.5398

Normal distribution:

x          f(x)      F(x)      Q(x)
      0    0.3989    0.5000    0.5000
 0.1000    0.3970    0.5398    0.4602
 0.2000    0.3910    0.5793    0.4207
 0.3000    0.3814    0.6179    0.3821
 0.4000    0.3683    0.6554    0.3446
 0.5000    0.3521    0.6915    0.3085
 0.6000    0.3332    0.7257    0.2743
 0.7000    0.3123    0.7580    0.2420
 0.8000    0.2897    0.7881    0.2119
 0.9000    0.2661    0.8159    0.1841
 1.0000    0.2420    0.8413    0.1587
 1.1000    0.2179    0.8643    0.1357
 1.2000    0.1942    0.8849    0.1151
 1.3000    0.1714    0.9032    0.0968
 1.4000    0.1497    0.9192    0.0808
 1.5000    0.1295    0.9332    0.0668
 1.6000    0.1109    0.9452    0.0548
 1.7000    0.0940    0.9554    0.0446
 1.8000    0.0790    0.9641    0.0359
 1.9000    0.0656    0.9713    0.0287
 2.0000    0.0540    0.9772    0.0228
 2.1000    0.0440    0.9821    0.0179
 2.2000    0.0355    0.9861    0.0139
 2.3000    0.0283    0.9893    0.0107
 2.4000    0.0224    0.9918    0.0082
 2.5000    0.0175    0.9938    0.0062
 2.6000    0.0136    0.9953    0.0047
 2.7000    0.0104    0.9965    0.0035
 2.8000    0.0079    0.9974    0.0026
 2.9000    0.0060    0.9981    0.0019
 3.0000    0.0044    0.9987    0.0013

End of exam 1, total 100 points (considering that 2 questions aren't graded).

Engineering Probability Class 19 and Exam 2 Solution - Thu 2018-03-29

W Randolph Franklin (WRF), RPI

2018-03-29 00:00

Name, RCSID:

WRF solution

Rules:

You have 80 minutes.
You may bring two 2-sided 8.5"x11" paper with notes.
You may bring a calculator.
You may not share material with each other during the exam.
No collaboration or communication (except with the staff) is allowed.
Check that your copy of this test has all eleven pages.
Each part of a question is worth 5 points.
You may cross out two questions, which will not be graded.
When answering a question, don't just state your answer, prove it.

Questions:

Consider this probability distribution:

$$f_X(x)= \begin{cases} a(2-x) & \text{if } 0\le x\le1\\ 0&\text{otherwise}\end{cases}$$
1. What is $a$?
  
  We require that $\int_0^1 f(x) = 1$.
  
  So, $a=2/3$. You don't have to write it, but this gives $f_X(x) = \frac{4}{3}-\frac{2}{3}x$ if $0<x<1$.
2. What is $F_X(x)$?
  
  $$F(x) = \int_0^x f(w) dw = \begin{cases} 0 & \text{if } x\le0 \\ \frac{4}{3}x - \frac{x^2}{3} & 0<x<1 \\ 1 & 1\le x \end{cases} $$
  
  You can use any placeholder variable (other than $x$) in place of $w$.
  
  It doesn't matter where you say $<$ versus $\le$.
3. What is E[X]?
  
  $$E[X] = \int_0^1 x f(x) dx = \int (4/3 x -2/3 x^2)dx = \left. \left(\frac{2 x^2}{3} - \frac{2 x^3}{9}\right)\right|_0^1 = \frac{4}{9}$$
4. What is the reliability, R[x]?
  
  $$R(x) = 1 - F(x) = \begin{cases} 1 - \frac{4}{3}x + \frac{x^2}{3} & 0<x<1 \\ 0 & 1\le x \end{cases} $$
  
  Lifetimes are nonnegative, so I deleted the case for $x<0$, but it doesn't matter.
5. What is the MTTF?
  
  MTTF = $$\int_0^1 R(x) dx = \int \left(1 - \frac{4}{3}x + \frac{x^2}{3}\right) dx = \frac{4}{9}$$
  
  MTTF=E[X]. The integral goes up to 1 because R(x) is 0 when x>1.
6. What is the failure rate, r(x)?
  
  For $0<x<1$,
  
  $$r(x) = \frac{-R'(x)}{R(x)} = \frac{\frac{4}{3}-\frac{2x}{3}}{1 - \frac{4}{3}x + \frac{x^2}{3}} = \frac{4-2x}{3 - 4 x + x^2}$$
  
  You don't need to simplify it.
  
  Note that $r(x)$ grows to infinity as $x$ approaches 1.
7. What is $f_X(x|x>.5)$?
  
  $P[x>.5] = 1-F(.5) = 5/12$.
  
  $$f_X(x|x>.5) = f(x)/P[x>.5] = \begin{cases} \frac{8}{5}(2-x) & \text{if } 0.5\le x\le1\\ 0&\text{otherwise}\end{cases}$$
  
  As a check, you can see that $\int f(x|x>.5) dx = 1$.
Define a new r.v. Y=2X, where X is the r.v. in the previous question.
1. What is $f_Y(y)$?
  
  The nonzero domain for $f_X(x)$ is $0<x<1$, and Y=2X.
  
  So the nonzero domain for $f_Y(y)$ will be $0<y<2$.
  
  $dy/dx = 2$, so
  
  $$f_Y(y) = f_X(y/2)/2 = \begin{cases} \frac{2}{3} - \frac{y}{6} & 0<y<2\\ 0 & \text{otherwise}\end{cases}$$
2. What is $F_Y(y)$?
  
  $$F_Y(y) = F_X(y/2)$$
  
  $$F_Y(y)= \begin{cases} \frac{2}{3} y - \frac{y^2}{12} & 0<y<2\\ 0 & y<0 \\ 1 & y>2 \end{cases}$$
3. What is E[Y]?
  
  $$ E[Y] = \int_0^2 y\left(\frac{2}{3} - \frac{y}{6}\right) dy = \left.\left(y^2/3-y^3/18\right)\right|_0^2 = \frac{8}{9}$$
  
  Alternatively, E[Y] = 2 E[X].
Your web server gets on the average 1 hit per second. The possible clients are independent of each other.
1. What is the name of appropriate distribution for the number of hits per second?
  
  Poisson
2. What is the probability that it gets exactly one hit in the next two seconds?
  
  That r.v. is Poisson with $\alpha=2$ so $$P[X=1] = \frac{2^1 e^{-2}}{1!} = 2 e^{-2} = .27$$
  
  full points for $2 e^{-2}$.
3. What is the name of appropriate probability distribution for the time between successive hits?
  
  Exponential
4. What is the probability that the time between two successive hits is less than two seconds?
  
  Mean: $1/\lambda= 1$. $F(x) = 1-e^{-x}$. $F(2) = 1-e^{-2}=.14=.86$.
  
  full points for $1-e^{-2}$.
Let X be an exponential random variable with mean 1.
1. Using the Markov inequality, what's P[X>3]?
  
  See page 181. $\mu=1$. $P\le 1/3$.
2. Using the Chebyshev inequality, what's P[X>3]?
  
  $\mu=\sigma=1$, P[X<0]=0, so $$P[X>3]= P[|X-1|>2] \le 1/4$$
3. What's the exact P[X>3]?
  
  $F(x) = 1-e^{-x}$ so $P[X>3] = 1-F[3] = e^{-3} = .05$
  
  full points for $e^{-3}$.
Let X be a normal random variable with mean 100 and standard deviation 10. Give the following numbers, using the supplied table.
1. P[X>100].
  
  0.5
2. P[80<X<100].
  
  Converted to $\mu=0,\ \sigma=1$, this is P[-2<Y<0] = F(0)-F(-2) = .5 - .02 = .48.
You're tossing 10000 fair coins. What's the probability of getting between 5000 and 5100 heads? Use the table.

This is a Bernoulli r.v. with $\mu=5000,\ \sigma=\sqrt{npq}=50$.

Use a normal approximation; the answer is F(2)-F(0) = .98-.5 = .48.
Evaluate $$\int_0^\infty e^{-2 x^2} dx$$

For $\mu=0$, what value of $\sigma$ would make $f(x) = c e^{-2 x^2}$?

$$f(x) = \frac{1}{\sqrt{2\pi} \cdot \sigma} \exp\left(\frac{-x^2}{2\sigma^2}\right)$$

so let $\sigma=1/2$ and

$$f(x) = \sqrt{\frac{2}{\pi} } e^{\left(-2 x^2\right)}$$

So $$\int_\infty^\infty e^{-2 x^2} = \sqrt{\frac{\pi}{2} }$$

and $$\int_0^\infty e^{-2 x^2} dx$$ = $$\sqrt{\frac{\pi}{8} }$$

which could be written various ways.
Let $f_X(x) = 1$ and $f_Y(y)=2y$, both in the range $0\le x, y\le1$.

Let Z=max(X,Y).

What is E[Z]?

$F_X(x) = \int f_X(x) dx = x$,

$F_Y(y) = y^2$.

$F_Z(z) = F_X(z) F_Y(z) = z^3$,

$f_Z(z) = 3 z^2$,

$$E[Z] = \int_0^1 3 z^3 dz = 3/4 z^4|_0^1 = \frac{3}{4}$$.

Normal distribution:

x          f(x)      F(x)      Q(x)
-3.0000    0.0044    0.0013    0.9987
-2.9000    0.0060    0.0019    0.9981
-2.8000    0.0079    0.0026    0.9974
-2.7000    0.0104    0.0035    0.9965
-2.6000    0.0136    0.0047    0.9953
-2.5000    0.0175    0.0062    0.9938
-2.4000    0.0224    0.0082    0.9918
-2.3000    0.0283    0.0107    0.9893
-2.2000    0.0355    0.0139    0.9861
-2.1000    0.0440    0.0179    0.9821
-2.0000    0.0540    0.0228    0.9772
-1.9000    0.0656    0.0287    0.9713
-1.8000    0.0790    0.0359    0.9641
-1.7000    0.0940    0.0446    0.9554
-1.6000    0.1109    0.0548    0.9452
-1.5000    0.1295    0.0668    0.9332
-1.4000    0.1497    0.0808    0.9192
-1.3000    0.1714    0.0968    0.9032
-1.2000    0.1942    0.1151    0.8849
-1.1000    0.2179    0.1357    0.8643
-1.0000    0.2420    0.1587    0.8413
-0.9000    0.2661    0.1841    0.8159
-0.8000    0.2897    0.2119    0.7881
-0.7000    0.3123    0.2420    0.7580
-0.6000    0.3332    0.2743    0.7257
-0.5000    0.3521    0.3085    0.6915
-0.4000    0.3683    0.3446    0.6554
-0.3000    0.3814    0.3821    0.6179
-0.2000    0.3910    0.4207    0.5793
-0.1000    0.3970    0.4602    0.5398

Normal distribution:

x          f(x)      F(x)      Q(x)
      0    0.3989    0.5000    0.5000
 0.1000    0.3970    0.5398    0.4602
 0.2000    0.3910    0.5793    0.4207
 0.3000    0.3814    0.6179    0.3821
 0.4000    0.3683    0.6554    0.3446
 0.5000    0.3521    0.6915    0.3085
 0.6000    0.3332    0.7257    0.2743
 0.7000    0.3123    0.7580    0.2420
 0.8000    0.2897    0.7881    0.2119
 0.9000    0.2661    0.8159    0.1841
 1.0000    0.2420    0.8413    0.1587
 1.1000    0.2179    0.8643    0.1357
 1.2000    0.1942    0.8849    0.1151
 1.3000    0.1714    0.9032    0.0968
 1.4000    0.1497    0.9192    0.0808
 1.5000    0.1295    0.9332    0.0668
 1.6000    0.1109    0.9452    0.0548
 1.7000    0.0940    0.9554    0.0446
 1.8000    0.0790    0.9641    0.0359
 1.9000    0.0656    0.9713    0.0287
 2.0000    0.0540    0.9772    0.0228
 2.1000    0.0440    0.9821    0.0179
 2.2000    0.0355    0.9861    0.0139
 2.3000    0.0283    0.9893    0.0107
 2.4000    0.0224    0.9918    0.0082
 2.5000    0.0175    0.9938    0.0062
 2.6000    0.0136    0.9953    0.0047
 2.7000    0.0104    0.9965    0.0035
 2.8000    0.0079    0.9974    0.0026
 2.9000    0.0060    0.9981    0.0019
 3.0000    0.0044    0.9987    0.0013

End of exam 1, total 100 points (considering that 2 questions aren't graded).

Engineering Probability Class 18 Mon 2018-03-26

W Randolph Franklin (WRF), RPI

2018-03-26 00:00

Table of contents

1 Exam 2, Thurs 3/29
- 1.1 Summary
- 1.2 Material on exam
- 1.3 Material not on exam
2 An old exam
3 Normal Q function
4 Iclicker questions
5 Material added after class

1 Exam 2, Thurs 3/29

1.1 Summary

Closed book but a calculator and two 2-sided letter-paper-size note sheets are allowed.
Material is mostly from chapter 4, with maybe some from chapters 1-3.
Questions will be based on book, class, and homework, examples and exercises.
The hard part for you may be deciding what formula to use.
Any calculations will (IMHO) be easy.
Speed should not be a problem; most people should finish in 1/2 the time.

1.2 Material on exam

distributions:
1. uniform: discrete, continuous
2. exponential: This is the interarrival time between i.i.d (indep and identically distributed) events, e.g., radioactive decays, or web server hits.
3. normal
4. Poisson: This is the probability distribution for the number of events in a fixed time, when each possible event is independent and identically distributed.
  
  Examples:
  1. number of atoms decaying in a block of radium.
  2. number of hits on your web server.
  3. number of students visiting bursar.
5. binomial
6. Bernoulli
7. Geometric
For each distribution: pdf/pmf, cdf, mean, variance. (Pages 116, 164).
I might give you a new pdf and ask you to compute the cdf.
conditional probabilities.
Markov and Chebyshev inequalities.
function of a r.v.
Reliability. R(t) = 1-F(t).
MTTF (new) p 190.

MTTF = $\int_0^\infty R(t) dt$

Ex. MTTF of U[0,1].
Failure rate.

r(t)dt is the probability of failing in the next dt.

$$r(t) = \frac{-R'(t)}{R(t)}$$

Ex: do on U[0,1].
Reliability.
pdf/cdf of the max/min/sum of 2 r.v.

1.3 Material not on exam

characteristic functions.
moment generating functions.
Matlab.
entropy.
generating random variables.
Chi-square
Weibull.

2 An old exam

https://wrf.ecse.rpi.edu/pmwiki/pmwiki.php/EngProbSpring2011/Exam2

Answers: https://wrf.ecse.rpi.edu/pmwiki/pmwiki.php/EngProbSpring2011/Exam2Sol

3 Normal Q function

Table from UD Davis E&CE411, Spring 2009.

4 Iclicker questions

Consider a fair tetrahedral die, with faces labeled 1, 2, 3, 4. What is f(2)?
1. 1/6.
2. 1/4.
3. 1/2.
For the same die, what is F(2)?
1. 1/6.
2. 1/4.
3. 1/2.
Consider the continuous distribution with f(x) = $x^2$ for $0\le x \le \sqrt[3]{3}$. What is F(x)?
1. $x$
2. $x^2$
3. $x^3$
4. $x^2/2$
5. $x^3/3$
For that distribution, what is F(1)?
1. 0
2. 1/3
3. 1/2
4. 1
5. None of the above.
For the uniform U[0,2] distribution, what's the reliability R(1/2)?
1. 1/2
2. 1/4
3. 3/4
4. 1
5. None of the above.
For that distribution, what's the failure rate at 1/2?
1. 1/2
2. 1/4
3. 3/4
4. 1
5. None of the above.

5 Material added after class

My handwritten tablet notes.

Engineering Probability Class 17 Thurs 2018-03-22

W Randolph Franklin (WRF), RPI

2018-03-22 00:00

Table of contents

1 Exam 2, Thurs 3/29

Bring 2 2-sided crib sheets.
Bring a calculator if you wish. I'll try to set questions that don't require it. E.g., it's ok to write down an expression w/o evaluating it.
The exam will include whatever normal distribution tables you need. It's legal if your calculator can do these, but I'll try to set questions where that doesn't help.

2 Chapter 5, Two Random Variables, ctd

Read up to page 257.
Review of some useful summations
1. $$\sum_{k=0}^\infty a^k = \frac{1}{1-a}$$
  1. E.g. $$\sum_{k=0}^\infty 2^{-k} = 1 + 1/2 + 1/4 + \cdots = 2$$
  2. I'll prove it.
  3. E.g. for a geometric dist with $$f(k) = (1-p) p^k$$,
    
    $$\sum_{k=0}^\infty f(k) = 1 $$
    
    which is correct.
  4. That's Eqn 2.42b on page 64, and Example 3.15 on page 106, but with a different notation. Those use q where this uses p.
2. $$\sum_{k=0}^\infty k a^k = \frac{a}{(1-a)^2}$$
  1. E.g. $$\sum_{k=0}^\infty 2^{-k} = 1/2 + 2/4 + 3/8 + 4/16 + \cdots = 2$$
  2. I'll prove it.
  3. For the geometric dist, the mean, $$\sum_{k=0}^\infty k f(k) = \frac{p}{1-p} $$
  4. That's Eqn 3.15 on page 106.
Notation inconsistency.
1. p in the geometric distribution here is called q in earlier chapters.
2. The book often uses q=1-p. For these examples, q is the full number of blocks, q for quotient.
Example 5.9 on page 242.
1. The math is also relevant to filesystems where there is an underlying block size, like 4K. Some filesystems pack the partial last blocks of several files into one block.
Example 5.12 on page 246.
Section 5.3.1: Example 5.14 on page 247.

This has mixed continuous - discrete random variables. The input signal X is 1 or -1. It is perturbed by noise N that is U[-2,2] to give the output Y.. What is P[X=1|Y<=0]?
Example 5.15 on page 251. CDF of joint uniform r.v.

$$f_{X,Y} = \frac{1}{2\pi \sqrt{1-\rho^2}} \exp{\left(\frac{-{(x^2-2\rho x y + y^2)}}{{2(1-\rho^2)}}\right)}$$

3 Material added after class

My handwritten tablet notes.

Engineering Probability Class 16 Mon 2018-03-19

W Randolph Franklin (WRF), RPI

2018-03-18 00:00

Table of contents

1 Review of normal distribution

Review of the normal distribution. If $\mu=0, \sigma=1$ (to keep it simple), then: $$f_N(x) = \frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}} $$
Show that $\int_{-\infty}^{\infty} f(x) dx =1$. This is example 4.21 on page 168.
Iclicker: Consider a normal r.v. with $\mu=500, \sigma=100$. What is the probability of being in the interval [400,600]? Page 169 might be useful.
1. .02
2. .16
3. .48
4. .58
5. .84
Iclicker. Repeat that question for the interval [500,700].
Iclicker. Repeat that question for the interval [0,300].

2 Chapter 5, Two Random Variables

See intro I did in last class.
Today's reading: Chapter 5, page 233-242.
Review: An outcome is a result of a random experiment. It need not be a number. They are selected from the sample space. A random variable is a function mapping an outcome to a real number. An event is an interesting set of outcomes.
Example 5.3 on page 235.
Example 5.5 on page 238.
Example 5.6 on page 240.
Example 5.7 on page 241.
Example 5.8 on page 242.

3 Next time

Cdf of mixed continuous - discrete random variables: section 5.3.1 on page 247. The input signal X is 1 or -1. It is perturbed by noise N that is U[-2,2] to give the output Y.. What is P[X=1|Y<=0]?
Review Extend section 5.3.1 example 5.14 on page 247.
Independence: Example 5.22 on page 256. Are 2 normal r.v. independent for different values of $\rho$ ?
Example 5.31 on page 264. This is a noisy comm channel, now with Gaussian (normal) noise. The problems are:
1. what input signal to infer from each output, and
2. how accurate is this?
5.6.2 Joint moments etc
1. Work out for 2 3-sided dice.
2. Work out for tossing dart onto triangular board.
Example 5.27: correlation measures ''linear dependence''. If the dependence is more complicated, the variables may be dependent but not correlated.
Covariance, correlation coefficient.
Section 5.7, page 261. Conditional pdf. There is nothing majorly new here; it's an obvious extension of 1 variable.
1. Discrete: Work out an example with a pair of 3-sided loaded dice.
2. Continuous: a triangular dart board. There is one little trick because for P[X=x]=0 since X is continuous, so how can we compute P[Y=y|X=x] = P[Y=y & X=x]/P[x]? The answer is that we take the limiting probability P[x<X<x+dx] etc as dx shrinks, which nets out to using f(x) etc.
Example 5.31 on page 264. This is a noisy comm channel, now with Gaussian (normal) noise. This is a more realistic version of the earlier example with uniform noise. The application problems are:
1. what input signal to infer from each output,
2. how accurate is this, and
3. what cutoff minimizes this?
In the real world there are several ways you could reduce that error:
1. Increase the transmitted signal,
2. Reduce the noise,
3. Retransmit several times and vote.
4. Handshake: Include a checksum and ask for retransmission if it fails.
5. Instead of just deciding X=+1 or X=-1 depending on Y, have a 3rd decision, i.e., uncertain if $|Y|<0.5$, and ask for retransmission in that case.
Section 5.8 page 271: Functions of two random variables.
1. We already saw how to compute the pdf of the sum and max of 2 r.v.
What's the point of transforming variables in engineering? E.g. in video, (R,G,B) might be transformed to (Y,I,Q) with a 3x3 matrix multiply. Y is brightness (mostly the green component). I and Q are approximately the red and blue. Since we see brightness more accurately than color hue, we want to transmit Y with greater precision. So, we want to do probabilities on all this.
Functions of 2 random variables
1. This is an important topic.
2. Example 5.44, page 275. Tranform two independent Gaussian r.v from (X,Y) to (R, $\theta$} ).
3. Linear transformation of two Gaussian r.v.
4. Sum and difference of 2 Gaussian r.v. are independent.
Section 5.9, page 278: pairs of jointly Gaussian r.v.
1. I will simplify formula 5.61a by assuming that $\mu=0, \sigma=1$.
  
  $$f_{XY}(x,y)= \frac{1}{2\pi \sqrt{1-\rho^2}} e^{ \frac{-\left( x^2-2\rho x y + y^2\right)}{2(1-\rho^2)} } $$ .
2. The r.v. are probably dependent. $\rho$} says how much.
3. The formula degenerates if $|\rho|=1$ since the numerator and denominator are both zero. However the pdf is still valid. You could make the formula valid with l'Hopital's rule.
4. The lines of equal probability density are ellipses.
5. The marginal pdf is a 1 variable Gaussian.
Example 5.47, page 282: Estimation of signal in noise
1. This is our perennial example of signal and noise. However, here the signal is not just $\pm1$ but is normal. Our job is to find the ''most likely'' input signal for a given output.
Next time: We've seen 1 r.v., we've seen 2 r.v. Now we'll see several r.v.

4 Material added after class

My handwritten tablet notes.

Engineering Probability Homework 7 due Thurs 2018-03-22 2359 EST

W Randolph Franklin (WRF), RPI

2018-03-08 00:00

How to submit

Submit to LMS; see details in syllabus.

Questions

All questions are from the text.

Each part of a question is worth 5 points.

4.63 on page 221.
4.68 on page 222.
4.69 on page 222.
4.85 on page 223.
4.90 on page 223.
4.99 a and c on page 224.
4.126 on page 226. Assume that devices that haven't been used yet aren't failing.
4.133 (a) on page 227.

Total: 70 pts.

Engineering Probability Class 15 Thurs 2018-03-08

W Randolph Franklin (WRF), RPI

2018-03-05 00:00

Table of contents

1 Grades
- 1.1 Exam 1
- 1.2 Knowitall points
- 1.3 Iclicker
- 1.4 Piazza
2 Future Iclicker grading policy
3 Tutorial on probability density
4 4.8 Reliability
5 4.9 Generating r.v
6 4.10 Entropy
7 Max of two r.v.
8 Chapter 5, Two Random Variables
9 Material added after class

1 Grades

1.1 Exam 1

We'll hand back Exam 1 today. Amelia will keep the exams that are not picked up.
If you have a question, write it in an email, to Amelia and me, by March 19. Then, talk to her that week.

1.2 Knowitall points

I (WRF) uploaded them to LMS.
Write me about missing points, with a copy of your email to me mentioning the point that I ignored.

1.3 Iclicker

23 students (registered in LMS) have not registered iclickers. I emailed all of them. Register within 2 weeks if you want these points.
Eight classes used iclickers. I assigned one point for each class where you answered at least one question, regardless of whether it was correct.
The first version of the upload did not give credit for a day when all answers were wrong. That's been fixed.
I uploaded these to LMS.
I'll make manual corrections, e.g., for excused absences, later, since it's difficult to merge them in in a way that won't be reversed by the next download from iclicker.
If you having trouble registering, see me after class, and we'll try here:

https://www.iclicker.com/remote-registration-form-for-classic

1.4 Piazza

See the Syllabus, Section 8.4.
For the course to date, I gave points for contributing at least twice.
These were uploaded to LMS.

2 Future Iclicker grading policy

In future classes, to get credit for a day, you'll need to get at least one question right.

3 Tutorial on probability density

Since the meaning of probability density when you transform variables is still causing problems for some people, think of changing units from English to metric. First, with one variable, X.

Let X be in feet and be U[0,1].

$$f_X(x) = \begin{cases} 1& \text{if } 0\le x\le1\\ 0&\text{otherwise} \end{cases}$$
$P[.5\le x\le .51] = 0.01$.
Now change to centimeters. The transformation is $Y=30X$.
$$f_Y(y) = \begin{cases} 1/30 & \text{if } 0\le y\le30\\ 0&\text{otherwise} \end{cases}$$
Why is 1/30 reasonable?
First, the pdf has to integrate to 1: $$\int_{-\infty}^\infty f_Y(y) =1$$
Second, $$\begin{align} & P[.5\le x\le .51] \\ &= \int_.5^.51 f_X(x) dx \\& =0.01 \\& = P[15\le y\le 15.3] \\& = \int_{15}^{15.3} f_Y(y) dy \end{align}$$

4 4.8 Reliability

The reliability R(t) is the probability that the item is still functioning at t. R(t) = 1-F(t).
What is the reliability of an exponential r.v.?
The Mean Time to Failure (MTTF) is obvious.
... for an exponential r.v.?
The failure rate is the probability of a widget that is still alive now dying in the next second.
If the failure rate is constant, the distribution is exponential.
The importance of getting the fundamentals (or foundations) right:

(I mentioned this in an earlier class.)

In the past 40 years, two major bridges in the Capital district have collapsed because of inadequate foundations. The Green Island Bridge collapsed on 3/15/77, see http://en.wikipedia.org/wiki/Green_Island_Bridge , http://cbs6albany.com/news/local/recalling-the-schoharie-bridge-collapse-30-years-later . The Thruway (I-90) bridge over Schoharie Creek collapsed on 4/5/87, killing 10 people.

Why RPI likes the Roeblings: none of their bridges collapsed. E.g., when designing the Brooklyn Bridge, Roebling Sr knew what he didn't know. He realized that something hung on cables might sway in the wind, in a complicated way that he couldn't analyze. So he added a lot of diagonal bracing. The designers of the original Tacoma Narrows Bridge were smart enough that they didn't need this expensive margin of safety.
Another way to look at reliability: think of people.
1. Your reliability R(t) is the probability that you live to age t, given that you were born alive. In the US, that's 98.7% for age 20, 96.4% for 40, 87.8% for 60.
2. MTTF is your life expectancy at birth. In the US, that's 77.5 years.
3. Your failure rate, r(t), is your probability of dying in the next dt, divided by dt, at different ages. E.g. for a 20-year-old, it's 0.13%/year for a male and 0.046%/year for a female http://www.ssa.gov/oact/STATS/table4c6.html . For 40-year-olds, it's 0.24% and 0.14%. For 60-year-olds, it's 1.2% and 0.7%. At 80, it's 7% and 5%. At 100, it's 37% and 32%.
P190: If the failure rate is constant, then the distribution is exponential. We'll show this.
If several subsystems are all necessary, e.g., are in serial, then their reliabilities multiply. The result is less reliable.

If only one of them is necessary, e.g. are in parallel, then their complementary reliabilities multiply. The result is more reliable.

An application would be different types of RAIDs. (Redundant Array of Inexpensivexxxxxxxxxxxxx Independent Disks). In one version you stripe a file over two hard drives to get increased speed, but decreased reliability. In another version you triplicate the file over three drives to get increased reliability. (You can also do a hybrid setup.)

(David Patterson at Berkeley invented RAID (and also RISC). He intended I to mean Inexpensive. However he said that when this was commercialized, companies said that the I meant Independent.)

5 4.9 Generating r.v

Ignore. It's surprisingly hard to do right, and has been implemented in builtin routines. Use them.

6 4.10 Entropy

Ignore since it's starred.

7 Max of two r.v.

This is not in the text, but is an intro to multiple r.v.

pdf and cdf of the max of 2 random variables:

If Z=max(X,Y) then $F_Z(x) = F_X(x) F_Y(x)$

E.g. if X and Y and U[0,1], so $F_X(x) = x$ for 0<=x=1, then $F_Z(x) = x^2$

What are the pdf and mean here? What about the max of 3 r.v.? What about the min?
Iclicker. What is the cdf (for 0<=x<=1) of the max of 3 r.v. that are each U[0,1]?
1. x
2. x^2
3. x^3
4. 1
5. 0
pdf of the sum of 2 r.v. If Z=X+Y then $f_Z(z) = \int_x f_X(x) f_Y(z-x) dx$ E.g. If X and Y and U[0,1] then $f_Z(z) =$ ?

What is the mean?
The pdf of the sum of two uniform r.v. is a hat function. It looks a little more like a normal distribution than the square uniform distribution did.

The sum of 3 uniform r.v. would look even more normal, and so on.

8 Chapter 5, Two Random Variables

One experiment might produce two r.v. E.g.,
1. Shoot an arrow; it lands at (x,y).
2. Toss two dice.
3. Measure the height and weight of people.
4. Measure the voltage of a signal at several times.
The definitions for pmf, pdf and cdf are reasonable extensions of one r.v.
The math is messier.
The two r.v. may be '''dependent''' and '''correlated'''.
The '''correlation coefficient''', $\rho$, is a dimensionless measure of linear dependence. $-1\le\rho\le1$.
$\rho$ may be 0 when the variables have a nonlinear dependent relation.
Integrating (or summing) out one variable gives a marginal distribution.
We'll do some simple examples:
1. Toss two 4-sided dice.
2. Toss two 4-sided ''loaded'' dice. The marginal pmfs are uniform.
3. Pick a point uniformly in a square.
4. Pick a point uniformly in a triangle. x and y are now dependent.
The big example is a 2 variable normal distribution.
1. The pdf is messier.
2. It looks elliptical unless $\rho$=0.

9 Material added after class

My handwritten tablet notes.

Engineering Probability Class 14 Mon 2018-03-05

W Randolph Franklin (WRF), RPI

2018-03-02 00:00

Table of contents

1 How find all these blog postings

Click the archive or tags button at the top of each posting.

2 Exam 1 grades

Here is a scatterplot showing the lack of correlation between the order in which a student finished exam 1 and the grade.

3 Matlab review

This is my opinion of Matlab.

Advantages
1. Excellent quality numerical routines.
2. Free at RPI.
3. Many toolkits available.
4. Uses parallel computers and GPUs.
5. Interactive - you type commands and immediately see results.
6. No need to compile programs.
Disadvantages
1. Very expensive outside RPI.
2. Once you start using Matlab, you can't easily move away when their prices rise.
3. You must force your data structures to look like arrays.
4. Long programs must still be developed offline.
5. Hard to write in Matlab's style.
6. Programs are hard to read.
Alternatives
1. Free clones like Octave are not very good
2. The excellent math routines in Matlab are also available free in C++ librarues
3. With C++ libraries using template metaprogramming, your code looks like Matlab.
4. They compile slowly.
5. Error messages are inscrutable.
6. Executables run very quickly.

4 Matlab ctd

Finish the examples from last Thurs.

5 Chapter 4 ctd

Exponential r.v. page 166.
1. Memoryless.
2. $f(x) = \lambda e^{-\lambda x}$ if $x\ge0$, 0 otherwise.
3. Example: time for a radioactive atom to decay.
Gaussian r.v.
1. $$f(x) = \frac{1}{\sqrt{2\pi} \cdot \sigma} e^{\frac{-(x-\mu)^2}{2\sigma^2}}$$
2. cdf often called $\Psi(x)$
3. cdf complement:
  1. $$Q(x)=1-\Psi(x) = \int_x^\infty \frac{1}{\sqrt{2\pi} \cdot \sigma} e^{\frac{-(t-\mu)^2}{2\sigma^2}} dt$$
  2. E.g., if $\mu=500, \sigma=100$,
    1. P[x>400]=0.66
    2. P[x>500]=0.5
    3. P[x>600]=0.16
    4. P[x>700]=0.02
    5. P[x>800]=0.001
Skip the other distributions (for now?).
Example 4.22 page 169.
Example 4.24 page 172.
Functions of a r.v.: Example 4.29 page 175.
Linear function: Example 4.31 on page 176.
Markov and Chebyshev inequalities.
1. Your web server averages 10 hits/second.
2. It will crash if it gets 20 hits.
3. By the Markov inequality, that has a probability at most 0.5.
4. That is way way too conservative, but it makes no assumptions about the distribution of hits.
5. For the Chebyshev inequality, assume that the variance is 10.
6. It gives the probability of crashing at under 0.1. That is tighter.
7. Assuming the distribution is Poisson with a=10, use Matlab 1-cdf('Poisson',20,10). That gives 0.0016.
8. The more we assume, the better the answer we can compute.
9. However, our assumptions had better be correct.
10. (Editorial): In the real world, and especially economics, the assumptions are, in fact, often false. However, the models still usually work (at least, we can't prove they don't work). Until they stop working, e.g., https://en.wikipedia.org/wiki/Long-Term_Capital_Management . Jamie Dimon, head of JP Morgan, has observed that the market swings more widely than is statistically reasonable.
Section 4.7, page 184, Transform methods: characteristic function.
1. The characteristic function $\Phi_X(\omega)$ of a pdf f(x) is like its Fourier transform.
2. One application is that the moments of f can be computed from the derivatives of $\Phi$.
3. We will compute the characteristic functions of the uniform and exponential distributions.
4. The table of p 164-5 lists a lot of characteristic functions.
For discrete nonnegative r.v., the moment generating function is more useful.
1. It's like the Laplace transform.
2. The pmf and moments can be computed from it.

6 Material added after class

My handwritten tablet notes.