Skip to main content

Engineering Probability Class 20 Mon 2021-04-12

1 Textbook material

1.1 Min, max of 2 r.v.

  1. Example 5.43, page 274.

1.2 Chapter 6: Vector random variables, page 303-

  1. Skip the starred sections.

  2. Examples:

    1. arrivals in a multiport switch,

    2. audio signal at different times.

  3. pmf, cdf, marginal pmf and cdf are obvious.

  4. conditional pmf has a nice chaining rule.

  5. For continuous random variables, the pdf, cdf, conditional pdf etc are all obvious.

  6. Independence is obvious.

  7. Work out example 6.5, page 306. The input ports are a distraction. This problem reduces to a multinomial probability where N is itself a random variable.

1.3 6.1.2 Joint Distribution Functions, ctd.

  1. Example 6.7 Multiplicative Sequence, p 308.

1.4 6.1.3 Independence, p 309

  1. Definition 6.16.

  2. Example 6.8 Independence, p. 309.

  3. Example 6.9 Maximum and Minimum of n Random Variables

    Apply this to uniform r.v.

  4. Example 6.10 Merging of Independent Poisson Arrivals, p 310

  5. Example 6.11 Reliability of Redundant Systems

  6. Reminder for exponential r.v.:

    1. $f(x) = \lambda e^{-\lambda x}$

    2. $F(x) = 1-e^{-\lambda x}$

    3. $\mu = 1/\lambda$

1.5 6.2.2 Transformations of Random Vectors

  1. Let A be a 1 km cube in the atmosphere. Your coordinates are in km.

  2. Pick a point uniformly in it. $f_X(\vec{x}) = 1$.

  3. Now transform to use m, not km. Z=1000 X.

  4. $F_Z(\vec{z}) = 1/(1000^3) f_X(\vec{z}/1000)$

1.6 6.2.3 pdf of General Transformations

We skip Section 6.2.3. However, a historical note about Student's T distribution:

Student was a pseudonymn of a mathematician working for Guinness in Ireland. He developed several statistical techniques to sample beer to assure its quality. Guinness didn't let him publish under his real name because these were trade secrets.

1.7 6.3 Expected values of vector random variables, p 318

  1. Section 6.3, page 316, extends the covariance to a matrix. Even with N variables, note that we're comparing only pairs of variables. If there were a complicated 3 variable dependency, which could happen (and did in a much earlier example), all the pairwise covariances would be 0.

  2. Note the sequence.

    1. First, the correlation matrix has the expectations of the products.

    2. Then the covariance matrix corrects for the means not being 0.

    3. Finally the correlation coefficents (not shown here) correct for the variances not being 1.

1.9 Section 6.5, page 332: Estimation of random variables

  1. Assume that we want to know X but can only see Y, which depends on X.

  2. This is a generalization of our long-running noisy communication channel example. We'll do things a little more precisely now.

  3. Another application would be to estimate tomorrow's price of GOOG (X) given the prices to date (Y).

  4. Sometimes, but not always, we have a prior probability for X.

  5. For the communication channel we do, for GOOG, we don't.

  6. If we do, it's a ''maximum a posteriori estimator''.

  7. If we don't, it's a ''maximum likelihood estimator''. We effectively assume that that prior probability of X is uniform, even though that may not completely make sense.

  8. You toss a fair coin 3 times. X is the number of heads, from 0 to 3. Y is the position of the 1st head. from 0 to 3. If there are no heads, we'll say that the first head's position is 0.

    (X,Y)

    p(X,Y)

    (0,0)

    1/8

    (1,1)

    1/8

    (1,2)

    1/8

    (1,3)

    1/8

    (2,1)

    2/8

    (2,2)

    1/8

    (3,1)

    1/8

    E.g., 1 head can occur 3 ways (out of 8): HTT, THT, TTH. The 1st (and only) head occurs in position 1, one of those ways. p=1/8.

  9. Conditional probabilities:

    p(x|y)

    y=0

    y=1

    y=2

    y=3

    x=0

    1

    0

    0

    0

    x=1

    0

    1/4

    1/2

    1

    x=2

    0

    1/2

    1/2

    0

    x=3

    0

    1/4

    0

    0

    $g_{MAP}(y)$

    0

    2

    1 or 2

    1

    $P_{error}(y)$

    0

    1/2

    1/2

    0

    p(y)

    1/8

    1/2

    1/4

    1/8

    The total probability of error is 3/8.

  10. We observe Y and want to guess X from Y. E.g., If we observe $$\small y= \begin{pmatrix}0\\1\\2\\3\end{pmatrix} \text{then } x= \begin{pmatrix}0\\ 2 \text{ most likely} \\ 1, 2 \text{ equally likely} \\ 1 \end{pmatrix}$$

  11. There are different formulae. The above one was the MAP, maximum a posteriori probability.

    $$g_{\text{MAP}} (y) = \max_x p_x(x|y) \text{ or } f_x(x|y)$$

    That means, the value of $x$ that maximizes $p_x(x|y)$

  12. What if we don't know p(x|y)? If we know p(y|x), we can use Bayes. We might measure p(y|x) experimentally, e.g., by sending many messages over the channel.

  13. Bayes requires p(x). What if we don't know even that? E.g. we don't know the probability of the different possible transmitted messages.

  14. Then use maximum likelihood estimator, ML. $$g_{\text{ML}} (y) = \max_x p_y(y|x) \text{ or } f_y(y|x)$$

  15. There are other estimators for different applications. E.g., regression using least squares might attempt to predict a graduate's QPA from his/her entering SAT scores. At Saratoga in August we might attempt to predict a horse's chance of winning a race from its speed in previous races. Some years ago, an Engineering Assoc Dean would do that each summer.

  16. Historically, IMO, some of the techniques, like least squares and logistic regression, have been used more because they're computationally easy than because they're logically justified.

1.10 Central limit theorem etc

  1. Review: Almost no matter what distribution the random variable X is, $F_{M_n}$ quickly becomes Gaussian as n increases. n=5 already gives a good approximation.

  2. nice applets:

    1. http://onlinestatbook.com/stat_sim/normal_approx/index.html This tests how good is the normal approximation to the binomial distribution.

    2. http://onlinestatbook.com/stat_sim/sampling_dist/index.html This lets you define a distribution, and take repeated samples of a given size. It shows how the means of the samples are distributed. For sample with more than a few observations, they look fairly normal.

  3. Sample problems.

    1. Problem 7.1 on page 402.

    2. Problem 7.22.

    3. Problem 7.25.