Skip to main content

Engineering Probability Class 11 Thu 2022-02-17

1 Midterm exam 1: Feb 28

  1. Reminder that it will be on Mon Feb 28

  2. In class,

    1. unless you are quarantined etc.

    2. If so, remind me before, and you'll do it remotely.

  3. It will use gradescope.

  4. So bring your computers.

  5. If you have an accommodation, we'll move you to my lab for the rest of the time.

  6. The exam is not intended to be a speed contest, but opinions may differ.

  7. FWIW, when I graph finish time vs grade, there has been no correlation.

  8. You are allowed one 2-sided, 8.5"x11" crib sheet. It may be produced mechanically. You may form consortia and mass produce and sell crib sheets. If so, give me a copy.

  9. You are expected not to use your computers for anything except putting answers into gradescope.

2 In-class vs remote lectures

  1. I don't personally care which you do.

  2. Except exams, which are in person.

  3. Suggestions for improving the technical quality of the videos are welcome.

  4. However I've already thought of most of the obvious things.

  5. I don't want to reduce the quality for the students that do attend in person.

3 Continuing Chapter 4

  1. Text 4.2 p 148 pdf

  2. Simple continuous r.v. examples: uniform, exponential.

  3. The exponential distribution complements the Poisson distribution. The Poisson describes the number of arrivals per unit time. The exponential describes the distribution of the times between consecutive arrivals.

    The exponential is the continuous analog to the geometric. If the random variable is the integral number of seconds, use geometric. If the r.v. is the real number time, use exponential.

    Ex 4.7 p 150: exponential r.v.

  4. Properties

    1. Memoryless.

    2. \(f(x) = \lambda e^{-\lambda x}\) if \(x\ge0\), 0 otherwise.

    3. Example: time for a radioactive atom to decay.

  5. Skip 4.2.1 for now.

  6. The most common continuous distribution is the normal distribution.

  7. 4.2.2 p 152. Conditional probabilities work the same with continuous distributions as with discrete distributions.

  8. p 154. Gaussian r.v.

    1. \(f(x) = \frac{1}{\sqrt{2\pi} \cdot \sigma} e^{\frac{-(x-\mu)^2}{2\sigma^2}}\)

    2. cdf often called \(\Psi(x)\)

    3. cdf complement:

      1. \(Q(x)=1-\Psi(x) = \int_x^\infty \frac{1}{\sqrt{2\pi} \cdot \sigma} e^{\frac{-(t-\mu)^2}{2\sigma^2}} dt\)

      2. E.g., if \(\mu=500, \sigma=100\),

        1. P[x>400]=0.66

        2. P[x>500]=0.5

        3. P[x>600]=0.16

        4. P[x>700]=0.02

        5. P[x>800]=0.001

  9. Text 4.3 p 156 Expected value

  10. Skip the other distributions (for now?).

4 Examples

4.11, p153.

5 4.3.2 Variance

p160

6 Memoryless Exponential Distn

p 166.

7 4.4.3 Normal (Gaussian) dist

p 167.

Show that the pdf integrates to 1.

Lots of different notations:

Generally, F(x) = P(X<=x).

For normal: that is called $\Psi(x)$ .

$Q(x) = 1-\Psi(x)$ .

Example 4.22 page 169.

8 4.4.4 Gamma r.v.

  1. 2 parameters

  2. Has several useful special cases, e.g., chi-squared and m-Erlang.

  3. The sum of m exponential r.v. has the m-Erlang dist.

  4. Example 4.24 page 172.

9 Functions of a r.v.

  1. Example 4.29 page 175.

  2. Linear function: Example 4.31 on page 176.

10 Markov and Chebyshev inequalities (Section 4.6, page 181)

  1. Your web server averages 10 hits/second.

  2. It will crash if it gets 20 hits.

  3. By the Markov inequality, that has a probability at most 0.5.

  4. That is way way too conservative, but it makes no assumptions about the distribution of hits.

  5. For the Chebyshev inequality, assume that the variance is 10.

  6. It gives the probability of crashing at under 0.1. That is tighter.

  7. Assuming the distribution is Poisson with a=10, use Matlab 1-cdf('Poisson',20,10). That gives 0.0016.

  8. The more we assume, the better the answer we can compute.

  9. However, our assumptions had better be correct.

  10. (Editorial): In the real world, and especially economics, the assumptions are, in fact, often false. However, the models still usually work (at least, we can't prove they don't work). Until they stop working, e.g., https://en.wikipedia.org/wiki/Long-Term_Capital_Management . Jamie Dimon, head of JP Morgan, has observed that the market swings more widely than is statistically reasonable.

11 Reliability (section 4.8, page 189)

  1. The reliability R(t) is the probability that the item is still functioning at t. R(t) = 1-F(t).

  2. What is the reliability of an exponential r.v.? ( $F(t)=1-e^{\lambda t}$ ).

  3. The Mean Time to Failure (MTTF) is obvious. The equation near the top of page 190 should be

    $E[T] = \int_0^\infty \textbf{t} f(t) dt$

  4. ... for an exponential r.v.?

  5. The failure rate is the probability of a widget that is still alive now dying in the next second.

  6. The importance of getting the fundamentals (or foundations) right:

    In the past 50 years, two major bridges in the Capital district have collapsed because of inadequate foundations. The Green Island Bridge collapsed on 3/15/77, see http://en.wikipedia.org/wiki/Green_Island_Bridge , http://cbs6albany.com/news/local/recalling-the-schoharie-bridge-collapse-30-years-later . The Thruway (I-90) bridge over Schoharie Creek collapsed on 4/5/87, killing 10 people.

    Why RPI likes the Roeblings: none of their bridges collapsed. E.g., when designing the Brooklyn Bridge, Roebling Sr knew what he didn't know. He realized that something hung on cables might sway in the wind, in a complicated way that he couldn't analyze. So he added a lot of diagonal bracing. The designers of the original Tacoma Narrows Bridge were smart enough that they didn't need this expensive margin of safety.

  7. Another way to look at reliability: think of people.

    1. Your reliability R(t) is the probability that you live to age t, given that you were born alive. In the US, that's 98.7% for age 20, 96.4% for 40, 87.8% for 60.

    2. MTTF is your life expectancy at birth. In the US, that's 77.5 years.

    3. Your failure rate, r(t), is your probability of dying in the next dt, divided by dt, at different ages. E.g. for a 20-year-old, it's 0.13%/year for a male and 0.046%/year for a female http://www.ssa.gov/oact/STATS/table4c6.html . For 40-year-olds, it's 0.24% and 0.14%. For 60-year-olds, it's 1.2% and 0.7%. At 80, it's 7% and 5%. At 100, it's 37% and 32%.

  8. Example 4.47, page 190. If the failure rate is constant, the distribution is exponential.

  9. If several subsystems are all necessary, e.g., are in serial, then their reliabilities multiply. The result is less reliable.

    If only one of them is necessary, e.g. are in parallel, then their complementary reliabilities multiply. The result is more reliable.

    An application would be different types of RAIDs. (Redundant Array of Inexpensivexxxxxxxxxxxxx Independent Disks). In one version you stripe a file over two hard drives to get increased speed, but decreased reliability. In another version you triplicate the file over three drives to get increased reliability. (You can also do a hybrid setup.)

    (David Patterson at Berkeley invented RAID (and also RISC). He intended I to mean Inexpensive. However he said that when this was commercialized, companies said that the I meant Independent.)

  10. Example 4.49 page 193, reliability of series subsystems.

  11. Example 4.50 page 193, increased reliability of parallel subsystems.

12 4.9 Generating r.v

Ignore. It's surprisingly hard to do right, and has been implemented in builtin routines. Use them.

13 4.10 Entropy

Ignore since it's starred.

14 Xkcd comic

Frequentists vs. Bayesians