Engineering Probability Class 14 Mon 2018-03-05

1   How find all these blog postings

Click the archive or tags button at the top of each posting.

2   Exam 1 grades

Here is a scatterplot showing the lack of correlation between the order in which a student finished exam 1 and the grade.

/images/exam1-scatter.png

3   Matlab review

This is my opinion of Matlab.

  1. Advantages
    1. Excellent quality numerical routines.
    2. Free at RPI.
    3. Many toolkits available.
    4. Uses parallel computers and GPUs.
    5. Interactive - you type commands and immediately see results.
    6. No need to compile programs.
  2. Disadvantages
    1. Very expensive outside RPI.
    2. Once you start using Matlab, you can't easily move away when their prices rise.
    3. You must force your data structures to look like arrays.
    4. Long programs must still be developed offline.
    5. Hard to write in Matlab's style.
    6. Programs are hard to read.
  3. Alternatives
    1. Free clones like Octave are not very good
    2. The excellent math routines in Matlab are also available free in C++ librarues
    3. With C++ libraries using template metaprogramming, your code looks like Matlab.
    4. They compile slowly.
    5. Error messages are inscrutable.
    6. Executables run very quickly.

4   Matlab ctd

Finish the examples from last Thurs.

5   Chapter 4 ctd

  1. Exponential r.v. page 166.
    1. Memoryless.
    2. $f(x) = \lambda e^{-\lambda x}$ if $x\ge0$, 0 otherwise.
    3. Example: time for a radioactive atom to decay.
  2. Gaussian r.v.
    1. $$f(x) = \frac{1}{\sqrt{2\pi} \cdot \sigma} e^{\frac{-(x-\mu)^2}{2\sigma^2}}$$
    2. cdf often called $\Psi(x)$
    3. cdf complement:
      1. $$Q(x)=1-\Psi(x) = \int_x^\infty \frac{1}{\sqrt{2\pi} \cdot \sigma} e^{\frac{-(t-\mu)^2}{2\sigma^2}} dt$$
      2. E.g., if $\mu=500, \sigma=100$,
        1. P[x>400]=0.66
        2. P[x>500]=0.5
        3. P[x>600]=0.16
        4. P[x>700]=0.02
        5. P[x>800]=0.001
  3. Skip the other distributions (for now?).
  4. Example 4.22 page 169.
  5. Example 4.24 page 172.
  6. Functions of a r.v.: Example 4.29 page 175.
  7. Linear function: Example 4.31 on page 176.
  8. Markov and Chebyshev inequalities.
    1. Your web server averages 10 hits/second.
    2. It will crash if it gets 20 hits.
    3. By the Markov inequality, that has a probability at most 0.5.
    4. That is way way too conservative, but it makes no assumptions about the distribution of hits.
    5. For the Chebyshev inequality, assume that the variance is 10.
    6. It gives the probability of crashing at under 0.1. That is tighter.
    7. Assuming the distribution is Poisson with a=10, use Matlab 1-cdf('Poisson',20,10). That gives 0.0016.
    8. The more we assume, the better the answer we can compute.
    9. However, our assumptions had better be correct.
    10. (Editorial): In the real world, and especially economics, the assumptions are, in fact, often false. However, the models still usually work (at least, we can't prove they don't work). Until they stop working, e.g., https://en.wikipedia.org/wiki/Long-Term_Capital_Management . Jamie Dimon, head of JP Morgan, has observed that the market swings more widely than is statistically reasonable.
  9. Section 4.7, page 184, Transform methods: characteristic function.
    1. The characteristic function \(\Phi_X(\omega)\) of a pdf f(x) is like its Fourier transform.
    2. One application is that the moments of f can be computed from the derivatives of \(\Phi\).
    3. We will compute the characteristic functions of the uniform and exponential distributions.
    4. The table of p 164-5 lists a lot of characteristic functions.
  10. For discrete nonnegative r.v., the moment generating function is more useful.
    1. It's like the Laplace transform.
    2. The pmf and moments can be computed from it.