Engineering Probability Class 2 Thu 2021-01-28

W Randolph Franklin (WRF), Rensselaer Polytechnic Institute (RPI)

2021-01-28 00:00

Source

Table of contents::

1 Note the files button in the top bar

2 Probability in the real world - enrichment

How Did Economists Get It So Wrong? is an article by Paul Krugman (2008 Nobel Memorial Prize in Economic Science). It says, "the economics profession went astray because economists, as a group, mistook beauty, clad in impressive-looking mathematics, for truth." You might see a certain relevance to this course. You have to get the model right before trying to solve it.

Though I don't know much about it, I'll cheerfully try to answer any questions about econometrics.

Another relevance to this course, in an enrichment sense, is that some people believe that the law of large numbers does not apply to certain variables, like stock prices. They think that larger and larger sample frequencies do not converge to a probability, because the variance of the underlying distribution is infinite. This also is beyond this course.

3 Chapter 1 ctd

Rossman-Chance coin toss applet demonstrates how the observed frequencies converge (slowly) to the theoretical probability.
Example of unreliable channel (page 12)
1. Want to transmit a bit: 0, 1
2. It arrives wrong with probability e, say 0.001
3. Idea: transmit each bit 3 times and vote.
  1. 000 -> 0
  2. 001 -> 0
  3. 011 -> 1
1. 3 bits arrive correct with probability \((1-e)^3\) = 0.997002999
2. 1 error with probability \(3(1-e)^2e\) = 0.002994
3. 2 errors with probability \(3(1-e)e^2\) = 0.000002997
4. 3 errors with probability \(e^3\) = 0.000000001
5. corrected bit is correct if 0 or 1 errors, with probability \((1-e)^3+3(1-e)^2e\) = 0.999996999
6. We reduced probability of error by factor of 1000.
7. Cost: triple the transmission plus a little logic HW.
Example of text compression (page 13)
1. Simple way: Use 5 bits for each letter: A=00000, B=00001
2. In English, 'E' common, 'Q' rare
3. Use fewer bits for E than Q.
4. Morse code did this 170 years ago.
  1. E = .
  2. Q = _ _ . _
1. Aside: An expert Morse coder is faster than texting.
2. English can be compressed to about 1 bit per letter (with difficulty); 2 bits is easy.
3. Aside: there is so much structure in English text, that if you add the bit strings for 2 different texts bit-by-bit, they can usually mostly be reconstructed.
4. That's how cryptoanalysis works.
Example of reliable system design (page 13)
1. Nuclear power plant fails if
  1. water leaks
  2. and operator asleep (a surprising number of disasters happen in the graveyard shift).
  3. and backup pump fails
  4. or was turned off for maintenance
1. What's the probability of failure? This depends on the probabilities of the various failure modes. Those might be impossible to determine accurately.
2. Design a better system? Coal mining kills.
3. The backup procedures themselves can cause problems (and are almost impossible to test). A failure with the recovery procedure was part of the reason for a Skype outage.

4 Chapter 2

A random experiment (page 21) has 2 parts:
1. experimental procedure
2. set of measurements
Random experiment may have subexperiments and sequences of experiments.
Outcome or sample point \(\zeta\): a non-decomposable observation.
Sample space S: set of all outcomes
\(|S|\):
1. finite, e.g. {H,T}, or
2. discrete = countable, e.g., 1,2,3,4,... Sometimes discrete includes finite. or
3. uncountable, e.g., \(\Re\), aka continuous.
Types of infinity:
1. Some sets have finite size, e.g., 2 or 6.
2. Other sets have infinite size.
3. Those are either countable or uncountable.
4. A countably infinite set can be arranged in order so that its elements can be numbered 1,2,3,...
5. The set of natural numbers is obviously countable.
6. The set of positive rational numbers between 0 and 1 is also countable. You can order it thus: \(\frac{1}{1}, \frac{1}{2}, \frac{1}{3}, \frac{2}{3}, \frac{1}{4}, \ \frac{3}{4}, \frac{1}{5}, \frac{2}{5}, \frac{3}{5}, \ \cdots\)
7. The set of real numbers is not countable (aka uncountable). Proving this is beyond this course. (It uses something called diagonalization.
8. Uncountably infinite is a bigger infinity than countably infinite, but that's beyond this course.
9. Georg Cantor, who formulated this, was hospitalized in a mental health facility several times.
Why is this relevant to probability?
1. We can assign probabilities to discrete outcomes, but not to individual continuous outcomes.
2. We can assign probabilities to some events, or sets of continuous outcomes.
E.g. Consider this experiment to watch an atom of sodium-26.
1. Its half-life is 1 second (Applet: Nuclear Isotope Half-lifes)
2. Define the outcomes to be the number of complete seconds before it decays: \(S=\{0, 1, 2, 3, \cdots \}\)
3. \(|S|\) is countably infinite, i.e., discrete.
4. \(p(0)=\frac{1}{2}, p(1)=\frac{1}{4}, \cdots\) \(p(k)=2^{-(k+1)}\)
5. \(\sum_{k=0}^\infty p(k) = 1\)
6. We can define events like these:
  1. The atom decays within the 1st second. p=.5.
  2. The atom decays within the first 3 seconds. p=.875.
  3. The atom's lifetime is an even number of seconds. \(p = \frac{1}{2} + \frac{1}{8} + \frac{1}{32} + \cdots = \frac{2}{3}\)
Now consider another experiment: Watch another atom of Na-26
1. But this time the outcome is defined to be the real number, x, that is the time until it decays.
2. \(S = \{ x | x\ge0 \}\)
3. \(|S|\) is uncountably infinite.
4. We cannot talk about the probability that x=1.23 exactly. (It just doesn't work out.)
5. However, we can define the event that \(1.23 < x < 1.24\), and talk about its probability.
6. \(P[x>x_0] = 2^{-x_0}\)
7. \(P[1.23 < x < 1.24]\) \(= 2^{-1.23} - 2^{-1.24} = 0.003\)
Event
1. collection of outcomes, subset of S
2. what we're interested in.
3. e.g., outcome is voltage, event is V>5.
4. certain event: S
5. null event: \(\emptyset\)
6. elementary event: one discrete outcome
Set theory
1. Sets: S, A, B, ...
2. Universal set: U
3. elements or points: a, b, c
4. \(a\in S, a\notin S\), \(A\subset B\)
5. Venn diagram
6. empty set: {} or \(\emptyset\)
7. operations on sets: equality, union, intersection, complement, relative complement
8. properties (axioms): commutative, associative, distributive
9. theorems: de Morgan
Prove deMorgan 2 different ways.
1. Use the fact that A equals B iff A is a subset of B and B is a subset of A.
2. Look at the Venn diagram; there are only 4 cases.
2.1.4 Event classes
1. Remember: an event is a set of outcomes of an experiment, e.g., voltage.
2. In a continuous sample space, we're interested only in some possible events.
3. We're interested in events that we can measure.
4. E.g., we're not interested in the event that the voltage is exactly an irrational number.
5. Events that we're interested in are intervals, like [.5,.6] and [.7,.8].
6. Also unions and complements of intervals.
7. This matches the real world. You can't measure a voltage as 3.14159265...; you measure it in the range [3.14,3.15].
8. Define \(\cal F\) to be the class of events of interest: those sets of intervals.
9. We assign probabilities only to events in \(\cal F\).
2.2 Axioms of probability
1. An axiom system is a general set of rules. The probability axioms apply to all probabilities.
2. Axioms start with common sense rules, but get less obvious.
3. I: 0<=P[A]
4. II: P[S]=1
5. III: \(A\cap B=\emptyset \rightarrow\) \(P[A\cup B] = P[A]+P[B]\)
6. III': For \(A_1, A_2, ....\) if \(\forall_{i\ne j} A_i \cap A_j = \emptyset\) then \(P[\bigcup_{i=1}^\infty A_i]\) \(= \sum_{i=1}^\infty P[A_i]\)
Example: cards. Q=event that card is queen, H=event that card is heart. These events are not disjoint. Probabilities do not sum.
1. \(Q\cap H \ne\emptyset\)
2. P[Q] = 1/13=4/52, P[H] = 1/4=13/52, P[Q \(\cup\) H] = 16/52!=17/52.
Example C=event that card is clubs. H and C are disjoint. Probabilities do sum.
1. \(C\cap H = \emptyset\).
2. P[C] = 13/52, P[H] = 1/4=13/52, P[Q \(\cup\) H] = 26/52.
Example. Flip a fair coin \(A_i\) is the event that the first time you see heads is the i-th time, for \(i\ge1\).
1. We can assign probabilities to these countably infinite number of events.
2. \(P[A_i] = 1/2^i\)
3. They are disjoint, so probabilities sum.
4. Probability that the first head occurs in the 10th or later toss = \(\sum_{i=10}^\infty 1/2^i\)
Corollory 1
1. \(P[A^c] = 1-P[A]\)
2. E.g., P[heart] = 1/4, so P[not heart] = 3/4
Corollory 2: P[A] <=1
Corollory 3: P[\(\emptyset\)] = 0
Corollory 4:
1. For \(A_1, A_2, .... A_n\) if \(\forall_{i\ne j} A_i \cap A_j = \emptyset\) then \(P\left[\bigcup_{i=1}^n A_i\right] = \sum_{i=1}^n P[A_i]\)
2. Proof by induction from axiom III.
Prove de Morgan's law (page 28)
Corollory 5 (page 33): \(P[A\cup B] = P[A] + P[B] - P[A\cap B]\)
1. Example: Queens and hearts. P[Q]=4/52, P[H]=13/52, P[Q \(\cup\) H]=16/52, P[Q \(\cap\) H]=1/52.
2. \(P[A\cup B] \le P[A] + P[B]\)

5 Questions

Continuous probability:

S is the real interval [0,1].

P([a,b]) = b-a if 0<=a<=b<=1.

Event A = [.2,.6].

Event B = [.4,1].

Questions:

What is P[A]?
1. .2
2. .4
3. .6
4. .8
What is P[B]?
1. .2
2. .4
3. .6
4. .8
What is P[A \(\cup\) B]?
1. .2
2. .4
3. .6
4. .8
What is P[A \(\cap\) B]?
1. .2
2. .4
3. .6
4. .8
What is P[A \(\cup\) B \(^c\) ]?
1. .2
2. .4
3. .6
4. .8
Retransmitting a noisy bit 3 times: Set e=0.1. What is probability of no error in 3 bits:
1. 0.1
2. 0.3
3. 0.001
4. 0.729
5. 0.9
Flipping a fair coin until we get heads: How many times will it take until the probability of seeing a head is >=.8?
1. 1
2. 2
3. 3
4. 4
5. 5
This time, the coin is weighted so that p[H]=.6. How many times will it take until the probability of seeing a head is >=.8?
1. 1
2. 2
3. 3
4. 4
5. 5

6 Xkcd comic

Significant

7 To read

Leon-Garcia, chapter 2.

8 To watch

Rich Radke's Probability Bites:

Axioms of Probability
Discrete Sample Spaces
Combinatorics

https://www.youtube.com/playlist?list=PLuh62Q4Sv7BXkeKW4J_2WQBlYhKs_k-pj