Skip to main content

Engineering Probability Class 28 Mon 2022-04-25

1 Homework 11

  1. The due date was accidentally set too late, and has been changed to Thurs. (Assignments are not allowed in reading period.)

  2. As soon as possible after that, we'll calculate what letter grade you'd get if you didn't write the final, and upload it to LMS.

2 Misc statistics topics

Reviewing the videos.

2.1 T-test

  1. You have 2 populations.

  2. Do they have the same mean?

  3. Take a sample of observations from each population.

  4. Calculate the sample means.

  5. They're probably different.

  6. What's the prob the sample means would be at least that different if the population means were the same.

  7. At least can be 1 sided or 2 sided.

2.2 ANOVA

  1. Analysis of variance

  2. Test for possible difference in several groups.

  3. E.g. you're searching for a cure for lycanthropy.

  4. 5 possible treatments: aspirin, silver crosses, sunlight, being bitten by Dracula, nothing.

  5. Take 100 people with lycanthropy.

  6. Assign different treatments randomly.

  7. Measure length of hair at next full moon.

  8. Did any treatment work?

  9. Real work application: The worldwide pharma industry grosses $$10^{12}$$ dollars a year. A new drug costs several $$10^9$$ to develop, including the costs of the failures. To get a new drug approved, you have to prove, with trials and statistics, that it works.

2.3 Linear regression

  1. To explore possible linear relationships between several variables.

  2. Several possible independent variables.

  3. One dependent variable.

  4. One independent variable example:

    1. student score vs time on exam 2:

    2. Independent variable: time to finish.

    3. Dependent variable: score.

    4. Is there a linear relationship?

    5. What is it?

    6. How good is it?

  5. Multiple independent variables example:

    1. Try to predict first year student performance at RPI.

    2. Dependent variable: first year GPA.

    3. Independent variables:

      1. high school grade

      2. high school rank

      3. number of AP

      4. fraternity?

      5. athlete?

      6. home state

      7. height

      8. weight

    4. which one is the strongest predictor?

    5. Add the independent variables one by one in order of importance.

    6. However, independent variables may be correlated with each other.

    7. with enough independent variables you can explain anything.

    8. what about nonlinear relationships?

2.4 Non parametric stats

  1. no assumptions about the distribution, except that the observations are independent.

  2. Often use order stats.

  3. E.g., Wilcoxon rank-sum test (aka Mann-Whitney) to test if two pops have same mean:

    1. combine the observations from the two populations, X and Y.

    2. sort them all together

    3. see if the observations from population X are clustered at the start.

    4. by computing U score: count number of times Xi>Yj,

      1. for large enough n, U is normal, with mean $$n^2/2$$ and variance $$n^2(2n+1)/12$$.

    5. what's the probability that observations would be this biassed (towards the start) if the population means were the same? I.e., that U would be this far off mean?

  4. There are many tests.

  5. You need to decide what "biassed" means. I.e., pick your alternative hypothesis.

  6. Not as powerful but more robust.

2.5 How to lie with statistics

https://en.wikipedia.org/wiki/How_to_Lie_with_Statistics

https://www.amazon.com/How-Lie-Statistics-Darrell-Huff/dp/0393310728

2.6 Machine learning

current hot application of stats.

3 Final exam

  1. The material will go up to homework 11,

  2. There will be no statistics or paradoxes, since we didn't have homeworks on that.

  3. The final exam will be as specified by the registrar.

  4. It will be in person, using gradescope.

  5. Bring blank scratch paper.

  6. You may have three (3) 2-sided crib sheets.

  7. As specified in the syllabus, all 3 exams have the same weight, and the lowest will be dropped.

  8. The lowest homework will also be dropped.

  9. There was no final exam last year because RPI was shut down by the computer hack.

#. Here is from 2 years ago: Spring 2020 final exam. Answers .

  1. However the material covered changes somewhat each year.

4 After the course

We have a professional relationship. I'm available to discuss any legal ethical topic even after you graduate.

Even after I retire, you have my non-RPI email.

Parting advice: look at the famous alumni on the Darrin windows. What can you do in later life, so your picture goes there also?