WEBVTT 1 00:00:24.690 --> 00:00:30.359 Oh, hey, technology is sort of working today. 2 00:00:30.359 --> 00:00:33.539 So 1st, what I was doing on. 3 00:00:33.539 --> 00:00:44.820 Thursday as I was on a review panel, um, for the National Science Foundation for 2 days, we were looking at a pile of proposals and there were. 4 00:00:44.820 --> 00:00:51.960 8 of us on the panel to NSF program directors, and we were reviewing the proposals. 5 00:00:51.960 --> 00:01:02.670 It's like 100 pages each over 201 point, and rating each proposal on many different categories and then making a recommendation. 6 00:01:02.670 --> 00:01:12.750 To NSF, and they may or might not follow our panels recommendation and then maybe 10% to 20% of them would get funded. 7 00:01:12.750 --> 00:01:25.920 Oh, that's what I was doing. Um, I do it whenever NSF asked me our likes me to do those things, because our likes good relations with the National Science Foundation. 8 00:01:25.920 --> 00:01:33.269 And it's also a chance to see what good proposals and bad proposals look like. Um. 9 00:01:34.650 --> 00:01:43.109 I'm being a little vague about which panel I was on because actually, um, my membership on the panel is, um. 10 00:01:43.109 --> 00:01:51.629 Is a confidential matter, um, the specific panels I'm on and in fact, if you ask NSF. 11 00:01:51.629 --> 00:02:01.674 You know, who the names of the reviewers were, if you submitted a proposal to ask us to NSF and propose say NSF declined it is they do maybe 80% of the proposals. 12 00:02:01.674 --> 00:02:07.165 If you ask the NSF who the reviewers were NSF would refuse to tell you. 13 00:02:07.709 --> 00:02:11.520 Because the names of the reviewers are confidential. 14 00:02:11.520 --> 00:02:20.490 And the legal way that that happens is that we reviewers are providing Pre decisional advice to NSF. 15 00:02:20.490 --> 00:02:25.889 Um, they don't have to follow our advice they generally do, but not always. 16 00:02:25.889 --> 00:02:33.659 For example, they for distributional requirements, they might fund a proposal. That's not at the top of the list. 17 00:02:33.659 --> 00:02:34.080 Um, 18 00:02:34.735 --> 00:02:46.044 if it's from a state or a group that doesn't submit many proposals that are proposal from there would be more likely to get funded and the program director would then overrule the panel that say but that's why, 19 00:02:46.405 --> 00:02:46.794 um, 20 00:02:46.824 --> 00:02:47.875 if you see in the news, 21 00:02:47.875 --> 00:02:48.594 for example, 22 00:02:48.594 --> 00:02:51.865 you'll see about some government advisory group, 23 00:02:52.555 --> 00:02:57.324 and the names the members of the advisory group are confidential and the government will say, 24 00:02:57.324 --> 00:02:59.125 refuse to say who they are. 25 00:02:59.634 --> 00:03:01.044 The reason is that the group. 26 00:03:01.379 --> 00:03:05.939 They're only providing opinions and advice. They're not. 27 00:03:05.939 --> 00:03:11.129 You know, they're not making decisions they don't actually do something so. 28 00:03:11.129 --> 00:03:14.370 And that's where weird legal things happen. 29 00:03:15.900 --> 00:03:23.669 But, um, I actually, at 1 point spent a couple of years at the National Science Foundation for 20 years ago. So I, I was a program director. 30 00:03:23.844 --> 00:03:38.034 And, um, so I just made recommendations too, to my boss, but each year I recommended the spending of about 10Million dollars on various proposals. So I also did some liaison with darva defense events. Um. 31 00:03:38.939 --> 00:03:53.159 Research Projects Agency and we funded some work together. Actually I put some money in the mass petitions and let's not put some money in and Barbara put some money in funded 2 rounds of a joint program. So. 32 00:03:53.159 --> 00:04:05.430 It's a fun chance to see how dark operated. Okay, so that's your revolving door that you read about people going from the government to private industry and also with NSF half the people at NSF and. 33 00:04:05.430 --> 00:04:19.769 Are what our so called rotators they're full time day jobs are at a university. Usually sometimes have some sort of other government agency, and they rotate to the National Science Foundation for a couple of years. 34 00:04:19.769 --> 00:04:26.459 In my case, it was 3, but almost 3 years and then they rotate back to their home. 35 00:04:26.459 --> 00:04:33.988 To their home institution is a lot in there's a lot of faculty members have spent time with the National Science Foundation. 36 00:04:33.988 --> 00:04:38.879 Um, and then the other half of vendors that program directors are. 37 00:04:38.879 --> 00:04:45.928 Are, um, full time and the highest level people at NSF are rotators also for 6 years. 4 years. 38 00:04:45.928 --> 00:04:50.939 Dark, everyone's a rotator a dark or they're there for 4 years or. 39 00:04:50.939 --> 00:04:54.809 Maybe Max, 6 years or so and then they go back somewhere else. 40 00:04:54.809 --> 00:05:02.038 In darkest case, it could be a university it could be a private company. It could be a military agency or something. So but. 41 00:05:02.038 --> 00:05:09.269 Everyone at Starbucks does so the, and the support staff, they don't work for darker. They work for some. 42 00:05:09.269 --> 00:05:21.959 Um, consulting company who has a actually has their building next across the street from dark. So the staff, the assistants are not darker employees. There'll be dark employees and professional people. So. 43 00:05:23.069 --> 00:05:27.119 But, um, so it spreads the expertise around the country. Um. 44 00:05:28.348 --> 00:05:36.869 So, I used to joke, in fact, because the darker headquarters at that time was what is going on here just 3rd, this. 45 00:05:36.869 --> 00:05:43.019 And keep stopping, um. 46 00:05:44.759 --> 00:05:48.959 The 2nd here, um. 47 00:05:56.309 --> 00:06:00.959 Okay, so with luck, maybe this is broadcasting. Um. 48 00:06:02.548 --> 00:06:09.028 As long as I'm not typing a 2nd, so dark I used to joke because the, um. 49 00:06:10.858 --> 00:06:13.889 The bill is the headquarters is on a public street. 50 00:06:13.889 --> 00:06:27.869 And just set in, you know, 100 feet or so. So if whoever was a car bomb or something, it wouldn't be that bad because have everyone, there's a rotator and half the people there on travel. So, in the real work done by the community. So. 51 00:06:27.869 --> 00:06:34.168 Okay, so what is happening here today? 52 00:06:36.629 --> 00:06:42.449 Okay, so, 1st, the homework 11, the date was wrong. We're setting it to Thursday, which is. 53 00:06:42.449 --> 00:06:46.709 3 days, um, and well, we'll have it will allow them to be submitted late, but. 54 00:06:46.709 --> 00:06:50.488 The thing is, I'm not allowed to have due dates that are in, um. 55 00:06:50.488 --> 00:06:54.119 In the reading period, it's already 1 day and but we'll ignore that. 56 00:06:54.119 --> 00:07:01.918 And when homework 11 gets great and I'll calculate your guaranteed grade if you don't write the final exam. 57 00:07:01.918 --> 00:07:12.838 So, from the syllabus, the 3 exams, their equal weights, the top 2 will be used the 3rd will be dropped and for the homeworks, the lowest homework will be dropped. 58 00:07:12.838 --> 00:07:20.069 And I'll calculate that I'll calculate the letter grade that you will get. If you do not write the final. 59 00:07:20.069 --> 00:07:24.869 If you do write the final, it cannot lower your grade it might raise your grades. 60 00:07:24.869 --> 00:07:31.259 Um, particularly some people, for example, who, um, skip the 2nd exam. 61 00:07:31.259 --> 00:07:37.709 They will very much be helped by writing the 3rd exam and getting a grade in it. So, um. 62 00:07:37.709 --> 00:07:45.059 The final exam is the date time mentioned, um, turn down here somewhere. 63 00:07:45.059 --> 00:07:50.759 There they mentioned in on the registrar's page, it'll be in, it'll be in that room. 64 00:07:50.759 --> 00:07:53.968 Using grade school, just like we did before. 65 00:07:53.968 --> 00:07:57.178 You know, bring some blank work paper. If you want, um. 66 00:07:57.178 --> 00:08:05.579 Do you want to bring a calculator? Go ahead. I guess we've tried to write questions and it won't help you. You can bring 3 crib sheets. 67 00:08:05.579 --> 00:08:09.358 For the 3rd exam, so presumably your 1st, 2 sheets and 1 more. 68 00:08:09.713 --> 00:08:18.053 Um, there will be no statistics on the exam, because we haven't had homework on that yet. So so it'll be up to the end of chapter 7. 69 00:08:18.564 --> 00:08:29.244 um, the ta's will have several, um, office hours before the exam. We'll try to have an office hour every day before the exam will announce them. 70 00:08:29.519 --> 00:08:38.879 Um, there was no final examine this course last year, cause was shut down because our computers got back. 71 00:08:38.879 --> 00:08:47.519 Um, our refused to offer any specifics, except it's supposed to have been some administrative person who screwed up. 72 00:08:47.519 --> 00:08:58.139 Um, I continued going okay mostly because I oh, I do my email outside, you know, my preferred emails outside our API. 73 00:08:58.139 --> 00:09:08.369 And I've had a private account with for 20 years now. So, I mean, and I, and the master copy of my websites is on my own machine. When I push. 74 00:09:08.369 --> 00:09:14.038 I push copies out to the RPI web servers and everything, so I just. 75 00:09:14.038 --> 00:09:22.499 1st, a new copy out to another, you know, web server off our I was doing okay, but we still canceled the exam last year. 76 00:09:22.499 --> 00:09:28.499 However, um, uh, sorry for that 1. um, I'll fix that. 77 00:09:28.499 --> 00:09:36.149 I didn't re, read it. Okay. I'll fix the link so you'll have a, um, link from, um. 78 00:09:38.458 --> 00:09:44.099 So so you'll have a link to see the 2 years ago understanding that. 79 00:09:44.099 --> 00:09:52.438 Topics changed slightly, perhaps, um, the 1 more point for after the course before I get to some material. 80 00:09:52.438 --> 00:09:55.499 Um, so we got a professional relationship. 81 00:09:55.499 --> 00:10:00.808 Um, and about last you leave. 82 00:10:00.808 --> 00:10:07.379 Then, um, you're welcome to discuss any legal and ethical topic with me. Um, just, you know. 83 00:10:07.379 --> 00:10:14.759 And I encourage you to get your own, um, non email. I. 84 00:10:14.759 --> 00:10:20.038 Put my on non email after, um, on all of my signature files. 85 00:10:20.038 --> 00:10:27.359 And I may retire at some point, but so you'll have an archive email to get hold of me. 86 00:10:27.359 --> 00:10:31.019 And some parting thing also, uh, Eco outside. 87 00:10:31.019 --> 00:10:41.453 The windows have these pictures of famous alumni is they were once measurable undergrads at some point. Okay, they did it. Okay. So you might be asking yourself after you leave. 88 00:10:42.053 --> 00:10:47.303 Could you do something that might end up result in your picture being up on a window there? So, I'll just think about it. 89 00:10:47.969 --> 00:10:51.418 Make some difference in the world when you graduate. 90 00:10:52.438 --> 00:10:58.318 Okay, back to direct material talking about statistics. Um. 91 00:11:00.418 --> 00:11:06.688 So, I, I, I'll review some videos to look at on your own on Thursday if you want it and just, um. 92 00:11:06.688 --> 00:11:12.119 My take on some of them just for a few minutes and. 93 00:11:13.139 --> 00:11:18.599 So T, test is I mentioned a little before, but. 94 00:11:18.599 --> 00:11:25.589 It's worth say, okay, I'm going to take the keyboard off. 95 00:11:25.589 --> 00:11:38.038 The disadvantage it means, of course, the camera is not pointing at me. It's pointing up at the ceiling, but the advantage is the iPad flat on the counter. I can actually write on it semi allegedly. So the T test is. 96 00:11:38.038 --> 00:11:42.418 You've got to in her mind, you've got 2 populations. 97 00:11:44.219 --> 00:11:49.078 And you're plotting thing, so you've got red and blue let's say so. 98 00:11:49.078 --> 00:11:59.428 Um, you got red dots here. 99 00:11:59.428 --> 00:12:04.438 Read observations and then you've got blue observations. 100 00:12:06.808 --> 00:12:13.438 Okay, something like that, if you can see the caller differences. So basically, so are, you know. 101 00:12:17.009 --> 00:12:23.729 Means different or whatever. I mentioned a little a weekoh. 102 00:12:23.729 --> 00:12:28.349 Some standard deviation variances or whatever so. 103 00:12:29.369 --> 00:12:40.019 And, um, and we construct, I keep mentioning this every time, but I construct and all hypothesis is, I'm saying this again and again, because it's important. 104 00:12:40.019 --> 00:12:44.999 Means the same. 105 00:12:44.999 --> 00:12:51.869 Okay, and then the alternative hypothesis. 106 00:12:54.899 --> 00:13:03.599 Is the different now you might get particular about how they're different. Um. 107 00:13:11.068 --> 00:13:16.318 Particular about how they're different. 108 00:13:22.048 --> 00:13:26.278 A blue is greater than red. 109 00:13:27.599 --> 00:13:32.158 Here it looks. Okay, so then the question, the big question is. 110 00:13:33.869 --> 00:13:42.239 If there's no difference how likely is what you saw. 111 00:13:46.769 --> 00:13:52.589 Okay, and here the blues look a little higher than the reds, the law bottom to our reds and so okay. 112 00:13:52.589 --> 00:13:57.448 And that's the T test. Okay. So the little video on that, um. 113 00:14:00.839 --> 00:14:05.849 And we saw different versions, and I mentioned quickly, uh, a week ago. Um. 114 00:14:05.849 --> 00:14:09.239 Then the next video is something called. 115 00:14:09.239 --> 00:14:18.509 And now called analysis analysis. 116 00:14:18.509 --> 00:14:28.229 I give an example, I take example, I made up here. You can take this example and map at the current events. If you want by changing the names. Um. 117 00:14:28.229 --> 00:14:34.739 We have the population of people, and some people they're infected with looking for. 118 00:14:34.739 --> 00:14:41.969 And the symptoms of looking for is that on full moons, they turn into a firewall so the hair gets longer. 119 00:14:41.969 --> 00:14:45.869 And let's say we test if they are. 120 00:14:45.869 --> 00:14:57.058 Suffering from the control by making the length of their hair. So if the hair gets longer in a full moon, then we say the suffering from the control let's say here are longer whatever. 121 00:14:57.058 --> 00:15:00.778 pete's get longer. I don't know. Um, okay. 122 00:15:01.073 --> 00:15:15.653 So so there's various possible here, is that we're, we're investigating curious. We're looking for LP. Um, well, 1st, we do nothing. That's a control group. 2nd. We may give the people aspirin. 123 00:15:15.683 --> 00:15:28.734 3rd we may have the people who are across 4th. We may, um, try lots of sunlight during the day time. Um, we might bring in Dracula. I'd have Dracula bite to people, um. 124 00:15:29.634 --> 00:15:30.384 Or whatever, 125 00:15:30.744 --> 00:15:45.443 and so we have maybe 5 different groups of people control groups people with crosses people with aspirin people treated with sunlight people being bit by Dracula and we measure the length of their hair and their teeth at the next full 126 00:15:45.443 --> 00:15:45.803 moon. 127 00:15:46.614 --> 00:15:58.793 And, and we, um, see, um, and each, each population of these 5, sub populations, you know, we've got a, we've got a distribution for airlines and 2 flights. 128 00:15:58.823 --> 00:16:02.004 Um, and we're looking at this and. 129 00:16:02.308 --> 00:16:08.698 We're trying to decide, do we see a difference? Um, so. 130 00:16:08.698 --> 00:16:18.239 And so I'm giving you the executive summary so the analytic things they look at all these different things and say, are we seeing something. 131 00:16:18.239 --> 00:16:29.938 Um, happening, I mean, the thing is, this you're going to get some variants anyway. Okay the people will have their hair grow at different lengths in a full moon. Anyway. Maybe. 132 00:16:29.938 --> 00:16:44.573 Uh, but you're looking to see if there's some, if these different populations look different so it's a, it's a, it's related to the T test, but, um, it looks at several populations and this is very popular with social scientists. 133 00:16:44.604 --> 00:16:59.423 They they don't know any mathematics. They just plug it into 1 of your take your favorite statistics package and does calculate it. I'm so, I'm giving you the executive summary of this now. Real World looking. Trophy is a fictional disease, but let me throw real world numbers. 134 00:16:59.423 --> 00:16:59.903 At you. 135 00:17:00.629 --> 00:17:05.429 I mentioned these little before, but there are enough money that it's worth looking at again. Um. 136 00:17:05.429 --> 00:17:14.009 Look at the worldwide pharmaceutical industry, the drug business they every year grows 1Trillion dollars. 137 00:17:14.009 --> 00:17:22.048 Half of which is in the United States. So 502Billion dollars of drugs are bought in the United States each year. 138 00:17:22.048 --> 00:17:31.284 Which is 1500 dollars per person on the average legal drug. Okay this is all legal stuff I'm not talking marijuana. Okay. Or cocaine, or anything the league might be bigger goal. 139 00:17:31.284 --> 00:17:39.203 I don't know the legal pharmaceutical industry United States is 1 half a 1Trillion dollars. 500Billion dollars 1500 bucks a person. 140 00:17:41.699 --> 00:17:42.358 Um, 141 00:17:43.253 --> 00:17:49.403 and so they want to develop new drugs because, 142 00:17:49.433 --> 00:17:49.703 you know, 143 00:17:49.703 --> 00:17:50.334 the drugs, 144 00:17:50.364 --> 00:17:50.933 um, 145 00:17:50.963 --> 00:17:51.713 they start off, 146 00:17:51.713 --> 00:17:54.203 they're on patent and they're very expensive, 147 00:17:54.203 --> 00:17:54.683 but then, 148 00:17:54.713 --> 00:17:55.884 as they go off Pat, 149 00:17:55.884 --> 00:17:57.114 and they become very cheap. 150 00:17:57.929 --> 00:18:03.689 And then the revenues of the company falls, so they're developing new drugs. Um. 151 00:18:03.689 --> 00:18:18.509 Developing 1, new successful drug cost a few 1Billion dollars when you work in the cost of the failures because most of the attempts are failures. Um, so so bringing a successful drug to market is. 152 00:18:18.509 --> 00:18:27.989 The number is very, but maybe 2Billion dollars or 25Billion dollars or something, which sounds like a lot of money, but remember the grossing a 1,000,000,000,000. 153 00:18:28.163 --> 00:18:35.304 Okay, so what new drug is, um, you know, whatever, a 3rd of 1% of the annual growth so you can afford you the number. Okay. 154 00:18:35.334 --> 00:18:44.574 So, now, what does it take to bring the new drug to market is DC United States you have to prove that it works to the satisfaction of the FDA. 155 00:18:45.929 --> 00:18:52.979 So, that involves clinical trials and statistical tests and so on. So, this is where the statistical test. 156 00:18:52.979 --> 00:18:58.709 Come in, and, you know, they're published and they'll say they tested a 100 patients. 157 00:18:58.709 --> 00:19:10.828 And again, um, so, you know, for again, you can work it into the news. Um, you need to read the DOS, um, you got diseases that people have and you got. 158 00:19:10.828 --> 00:19:15.568 Things that are claiming to be curious and things that are officially. 159 00:19:15.568 --> 00:19:18.598 You know, their official. 160 00:19:18.598 --> 00:19:27.898 You know, bless this official things that will help. And how, how do they do they did the statistical tests and they found out that some people were hurt by the. 161 00:19:27.898 --> 00:19:37.409 New drug, and some people were helped and and the drug company statisticians convinced the FDA that, um, the drug. 162 00:19:37.409 --> 00:19:43.169 On on the balance was good and therefore the FDA approves it. 163 00:19:43.169 --> 00:19:49.318 And then that's how the system works and well, 1st, that it's better and 2nd, that it. 164 00:19:49.318 --> 00:19:54.659 Doesn't have really bad side effects. Um, let me so. 165 00:19:54.659 --> 00:19:59.489 I mean, the poster child for the regulation that you mentioned before is. 166 00:19:59.489 --> 00:20:08.308 Several decades ago, there was a drug called that helped with morning sickness for pregnant women, I think, and. 167 00:20:08.308 --> 00:20:11.699 I basically every country in the world, but the United States. 168 00:20:11.699 --> 00:20:17.338 Approved it for use, cause it did help this morning sickness. Um. 169 00:20:17.338 --> 00:20:26.759 The problem is that it also met the poor baby when it was foreign, had no arms or legs cut off about they lost the last fit. 170 00:20:26.759 --> 00:20:38.999 Not not in the United States, cause United States had never legalized that, um, any case. So, back to statistics, this is the real world application where there's a lot where statistics is very important. 171 00:20:38.999 --> 00:20:43.409 So, okay, so, um. 172 00:20:46.229 --> 00:20:53.009 And these, you know, you read the medical, they published, they'll say they did this test and, um. 173 00:20:53.009 --> 00:20:59.009 The drug, if he doesn't talk about type 1 type 2 errors, it's slightly different way of doing the analysis but. 174 00:20:59.483 --> 00:21:13.193 You know, they do it. Um, 1, big thing for making these stats valid is they have to say what they're measuring before they measure it. Um, let me forward. So, what you must do for these for these, like, these. 175 00:21:14.068 --> 00:21:22.679 Drug trials, et cetera um, you know, fine. 176 00:21:25.348 --> 00:21:32.009 But they'll be measured and, um. 177 00:21:32.009 --> 00:21:42.148 And what success will be, um, before the test. Okay. 178 00:21:42.148 --> 00:21:45.749 Before the Trump before the trial, so let's say. 179 00:21:47.278 --> 00:21:51.209 What is very bad is what's called data? 180 00:21:55.469 --> 00:21:59.548 And data is, you you look at the results. 181 00:22:01.949 --> 00:22:06.898 To find, you know, to find whatever interesting. 182 00:22:10.858 --> 00:22:22.439 Interesting things. Um, but very what's very bad is you do your experiments, you do your trials, and you look at the results and see if there's any interesting stuff you can see in the results. 183 00:22:22.439 --> 00:22:37.193 And that's bad, because you see coincidences and stuff. What you have to do is to find the experiment for the trial. What, what you define in advance what the measure of success will be. And then you run the trial. 184 00:22:37.679 --> 00:22:44.759 And then, did the trial meet that measure of success that you defined in advance? 185 00:22:44.759 --> 00:22:50.398 And that's the key um, yeah, before the trials. So. 186 00:22:51.294 --> 00:23:05.364 Key to making things solid because the companies are going to push things and, um, you know yeah, yeah. Then you have to have rules here. You also want to have the rules defined in advance because so very much money is involved. 187 00:23:05.814 --> 00:23:08.124 If there's any ambiguity then, um. 188 00:23:08.759 --> 00:23:18.449 Everyone gets, you know, there's lawsuits flying everywhere. Okay so that's that's where things, analysis of variance T, tests and stuff like that. Come in. 189 00:23:18.449 --> 00:23:27.239 And so, you know, the underlying mass is available if you want to look at that, but I gave you the executive summary. So. 190 00:23:27.239 --> 00:23:34.979 Questions okay um, next thing I had a film on was linear regression. 191 00:23:40.169 --> 00:23:47.699 So, again, you see, the whole thing was the test X is, um, so the whole thing, but statistics let's talk about. 192 00:23:49.348 --> 00:23:52.888 Is you know, we want to find parameters. 193 00:23:52.888 --> 00:24:00.028 Brian parameters test hypotheses. 194 00:24:04.558 --> 00:24:14.548 And find relations, and so on find relations. Okay. Um, between stuff. 195 00:24:14.548 --> 00:24:21.778 You know, and the goal is to say to predict predict the future. 196 00:24:21.778 --> 00:24:31.618 Better so, um, stock market fell by 3% last Friday. I mean, I haven't even looked at my phone today. I've been working all day, but. 197 00:24:31.618 --> 00:24:35.308 You know, what's it doing today? Um. 198 00:24:46.888 --> 00:24:57.568 Trying to figure out, uh, I can't even read my chart. 199 00:25:00.538 --> 00:25:06.959 I think I may have gone up today or something. Um, it'll but still below a few days ago. So you want. 200 00:25:06.959 --> 00:25:10.288 To do predictions and 1 way. 201 00:25:10.288 --> 00:25:14.729 Is a linear regression that you look at? Um. 202 00:25:16.588 --> 00:25:20.608 Um, no, let, let me use the stock market say. 203 00:25:22.229 --> 00:25:33.534 Saturday, 4 or 500, it's an average of, um, a weighted average of the approximately the top 500 stocks and 5 hundred's only in approximation. It could be 499 or something. 204 00:25:33.534 --> 00:25:44.933 Um, and they're weighted by the value of the stock outstanding, which is another question. That's not well defined, so we might look at something like, um, here, our old values for. 205 00:25:49.288 --> 00:25:59.368 You know, maybe this doesn't mean maybe this hypothetically is this is Friday, Thursday, Wednesday, Tuesday, Monday and so we'd want to say. 206 00:26:02.009 --> 00:26:05.729 Um, whatever. 207 00:26:05.729 --> 00:26:09.088 What's going to happen here? See it. 208 00:26:09.088 --> 00:26:12.449 In the new regression means say you fit a line. 209 00:26:14.009 --> 00:26:17.939 If it aligned to the data and then maybe extrapolate it. Okay. 210 00:26:17.939 --> 00:26:26.759 And here, I'll do a year, I'll throw some math at you. Um, just for fun. I can't just be, you know, we got to have a little quantitative stuff. 211 00:26:26.759 --> 00:26:33.719 And so what we have is say these are points X and Y. 212 00:26:33.719 --> 00:26:38.429 And we want to fit a line Y, equals X. plus B, let's say. 213 00:26:38.429 --> 00:26:43.439 I'll do it on the next page here. I'll rewrite this so we got the data. 214 00:26:45.328 --> 00:26:49.949 That's why I. 215 00:26:49.949 --> 00:27:04.348 Be, and I gave you the example, say, for exam 2, where we had these were the ads I gave you the actual part of data from exam. 2 x access was. 216 00:27:04.348 --> 00:27:08.368 Time to finish the exam and Y axis was great. 217 00:27:08.368 --> 00:27:18.328 Okay, so, um, so the way what we do here is we want to, you know, we want to, um, find. 218 00:27:19.528 --> 00:27:24.568 A, and B, to minimize error. 219 00:27:26.308 --> 00:27:32.788 And, uh, okay now error. Okay, basically, you know how to define the error. 220 00:27:32.788 --> 00:27:41.128 And what they, what we do often is we'll have the vertical distance here, and we want to minimize it. Um. 221 00:27:41.128 --> 00:27:45.509 So, we defined equals, say. 222 00:27:47.578 --> 00:27:53.159 Plus B, I minus Y, I. or something and squared. 223 00:27:53.159 --> 00:27:59.098 So this is the actual value and this is the computer. 224 00:28:00.838 --> 00:28:04.828 You know, from the line from the line. Okay. 225 00:28:04.828 --> 00:28:09.358 So, we take the difference between the computed number from our. 226 00:28:09.358 --> 00:28:17.459 Progression line, and it would take the different set from the actual Y, and we square it. Now. Why do we square it? Um. 227 00:28:17.459 --> 00:28:22.199 The real reason I think is because the mass is easier to do. 228 00:28:22.523 --> 00:28:35.153 And, um, what it does is that, um, means that big errors that are treated more seriously than small errors. Cause his error is twice as big as 4 times the weight, because it's a squared here. 229 00:28:35.993 --> 00:28:44.453 So, in any case, this is what's conventional to minimize the sum of the square there so we want to do we want to minimize the sum of all the, um. 230 00:28:44.759 --> 00:28:51.959 Some of all the squared errors sets, minimize some of all the, um. 231 00:28:54.118 --> 00:28:58.378 8 times plus B minus Y. Y. 232 00:28:58.378 --> 00:29:05.159 Squared so we wanted we wanted to pick a and B to minimize this that. 233 00:29:05.159 --> 00:29:12.328 Minimize that. Okay. So how do we do that? Um. 234 00:29:12.328 --> 00:29:16.199 And be a variable and the X, and the whys are all fixed. 235 00:29:16.199 --> 00:29:20.249 Well, um, so they've called out our big error equals E. um. 236 00:29:22.409 --> 00:29:26.969 Well, equals, um. 237 00:29:29.729 --> 00:29:34.229 And, you know, the sum of all the, um, okay. 238 00:29:37.588 --> 00:29:40.648 Oh, I can expand the square. Okay. Um. 239 00:29:40.648 --> 00:29:44.189 So E, is the, um. 240 00:29:44.189 --> 00:29:47.398 Hey, squared to sum of all squared. 241 00:29:47.398 --> 00:29:51.778 Plus, um, and be squared. 242 00:29:51.778 --> 00:29:54.898 Plus the sum of all the Y, I squared. 243 00:29:55.949 --> 00:30:00.959 Plus, to a B, to some of. 244 00:30:02.578 --> 00:30:08.909 Minus to a sum of all the Y, I. 245 00:30:10.618 --> 00:30:16.288 Um, plus, um. 246 00:30:16.288 --> 00:30:19.558 Who, and some of all the. 247 00:30:19.558 --> 00:30:25.499 On 23456. 248 00:30:25.499 --> 00:30:35.009 Okay, um, okay, now here, um, some of those things are. 249 00:30:36.989 --> 00:30:45.209 Constant and so, okay, so now some of all the why I squared. Okay. So, now what we need to do is, um. 250 00:30:45.209 --> 00:30:51.538 Actually, if I got this, right? 251 00:30:59.878 --> 00:31:03.328 So, now we want to find a, and be the minimize that. 252 00:31:04.648 --> 00:31:10.078 Well, I can take the E over D. a derivative. 253 00:31:11.278 --> 00:31:14.308 To a, some of all the squared. 254 00:31:17.278 --> 00:31:23.219 Plus, to be some of all the . 255 00:31:23.219 --> 00:31:28.288 The X, Y, and that's it. 256 00:31:28.288 --> 00:31:33.778 Okay, and you want to set it equal to 0. okay. It goes 0 at the men. 257 00:31:36.749 --> 00:31:39.898 Okay, so basically what we have. 258 00:31:42.509 --> 00:31:48.538 Is a X squared plus B, some of all the. 259 00:31:48.538 --> 00:31:52.888 Minus some of was 0. 260 00:31:54.328 --> 00:31:59.038 Okay, and that is, um, 20 equation. 261 00:31:59.038 --> 00:32:06.239 And now we can also take the E. R. D. B. 262 00:32:06.239 --> 00:32:15.058 And that is, um, 2 and B. 263 00:32:17.429 --> 00:32:24.209 +2 a 2. 264 00:32:26.098 --> 00:32:32.818 But, um, go ahead and some of. 265 00:32:35.729 --> 00:32:41.489 0, um, and so what we get is, um. 266 00:32:44.249 --> 00:32:50.159 And B. 267 00:32:50.159 --> 00:33:01.348 Hey, sorry, this should be in capital in flux end. Some of the only was there or something, and then we can solve and then we can solve. 268 00:33:02.969 --> 00:33:08.038 For a, and B, and this is how we can do linear regression. 269 00:33:08.038 --> 00:33:16.798 Um, I might have gotten some something wrong. You get the idea, and you can solve for a B and then you can also compute. 270 00:33:19.169 --> 00:33:23.098 Okay, yeah. Okay. 271 00:33:23.098 --> 00:33:29.578 Basically, in your aggression, um, now. 272 00:33:31.378 --> 00:33:35.249 That's the basic thing now it can start getting interesting. 273 00:33:35.249 --> 00:33:39.028 Um, because next is, um. 274 00:33:39.028 --> 00:33:43.019 Up here next, you can get things such as. 275 00:33:46.409 --> 00:33:54.028 Multi linear regression and, um. 276 00:33:54.028 --> 00:34:06.719 So you've got several in dev, so what you want is some, your output why? I will say +, plus a 3 X3+some error term. 277 00:34:08.369 --> 00:34:15.659 All right, but lots of eyes in here. So you've got several independent variables. Um. 278 00:34:15.659 --> 00:34:21.179 And I use an example here, um, let's propose was trying to predict. 279 00:34:21.179 --> 00:34:26.728 Freshman CPAs that's the dependent variable GPA. 280 00:34:27.384 --> 00:34:38.364 And then what our might be using to try to predict this, um, they might look at your high school grade, your high school class rank, which are 2 different things. 281 00:34:39.143 --> 00:34:43.793 Um, the number of AP courses you had the, are you in a fraternity. 282 00:34:44.128 --> 00:34:56.039 That just be is there a 1 variable? Are you an athlete? Is there a 1 variable on, um, the home state? You're from how far maybe how far you travel to come to our, maybe your height, your weight. 283 00:34:56.039 --> 00:35:03.599 So, I got the example here, I've got 8 different, possible, independent variables. And then the dependent variable is, you GPA this year. 284 00:35:03.599 --> 00:35:08.068 And then, so we'd want to do the regression. 285 00:35:08.068 --> 00:35:14.009 And we'd get the 8 coefficient of how much each 1 of these affected your GPA. 286 00:35:14.009 --> 00:35:22.619 And then we'd get the error, which is how, you know, how well, those 8 of them do and that. So. 287 00:35:23.034 --> 00:35:36.503 And you could do the formula, just you find error, you know, the unknowns are the coefficients and then the error and then you just go and you do the derivative and optimize it. 288 00:35:36.893 --> 00:35:39.923 Well, that's how you get to derive the formula then to use. 289 00:35:40.918 --> 00:35:46.739 I'm sorry um, okay. 290 00:35:48.539 --> 00:35:53.849 Do share screen. 291 00:35:59.969 --> 00:36:01.164 It's being recorded, 292 00:36:01.164 --> 00:36:02.364 it's being shared, 293 00:36:06.474 --> 00:36:08.304 so here's the probability question, 294 00:36:08.304 --> 00:36:13.164 but just to be random variable the time until the screen sharing crashes again, 295 00:36:13.164 --> 00:36:15.414 it's crashed twice so far in 30 minutes. 296 00:36:17.034 --> 00:36:31.733 So this is a random variable. Do you think it's exponentially distributed or what you know so, what we could do is we could record the time until the, um, Webex screen sharing crashes again. And we get a number of data points for the class. 297 00:36:32.213 --> 00:36:39.114 And then we could say, do they look like they're exponential or what that's testing a fit for distribution? 298 00:36:40.108 --> 00:36:51.449 Okay, so, in any case, I'm giving you a high level description of multi, linear regression. We've got a number of independent variables and we want to find coefficients to predict. 299 00:36:51.449 --> 00:36:54.628 Say your GPA at Christmas or something. 300 00:36:54.628 --> 00:37:01.228 Um, the next thing is to go step wise. 301 00:37:03.389 --> 00:37:09.628 That's wise progression and we want to. 302 00:37:09.628 --> 00:37:16.619 Pick the most important independent variables. 303 00:37:22.648 --> 00:37:29.398 And add them to the mix 1 by 1 by 1 by 1. 304 00:37:31.018 --> 00:37:37.528 And so what you might do is you might say, you say correlation coefficient. 305 00:37:46.378 --> 00:37:51.449 So, what you do is, um, say that's called role typically. So we compute. 306 00:37:53.699 --> 00:37:58.318 Row for each independent variable separately. 307 00:38:02.969 --> 00:38:13.289 And the dependent variable, and we pick the variable. 308 00:38:15.750 --> 00:38:20.940 What's the biggest absolute value per row? 309 00:38:20.940 --> 00:38:25.619 Okay, and then we can do that and then what we can do. 310 00:38:25.619 --> 00:38:31.530 Is we can then, um, compute the residual. 311 00:38:37.079 --> 00:38:42.210 Um, so the residual errors, let's say. 312 00:38:42.210 --> 00:38:48.420 And add the new, most important variable. 313 00:38:55.619 --> 00:38:59.280 And this will be a step wise, multi, linear regression. Um. 314 00:39:01.079 --> 00:39:05.400 So, at each step, you, um, you have a new, um. 315 00:39:06.480 --> 00:39:17.610 Dependent variable, it's the errors and you'll find the most important independent variables you haven't yet used and you add it in your compute errors and re, computer new regression. 316 00:39:17.610 --> 00:39:24.179 And as you add more and more independent variables, if it gets better and better. 317 00:39:24.179 --> 00:39:28.260 You hope so um. 318 00:39:30.869 --> 00:39:37.440 So, that's, um, let's say, um, you know, that's used a lot social sciences and so on, um. 319 00:39:39.210 --> 00:39:46.380 Now, funny things can happen because independent variables might be strongly correlated with each other. 320 00:39:46.380 --> 00:39:58.289 See, you might have 2 independent variables, and they both have very strong correlations with a dependent, but they're also strongly correlated with each other. So you don't need both of them. You only need 1 let me give an example here. 321 00:39:58.289 --> 00:40:02.039 So, I'm trying to predict your say your 1st. 322 00:40:02.039 --> 00:40:08.219 Semesters GPA RPI and maybe a high school grades a strong predictor. 323 00:40:08.219 --> 00:40:15.809 And maybe your high school ranks a strong predictor, but you're great in your rank are correlated with each other strong is you don't need both of them. 324 00:40:15.809 --> 00:40:19.380 That sort of thing, so weird things can happen. So. 325 00:40:20.969 --> 00:40:27.179 And another thing is that if you bring in enough independent variables, you can explain anything and it doesn't mean anything. So. 326 00:40:28.650 --> 00:40:32.190 And you may have nonlinear relationships so, um. 327 00:40:33.719 --> 00:40:37.860 You see that with statistics about health and so on that, um. 328 00:40:40.704 --> 00:40:53.635 Well, an example, silly example would be if you, um, eat fewer calories, you will live longer. And that's actually the really the only really proven way to live longer is even fewer calories. 329 00:40:53.635 --> 00:40:59.094 But if you 2 few calories, you die starvation. So there's a nonlinear relationship there. 330 00:41:00.000 --> 00:41:09.960 Okay, so that's examples of linear regression to motivate where it fits in the universe. 331 00:41:11.550 --> 00:41:18.510 And if there's a non linear effect, then you add to square some variable is a new independent variable when you fit on that. 1. 332 00:41:18.510 --> 00:41:28.260 Okay, now there's other ways to have errors in just the square of the difference you're going to have some value of the difference and so on. 333 00:41:28.260 --> 00:41:33.960 And people might argue, 1 is better than the other, but all those other ones are very difficult mathematically. 334 00:41:33.960 --> 00:41:48.804 Where there's a number of tests that you see that statisticians people tend to use and if you ask why they use these tests and not some other 335 00:41:48.804 --> 00:41:49.434 tests, 336 00:41:50.034 --> 00:41:52.315 the honest answer is back when they were developed, 337 00:41:52.315 --> 00:41:55.375 they're actually easier to work with before computers. 338 00:41:55.585 --> 00:41:55.914 So. 339 00:41:56.280 --> 00:41:59.730 But the books don't necessarily say that the real reason. 340 00:42:00.809 --> 00:42:05.789 Okay um, so. 341 00:42:14.909 --> 00:42:19.860 A non parametric stats have a couple of different definitions, but maybe. 342 00:42:19.860 --> 00:42:29.699 1 is don't assume anything about the distribution let's say. 343 00:42:34.559 --> 00:42:47.159 Um, so maybe the distribution of the underlying things is not, um, or you don't know. So these tests, they're, they're much more robust. 344 00:42:50.519 --> 00:42:56.639 But they're weaker, um, which makes common sense. 345 00:42:57.719 --> 00:43:04.860 So, um, so maybe we're testing, um, we're looking at some numbers and. 346 00:43:07.260 --> 00:43:13.289 Let me, um, that's my. 347 00:43:18.449 --> 00:43:25.769 I got some blue things here and let's say, I don't know, and we got some red things. Um. 348 00:43:32.610 --> 00:43:43.920 Something like that perhaps um, okay 12345 blue 12345 red. Okay. So now the question is. 349 00:43:47.039 --> 00:43:54.869 Um, either read observation, smaller. 350 00:43:58.469 --> 00:44:05.699 Or well, does the red population have a smaller mean? So. 351 00:44:18.900 --> 00:44:24.389 This is a parametric method would be to a T test, so we're not gonna use that. We're going to use. 352 00:44:27.150 --> 00:44:32.579 Use only their order. 353 00:44:32.579 --> 00:44:38.880 Okay, and what we're going to do. 354 00:44:40.619 --> 00:44:47.969 They're going to count the number of times sort of a red number is less than a blue number. 355 00:44:50.760 --> 00:44:57.630 Okay, and here the 1st, red is, um, less than all 5 flu numbers. 356 00:44:57.630 --> 00:45:06.179 Probably you statistic the 2nd, red is less than all 5 blue numbers. The 3rd rate is less than 4 of them. 357 00:45:06.179 --> 00:45:11.969 The 4th thread is less than 3 and the 5th rate is less than 3. 358 00:45:11.969 --> 00:45:17.670 And we're going to get 1014 and 6, which is 20. 359 00:45:17.670 --> 00:45:22.440 And this is called a use that this is called a. 360 00:45:22.440 --> 00:45:31.050 This is a man Whitney. You statistic. 361 00:45:32.429 --> 00:45:44.670 And the thing is that if the red of the blue populations are the same and is reasonably big, then you is actually normally distributed. And, um. 362 00:45:49.199 --> 00:45:53.219 And actually it's an old hypothesis. 363 00:45:53.219 --> 00:46:03.030 Then, you mean as, like, event squared over 200 is 5 here will be 12.5 and use and then the, um. 364 00:46:03.030 --> 00:46:07.679 Then the variants for you. 365 00:46:09.030 --> 00:46:13.289 Okay, um, I don't actually mind if you make a. 366 00:46:13.289 --> 00:46:16.980 A betting pool among yourselves to, um. 367 00:46:16.980 --> 00:46:20.849 As to when this will happen again, so it's okay by me. 368 00:46:20.849 --> 00:46:29.610 I want 10%. Um, okay. 369 00:46:34.199 --> 00:46:37.230 So this used it, so I count the number of times. 370 00:46:37.230 --> 00:46:42.750 A red observation is less than a blue 1. I do for all the N squared tears. 371 00:46:43.105 --> 00:46:54.594 And, um, so there's, um, that's 25 pairs, and I count the number of times right assess the boy if it's all random, it'll be 12 and a half times. Actually, it was 20 times. 372 00:46:54.594 --> 00:47:01.644 And the variance on you, if it's random is 25 times 11 divided by 12. 373 00:47:04.110 --> 00:47:07.800 Which is about 23 or something. 374 00:47:07.800 --> 00:47:11.039 So, the Sigma is say. 375 00:47:12.750 --> 00:47:19.110 4.8 or whatever, and so, in other words, so I observed 20. 376 00:47:19.110 --> 00:47:25.170 Um, that the red is less from blue, 20 times so, which is way more than a Sigma. 377 00:47:25.170 --> 00:47:31.739 So, um, okay, so this suggests that this, this. 378 00:47:31.739 --> 00:47:45.030 So, the non hypothesis, the red and blue means are the same. 379 00:47:51.300 --> 00:47:59.610 You know, this is something like, um, you know, that's, you know, let's give or take, um. 380 00:48:00.840 --> 00:48:09.869 1.5, um, you know, as as a normal variable as a normal variable or something. 381 00:48:09.869 --> 00:48:16.320 So, it could happen, but it's very unlikely. So, and also my 5 here, quite small. 382 00:48:16.320 --> 00:48:22.289 That's it. So, yeah, I, I took the shortcuts here. 383 00:48:25.079 --> 00:48:39.744 This is unlikely. Okay. So what I showed you was what's called a non parametric way it was a quick way. I2 populations, red and blue that maybe had different means. Maybe not. 384 00:48:39.744 --> 00:48:41.605 My all hypothesis is the same mean. 385 00:48:42.570 --> 00:48:49.469 And, yeah, I did not use a T test because that would assume that the populations that are normal. 386 00:48:49.469 --> 00:48:56.730 Um, this is this code called a, or it's called a W, wilcoxen write some test or Amanda Whitney you test. 387 00:48:56.730 --> 00:49:03.809 And I sort the observations, red and blues together, and I count the number of times I read is less than a blue. 388 00:49:03.809 --> 00:49:15.210 All in squared pair so, if the underlying population is, in fact, normal, this is a weak test. It will. 389 00:49:15.210 --> 00:49:23.250 It will not rejected all hypothesis at times and all that process ought to be rejected perhaps. But if the distribution is really weird. 390 00:49:23.250 --> 00:49:30.000 Then, um, then this will be more robust, so. 391 00:49:31.139 --> 00:49:34.500 And there's a lot of different tests like that. So. 392 00:49:37.139 --> 00:49:40.590 And finally, oh, not finally. 393 00:49:40.590 --> 00:49:46.170 Well, okay, the current thought thing was statistics is, of course, machine learning. 394 00:49:46.170 --> 00:49:50.610 So, um, let me put this the. 395 00:49:59.010 --> 00:50:03.869 This is your current hot application? This is the current. 396 00:50:05.849 --> 00:50:10.349 Big application of statistics. Okay. 397 00:50:12.300 --> 00:50:20.489 So, um, you know, so I drive a Tesla and, um. 398 00:50:20.489 --> 00:50:27.539 So tests, you know, they want to have full self driving, which I think, basically is a fraud at the moment, but it's getting there. 399 00:50:27.539 --> 00:50:36.239 So, you know, it tries to recognize objects, recognize speed, limit, signs, and lines on the road and bicyclists and whatever. 400 00:50:36.239 --> 00:50:41.190 And so it's using statistics of the videos. 401 00:50:41.190 --> 00:50:50.369 So, it's computer and using massive amounts of compute power server farm. So that's the big application that's whole courses on its own. But that's the current thought. 402 00:50:50.369 --> 00:50:54.210 Topic for statistics machine learning, so. 403 00:50:54.210 --> 00:50:58.349 So, I won't go into that much more detail, but. 404 00:50:58.349 --> 00:51:05.309 Then the next fun thing is, of course, since we're learning statistics that we got to do is see how the lie with statistics. 405 00:51:06.690 --> 00:51:21.449 Um, and there's various books on it and so on. So, this one's for other old, but it's a famous 1, um, 1617 years old but, um. 406 00:51:21.449 --> 00:51:26.730 So, you can have fun looking at these, how to mangle things and, um. 407 00:51:29.940 --> 00:51:33.750 All your jokes about by standardized and statistics, so. 408 00:51:35.639 --> 00:51:38.005 Bring in Mark Twain, stuff like that. 409 00:51:50.815 --> 00:51:54.114 Don't for sale. This wasn't new addition. So you can have fun. 410 00:51:55.440 --> 00:52:09.329 Oh, okay. So that's a good point to stop. What I was doing today is re, giving us some new ideas and statistics talking about the videos that I recommended for Thursday. So. 411 00:52:09.329 --> 00:52:15.480 We saw things like, well, just a quick thing, but analysis of various, what it is very, in a few words. 412 00:52:15.480 --> 00:52:18.750 Progression, um, linear regression. 413 00:52:18.750 --> 00:52:26.699 We're trying to we have a dependent variable. It's a function of independent variable, trying to fit a straight line and then see how good the fit is. 414 00:52:27.264 --> 00:52:39.775 Um, because it's a fit that's expressed by the correlation coefficient the square of that is how much was a variance you explain lots of independent variables for multiple regression and we can do a step wise thing. 415 00:52:40.255 --> 00:52:44.454 We got lots of possible that are independent variables. You don't know which ones are important. We. 416 00:52:45.000 --> 00:52:51.480 We do the correlation coefficient to them with the dependent on we bring them in 1 by 1 and read compute after every step. 417 00:52:51.480 --> 00:53:05.309 By the way I had a hobby project, I think when I was in grad school with a friend of mine with my roommate, when we're at Harvard, we were doing animal performance statistics with reference to something down. So it was a thoroughbred race. Course. So. 418 00:53:05.309 --> 00:53:11.250 What we did was we bought we advertised and bought 6 foot high stack of racing forms. 419 00:53:11.250 --> 00:53:15.780 And we typed in by hand on computer punch cards. 420 00:53:15.780 --> 00:53:19.440 2000 cards worth the statistics for horses. 421 00:53:19.440 --> 00:53:23.460 Um, we're trying to predict what horses would win and the. 422 00:53:23.460 --> 00:53:27.780 The independent that was the dependent on the independent variable was how many times the horse and. 423 00:53:27.780 --> 00:53:35.429 1, in the past, how many dollars that had won it? Speed how fast it was raised in the past installer and we tried to do, um. 424 00:53:35.875 --> 00:53:48.804 So, we tried to do that and compute that then we wrote, like, 4 trend programs and computed that I was interested in the mathematics of it, but my roommate was interested in the horses themselves and so he went to the track a lot. 425 00:53:49.074 --> 00:53:55.344 He was getting a PhD I was getting a PhD in applied math, computer science. He was getting a PhD in psychology. 426 00:53:55.800 --> 00:53:59.039 And he spent like, 4 years on the program and then. 427 00:53:59.039 --> 00:54:02.639 I don't know, he decided he didn't like the topic any more. 428 00:54:02.639 --> 00:54:06.599 He likes statistics and stuff, so he dropped out of the, um. 429 00:54:06.599 --> 00:54:14.400 Infants like PhD after he's done all the experiments on infants and seeing how they interacted with each other. 430 00:54:14.400 --> 00:54:18.539 And then he went into. 431 00:54:18.539 --> 00:54:25.079 Law school got a law degree, then work to choose from Canada where, if you enter your securities commission. 432 00:54:25.079 --> 00:54:34.829 Doing regulating on companies and then doing stuff like that, making much more money than it was made. If he got his PhD and insights. Of course. 433 00:54:34.829 --> 00:54:39.480 But any case that's, um, doing multiple stepwise, linear regression. So. 434 00:54:40.559 --> 00:54:48.510 There used to be a couple of people at there was an associate dean of an engineering called Paul and the chairman of math. Um. 435 00:54:48.510 --> 00:54:54.179 Used to go up to, um, Saratoga in the summer. In fact, they did. Okay. 436 00:54:54.179 --> 00:55:03.210 We're mathematics, but actually make you some money. Okay, so that's a reasonable point to stop again to remind you, um. 437 00:55:03.210 --> 00:55:07.530 After your homework 11 is done and then. 438 00:55:07.530 --> 00:55:12.239 Made it to Thursday, because that's already 1 day later then we can officially make. 439 00:55:12.239 --> 00:55:18.210 And I'll compute and it gets great. It'll take the ta's 2 days to grade it because it's not just multiple choice. Um. 440 00:55:18.210 --> 00:55:24.420 Then I'll compute a guaranteed minimum grade for you that you'll get. If you don't write the final. 441 00:55:24.420 --> 00:55:32.880 You do write, it can only help and then we'll have a QA office hour every day before the final. 442 00:55:32.880 --> 00:55:40.079 And good luck and and then after the course, again, I'm open to questions and stuff. 443 00:55:41.369 --> 00:55:45.150 So have fun and hope your semester finishes off. Oh, okay. 444 00:55:55.769 --> 00:55:58.889 Doing 1 person. Okay.