WEBVTT 1 00:00:09.329 --> 00:00:18.868 Hello. 2 00:01:05.129 --> 00:01:09.629 Hello. 3 00:01:32.724 --> 00:03:06.865 Okay, 4 00:03:06.865 --> 00:03:07.735 so. 5 00:03:14.789 --> 00:03:25.020 Theory might be. I'm actually broadcasting recording stuff. 6 00:03:26.789 --> 00:03:30.180 Somebody wanted to check that on me. Okay. 7 00:03:31.650 --> 00:03:34.919 Go ahead. So. 8 00:03:40.229 --> 00:03:45.270 Profitability class. 9 00:03:48.599 --> 00:03:53.370 26, so. 10 00:04:06.180 --> 00:04:16.500 Oh, I'm working here. 11 00:04:16.500 --> 00:04:22.500 Hello. 12 00:04:26.365 --> 00:04:40.795 Hello. 13 00:04:42.478 --> 00:04:48.658 Okay, so where are we chapter 26 a chapter class? 14 00:04:48.658 --> 00:04:53.098 26, um. 15 00:04:53.098 --> 00:04:56.788 Chapter 8, or beyond Garcia the textbook. 16 00:04:56.788 --> 00:05:01.168 And then. 17 00:05:02.309 --> 00:05:07.228 Stop here come on. 18 00:05:07.228 --> 00:05:13.709 Okay. 19 00:05:13.709 --> 00:05:19.288 So, I got some blurbs, I've typed into the blog and then I'll have. 20 00:05:19.288 --> 00:05:23.399 Couple of points from chapter 8 jumping around. 21 00:05:23.399 --> 00:05:28.528 1st, 2 weeks ago or so I showed you a demo and Mathematica. 22 00:05:28.528 --> 00:05:31.678 Which is the world's breast algebra package. 23 00:05:31.678 --> 00:05:35.069 Um, here is a print out. 24 00:05:35.069 --> 00:05:38.129 Of the session and a few things before. 25 00:05:38.129 --> 00:05:44.069 So, you can see the commands that I did and, um, what the output was. So. 26 00:05:44.069 --> 00:05:47.939 Pdf for the of the session, um. 27 00:05:47.939 --> 00:05:51.778 Some things that will show you are that. 28 00:05:54.418 --> 00:06:01.319 Um, it will do I did some software for the startup it'll, it'll do an integral. 29 00:06:01.319 --> 00:06:07.048 Um, for example, integrate, sign square of X. 30 00:06:07.048 --> 00:06:12.869 And it did, it's fine. Um, and definitely the goal I could put in limits that give me a number. 31 00:06:12.869 --> 00:06:16.528 Ignore manipulate it will do summation. Here was a. 32 00:06:16.528 --> 00:06:21.928 Um, and put 3 was a fixed, um, of X squared. 33 00:06:21.928 --> 00:06:29.908 From 0, to 10, um, well, actually was a variable so that that was sort of a silly thing to do. Um, that's. 34 00:06:29.908 --> 00:06:33.598 But I could also say cause I was hiring, um. 35 00:06:33.598 --> 00:06:41.579 And here on line, 6 summing, execute, bearing extra 0 to K gave me the answer integral in depth. 36 00:06:41.579 --> 00:06:45.418 No, inner goals again. 37 00:06:45.418 --> 00:06:54.869 Most experiments do not have closed inner girls if they're if a closed article exists, the Mathematica will find it. 38 00:06:54.869 --> 00:07:06.088 If it does not exist and will use the miracle techniques to give you an approximation. It'll always do derivatives. So line 8, there is there to assign cube times exponential. 39 00:07:06.088 --> 00:07:11.579 If you that I can define functions like, can I define the function? 40 00:07:11.579 --> 00:07:14.999 The syntax for function is a little Massey. 41 00:07:14.999 --> 00:07:20.699 Um, basically, so I put in, I have to put an underscore after. 42 00:07:20.699 --> 00:07:29.369 Yes, and on the left hand side is cold and equals ignore the syntax. But what I did in, like, 10 there is to find the calcium. 43 00:07:29.369 --> 00:07:34.709 Ability density, f***, it's built into the system. It's an existing function. I just showed you. 44 00:07:34.709 --> 00:07:40.798 How to define it fly and 11 I could plot a function, say, -3 to 3. 45 00:07:40.798 --> 00:07:46.499 And that I could also if this was a live session, drag it, and so on. 46 00:07:47.608 --> 00:07:56.399 Um, then line 12, uh, integrated that density function for -1 to 1. it's not a closed integral. 47 00:07:56.399 --> 00:08:01.559 Mathematic he gave me an answer in terms of an error function. Um. 48 00:08:02.699 --> 00:08:13.619 And, er, which is a predefined function um, and then I could say, well, I don't want it in terms of the predefined function. I want a number. 49 00:08:13.619 --> 00:08:24.119 So line 13 I put in as a function that converts these things to numbers. And the other goal of that thing for -1 to 1 was point 6 8. 50 00:08:24.119 --> 00:08:29.098 So probability that here within 1 SEC was the main. 51 00:08:29.098 --> 00:08:36.899 And then I integrated density fonts in 0 to infinity you can put in Infinity it means affinity. 52 00:08:36.899 --> 00:08:41.099 Give me 1 half, um. 53 00:08:41.099 --> 00:08:46.739 And again, the syntax is weird. Braces mean a list they don't mean a set. 54 00:08:46.739 --> 00:08:49.979 And function arguments, you have square brackets. 55 00:08:49.979 --> 00:08:56.908 I take the density function, uh, evaluated it to. I get the output 15. it involves E. 56 00:08:56.908 --> 00:09:03.269 Square root of 2 pie. I want it as a number. I do that is point 05 and so on. Um. 57 00:09:04.769 --> 00:09:11.188 Next thing I'm showing you is, this is a place where mathematically you get to love it is it works with functions that have. 58 00:09:11.188 --> 00:09:16.288 Case statements in them, like the piece wise linear or something. 59 00:09:16.288 --> 00:09:24.269 And line, um, 17, I defined a function, which has a conditional. It's a step it's a square away. 60 00:09:24.269 --> 00:09:28.499 From maximum minus a half to a half. It's 1 other. Why is it 0. 61 00:09:28.499 --> 00:09:31.649 I plotted in 18. 62 00:09:31.649 --> 00:09:36.239 You don't see the vertical side to the square away, but you see the square away. 63 00:09:36.239 --> 00:09:41.278 I integrate it. Okay. Then I want to do a convolution of that with itself. 64 00:09:41.278 --> 00:09:45.418 And so it's a little growth that function is g and G. 65 00:09:46.438 --> 00:09:54.178 Why time 2-fly integrating right now? I had to do a few tricks to make this work. If you actually want to use mathematics. 66 00:09:54.178 --> 00:09:57.479 Um, come and ask me so. 67 00:09:57.479 --> 00:10:07.558 I'm going through it really quickly now, but because I spent some considerable time before that other class, making things work any case to integrate the square away we're gonna try and go way. 68 00:10:07.558 --> 00:10:17.548 Um, and I can print out what it is. So, here, it's a thing with case statement with 3 cases there triangle way. 69 00:10:17.548 --> 00:10:22.318 Um, that side of the triangle, right side to try and go outside the triangle. 70 00:10:22.318 --> 00:10:29.428 And then I can say, okay, I'm gonna take, I'm going to integrate I'm going to involve the triangle wave with itself. 71 00:10:29.428 --> 00:10:37.828 And so this, so so the step file is squared away. The convolution is the sum of 2 square wave random variables. 72 00:10:37.828 --> 00:10:42.089 The next convolution will be the summer for wave random barrier for. 73 00:10:42.089 --> 00:10:46.139 Square away random variables. I, um. 74 00:10:47.788 --> 00:10:52.918 By doing another convolution, let's get it working and then I plot it. 75 00:10:52.918 --> 00:10:58.019 It's looking like this, it's the piece wise, um, quadratic function. 76 00:10:58.019 --> 00:11:03.269 And it doesn't look that far off from a normal. It's not exactly it goes down. Exactly. 0, but. 77 00:11:03.269 --> 00:11:09.389 Some of 4, it means squared away is pretty, you know, you know. 78 00:11:09.389 --> 00:11:15.239 It's not a smooth function at all, but I add 4 square wave random variables. It starts looking nice and smooth. 79 00:11:15.239 --> 00:11:22.229 Um, and then I'm just plotting all 3 of them at once. Okay so that well, this is what the. 80 00:11:22.229 --> 00:11:27.058 Some of the 4 square away. Random variables. Looks like so. 81 00:11:27.058 --> 00:11:33.448 It's fairly may he had to do this by hand it's fairly messy. It's got like, 123, you know, 6 different cases. 82 00:11:33.448 --> 00:11:39.448 So, and so I can plot it, I can list what it is, I can print out particular values. 83 00:11:39.448 --> 00:11:44.068 Also strong, I can plot simple things like signs and this is a 3 D plot. 84 00:11:44.068 --> 00:11:50.548 And when I was showing you the live session, I could drag that plot and rotated and so on. 85 00:11:50.548 --> 00:11:56.428 I took it mathematic as a concept of working with distribution. So, distribution is like. 86 00:11:56.428 --> 00:12:03.538 The random variables uniform, distribution in distribution, wherever calcium, normal distribution. So. 87 00:12:03.538 --> 00:12:08.249 And, like, 45 and and is variable what's. 88 00:12:08.249 --> 00:12:16.019 As a distribution in it, and so we can't print it directly. What we do is we take this distribution and do something to it. 89 00:12:16.019 --> 00:12:21.629 Like, evaluate its density function at 3, and we get out put 46 there. 90 00:12:21.629 --> 00:12:27.119 Which we can turn into a number of 147. 91 00:12:27.119 --> 00:12:33.058 Cd app will give me the cumulative distribution function and it's plotting it for -3 to 3. 92 00:12:33.058 --> 00:12:37.979 Evaluating at some point, like 2 and then. 93 00:12:37.979 --> 00:12:41.009 Um, getting a, um. 94 00:12:41.009 --> 00:12:48.568 A number there, um, given the distribution, I can do things like fine. I mean, and variance and we'll do the obvious things. 95 00:12:48.568 --> 00:12:54.658 Um, I can do distributions and several variables like a multi. 96 00:12:54.658 --> 00:13:00.479 This so this is a 2 variable here line 56. 97 00:13:00.479 --> 00:13:08.578 And what I give for that is I give it the means of the variables separately, the standard deviations of the variables separately. 98 00:13:08.578 --> 00:13:13.438 Oh, the means and then a variance matrix a 2 by 2 matrix. 99 00:13:13.438 --> 00:13:22.889 Um, which will be 1.5.51 and the off beg. It'll say how strongly the 2 variables are correlated. 100 00:13:22.889 --> 00:13:29.938 And I plot it and you see, it's not a perfectly circular thing. It's the. 101 00:13:29.938 --> 00:13:35.129 Compressed mountain, you might say because of the correlation between the 2 variables. 102 00:13:35.129 --> 00:13:38.788 And I can do density functions on that. 103 00:13:38.788 --> 00:13:46.048 Um, what I do 163 is I integrate out 1 of the variables, look at a marginal density. 104 00:13:46.048 --> 00:13:51.658 And so all of this is, you want to learn mathematic if there's a learning curve, it's a lot faster. 105 00:13:51.658 --> 00:13:56.489 Okay, so that's what I have here. That's that's what I showed you in class. 106 00:13:56.489 --> 00:14:02.729 Print out and 1 more thing I have is, should you want to play with mathematic yourself? 107 00:14:02.729 --> 00:14:07.019 What I have here is the session stored is a mathematical file. 108 00:14:07.019 --> 00:14:11.158 So, if you fire up mathematic, if you download that session file. 109 00:14:11.158 --> 00:14:14.818 You can read it in the mathematic and now you're in my session. So. 110 00:14:14.818 --> 00:14:23.548 Okay, um, this, this continued now, Thursday, I'm on a government um. 111 00:14:23.548 --> 00:14:28.349 I'm doing some volunteer work for the federal government, so. 112 00:14:28.349 --> 00:14:34.078 And so I won't be in class, so I'll have some videos for you to look at. 113 00:14:34.078 --> 00:14:39.269 And also you may, um, I may upload some, but here are some videos. 114 00:14:39.269 --> 00:14:42.839 Um, which give you other ways of looking at. 115 00:14:42.839 --> 00:14:47.038 Statistics, um, 2 organizations researched by design and. 116 00:14:47.038 --> 00:14:50.099 Of course, with 3 in MIT is open courseware. 117 00:14:50.099 --> 00:14:56.099 So my value to you, and this is selecting videos that I think are good videos that match the course, because. 118 00:14:56.099 --> 00:15:00.479 It takes a long time to go through all the stuff on the web, find some stuff. That might be interesting. 119 00:15:01.403 --> 00:15:14.933 Um, so 1 of my messages for statistics is, it is counterintuitive and that is public policy implications because the mathematics says things that most people might think are not true. 120 00:15:15.653 --> 00:15:18.533 And because most people don't know any math um. 121 00:15:19.854 --> 00:15:29.874 I actually years we went ahead and let her to the editor published the times June. Yeah, it was a joke letter that they published though, saying that there was too much math being taught in high school. 122 00:15:30.024 --> 00:15:35.063 Um, so, um, because how often does the average citizen use math. 123 00:15:35.339 --> 00:15:43.048 And then I said my last sentence in the letter is the advantage of teaching less math and high schools mcdonald's would have a logical of workers. 124 00:15:43.048 --> 00:15:51.298 So, I may upload that letter, but any case. So, there are some paradoxes. Simpson's paradox is 1. 125 00:15:51.298 --> 00:15:59.999 That thing I showed you with the admission statistics and that fictional case that's happened in the real world. Actually. Um. 126 00:15:59.999 --> 00:16:04.379 Other counterintuitive things pulling up current events like. 127 00:16:04.379 --> 00:16:12.328 Your goal the number of excess desks in the United States rose by 15% compared to the year before. 128 00:16:12.328 --> 00:16:18.239 And so the question is, is the number of deaths total number of tests went up by 15%. 129 00:16:18.239 --> 00:16:23.068 Does that mean that the life expectancy went down by 15% or not? 130 00:16:23.068 --> 00:16:31.649 In fact, it did not the life expectancy went down by a year or 2 in the United States for the number of desks. Total number of deaths went up by 15%. 131 00:16:31.649 --> 00:16:35.668 And you can crunch through the math and see how that thing is consistent. 132 00:16:35.668 --> 00:16:42.448 Wikipedia has an article on these things it says something called Simpson's paradox. Um. 133 00:16:44.099 --> 00:16:49.139 So miss you, so any case, if you want to learn more about that. 134 00:16:49.139 --> 00:16:52.318 So, bang, bang, bang, bang, bang. 135 00:16:52.318 --> 00:16:58.649 Oh, okay. That's no questions. There. Next thing is getting back to. 136 00:17:00.928 --> 00:17:07.048 This is chapter 8 continued. I want to show you some, some random things from it. 137 00:17:07.048 --> 00:17:13.648 And, um, 1. 138 00:17:15.388 --> 00:17:21.148 416 um, okay. 139 00:17:22.558 --> 00:17:25.769 Parameter estimation. Okay. 140 00:17:25.769 --> 00:17:35.308 I mentioned this before I mentioned it again this time in greater. 141 00:17:35.308 --> 00:17:38.699 And just for the fun of it in theory. 142 00:17:38.699 --> 00:17:42.419 This is being recorded by, um. 143 00:17:44.128 --> 00:17:52.558 Webex so, and in theory, I'm sharing my screen just to make sure. 144 00:18:00.778 --> 00:18:10.679 Cash? Yeah. Okay. Okay. So, um. 145 00:18:32.548 --> 00:18:36.568 Okay, see, you, you have a population. 146 00:18:43.588 --> 00:18:47.699 Draw and random variables. 147 00:18:51.719 --> 00:18:54.719 Okay, and. 148 00:19:00.118 --> 00:19:05.699 Um, do you want to infer. 149 00:19:08.939 --> 00:19:12.028 Unknown parameters of the distribution. 150 00:19:19.199 --> 00:19:24.148 Um, and what is unknown depends and so we got. 151 00:19:25.348 --> 00:19:31.588 Different cases, um, say. 152 00:19:32.939 --> 00:19:36.118 You know, new is unknown, let's say. 153 00:19:39.298 --> 00:19:43.828 Maybe Sigma depends and in any case, um. 154 00:19:50.489 --> 00:19:55.199 Okay, that's, um. 155 00:19:59.009 --> 00:20:04.798 And good Estimator of you and it's also unbiased. Um. 156 00:20:07.709 --> 00:20:12.868 Which means, it's mean, okay talk to him about that and make some common sense. Um. 157 00:20:14.308 --> 00:20:18.358 So next thing is, suppose. 158 00:20:19.888 --> 00:20:26.278 Sigma is unknown. Okay. Um. 159 00:20:26.278 --> 00:20:31.798 Sigma squared just don't have to do a square. Um. 160 00:20:33.838 --> 00:20:43.949 Okay, so the obvious Estimator is this, um. 161 00:20:56.128 --> 00:20:59.578 I E, we take our estimated mean. 162 00:20:59.578 --> 00:21:12.838 Is this and we just take that as if it was a true mean and we take squared difference is 1 of random. That's the this is an estimate of the variance. And, um. 163 00:21:14.638 --> 00:21:17.699 The problem is, this is the obvious thing. 164 00:21:17.699 --> 00:21:23.128 Now, when I say something is obvious, a lot of the time I follow that by saying it's wrong. 165 00:21:23.128 --> 00:21:27.328 Problem with this is that it's, um. 166 00:21:32.818 --> 00:21:40.019 It's biased what I mean, the expected value of this. 167 00:21:42.449 --> 00:21:56.064 Does not equal, a Sigma squared. Okay. What I'm going to do now is show that, um, chance to throw little algebra at you. You know, occasionally I show you movies um, videos case I hit you with some math. 168 00:21:56.064 --> 00:22:00.203 This will be hit you with some math couple of minutes. Um. 169 00:22:01.949 --> 00:22:07.048 Okay, well show you this, so. 170 00:22:08.759 --> 00:22:12.628 S, um, squared, so. 171 00:22:20.038 --> 00:22:25.409 Okay, um, what I'm going to do in here. 172 00:22:25.409 --> 00:22:29.548 And I'm sorry. 173 00:22:29.548 --> 00:22:38.699 Okay, there we go. Well, okay. Um. 174 00:22:42.088 --> 00:22:52.673 Stick in here is, I'm gonna have subtraction. 175 00:22:52.673 --> 00:22:57.473 Add new news is the no is the actual mean or the distribution. 176 00:22:57.719 --> 00:23:03.088 And so what happens here. 177 00:23:08.278 --> 00:23:21.778 Okay. 178 00:23:23.189 --> 00:23:26.669 No, I can take this and I can just. 179 00:23:26.669 --> 00:23:30.358 You know, so then. 180 00:23:32.128 --> 00:23:37.798 And we'll just expand, um. 181 00:23:56.909 --> 00:24:01.409 Cause, you know, finding the square. 182 00:24:01.409 --> 00:24:06.449 Now, what I can do is, um. 183 00:24:08.278 --> 00:24:21.479 Move the summation inside and then the new minus X bar is a constant. So. 184 00:24:24.838 --> 00:24:34.318 Um, again, you minus X bar square plus there'll be a . 185 00:24:34.318 --> 00:24:40.199 And, um, you know, my. 186 00:24:42.209 --> 00:24:49.318 Okay, now. 187 00:24:54.624 --> 00:24:57.233 We do here is 188 00:25:11.963 --> 00:25:13.344 this thing here? 189 00:25:13.433 --> 00:25:14.064 Oops. 190 00:25:15.778 --> 00:25:20.338 Not what I wanted. Um. 191 00:25:22.108 --> 00:25:28.499 I don't need chunk here. 192 00:25:33.473 --> 00:25:48.384 Equals close 193 00:25:48.384 --> 00:25:49.134 that, 194 00:25:49.163 --> 00:25:50.153 um. 195 00:25:54.749 --> 00:26:05.729 Uh. 196 00:26:05.729 --> 00:26:10.949 Okay, um, I got it. I got it. 197 00:26:10.949 --> 00:26:15.749 It's showing you the division. Cool. Okay. Um. 198 00:26:29.338 --> 00:26:32.459 The middle term is, um. 199 00:26:35.759 --> 00:26:40.108 So the 1 over in India in Kansas, I'll, I'll take this in 2 steps. 200 00:26:42.538 --> 00:26:48.929 Um, plus, um. 201 00:26:48.929 --> 00:26:59.009 Over here, actually what I do, I took that 1 over, and then I divided through my stuff. 202 00:26:59.009 --> 00:27:07.409 Okay, and so the thing in the middle here is, um. 203 00:27:17.459 --> 00:27:21.749 Thing here is -2. 204 00:27:21.749 --> 00:27:26.848 You 6 bar squared. 205 00:27:26.848 --> 00:27:33.689 And we take -2 and this, and this adds up to plus or . 206 00:27:36.269 --> 00:27:44.038 Minus okay, so the thing nets out to. 207 00:27:51.388 --> 00:27:59.098 Um, okay, let's see. 208 00:28:03.358 --> 00:28:07.709 Well, it doesn't matter which way I put this ex bar by an X. 209 00:28:07.709 --> 00:28:18.209 Squared okay, so and this is the, um, yes. 210 00:28:18.209 --> 00:28:25.588 Where this is my Estimator up here now, I want this expected value. 211 00:28:25.588 --> 00:28:32.278 Um, affected value of that. 212 00:28:32.278 --> 00:28:37.858 Um, oh. 213 00:29:15.989 --> 00:29:20.338 Okay, um, no. 214 00:29:26.308 --> 00:29:29.489 How do we find some of this and. 215 00:29:29.489 --> 00:29:43.618 Um, I'll skip this will take a while. I'll skip us. 216 00:29:43.618 --> 00:29:46.949 A step number 2, but in any case, um. 217 00:29:51.628 --> 00:29:56.578 It turns out that, um, dipping a little. 218 00:29:58.019 --> 00:30:02.368 The finish up this thing here netscope to, um. 219 00:30:04.078 --> 00:30:09.778 A, real, um, Sigma and this thing here nets out to, um. 220 00:30:13.979 --> 00:30:19.048 Um, and the thing that's out to. 221 00:30:20.398 --> 00:30:33.719 Um, next in there we want doesn't matter. 222 00:30:33.719 --> 00:30:38.939 What this says is that. 223 00:30:38.939 --> 00:30:45.719 If I use this obvious Estimator for the variance of the whole population. 224 00:30:45.719 --> 00:30:49.499 Calculated from what I saw with these samples. 225 00:30:49.499 --> 00:30:53.068 This Estimator is too small. 226 00:30:53.814 --> 00:31:08.153 So, um, there's 2 small. 227 00:31:10.679 --> 00:31:15.148 Um, so a, an unbiased. 228 00:31:17.308 --> 00:31:23.848 Would be, um, is this the wait it up by that? So we'll do. 229 00:31:23.848 --> 00:31:29.098 Um, would it be, um, I don't know. 230 00:31:29.098 --> 00:31:33.808 A little hat on, or something would be 1 over and -1. 231 00:31:33.808 --> 00:31:41.098 Some of all the minus, the pot, the observed population mean square. 232 00:31:42.324 --> 00:31:55.614 Okay, now, if you read less technical books on statistics, they'll say that you divide by in -1 instead event because it's N, -1 degrees of freedom, blah, blah, blah. I've never understood that. 233 00:31:56.729 --> 00:32:00.449 Why that should be the case, but, um. 234 00:32:01.709 --> 00:32:06.058 In any case, um, this is a mathematical thing, so. 235 00:32:06.058 --> 00:32:09.749 So, and I just spent a few minutes showing you was, um. 236 00:32:13.108 --> 00:32:22.169 Was how to compute an unbiased Estimator for the unknown variance of a PoC Gaussian population. 237 00:32:22.169 --> 00:32:25.709 By taking end samples, so okay. 238 00:32:27.808 --> 00:32:32.219 The next thing I wanted to show you, um, will be. 239 00:32:32.219 --> 00:32:35.429 8430ibelieve. 240 00:32:35.429 --> 00:32:42.058 And will be some confidence intervals. I've alluded to it. I want to do it in touch more detail. Um. 241 00:32:55.199 --> 00:33:01.048 Okay, so this is a set section 8.4. 242 00:33:07.979 --> 00:33:11.699 Okay, so what's happening here? Um. 243 00:33:14.159 --> 00:33:20.788 We have a population which say. 244 00:33:20.788 --> 00:33:25.048 Unknown. 245 00:33:28.499 --> 00:33:36.989 Take and observations, so bye. 246 00:33:36.989 --> 00:33:43.528 So, we compute an estimator. 247 00:33:46.798 --> 00:33:50.278 Of mean, and then that would be. 248 00:33:56.189 --> 00:34:00.479 Okay, it's a sample beam. So our question is. 249 00:34:01.919 --> 00:34:08.518 Um, what does this tell about. 250 00:34:08.518 --> 00:34:20.759 The real mean, okay, the real mean. 251 00:34:20.759 --> 00:34:25.018 And what we're going to work, and is that. 252 00:34:25.018 --> 00:34:35.338 The real mean, um, basically. 253 00:34:40.528 --> 00:34:50.639 Well, we're going to do, um, something called C, such that the probability that the real mean. 254 00:34:50.639 --> 00:34:55.619 Um. 255 00:35:00.208 --> 00:35:11.998 Let's see, let's say the probability, say, close point 95, say, or something. Okay. So going to find a new interval a number C sets that. 256 00:35:11.998 --> 00:35:19.378 The probability that the real mean is within sea of our sample mean is saying, 95 happens 95. 257 00:35:19.378 --> 00:35:25.829 Out of a 100 times so this will be a a confidence interval. Okay. 258 00:35:29.248 --> 00:35:39.119 And there, there's, um, there's, you know, there's, there's various there are several different cases. Um. 259 00:35:44.639 --> 00:35:50.099 He says, um, 1 is the population is. 260 00:35:52.798 --> 00:35:55.889 And we know. 261 00:35:55.889 --> 00:36:06.239 Oh, we don't know. Mean. Okay, let's say and maybe, we don't know segment. Maybe the population's not. 262 00:36:06.239 --> 00:36:09.929 Okay, but we start with a simple case. 263 00:36:09.929 --> 00:36:18.389 Okay, um, so the way we do this is that. 264 00:36:19.469 --> 00:36:24.599 Is is there a 1 this means. 265 00:36:24.599 --> 00:36:29.039 Um. 266 00:36:35.309 --> 00:36:46.858 Actually, not 01, unknown mean and we'll say we know the variance just make it 1. 267 00:36:46.858 --> 00:36:51.628 Okay um, so this. 268 00:36:51.628 --> 00:36:57.778 Which is the sum of all the over N. that's, um. 269 00:36:59.188 --> 00:37:03.179 I mean, and 1 over square root event. 270 00:37:04.349 --> 00:37:11.759 Okay show that before. 271 00:37:11.759 --> 00:37:17.789 And so. 272 00:37:20.309 --> 00:37:25.409 Oh, sure. So basically that. 273 00:37:25.409 --> 00:37:31.708 Again, I'm assuming we're making this the real mean siberry and someone. 274 00:37:33.389 --> 00:37:39.329 Okay, so basically you just look into your, um. 275 00:37:39.329 --> 00:37:42.688 Tables for the galaxy and distribution, um. 276 00:37:42.688 --> 00:37:49.259 This table, you know, for. 277 00:37:51.509 --> 00:37:54.898 Cdf on the Gaussian and so on. 278 00:37:57.028 --> 00:38:10.829 So, give an example. Um, so, maybe from the table, you might get something like the probability that, um. 279 00:38:22.230 --> 00:38:27.449 Well, let let me do I'll work my way up to this slowly um. 280 00:38:31.559 --> 00:38:34.650 No, no, so. 281 00:38:38.699 --> 00:38:41.730 That work growing up to it slowly. Okay. Um. 282 00:38:44.190 --> 00:38:52.619 Look, why say B, and C1, then you look into your table so the probability that say. 283 00:38:54.329 --> 00:39:01.889 Um, why is, um, that greater than, um. 284 00:39:03.539 --> 00:39:11.730 2 might be something like, um, I forget the number say. 285 00:39:11.730 --> 00:39:15.389 4 per cent or something. 286 00:39:15.389 --> 00:39:29.760 Okay, so so probably actually value Y, greater than 2 is going to be saying, 8% or something. These are approximate. 287 00:39:30.869 --> 00:39:34.469 Okay, so, um. 288 00:39:35.849 --> 00:39:40.920 So so if I go back to find confidence intervals for our mean. 289 00:39:40.920 --> 00:39:47.309 Um, so. 290 00:39:47.309 --> 00:39:51.659 The sample means, they might say less than me a website. 291 00:39:54.059 --> 00:40:08.280 So, if I want that to be say, okay, I suppose I want it to be. 292 00:40:08.280 --> 00:40:11.670 95%, let's say, um. 293 00:40:19.980 --> 00:40:23.340 Be like the probability that, um. 294 00:40:32.670 --> 00:40:36.989 Go ahead and this here. 295 00:40:36.989 --> 00:40:43.289 As Gaussian with unknown mean and then the standard deviation, um. 296 00:40:44.429 --> 00:40:52.260 Square root event, so now we can look into the table so our, our Z value actually. 297 00:40:53.519 --> 00:41:08.454 And see, value will be, um, it'd be like to see square good event and we look into a table and we can pull it out. Um. 298 00:41:08.849 --> 00:41:16.170 Okay, okay. And get the answer. Okay. 299 00:41:19.800 --> 00:41:26.519 It takes too much time to work through these things in detail, but okay. Um. 300 00:41:27.989 --> 00:41:33.030 So, that's where I know I know what this variance was. I just said it was some 0. 301 00:41:33.030 --> 00:41:36.389 I said it was 1. I'm sorry so, case 2. 302 00:41:36.389 --> 00:41:45.119 Is we don't know I mean, or stand or the same deviation. 303 00:41:45.119 --> 00:41:52.019 So, what do you do in a case like this? Um, so. 304 00:41:56.039 --> 00:41:59.820 I'll just give you the executive summary. We use something called. 305 00:42:01.199 --> 00:42:05.670 A T test and this is from. 306 00:42:06.750 --> 00:42:13.469 This is from the same guy I told you about work at Kenneth Rory, or at 9,008. hundred's a mathematician called. 307 00:42:13.469 --> 00:42:20.639 Um, the thing is that we can, um. 308 00:42:27.809 --> 00:42:32.340 Is that our, our sample what we can do is that, um. 309 00:42:37.530 --> 00:42:41.730 We essentially we essentially normalize our observations. Um. 310 00:42:41.730 --> 00:42:45.780 You see what I did before if I go back up to the previous case, um. 311 00:42:55.829 --> 00:43:00.900 I normalize the exercise. I took X minus, um. 312 00:43:02.340 --> 00:43:10.920 Something like the observed mean and that centered it and then I, I dunno. 313 00:43:10.920 --> 00:43:14.880 Divided by the known Sigma. 314 00:43:14.880 --> 00:43:20.489 Divided by you see this thing here. 315 00:43:23.159 --> 00:43:26.610 This is normal. 01 I, I converted it. 316 00:43:26.610 --> 00:43:34.409 I divided out the standard deviation. I shifted it by in that Estimator for the mean and I converted it to a variable. 317 00:43:34.409 --> 00:43:38.940 Which is calcium this is where I knew the Sigma. 318 00:43:38.940 --> 00:43:43.739 Now, if I don't know the, um. 319 00:43:45.360 --> 00:43:48.539 But now, I don't know Sigma either. Okay. 320 00:43:53.579 --> 00:44:01.650 Hi there, but still, I, I standardize, uh, my observations, um. 321 00:44:04.679 --> 00:44:16.889 So, I, I standardized my observations, um. 322 00:44:19.079 --> 00:44:24.929 More details on page 483 and I get something like, um. 323 00:44:27.750 --> 00:44:30.780 My observed mean minus, um. 324 00:44:31.949 --> 00:44:35.880 The unknown true mean divided by my observed. Um. 325 00:44:35.880 --> 00:44:39.510 My estimate for these standard deviation. 326 00:44:41.760 --> 00:44:46.440 Now, the thing with this is, this is not, um, is this is not. 327 00:44:51.210 --> 00:44:54.510 It's, um, student tea distribution, um. 328 00:44:58.079 --> 00:45:04.019 Distribution, but you can look at, but they're at their tables for that. 329 00:45:10.230 --> 00:45:14.219 Or, um, or or, you know, are built in functions now. 330 00:45:16.409 --> 00:45:25.380 Okay, and you can look in and you can look into the tables and you can get probabilities. And so on. 331 00:45:26.429 --> 00:45:40.889 So, you can find confidence intervals. So here is the case, you know, the underlying. 332 00:45:40.889 --> 00:45:46.679 Population is calcium, but you don't know what's mean and standard deviation. 333 00:45:46.679 --> 00:45:50.730 So you take and you take a pile of observations and you can. 334 00:45:50.730 --> 00:45:57.239 Compute an Estimator for them, and you can put that confidence intervals on the Estimator for the mean for example. 335 00:45:57.239 --> 00:46:00.989 And the thing is for Ingrid ankle to 10. 336 00:46:02.429 --> 00:46:10.170 Okay, I was sitting here. Excellent. 5. 337 00:46:10.170 --> 00:46:15.840 Pretty good. So. 338 00:46:16.860 --> 00:46:23.010 Okay, things called a normal right fast. Okay. 339 00:46:23.010 --> 00:46:28.289 So, you can do that. Um, I'm going to give you a high level thing here Tom. 340 00:46:29.550 --> 00:46:34.170 The book on page 434 shows, the plot of this, um. 341 00:46:34.170 --> 00:46:42.420 So comparing. 342 00:46:44.340 --> 00:46:48.780 And T, test page. 343 00:46:48.780 --> 00:46:54.000 4 3 4. okay. 344 00:46:54.000 --> 00:47:02.070 So case 1 that I showed you, we knew we did not know the mean when we figured the distribution was and. 345 00:47:02.070 --> 00:47:07.260 We didn't know the me, but we know the standard deviation case 2. we didn't know the standard deviation. 346 00:47:07.260 --> 00:47:15.659 Um, hey, 3. 347 00:47:16.710 --> 00:47:25.230 Uh, I'll throw a page numbers into for. 348 00:47:25.230 --> 00:47:33.119 35, we don't know the distribution either. 349 00:47:41.489 --> 00:47:47.130 Does it go on with no less and less now if we don't know the distribution either then, um. 350 00:47:48.809 --> 00:48:00.239 And well, or we know the distribution, but it's not, let's say. 351 00:48:02.039 --> 00:48:06.269 Um, let me reword this. Um. 352 00:48:10.559 --> 00:48:17.400 A little stronger it's not. 353 00:48:19.739 --> 00:48:24.960 Maybe we know it. Um, oh, what's an example? Um. 354 00:48:25.525 --> 00:48:39.505 Device lifetime device is its lifetime is random variable that's exponentially distributed, 355 00:48:39.505 --> 00:48:41.034 but we don't know the mean. 356 00:48:41.275 --> 00:48:41.994 Um. 357 00:48:42.300 --> 00:48:50.099 So. 358 00:48:52.650 --> 00:48:59.820 And the variable, I don't know what I mean. 359 00:49:01.139 --> 00:49:09.150 Or another example is half light radioactive element. Okay. 360 00:49:12.869 --> 00:49:25.230 Okay, um, or. 361 00:49:26.429 --> 00:49:34.590 Um, cause my grace okay. You know, or say defects. 362 00:49:36.869 --> 00:49:46.289 On the chip or something. Okay, so the pause on distribution for numbers, um, exponential distributions for inter arrival times. 363 00:49:46.289 --> 00:49:49.710 So, we're observing, um. 364 00:49:50.849 --> 00:49:58.644 We're observing this, um, you know, we gotta know some radioactive Adams here. 365 00:49:58.675 --> 00:50:09.414 We observe how long each Adam goes before, and we wish to estimate the half life of the element as a whole by looking at what happens to say 100 atoms. 366 00:50:10.530 --> 00:50:13.619 1000 Adams or something um. 367 00:50:13.619 --> 00:50:21.929 The problem is an exponential distribution is not, you know, this is exponential, goes like that. 368 00:50:24.000 --> 00:50:34.260 You know, this is normal goals like that exponential not at all normal so we can't just use those tactics directly. 369 00:50:34.260 --> 00:50:42.179 However, um, I've been showing you actually mathematical, started this class or Mathematica. 370 00:50:42.179 --> 00:50:51.929 Is that when you sum up some non Gaussian, random variables, some other distribution the sum really quickly starts looking. 371 00:50:51.929 --> 00:50:57.989 This is a lot of large numbers, um, which I just alluded to have improved, but, um. 372 00:50:59.400 --> 00:51:03.480 But so here is a nice, um, nice property. 373 00:51:08.010 --> 00:51:14.760 Some of random variables, something exponential distribution or something. 374 00:51:20.849 --> 00:51:25.800 Starts to look go see it quickly. 375 00:51:28.050 --> 00:51:34.710 And equals 5 and me, I mean, I showed you the square away function, you know, add 4 of them and it's looking not bad. 376 00:51:34.710 --> 00:51:42.599 So that, um, this is what we can do, and they, it's called to, um. 377 00:51:48.659 --> 00:51:56.489 Something called a batch to mean, um. 378 00:51:56.489 --> 00:52:02.730 So, it goes to 100 to copy the example in the book. 379 00:52:02.730 --> 00:52:08.489 Um, um. 380 00:52:11.190 --> 00:52:18.090 1, to 10 batches, 20 samples each. 381 00:52:22.260 --> 00:52:30.329 Okay, so the mean, uh, 20 exponential variables is going to be, you won't tell to tell the difference. So. 382 00:52:33.449 --> 00:52:41.219 The meaning of 20 exponential? No variable. Cisco. 383 00:52:43.860 --> 00:52:48.900 Pretty good. So, um, so compute. 384 00:52:54.030 --> 00:53:00.630 Confidence interval or the batch mean. 385 00:53:04.980 --> 00:53:11.280 Okay, so this is we can, um. 386 00:53:17.159 --> 00:53:27.420 And then extended, so this is the way we can work with distributions that don't look that are not in and not anywhere close. The calcium exponential. Let's say. 387 00:53:28.284 --> 00:53:43.074 Okay, um, the next thing, uh, this is the case 3 or something case for. 388 00:53:49.349 --> 00:53:54.570 Confidence interval or Sigma. 389 00:53:54.570 --> 00:54:01.380 Um, Sigma squared, I'd say, and, um. 390 00:54:01.380 --> 00:54:04.440 So, remember our, our estimator. 391 00:54:07.920 --> 00:54:16.230 Was, um, which is 1 over and -1. 392 00:54:20.429 --> 00:54:24.119 And this, uh, there, this is the observed meaning of our sample. 393 00:54:24.119 --> 00:54:30.360 And we take the sum of the square differences and divide by N -1 not by end to make it unbiased. 394 00:54:30.360 --> 00:54:37.800 Um, so we would like to say that, um. 395 00:54:39.809 --> 00:54:48.239 We'd like to say that the real, um, we'd like to say, put some, you know. 396 00:54:48.239 --> 00:54:54.719 So, we've completed our Estimator for the sample variant so now have a conference interval around it. 397 00:54:54.719 --> 00:55:00.090 For the real variance now, the thing is that. 398 00:55:00.090 --> 00:55:03.119 It's squared it's not Gaussian. Um. 399 00:55:05.760 --> 00:55:10.469 O. J. it's always current equal to 0, for example. Um. 400 00:55:10.469 --> 00:55:16.199 However, it's something called it's something called a. 401 00:55:16.199 --> 00:55:27.510 Chi square distribution, uh, you can take that workday go through that, but that's the thing. This is something that's. 402 00:55:27.510 --> 00:55:38.550 Um, so I went to Chi square, it looks like something like this or whatever, and you get confidence intervals. This goes off to infinity and this is 0. so. 403 00:55:38.550 --> 00:55:50.369 Okay, so 1, more thing I'd like to show you is. 404 00:55:52.650 --> 00:55:59.849 So be getting up on page um, I did hypothesis testing. I tried to refresh Iraqis videos and so on. 405 00:55:59.849 --> 00:56:07.739 Um, might do more of it later, but I want to jump ahead just to confuse you a little, um. 406 00:56:10.829 --> 00:56:16.230 Which will be paid, which 460 perhaps let's say, where are we going here? 407 00:56:20.250 --> 00:56:31.260 462 oh, okay. So the next thing is, um. 408 00:56:34.110 --> 00:56:38.789 Section 8.7 page. 409 00:56:48.719 --> 00:56:53.099 Okay, so here, I don't even know the distribution. 410 00:56:53.784 --> 00:56:54.385 So, 411 00:56:54.385 --> 00:56:54.985 um, 412 00:57:08.364 --> 00:57:09.114 maybe. 413 00:57:11.429 --> 00:57:16.170 I think it's some, whatever. Okay. 414 00:57:18.599 --> 00:57:22.679 I think it's whatever, um, there you go. See, it. 415 00:57:25.590 --> 00:57:33.869 No, so, um, you know, maybe take maybe perhaps and equals, say 100 observations, whatever. 416 00:57:38.039 --> 00:57:42.239 Um, now you can never prove that it's calcium um. 417 00:57:44.159 --> 00:57:51.780 But we can get a sense and there's something called a Chi square test. We'll help here. 418 00:58:00.570 --> 00:58:06.360 Um, it could also be useful say. 419 00:58:14.070 --> 00:58:18.389 If a die is fair. Let me give you this as an example, rather than in the book. 420 00:58:19.469 --> 00:58:22.619 Okay, so we, um. 421 00:58:24.449 --> 00:58:29.010 So die, let's say, um, 60 times. 422 00:58:31.679 --> 00:58:36.329 And the number of times, and, um. 423 00:58:38.820 --> 00:58:46.110 This is the, this is the, um. 424 00:58:47.519 --> 00:58:55.019 Phase 123456 and this is the number of times. 425 00:58:56.639 --> 00:59:00.599 Maybe, I dunno 15. 8. 426 00:59:00.599 --> 00:59:04.050 12, I dunno. 427 00:59:05.429 --> 00:59:10.920 9, we got here 2330, um, 5. 428 00:59:10.920 --> 00:59:16.170 44. 429 00:59:16.170 --> 00:59:19.829 I'm 6. okay. 430 00:59:21.269 --> 00:59:27.960 So does that I look here. Okay. Um, but here is thing, but the expected number of times. 431 00:59:30.630 --> 00:59:34.650 If it's a fair dies 101,010,101,010. 432 00:59:35.789 --> 00:59:39.329 But we saw 15 812 910 6. 433 00:59:39.329 --> 00:59:44.519 Is that fair? Um, it added up to 60. good. Oh, okay. 434 00:59:45.960 --> 00:59:50.280 So there is a Chi square test, um. 435 00:59:50.280 --> 00:59:55.349 And do this, I'll give you the test. I won't prove it. Um. 436 00:59:57.510 --> 01:00:00.599 So, basically. 437 01:00:05.909 --> 01:00:11.130 The 6 possible outcomes. 438 01:00:13.650 --> 01:00:19.409 Um, and using the same terminology as the book. 439 01:00:22.559 --> 01:00:34.559 Um, she doesn't have a good terminology. Um. 440 01:00:42.570 --> 01:00:52.409 Sure, it goes. Okay. Um, hey, is the number of outcomes and. 441 01:00:52.409 --> 01:01:00.780 60 observations. All right. Just clear. 442 01:01:07.980 --> 01:01:11.880 Okay um, so basically. 443 01:01:14.880 --> 01:01:22.199 And so is the number of times, um. 444 01:01:23.639 --> 01:01:30.630 We saw IE and M, sub is the expected number of times. 445 01:01:30.630 --> 01:01:36.329 Accept a number. Okay, so this particular case, it was always 10. 446 01:01:36.329 --> 01:01:41.429 Pretty expected, and what we actually saw is 10,812 910 6. 447 01:01:45.869 --> 01:01:54.210 Okay, so what we do is, we compute something called a D squared and that's going to be the sum of all the. 448 01:01:54.210 --> 01:01:58.949 Um, and I minus am I divided by. 449 01:02:01.110 --> 01:02:06.329 Where am I okay. 450 01:02:09.719 --> 01:02:12.719 And this is the Chi squared. 451 01:02:12.719 --> 01:02:17.969 Squared distribution. 452 01:02:19.199 --> 01:02:28.260 Um, degrees of freedom, and we can look into a table to see here. Um. 453 01:02:28.260 --> 01:02:34.829 So, basically, if the distribution was perfect, which the cards is unprovable itself, these quarter be 0. 454 01:02:34.829 --> 01:02:41.940 And it would be as the thing is more and more off center it gets larger and so. 455 01:02:44.400 --> 01:02:52.260 Um, and this works better, um. 456 01:02:57.030 --> 01:03:05.699 All the or similar, um, I sent a greater than 5 or something. 457 01:03:05.699 --> 01:03:12.300 And is everything gets big enough D, scores are f*** calcium like everything else does. So okay. 458 01:03:12.300 --> 01:03:16.920 So, um. 459 01:03:16.920 --> 01:03:23.550 Might even give you an example next week or something, but in any case, so this is a way you can then. 460 01:03:23.550 --> 01:03:33.239 Test so so, distribution doesn't have to be in here I'm testing the hypothesis that I'm tossing this excited dye. 461 01:03:33.239 --> 01:03:36.360 That the distribution is here in a form, so. 462 01:03:37.619 --> 01:03:48.599 But here, we test the hypothesis. Um, that's the distribution is uniform. 463 01:03:52.199 --> 01:03:55.289 But you can test them and this is the way it's also used. 464 01:03:56.969 --> 01:04:01.710 Can be used for drug tests and so on all you need is your expected number of. 465 01:04:01.710 --> 01:04:05.579 Observations in each category and, um. 466 01:04:05.579 --> 01:04:09.900 What you actually observed now you get to pick the categories and so on. 467 01:04:09.900 --> 01:04:16.860 So, if you had a malicious frame of mind, you can keep trying different category groupings until you. 468 01:04:16.860 --> 01:04:21.780 Compute what you want to prove, but, um, that's another course. 469 01:04:21.780 --> 01:04:25.230 That'd be a political science course, not our math scores. 470 01:04:25.230 --> 01:04:29.519 I'm just being unfair to policy people, but okay. Um. 471 01:04:32.340 --> 01:04:36.030 So that's possibly a reasonable point to stop. Um. 472 01:04:36.030 --> 01:04:40.110 I can review what I was doing. 473 01:04:40.110 --> 01:04:53.789 For you, um, 1st, I, um, I reviewed Mathematica just because it's an enrichment part of the course, but I think it's an important tool that can make your life easier. 474 01:04:53.789 --> 01:04:59.760 If I had my brothers who would work Mathematica to a lot of courses in engineering actually. 475 01:04:59.760 --> 01:05:07.199 I'm just an enrichment, so 2nd thing is I came back to statistical paradoxes. 476 01:05:07.199 --> 01:05:13.139 For the math compute something that is very surprising. 477 01:05:13.139 --> 01:05:20.969 And I gave you the example 2 weeks ago while the fictional admissions statistics. 478 01:05:20.969 --> 01:05:28.500 If you go through the Wikipedia article, it refers to actual case at UC Berkeley where this actually happened. 479 01:05:28.500 --> 01:05:35.099 I'm guessing it may happen a lot, but got involved in politics or something. Um. 480 01:05:36.570 --> 01:05:43.889 Gave you examples like that and then I got into some more parts of chapter 8 of the textbook. 481 01:05:43.889 --> 01:05:52.050 I show it a computation of the Estimator for the variance of a population. 482 01:05:52.050 --> 01:05:59.460 Where assume it's calcium, like slightly here you don't know the mean and you don't know the variance either. 483 01:05:59.460 --> 01:06:02.760 You draw 100 observations. 484 01:06:02.760 --> 01:06:09.869 And I showed that the, all BS Estimator for the variance is biased slightly. 485 01:06:09.869 --> 01:06:15.510 And you have to change an end to an end -1 to make this Estimator on 5. unbiased means. 486 01:06:15.510 --> 01:06:21.869 That the mean of the estimated variable actually matches the real Sigma. 487 01:06:21.869 --> 01:06:24.929 Sigma squared itself. 488 01:06:24.929 --> 01:06:29.909 Um, and then I showed some little work in confidence intervals. 489 01:06:29.909 --> 01:06:33.809 And getting into some. 490 01:06:35.039 --> 01:06:41.190 Different cases there and also got into some cases here where. 491 01:06:41.190 --> 01:06:44.579 Will start to legitimately see. 492 01:06:44.579 --> 01:06:48.420 Distributions that are not. 493 01:06:48.420 --> 01:06:52.679 And they do pop up, so. 494 01:06:52.679 --> 01:06:57.119 Like, if you don't know the mean or the variance. 495 01:06:57.119 --> 01:07:01.440 And is small then. 496 01:07:01.440 --> 01:07:05.400 How are these things like your computer mean? 497 01:07:05.400 --> 01:07:12.449 Will not be normally distributed it it'll be half of students T distribution. 498 01:07:12.449 --> 01:07:19.199 What does that gets to be more than 5 or 10? It looks and if we start looking at distributions of. 499 01:07:19.199 --> 01:07:25.110 For Estimator for the variance and so on things might start looking like a Chi square distribution. 500 01:07:25.110 --> 01:07:32.130 Which is, um, high squared distribution to get from the sum of squares of calcium actually. 501 01:07:32.130 --> 01:07:37.920 Um, and then so I showed you doing constant intervals and so on, you. 502 01:07:37.920 --> 01:07:51.179 Don't know the mean, but, you know, the variance you don't know what I mean or you don't know the variance and then you don't know the distribution or, you know, the distribution, but it's not calcium. It's nothing like calcium like, exponential, really non. 503 01:07:51.179 --> 01:07:56.250 Well, then you can appeal to your law of large numbers. 504 01:07:56.250 --> 01:08:00.269 And you say you have 200 observations phone numbers from the book. 505 01:08:00.269 --> 01:08:04.260 You patch these 200 observations and the 10 groups of 20. 506 01:08:04.260 --> 01:08:09.389 Now, you take the sum of 20 exponential, random variables. It looks. 507 01:08:09.389 --> 01:08:13.889 You can do that and Mathematica and next time or something and, um. 508 01:08:13.889 --> 01:08:17.880 So, that's how you can handle non. 509 01:08:17.880 --> 01:08:22.020 Calcium distributions it works better. If you've got more observations. 510 01:08:22.020 --> 01:08:31.470 The next thing is, you not, you don't know the distribution, you think maybe it's whatever calcium or uniform or something. 511 01:08:31.470 --> 01:08:37.140 The Chi square test, but you take some observations and the Chi square test will give you. 512 01:08:37.140 --> 01:08:42.930 A probability that you saw something at least does on, even as you saw it, your hypothesis is. 513 01:08:42.930 --> 01:08:46.020 True. The hypothesis testing. 514 01:08:46.020 --> 01:08:50.850 An example, I used was test of a fair die. 515 01:08:50.850 --> 01:08:55.890 Um, the times you got to get it to the 1 through 6. 516 01:08:55.890 --> 01:09:00.659 10 times, um, that's a well, it'll be never precise. 517 01:09:00.659 --> 01:09:06.600 But, you know, you get something somewhat off even and then you can do a Chi square test on it. 518 01:09:06.600 --> 01:09:10.229 And get a probability that you would see something, at least. 519 01:09:10.229 --> 01:09:16.649 The alternative hypothesis ordered that of something, at least that far from what you expected. 520 01:09:16.649 --> 01:09:21.659 You'll never the problem is seeing precisely those numbers. This is almost 0. 521 01:09:21.659 --> 01:09:26.729 But you can say seeing these numbers are numbers that are worse. I farther off expected. 522 01:09:26.729 --> 01:09:29.939 High square test will do that so. 523 01:09:29.939 --> 01:09:39.239 Numbers could be too good. Also. Um, people look at, you know, Mandal who is doing, um. 524 01:09:39.239 --> 01:09:45.630 You know, genetics and so on, and he said he observed various piece or whatever and. 525 01:09:45.630 --> 01:09:51.569 Percentages of the time you saw different things people looked at his published numbers and decided they're actually too good that. 526 01:09:51.569 --> 01:09:55.590 He was faking it because they really should vary more, but. 527 01:09:55.590 --> 01:09:58.949 Okay, that's enough for today. 528 01:09:58.949 --> 01:10:04.619 Thursday watch videos, I put them on the blog. I may tie them to Thursday. 529 01:10:04.619 --> 01:10:08.489 I may record a little something also. I know, but I've got some good videos for you to look at. 530 01:10:08.489 --> 01:10:20.189 And the videos cover a range, I've got less technical ones if you're feeling sort of blah and I've got more technical ones if you're feeling that I'm going to slowly for, you. 531 01:10:20.189 --> 01:10:23.909 You pick which you want so. 532 01:10:23.909 --> 01:10:30.329 More tightly, the ones are from MIT, so okay, so have a good week. 533 01:10:30.329 --> 01:10:35.399 I'll stay here for a minute or 2 if there's questions. Other than that, I'll see you next Monday. 534 01:10:37.170 --> 01:10:47.460 Hello. 535 01:10:47.460 --> 01:10:50.520 Hello.