WEBVTT 1 00:00:02.968 --> 00:00:09.569 Hey, theory is this, this is recording, um. 2 00:00:18.059 --> 00:00:23.670 I think for all the share. 3 00:00:23.670 --> 00:00:33.719 Sure, I'm sorry, you got to turn off the filter. 4 00:00:33.719 --> 00:00:38.039 So, you filter on your background, but. 5 00:00:38.039 --> 00:00:41.909 So, we won't be able to see it because recognize your face. Yeah. 6 00:00:41.909 --> 00:00:50.399 Where is remind me where is to filter on that? I do not know. The problem is that all these different platforms for Webex meet. 7 00:00:50.399 --> 00:00:55.469 Have the filter in different places um. 8 00:00:55.469 --> 00:01:02.789 Virtual background. 9 00:01:02.789 --> 00:01:06.450 Okay. 10 00:01:14.099 --> 00:01:19.109 Yeah, you turn it off your job. 11 00:01:19.109 --> 00:01:23.159 I need to just pointed out the thing and, uh. 12 00:01:23.159 --> 00:01:27.689 Wonderful, it's it needs to focus on just to it. 13 00:01:27.689 --> 00:01:34.859 I just did the pregnancy. Yeah, that's there. It probably could be better. If you. 14 00:01:36.359 --> 00:01:43.200 If you move it closer by taking the SharePoint, I couldn't hear you 1, maybe take this chair and move it closer. 15 00:01:43.200 --> 00:01:47.189 Okay here we go. All right. 16 00:01:48.780 --> 00:01:53.790 Okay. 17 00:01:58.500 --> 00:02:05.909 Okay, oh. 18 00:02:05.909 --> 00:02:09.780 Um, various points in here. 19 00:02:12.300 --> 00:02:15.900 And I'm also teaching some operating systems so, um. 20 00:02:22.289 --> 00:02:27.599 Okay, cool. Sort of things. 21 00:02:27.599 --> 00:02:34.949 Is the command um, there's a sort of. 22 00:02:34.949 --> 00:02:37.949 That are set this is Webex. 23 00:02:37.949 --> 00:02:42.569 And some of them are 2 small stack size there. 24 00:02:42.569 --> 00:02:48.449 You could. 25 00:02:52.469 --> 00:02:56.849 Oh. 26 00:02:56.849 --> 00:03:05.069 Now, the interesting thing about the Linux operating system with virtual memory. 27 00:03:05.069 --> 00:03:10.229 And does not allocate a page until you touch the page. 28 00:03:10.229 --> 00:03:15.870 Until you attempt to reader, right? Page of the vertical of memory, just because it's. 29 00:03:15.870 --> 00:03:19.710 Initially, it's just reserved and virtual address space. 30 00:03:19.710 --> 00:03:24.030 Now, so there's no problem in allocating a big stack. 31 00:03:24.030 --> 00:03:29.069 There's also here's the cool thing that people don't don't realize. 32 00:03:30.330 --> 00:03:33.990 If you have several thread running together. 33 00:03:33.990 --> 00:03:41.520 They each of them can have their own very large specs so I might have 20 friends and he has a trigger by. 34 00:03:41.520 --> 00:03:44.580 And you might wonder. 35 00:03:44.580 --> 00:03:51.330 How you do it because, you know, you don't want to be reserving a gigabyte EA for 25. that's fine gigabytes. 36 00:03:51.330 --> 00:03:58.439 But here's the point is, is there reservations in virtual address phase and until you. 37 00:03:58.439 --> 00:04:01.650 Like I said, I told her for end users that. 38 00:04:01.650 --> 00:04:04.650 It's okay, it's not using any resources. 39 00:04:04.650 --> 00:04:10.800 Except some entries that are booking slide buffer, and also as a thread. 40 00:04:10.800 --> 00:04:16.019 Push this stuff off in the stock to just start grabbing pages of virtual memory. So. 41 00:04:16.019 --> 00:04:22.110 My point is that in parallel process each thread. 42 00:04:22.110 --> 00:04:27.809 You can have product crosstalk you can have a private staff. That's very big. 43 00:04:27.809 --> 00:04:31.528 Edits and if you take an operating systems course with what. 44 00:04:31.528 --> 00:04:38.369 The operating systems, they say you kind of in memory, you could have 1 stack going up from the bottom, but another stack going down from the top. 45 00:04:38.369 --> 00:04:41.848 That's 2 stops how you have free stacks and don't know how big they're going to be. 46 00:04:41.848 --> 00:04:46.439 But that's that's an awesome idea. Now it was virtual memory. So. 47 00:04:46.439 --> 00:04:51.178 There's no penalty to setting the stack very big. 48 00:04:52.288 --> 00:04:57.269 And I mentioned that there and there is a variable, um. 49 00:04:58.949 --> 00:05:03.749 And I figured out what it does anything yet I wanted it to, um. 50 00:05:06.358 --> 00:05:12.149 With, um. 51 00:05:15.298 --> 00:05:23.759 More introduction country we use, um. 52 00:05:23.759 --> 00:05:29.819 And open M. E. as I've said, it's a living standard they've added. 53 00:05:29.819 --> 00:05:36.778 Connection to somewhat. I haven't been able to get it worked all that. Well. 54 00:05:36.778 --> 00:05:40.619 But I'll give you some of the ideas and. 55 00:05:42.269 --> 00:05:45.778 Okay, just a quick call. 56 00:05:45.778 --> 00:05:54.209 A quick review of some stuff level detail I haven't done before. So you've got the 1 friend and lots of threads and then it joins again. 57 00:05:55.348 --> 00:06:01.738 History, um, now find a way this tutorial. 58 00:06:01.738 --> 00:06:06.478 Is coming from overage who goes to synergy level? Of course. 59 00:06:08.038 --> 00:06:12.449 There, um, oh, um. 60 00:06:14.098 --> 00:06:19.858 Yeah, nothing here. 61 00:06:22.379 --> 00:06:27.028 There, um. 62 00:06:31.079 --> 00:06:38.369 1 thing here that I was giving you just an introduction to open up either. A lot about the clause is. 63 00:06:38.369 --> 00:06:44.699 I haven't talked to told you about that. They had all the data gets transferred into the thread and to start. 64 00:06:44.699 --> 00:06:48.689 Try to put out is it private? So, at that. 65 00:06:48.689 --> 00:06:53.428 The default works. Well, so I haven't found that. 66 00:06:54.449 --> 00:07:00.238 True. 67 00:07:02.459 --> 00:07:05.848 Okay, so. 68 00:07:07.769 --> 00:07:11.908 This is working its way up to. 69 00:07:12.928 --> 00:07:16.619 Parallelism on the GPU, the the 1st slide. 70 00:07:16.619 --> 00:07:20.069 1, going down, uh. 71 00:07:21.149 --> 00:07:26.608 And that the thing with, uh, is getting the more detail later. 72 00:07:26.608 --> 00:07:30.209 It's, it's a hierarchy of parallel threads. 73 00:07:30.209 --> 00:07:34.978 15 video, for example, we have, um. 74 00:07:34.978 --> 00:07:38.249 32 threads or a war. 75 00:07:38.249 --> 00:07:44.038 And they all it's the single thread multiple processing. 76 00:07:44.038 --> 00:07:48.418 Thread executes the same instruction or is idle. 77 00:07:48.418 --> 00:07:52.108 But they've separate private registers and. 78 00:07:52.108 --> 00:07:58.139 Can execute separate data and because they have separate registers, they could be pointing in different data. So. 79 00:07:58.139 --> 00:08:01.259 So, the 1st level is the 32 threads form a war. 80 00:08:01.259 --> 00:08:07.288 But then a war footprints, the next hierarchy, you might have 32 wars. 81 00:08:07.288 --> 00:08:13.168 Forming a, um, another 32 threads that a war they're locked together synchronous. 82 00:08:13.168 --> 00:08:19.199 The 3232 warps could form a thread block. 83 00:08:19.199 --> 00:08:24.418 And the threads works in a thread block. 84 00:08:24.418 --> 00:08:29.668 Or they can run it, they can be scheduled and they can run at different times. 85 00:08:29.668 --> 00:08:38.489 So, during the thread block, they're running the same instructions not at the same time necessarily. 86 00:08:38.489 --> 00:08:42.869 And then your process called a parallel colonel. 87 00:08:42.869 --> 00:08:50.519 A number of red blocks, which are independent of each other they don't communicate except for you the goal memory. 88 00:08:50.519 --> 00:08:53.908 And, but any case. 89 00:08:55.438 --> 00:08:59.548 So, there's a hierarchy of parallelism on. 90 00:08:59.548 --> 00:09:04.619 And what this is showing now. 91 00:09:05.818 --> 00:09:14.668 With open MP they I guess, as a policy matter did not want to use and video specific terminology. 92 00:09:14.668 --> 00:09:20.999 Because the hope is that open M. P. is for agnostic could work on video. It could work on. 93 00:09:20.999 --> 00:09:26.548 You know, other parallel of course, could be multi core. 94 00:09:26.548 --> 00:09:31.198 So, they chose terminology. 95 00:09:31.198 --> 00:09:34.198 Which is different from a video use this. 96 00:09:34.198 --> 00:09:37.349 That's good because they want to be, you know. 97 00:09:37.349 --> 00:09:43.318 Agnostic, but it's bad because it makes it difficult to understand. 98 00:09:43.318 --> 00:09:46.859 What they're doing so. 99 00:09:46.859 --> 00:09:50.339 You look at something like this, and they talk about. 100 00:09:50.339 --> 00:09:55.469 Teams, um, and a team is like a, uh. 101 00:09:55.469 --> 00:10:02.759 Roadblock, I guess, and they're very point about the mapping from their terminology to NVIDIA. 102 00:10:02.759 --> 00:10:08.219 But in any case, if you're not careful. 103 00:10:09.418 --> 00:10:15.328 Doing something in parallel, you'll be doing the same thing on the several separate threads and if it's redundant so. 104 00:10:15.328 --> 00:10:22.408 What they're talking about here? Um, what you want to do is the thing about the. 105 00:10:22.408 --> 00:10:27.119 The rights do you have a pragmatic target and targeted to the. 106 00:10:27.119 --> 00:10:30.899 And distribute it across teams, which are several. 107 00:10:30.899 --> 00:10:34.619 A warms their thread blocks. They're a little vague about that. 108 00:10:34.619 --> 00:10:43.229 And distributed, and in parallel we start the whole thing off in 4 meetings. We're going to have an associated forum group, right? After that. 109 00:10:43.229 --> 00:10:46.739 How many teams so this is what we brought it on the. 110 00:10:48.869 --> 00:10:52.649 We run it on the, um, so. 111 00:10:53.668 --> 00:10:57.778 You could also do targeting Cindy is more thinking like. 112 00:10:59.609 --> 00:11:05.068 Plus, it's part of it because gary's big parallel things here. Um. 113 00:11:05.068 --> 00:11:12.778 Do do several devices, but in any case, that's a good thing there for. 114 00:11:16.828 --> 00:11:29.038 Just wanted to spend a minute or 2 on that to show you. 115 00:11:31.109 --> 00:11:43.109 Hey, just woke up. 116 00:11:48.208 --> 00:11:55.889 Okay, um, and see, we read is more detailed quality. Now, the next thing I mentioned it quickly before. 117 00:11:55.889 --> 00:12:03.298 The unified shared memory previously there was a lot of efforts spent on. 118 00:12:03.298 --> 00:12:08.369 Education of a parallel computers about moving stuff back and forth. 119 00:12:08.369 --> 00:12:14.668 Between the host that we say, the, and the device that will be, for example, the. 120 00:12:14.668 --> 00:12:18.568 Now, with the unified shared memory, is that. 121 00:12:18.568 --> 00:12:22.979 They're all with the same address face of the page you have to back and forth. 122 00:12:22.979 --> 00:12:27.028 Automatically so so there is the problem of. 123 00:12:27.028 --> 00:12:30.599 You start with the data on the host, you upload it. 124 00:12:30.599 --> 00:12:34.229 The device that shows 2 devices here uh. 125 00:12:34.229 --> 00:12:37.229 Well, that happens vaguely automatically. Now. 126 00:12:39.149 --> 00:12:47.099 Um, oh, the relevance is that these slides are named people using the. 127 00:12:47.099 --> 00:12:51.418 Using like, you know, 1 of the couple of tasks, the supercomputers. 128 00:12:51.418 --> 00:12:54.418 So these are the tools that they would use, um. 129 00:12:55.589 --> 00:12:59.308 A mapping is an alpha lesson clause. That's. 130 00:12:59.308 --> 00:13:02.698 Uh, totally absolutely, but how you move the data back and forth. 131 00:13:02.698 --> 00:13:07.019 Um. 132 00:13:08.698 --> 00:13:13.438 Um, up to critical. 133 00:13:16.828 --> 00:13:21.269 It says, okay, um. 134 00:13:22.619 --> 00:13:31.168 For that sure we, are we a single address space over. 135 00:13:31.168 --> 00:13:39.269 Everything and well, that's a quick. 136 00:13:41.068 --> 00:13:46.918 Okay, now, um. 137 00:13:46.918 --> 00:13:50.609 Oh, you some other programs and show you. 138 00:14:03.594 --> 00:14:53.844 Okay. 139 00:15:06.928 --> 00:15:10.558 Um. 140 00:15:13.528 --> 00:15:18.208 Hello. 141 00:15:26.788 --> 00:15:36.599 Okay. 142 00:15:36.599 --> 00:15:45.839 Okay, so I want to show you some other programs that it showed you quickly. 143 00:15:45.839 --> 00:15:51.599 Um, some of the issues. 144 00:15:51.599 --> 00:15:56.759 With. 145 00:15:58.408 --> 00:16:08.698 So, we have here the. 146 00:16:12.028 --> 00:16:16.408 We have the loop, we're just adding the numbers 0 6 5. 147 00:16:18.119 --> 00:16:25.979 Hello. 148 00:16:28.078 --> 00:16:31.619 By the way how many people are familiar with the may come back. 149 00:16:32.729 --> 00:16:35.969 Um, so okay, I notice. 150 00:16:35.969 --> 00:16:40.349 See, you make it harder. 151 00:16:40.349 --> 00:16:44.519 Okay, what may is. 152 00:16:44.519 --> 00:16:49.678 It's a system to compile groups of files. 153 00:16:49.678 --> 00:16:54.778 And if I say make soft, that's the name, the money's our name. 154 00:16:54.778 --> 00:17:02.999 And what makes does, is it just going to compile a source file that we see could be for graphics? 155 00:17:02.999 --> 00:17:07.138 Could be C plus plus and you say it makes some. 156 00:17:07.138 --> 00:17:13.378 And what it does is, it goes through its searches for the possible source files. 157 00:17:13.378 --> 00:17:17.128 So, for example, I'm kind of summed up in DC. 158 00:17:17.128 --> 00:17:23.818 So, it will do it from the 1st compiler on some doc CC to produce. So. 159 00:17:23.818 --> 00:17:27.179 Now, it does this only if. 160 00:17:27.179 --> 00:17:31.288 Any existing executable evolve, so. 161 00:17:31.288 --> 00:17:36.058 So, it's really quite nice. So they said they are some. 162 00:17:37.199 --> 00:17:43.558 Some files that I say, it makes a 2nd time it doesn't do that. 163 00:17:43.558 --> 00:17:48.808 Give a large group of programs that only recompiled what it has to. 164 00:17:48.808 --> 00:17:53.219 Next point is, it has. 165 00:17:55.078 --> 00:17:59.338 It has a make file which gives options and rules. 166 00:17:59.338 --> 00:18:05.608 So, is that what we have here it says is variable. 167 00:18:05.608 --> 00:18:11.939 And the cap CSS is the name of the compiler will use. 168 00:18:11.939 --> 00:18:16.259 And it says various other things that will. 169 00:18:16.259 --> 00:18:20.999 5% colon common. 170 00:18:20.999 --> 00:18:26.548 Says every executable depends on the common files and update the file name. Um. 171 00:18:26.548 --> 00:18:34.499 It will recompile everything and then so there's a general rule for C. plus plus programs map called 4. G. 172 00:18:34.499 --> 00:18:39.148 For that I give a specific role that will be different for the general role. 173 00:18:39.148 --> 00:18:45.358 So, it's extremely powerful and you can use it on large system. Does. 174 00:18:45.358 --> 00:18:48.388 They make as a level above. 175 00:18:48.388 --> 00:18:51.419 Well, see, Meg will, um. 176 00:18:51.419 --> 00:18:54.959 January to make files and then resume. 177 00:18:56.128 --> 00:18:59.249 Oh, that's more than I need it. Okay. So. 178 00:19:00.749 --> 00:19:05.219 So, we do something like this and. 179 00:19:08.519 --> 00:19:11.638 Christian. 180 00:19:25.259 --> 00:19:33.689 Okay, now, what I did was I completed the correct number, but the computer number was wrong. 181 00:19:33.689 --> 00:19:37.169 Because of this problem, um. 182 00:19:38.578 --> 00:19:45.269 Of the requalify, right? So just remember the fact that a computer number is followed with the correct number. 183 00:19:45.269 --> 00:19:56.009 So the issue was that the parallel block, you would have several threads. 184 00:19:56.009 --> 00:20:00.118 You see, they have to each drive has to read the value of compute. It. 185 00:20:00.118 --> 00:20:06.298 Add eye to it and write it back. What meanwhile, the 2nd thread favorite the value is computers. 186 00:20:06.298 --> 00:20:12.479 And so the 2 of them are adding to the same guy or computed writing back. So it's just not always get added to it. 187 00:20:12.479 --> 00:20:20.909 For friends, you have the, uh, the worst of problems. Yeah. So illustrates why we have to use atomics and. 188 00:20:20.909 --> 00:20:27.449 So, on we get the wrong answer to this little thing I like to start. Okay. 189 00:20:27.449 --> 00:20:35.068 What's going on up here? I hear ya. 190 00:20:36.538 --> 00:20:41.398 So, he was a chart no. 191 00:20:41.398 --> 00:20:49.078 But this is the cost, the fees are just pretty pretty dilemma pretty credit. 192 00:20:49.078 --> 00:20:52.528 What pretty frankly. 193 00:20:52.528 --> 00:20:56.909 Oh, it means adding stuff to the program to make it look. 194 00:20:56.909 --> 00:21:00.808 Nicer and be more readable. 195 00:21:00.808 --> 00:21:05.398 Sorry, the numbers exactly. 196 00:21:05.398 --> 00:21:08.459 And so in the case. 197 00:21:08.459 --> 00:21:13.318 Or you have a group of digits if you add a cost, that things like that, they don't mean anything. 198 00:21:13.318 --> 00:21:17.669 It just to make it readable cost plus. 199 00:21:17.669 --> 00:21:22.828 Okay, so we get the wrong answer with this, um. 200 00:21:24.209 --> 00:21:27.689 Then so how can we handle this. 201 00:21:27.689 --> 00:21:31.078 Well, I asked for examples here. 202 00:21:31.078 --> 00:21:42.209 Something atomic. 203 00:21:42.209 --> 00:21:47.009 It puts a atomic fragment in there. 204 00:21:47.009 --> 00:21:51.509 That says that that following instruction. 205 00:21:51.509 --> 00:21:57.209 Will be done in comically, so it cannot get interrupted by another process or another. 206 00:21:57.209 --> 00:22:01.528 So, the 1st, let me see. 207 00:22:01.528 --> 00:22:05.098 Too long to execute that thing. Um. 208 00:22:06.838 --> 00:22:11.638 Is 13 seconds. Oh, let me show you 1 more thing here. 209 00:22:13.558 --> 00:22:21.388 When we ran last brand, so it's the 13 seconds real, but it took 20,643 seconds. 210 00:22:21.388 --> 00:22:26.578 Because it's the dual 440 on that 656 hybrid threads. 211 00:22:26.578 --> 00:22:38.159 And, um, they were saying that times 56 sort of equals that for now. Well, look at this percent CPU usage. 4800%. Okay. 212 00:22:38.159 --> 00:22:41.699 So, effectively when this was running in parallel. 213 00:22:41.699 --> 00:22:45.148 It was he was 848 friends full. 214 00:22:45.148 --> 00:22:51.328 So so this is the point when you're. 215 00:22:51.328 --> 00:22:55.588 Measuring performance of firewall computers you look at your. 216 00:22:55.588 --> 00:22:59.308 Ballpark time in the CPU time at all. 217 00:23:01.169 --> 00:23:07.348 Oh, okay. Um, let me make it smaller to show you. Oh, oh. 218 00:23:07.348 --> 00:23:12.568 I showed you this, um. 219 00:23:12.568 --> 00:23:16.019 Another thing here. 220 00:23:17.814 --> 00:23:18.534 It down. 221 00:23:36.358 --> 00:23:40.679 Hey, and. 222 00:23:53.159 --> 00:23:57.028 So really. 223 00:23:57.028 --> 00:24:02.249 Here. 224 00:24:03.269 --> 00:24:07.409 Okay, uh, 1st time I read the program here. 225 00:24:07.409 --> 00:24:14.878 It had, um, it was riding with 56 threads and took. 226 00:24:14.878 --> 00:24:18.598 1, and a half 2nd and 76 user seconds. 227 00:24:18.598 --> 00:24:23.009 Down here, now, here, I'm using something in Linux where I've been. 228 00:24:23.009 --> 00:24:26.788 Specified environment variable and then get the executable. 229 00:24:26.788 --> 00:24:32.429 And I've modified this environment variable only for that executable not permanently in the shell. 230 00:24:32.429 --> 00:24:38.368 That's the number of threads that we'll use. So he's the thread is for. 231 00:24:38.368 --> 00:24:41.608 Um, it was 56. 232 00:24:41.608 --> 00:24:44.878 Process there is available, we're using for threats. 233 00:24:44.878 --> 00:24:51.179 And crazy thing is it took less, um, real time. 234 00:24:51.179 --> 00:24:55.469 With fewer threads to blast real time. 235 00:24:55.469 --> 00:25:00.298 Watch list CPU time so sometimes greater problem. It doesn't help you. 236 00:25:00.298 --> 00:25:04.348 Oh, the answer was still wrong with them as wrong. 237 00:25:17.638 --> 00:25:23.219 And use the atomic thing, it takes more time, but the answers, correct? 238 00:25:23.219 --> 00:25:27.598 How are you. 239 00:25:29.548 --> 00:25:33.959 Hello. 240 00:25:33.959 --> 00:25:37.288 Nobody know it's all through that. 241 00:25:37.288 --> 00:25:42.689 Increasing the value that you get for some of the almost all the. 242 00:25:42.689 --> 00:25:46.078 Always lower from stronger. Yes yes. 243 00:25:46.078 --> 00:25:52.048 It's just always it's always slower. Well, yeah, because with more parallel. 244 00:25:52.048 --> 00:25:55.949 You've got this error this, uh. 245 00:25:55.949 --> 00:26:00.058 Error is getting worse, right. Okay. So. 246 00:26:01.409 --> 00:26:05.548 Right. It's okay. I see. 247 00:26:05.548 --> 00:26:11.759 So the reason I gave you, that is the townhome, this is the problem. Did. 248 00:26:11.759 --> 00:26:15.689 Anybody to use the atomic, it gives you the right answer. 249 00:26:15.689 --> 00:26:19.888 But the performance that if I go back up to here. 250 00:26:19.888 --> 00:26:25.648 Yeah. 251 00:26:26.909 --> 00:26:33.148 Well, Here's what's good. So, the policy that actually went in parallel nice it's very little, real time and the answer's correct. 252 00:26:33.148 --> 00:26:36.239 Well, the other cool thing about errors. 253 00:26:36.239 --> 00:26:49.409 We go back to solve every time you brought it. It's wrong, but every time you run it, you get a different wrong answer. 254 00:26:49.409 --> 00:26:57.328 Yes, yeah. 255 00:26:57.328 --> 00:27:00.659 You can have more friends in your program. 256 00:27:00.659 --> 00:27:04.229 Then you have processors or hybrid threads as a hardware. 257 00:27:04.229 --> 00:27:12.628 And this might be useful because if a thread is blocked, waiting for some resource dependent on the credit, the queue, right? 258 00:27:12.628 --> 00:27:16.798 Um, and so you could have. 259 00:27:16.798 --> 00:27:21.929 Normally, if you could have a feel for what we could try that. 260 00:27:35.213 --> 00:27:35.963 More time. 261 00:27:37.108 --> 00:27:45.868 If we do a topic, it was faster to answer so. 262 00:27:46.979 --> 00:27:55.528 Another way, so the thing with the comic is that the following statement cannot be interrupted by another threat. 263 00:27:55.528 --> 00:27:59.848 But the following state, but it's very limited for what it can be. It's like. 264 00:27:59.848 --> 00:28:03.209 Global variable plus. 265 00:28:03.209 --> 00:28:09.598 By this local variable or something, it's very limited, but very limited thing. It's. 266 00:28:09.598 --> 00:28:14.519 Now. 267 00:28:17.009 --> 00:28:24.778 There's a more general thing called a critical, pragmatic, the critical path, or the following statements the trivial lockdown. 268 00:28:24.778 --> 00:28:31.229 Is cannot be interrupted. It's done by 1 friend at a time, but. 269 00:28:31.229 --> 00:28:34.858 There's a bigger overhead, but it's, um. 270 00:28:37.709 --> 00:28:43.019 Overhead, but it's so more general, so it's a different way to interlock. So. 271 00:28:43.019 --> 00:28:48.898 And then, of course, the real way to do it is, um. 272 00:28:52.378 --> 00:28:56.368 Okay, so we're like here. 273 00:28:56.368 --> 00:29:00.568 Select the original 1 we're just adding in so you got it. 274 00:29:00.568 --> 00:29:04.469 But the has got this new thing up here reduction. 275 00:29:04.469 --> 00:29:09.778 And there's the common programming paradigm. 276 00:29:09.778 --> 00:29:13.409 Parallel today or more stimulating a solve. 277 00:29:13.409 --> 00:29:17.189 Okay, so all these threads you off accumulate the sub. 278 00:29:17.189 --> 00:29:20.909 And the efficient way to do this is to build a tree. 279 00:29:20.909 --> 00:29:26.848 For something, or what you can do is each thread, the efficient way weeks for everyone has a separate private, total. 280 00:29:26.848 --> 00:29:30.628 And all of the iterations handled by 0. 281 00:29:30.628 --> 00:29:40.648 Something to the private total for 30. all the iterations handled by 1 summit is a private poll for thread 1 and so on and they can all be done in parallel. No classic. 282 00:29:40.648 --> 00:29:46.499 Then, finally, at the end, you add the 56 private sub toggles into 1 grandchild. 283 00:29:46.499 --> 00:29:52.499 And this way is both parallelizable and works. You don't get these. 284 00:29:52.499 --> 00:29:58.138 He's got coherency problems so if you say this. 285 00:29:58.138 --> 00:30:02.939 But this means is everything I just described is done by the compiler. 286 00:30:02.939 --> 00:30:07.378 So it says that computed is a variable. 287 00:30:07.378 --> 00:30:11.999 And it's being reduced and the plus operator is being. 288 00:30:11.999 --> 00:30:15.328 So, the computer's being reduced with the plus the operator. So. 289 00:30:15.328 --> 00:30:22.709 So all the stuff I showed you about using a transcript so, in this particular case isn't necessary just make it a reduction variable. 290 00:30:22.709 --> 00:30:25.769 So, it runs. 291 00:30:25.769 --> 00:30:34.828 Now, before I run and show you just how fast I haven't really talked about global and multiple variables. The default is that. 292 00:30:34.828 --> 00:30:38.098 Okay, so we had to travel in parallel and. 293 00:30:38.098 --> 00:30:44.638 It applies to the following statements, has to be 1 statements before, but if I couldn't phrases and have a big law arbitrarily before. 294 00:30:44.638 --> 00:30:50.999 But the thing is this, uh, you got the global variables up here, like, computed. 295 00:30:50.999 --> 00:30:59.878 And they just pull through into the block and then you get the transparency issue on computer. If it gets written by 2 threads. 296 00:30:59.878 --> 00:31:04.769 But the global variables by default are visible in the parallel block. 297 00:31:04.769 --> 00:31:12.538 And the local things like, in here, by default is private to the thread. 298 00:31:12.538 --> 00:31:16.348 So the default make sense. 299 00:31:16.348 --> 00:31:20.459 Ignore we don't have to get explicit about whose private and who's bothered. 300 00:31:27.864 --> 00:31:40.554 Who gets the right answer it's passed so. 301 00:31:41.159 --> 00:31:46.919 Hey, um. 302 00:31:52.078 --> 00:31:55.378 Here. 303 00:31:55.378 --> 00:32:01.979 Oh, another cool. Little thing of no intellectual significance, but I got a little things. 304 00:32:03.568 --> 00:32:07.199 What this does here. 305 00:32:07.199 --> 00:32:13.828 Says that when I print out the numbers, it will add columns to the automatically. 306 00:32:14.878 --> 00:32:18.659 See, the sent to see how this. 307 00:32:18.659 --> 00:32:27.778 Has lots of digits we'll get Tom was put in anything set to share. The are the thoughts that pages full of comments for data. So you see output analog. 308 00:32:27.778 --> 00:32:31.199 Putting a foster phase on the input. 309 00:32:31.199 --> 00:32:34.528 Cool. 310 00:32:34.528 --> 00:32:39.838 All right, um, stuff here. 311 00:32:39.838 --> 00:32:46.318 During the last time hello um. 312 00:32:51.118 --> 00:32:55.769 So, are you looking at this for the hallmark. 313 00:32:56.999 --> 00:33:03.328 So the tag with the California connect with those. 314 00:33:03.328 --> 00:33:06.868 Cool. Little things are. 315 00:33:06.868 --> 00:33:14.219 Basically, the examples they have on the tutorials are much bigger than this icon. 316 00:33:15.449 --> 00:33:20.278 Oh, oh, oh, oh. 317 00:33:20.278 --> 00:33:28.769 Hello? Hello? Hello? 318 00:33:29.818 --> 00:33:37.828 Okay. 319 00:33:41.249 --> 00:33:49.618 Oh, hi. I don't know. Cool. Little thing is, who said that? Sure. I got to find a macro print. Are. 320 00:33:49.618 --> 00:33:52.798 And if you call for dark, I'm using my programs. 321 00:33:52.798 --> 00:33:57.148 It prints the name of the arguments and evaluate that affects the value. 322 00:33:57.148 --> 00:34:00.328 Based on the fact that it's CC plus plus. 323 00:34:00.328 --> 00:34:04.409 Inside of pound are is the string that's the name of the argument. 324 00:34:05.878 --> 00:34:09.148 Okay, so more stuff here. 325 00:34:12.929 --> 00:34:17.099 So I mentioned before you can fire off, uh. 326 00:34:17.099 --> 00:34:24.179 Parallel you can fire off parallel tasks and here's an example of doing it. 327 00:34:24.179 --> 00:34:29.039 Actually, using the recursive path that. 328 00:34:29.039 --> 00:34:34.378 So, you know, your program, you got a chapter class about this, because it's the 1st time. 329 00:34:34.378 --> 00:34:38.398 And so, what's happening here you want to base case. 330 00:34:38.398 --> 00:34:49.139 And it's less than 2, mid pars like 5 say, we do the normal recursive call here for small values event. Okay. 331 00:34:49.139 --> 00:34:55.289 Um, otherwise if Ben is big. 332 00:34:55.289 --> 00:35:04.978 What's happening down here is I'm doing this recursive thing, but I'm calling these 2 things with new parallel tasks. 333 00:35:04.978 --> 00:35:08.668 So, uh. 334 00:35:08.668 --> 00:35:13.739 So otherwise, and again, to show you what the far is fine I think. 335 00:35:13.739 --> 00:35:18.418 Uh, 4,828. 336 00:35:19.588 --> 00:35:26.878 Oh, okay. So so, uh. 337 00:35:28.139 --> 00:35:31.349 So else again is big enough. We fire off. 338 00:35:31.349 --> 00:35:35.369 Parallel task of, uh. 339 00:35:35.369 --> 00:35:40.168 And minus 1, and we're keeping counted number task list because their calendars. 340 00:35:40.168 --> 00:35:45.298 Has to be a topic and we run a separate parallel task. 341 00:35:45.298 --> 00:35:50.668 Up here, so inside the house, we've got this product, but. 342 00:35:50.668 --> 00:35:54.358 Yeah, that's true. Then minus 1 and also. 343 00:35:54.358 --> 00:36:03.059 Minus 2 in parallel and then we wait for those 2 tasks, supposed to finish and they're firing off parallel paths themselves. 344 00:36:03.059 --> 00:36:06.688 Yes. 345 00:36:08.009 --> 00:36:13.228 It's like all the threads and their values after. 346 00:36:23.998 --> 00:36:28.318 There are ways you can look at the number of threads, running and waiting and so on. 347 00:36:29.489 --> 00:36:34.438 We, you know, the way I can start to understand them saying the file. 348 00:36:34.438 --> 00:36:38.639 That's 1 reason I created the print backdrop. 349 00:36:38.639 --> 00:36:43.708 Okay, let me get serious about why I don't like. 350 00:36:43.708 --> 00:36:47.969 Is it's. 351 00:36:47.969 --> 00:36:53.338 Understand everything like, I cannot say, printed array of the stuff in a sensible way. 352 00:36:53.338 --> 00:36:57.028 If I write those print things, I can say if it's an array for it's the 1st. 353 00:36:57.028 --> 00:37:02.369 And all of us or something, I suppose you could do a CDP, but I haven't figured out how. 354 00:37:03.478 --> 00:37:07.798 So, uh, I'll give you this. 355 00:37:07.798 --> 00:37:12.929 Windows has better visual tool. 356 00:37:12.929 --> 00:37:20.248 So, for the, we decided to go put a little boxes with syntax trigger on top of the. 357 00:37:20.248 --> 00:37:24.719 Okay, how does this. 358 00:37:24.719 --> 00:37:28.708 So, what is the simultaneously. 359 00:37:28.708 --> 00:37:31.889 Like, we're like, 20 years behind the scenes of. 360 00:37:31.889 --> 00:37:35.909 Say that again wise with. 361 00:37:35.909 --> 00:37:39.898 So, we're using that just like, it was a more of a. 362 00:37:41.878 --> 00:37:52.438 So, I missed a couple of words why is what? So, you know, it's more of a qualitative comment, not necessarily thing, but for the cases, but it just feels like it's like we're working with, like, video. 363 00:37:52.438 --> 00:37:57.719 See, what bothers and stuff like that. It's like I'm working on working with you. That's already a few 1000, right? 364 00:37:57.719 --> 00:38:04.559 So, you're stuck with debugging with printing. Okay. And he's okay. Here's the serious reason. Is this. 365 00:38:04.559 --> 00:38:10.349 Is objections, I guess, versus call, but as a free software foundation. 366 00:38:10.349 --> 00:38:19.230 And let me tell you why a good B*** has to have the C plus plus parser embedded in. 367 00:38:19.230 --> 00:38:29.550 And because you want the debugging statements to fully understand C. plus plus plus the 1 point is even a look ahead context sensitive grammar. 368 00:38:29.550 --> 00:38:34.289 It's so concise that there's things that you cannot even give the. 369 00:38:34.289 --> 00:38:39.449 Mexico class of a token in all cases does not understand these demands. 370 00:38:39.449 --> 00:38:44.130 I forget this example where is horrible in that sense. 371 00:38:44.130 --> 00:38:50.070 So, you need the full parser, but you have to scrap the parser from. 372 00:38:50.070 --> 00:38:53.969 Plus plus and plug it in without making. 373 00:38:53.969 --> 00:38:59.849 Larger to B*** public, which may be good, but some of the people didn't larger departments don't want to do that. 374 00:38:59.849 --> 00:39:05.969 So, they cannot leverage on g. plus plus in their larger debugger. 375 00:39:05.969 --> 00:39:09.960 Which I think was a more of a case for modifications. 376 00:39:09.960 --> 00:39:13.380 Motivation sorry for the project actually. 377 00:39:14.969 --> 00:39:19.170 The short answer is it's a big thing that. 378 00:39:19.170 --> 00:39:23.489 So, what the video did is, they bought the compile or. 379 00:39:23.489 --> 00:39:28.170 So, then we say, used to be whatever. 380 00:39:28.170 --> 00:39:37.170 So, they so they're putting money into it um, and they have some debugging tools. I may show later I can figure them all out for myself, but they'll show. 381 00:39:37.170 --> 00:39:41.309 Thread occupancy. 382 00:39:43.230 --> 00:39:48.570 Well, the thing is also he'd like to discuss the program with okay without changing the behavior. 383 00:39:48.570 --> 00:39:52.260 And already that's a problem with single threading. 384 00:39:52.260 --> 00:40:04.500 May have a problem where you write revenue, right off the entrepreneur, right? If you call the 2nd call or something. So now you compile it with the B*** right? Besides that Margaret does not set fault. 385 00:40:04.500 --> 00:40:07.559 Because the double spaced out the erase. 386 00:40:07.559 --> 00:40:12.900 And so already with single threading. 387 00:40:12.900 --> 00:40:19.980 The, um, debugging gets hard the accuracy logging changes the systems. 388 00:40:19.980 --> 00:40:25.469 Behavior, so the thing is like, Bell Brian to solve is interpret everything. 389 00:40:25.469 --> 00:40:28.920 I can let me give you a factor of performance. 390 00:40:28.920 --> 00:40:34.289 We're saying, because they just turn everything into the interpreter, but doesn't have to interpret it perfectly. 391 00:40:35.429 --> 00:40:41.070 There's in theorist debugging testing, which is not used as much as it should be. 392 00:40:41.070 --> 00:40:47.250 Intel has hardware registers. Intel has a ways to go in and the role level and use the hardware. 393 00:40:47.250 --> 00:40:52.260 Piece of hardware registered for debugging your programs. I think some departments might use it. 394 00:40:52.260 --> 00:40:57.840 To the trouble also you want to get a performance so. 395 00:40:57.840 --> 00:41:04.619 How do you measure performance of a little routine that's called a 1Billion times because your counter it'll take more time than the routine. It. 396 00:41:04.619 --> 00:41:09.539 And then, of course, the reports plus plus or it compiles the routine in line. 397 00:41:09.539 --> 00:41:14.039 And optimizes, the whole thing is, you routine doesn't exist anymore how you measure it. 398 00:41:15.449 --> 00:41:19.260 So, I mean, what I mean is, you might have units for English. 399 00:41:19.260 --> 00:41:24.179 I have an automatic conversion. 400 00:41:25.409 --> 00:41:32.610 And so it's like a 1 line routine. Okay. Um, you know, centimeters equals number Saturday, number of interest at 2.54. 401 00:41:32.610 --> 00:41:40.920 And it's a 1 line function. Of course, it's inserted in line and optimize the function, which might be called millions of time doesn't exist anymore. So. 402 00:41:40.920 --> 00:41:43.949 How do you measure it? Performance? 403 00:41:45.809 --> 00:41:54.659 So it's a hard problem. They haven't solved yet. Okay. Well, 1 more thing there are some there there are some commercial tools that I looked at. I think all of, you. 404 00:41:54.659 --> 00:41:58.440 Which I even thought of buying, but. 405 00:41:58.440 --> 00:42:02.309 Costs money and it's, um. 406 00:42:02.309 --> 00:42:06.179 That's the learning curve. So there might be an answer that. 407 00:42:06.179 --> 00:42:10.860 If you're weren't going to figure it out project, consider something like oh, yeah. 408 00:42:10.860 --> 00:42:16.949 Which claims they handle some of this, how they do it? I don't know. 409 00:42:16.949 --> 00:42:22.469 Daily basis fire, so parallel processes and then wait for the end. 410 00:42:22.469 --> 00:42:26.280 Hello processes and they're firing up more and so on. So. 411 00:42:28.019 --> 00:42:33.269 And so with that, um. 412 00:42:39.719 --> 00:42:50.519 Okay. 413 00:42:53.039 --> 00:43:04.440 So so what I did here is I had a print out each brand number as it started, and all of the principles together and. 414 00:43:04.440 --> 00:43:13.949 Uh, add the Tuesday. 415 00:43:13.949 --> 00:43:17.429 5,400 of the CPU 56. 416 00:43:17.429 --> 00:43:20.610 So, in case, that's 1 thing. 417 00:43:20.610 --> 00:43:24.179 And if you wanted to have fun, you could change the, uh. 418 00:43:39.000 --> 00:43:43.440 Save by the way it's missing for us. 419 00:43:43.440 --> 00:43:48.179 All right, please stay files for us. Um. 420 00:43:48.179 --> 00:43:52.530 Uh, I'm using the I just as far as Pepsi Pepsi. Oh, okay. 421 00:43:52.530 --> 00:43:55.860 Okay, so that is different than. 422 00:43:55.860 --> 00:44:00.389 It's different for each thing how we, uh, oh, yeah they're all good. All the. 423 00:44:00.389 --> 00:44:13.889 Something like standards for you this correct? Because I probably starting off too many parallel for. 424 00:44:13.889 --> 00:44:19.079 Okay, so there's that, uh, let me show you, um. 425 00:44:31.650 --> 00:44:36.989 I wanted to show you a demo, a file that his counter it, but I couldn't remember what file was. 426 00:44:36.989 --> 00:44:41.250 I grab their resource file of the directory for the ones that contains the spring calendar. 427 00:44:43.559 --> 00:44:48.389 Okay, okay. 428 00:44:49.679 --> 00:44:53.190 So, what's happening here? This is a common thing. 429 00:44:55.860 --> 00:44:59.250 You're generating some answers in parallel. 430 00:44:59.250 --> 00:45:03.599 And you want to store them in 1, big local destinies that are right. 431 00:45:03.599 --> 00:45:09.719 And, but each parallel thread generating a nominal number of apps, there's. 432 00:45:09.719 --> 00:45:16.289 What you want to do what do you thread has an answer it wants to allocate an element from the global Ray. 433 00:45:16.289 --> 00:45:19.349 And store, it's answered there. 434 00:45:19.349 --> 00:45:22.619 So, it needs a way to allocate. 435 00:45:22.619 --> 00:45:27.389 Like, a Matlock or something in parallel. Really? And so. 436 00:45:28.500 --> 00:45:35.670 We need a way to for the, basically the unique calendar. So so what this works out to. 437 00:45:35.670 --> 00:45:42.059 We have that we have a counter that we're implementing and what we want to do any thread. 438 00:45:42.059 --> 00:45:45.539 Is incremented and store the value. 439 00:45:45.539 --> 00:45:55.230 Okay, and what we want is so we want industry theory indices to get different values of calendar. 440 00:45:55.230 --> 00:45:58.590 Okay, so if we have a case here. 441 00:45:58.590 --> 00:46:06.150 For, and it's 30, what we want is that each element of indices gets a different counter from several 29. 442 00:46:06.150 --> 00:46:15.840 Okay um, but the thing is, everything is running in parallel, so we're going to have this cash guarantee issue. 443 00:46:15.840 --> 00:46:20.880 And so we want so I have the comment there. Each. 444 00:46:20.880 --> 00:46:26.849 A unique counter, so illustrate your task here's the problem. And also. 445 00:46:26.849 --> 00:46:34.440 And also some solutions so, um, and so that looks like in parallel and. 446 00:46:34.440 --> 00:46:39.780 If you have a time and capture, so that will solve it. It will ever walk. 447 00:46:42.150 --> 00:46:46.289 Okay, um, is this printed out, like. 448 00:46:46.289 --> 00:46:51.510 So, does it like like. 449 00:46:51.510 --> 00:46:56.369 Is there like, okay, so here is the. 450 00:46:56.369 --> 00:47:00.269 I tried running it. I think I just got some jumping mess. Yeah. 451 00:47:00.269 --> 00:47:04.650 All right. Okay. I'll let you do that. Yeah. 452 00:47:04.650 --> 00:47:07.710 Well, here's another cool thing before. 453 00:47:07.710 --> 00:47:11.730 Plus plus the 17 or something, um. 454 00:47:15.210 --> 00:47:19.860 What is that thing in this? For a loop indices is an array. 455 00:47:19.860 --> 00:47:24.840 And this for loop is going over all the elements of the array and. 456 00:47:25.980 --> 00:47:32.519 Signing invite her, so I don't need to index with you right. Explicitly says is, is that. 457 00:47:32.519 --> 00:47:35.969 Grab the elements so that they can decide which of the variable. 458 00:47:37.260 --> 00:47:46.349 Type is the variable is also the type auto being still a sensible thing. It's a beautiful concept. Right? Recent versions of the plus plus. 459 00:47:46.349 --> 00:47:52.590 If you have like, a derived variable, you declare and type ofso. 460 00:47:52.590 --> 00:47:58.829 Well, I guess that's do the obvious thing. 461 00:48:07.679 --> 00:48:11.039 Right. So. 462 00:48:14.610 --> 00:48:22.650 So, what's happening? So, the goal is we hope that any counters, but looking at 40 and 3 got duplicated and so on, which is. 463 00:48:22.650 --> 00:48:25.980 So this is not. 464 00:48:25.980 --> 00:48:29.130 Uh, for me to man. 465 00:48:29.130 --> 00:48:32.130 Basically, the file was using this product. 466 00:48:32.130 --> 00:48:37.980 So, dot slash yeah, I think. 467 00:48:37.980 --> 00:48:41.849 The dot slash, just to make it explicit that I'm running it from. 468 00:48:41.849 --> 00:48:46.650 The current directory, because if it's a common name. 469 00:48:46.650 --> 00:48:51.719 And depending on my path environment, variable may try to run another program. 470 00:48:53.280 --> 00:48:57.300 Uh, what's happening here? Uh, this is another sign operating system. 471 00:48:58.590 --> 00:49:06.719 Suppose you type something like this food so it's gonna run a product people file name. Ooh well, where, where does it look for it? 472 00:49:06.719 --> 00:49:13.500 Answer is in this list of directories here. 473 00:49:13.500 --> 00:49:17.190 So, it will go down to an order of looking for. 474 00:49:17.190 --> 00:49:21.030 In this case, uh, executable call the call. 475 00:49:21.030 --> 00:49:27.960 And dots the current directory, some historical reason, I put it there. I'll should probably put it to the prod. 476 00:49:27.960 --> 00:49:34.889 But so it shouldn't find an executable colon comic in any 1 of these directories. 1st, we'll run that. 477 00:49:34.889 --> 00:49:39.900 That's I say dot slash says, okay, so. 478 00:49:40.920 --> 00:49:46.590 Hello. 479 00:49:47.880 --> 00:49:51.960 So every time we run it, it's different. 480 00:49:51.960 --> 00:49:56.039 Do it in, but with the atomic then. 481 00:49:56.039 --> 00:50:04.889 Every every indexes list at once without a UK problem. 482 00:50:06.630 --> 00:50:12.389 Now, you wonder, why did I oh, it was in the program again. 483 00:50:20.099 --> 00:50:24.389 I have, um. 484 00:50:25.949 --> 00:50:32.070 And this, these as a simple Ray, I wonder why I don't make it a vector. 485 00:50:32.070 --> 00:50:37.199 The answer is that sure. 486 00:50:40.110 --> 00:50:43.800 That was a factor that would be a competitor. 487 00:50:43.800 --> 00:50:48.090 Actually doesn't work on practice. 488 00:50:48.090 --> 00:50:51.420 Software. 489 00:50:51.420 --> 00:50:54.659 I'll tell you what I did there. 490 00:51:01.860 --> 00:51:04.889 Great job. 491 00:51:04.889 --> 00:51:08.639 Every so often it goes through my home directory and delete. 492 00:51:08.639 --> 00:51:15.840 Oh, yeah. 493 00:51:16.860 --> 00:51:22.619 Here's what I did factor into, uh. 494 00:51:25.170 --> 00:51:30.570 No, I couldn't hear you. What did the. 495 00:51:30.570 --> 00:51:33.960 Yeah, no, with the right. 496 00:51:39.869 --> 00:51:44.039 Hello. 497 00:51:44.039 --> 00:51:48.539 Okay. 498 00:51:48.539 --> 00:51:53.760 Wait, you know, whenever I write a program half of my time. 499 00:51:53.760 --> 00:52:00.510 Is trying to figure out how to take what I know concretely what to do, is fill it in deliverable from the compile. Ah. 500 00:52:01.530 --> 00:52:06.690 That 1 of them, so, and. 501 00:52:06.690 --> 00:52:14.550 How do I know that the capture doesn't do? Vectors is what? I was writing this program over the weekend that crashed. Yes. 502 00:52:14.550 --> 00:52:22.019 It's the sword there for oh, that's because if you look to list the numbers, it's too hard to tower to duplicate. 503 00:52:22.019 --> 00:52:27.389 As far as tickets, that would be easier for you to it. 504 00:52:38.909 --> 00:52:42.090 You see that they're sorted you could easily say. 505 00:52:43.110 --> 00:52:49.739 Uh, so when I was creating a whole, so you can really see that to help. You see uh, yeah. 506 00:52:49.739 --> 00:52:54.780 Any case? Yeah so I couldn't do it. 507 00:52:54.780 --> 00:52:59.340 To capture on the vector, maybe because the vectors on the heat for something. So I. 508 00:52:59.340 --> 00:53:05.969 Yeah, the atomic Craig results extremely limited in work and do. 509 00:53:05.969 --> 00:53:11.159 Hey, are you here. 510 00:53:11.159 --> 00:53:15.840 Uh, uh. 511 00:53:15.840 --> 00:53:18.840 I mentioned the. 512 00:53:20.039 --> 00:53:32.369 Certainly show it to you by the way, I'm just compiling stuff in the. 513 00:53:32.369 --> 00:53:36.630 Directory that's visible. You got a copy somewhere else, so. 514 00:53:38.190 --> 00:53:44.760 Okay, I'd like to have fun. Oh, Delta clock time is something I wrote that. 515 00:53:44.760 --> 00:53:49.739 That's the obvious thing. Um, so. 516 00:53:54.360 --> 00:53:59.820 Okay. 517 00:53:59.820 --> 00:54:04.469 It's like a case thing that. 518 00:54:04.469 --> 00:54:11.010 Sort of, uh, we say sections and then. 519 00:54:11.010 --> 00:54:14.429 We can say section section, so. 520 00:54:14.429 --> 00:54:18.329 Section inside the section for running parallel. 521 00:54:18.329 --> 00:54:24.000 So, here in parallel for summing up the agencies that they see that qualify. 522 00:54:24.000 --> 00:54:27.119 This blog in this block. 523 00:54:28.949 --> 00:54:32.940 Yes oh, okay. 524 00:54:32.940 --> 00:54:37.139 Okay, so, um, okay, so. 525 00:54:38.610 --> 00:54:46.170 Broader lesson from this stuff, um, is you can see the idea of tools for parallel programming. 526 00:54:46.170 --> 00:54:55.050 A competitor open empty would have different names for stuff that would have the same ideas. So static parallelism, dynamic Carlos among the task. 527 00:54:55.050 --> 00:54:58.289 The problems with capital here. 528 00:54:58.289 --> 00:55:03.269 Which could be worse when you have more parallelism. 529 00:55:03.269 --> 00:55:08.130 And the other obvious thing is the reason you don't make happiness care at all the time is. 530 00:55:08.130 --> 00:55:12.420 So, okay, so that. 531 00:55:20.905 --> 00:55:31.673 Happens most of the stuff I wanted to show you. 532 00:55:32.099 --> 00:55:38.369 Should you want to learn more? There's a lot of tutorial information online. The latest. 533 00:55:38.369 --> 00:55:42.840 Open MP 5 points who has some new stuff that I haven't figured out yet? 534 00:55:42.840 --> 00:55:45.960 I have to learn this stuff before I can talk about it. 535 00:55:45.960 --> 00:55:50.309 And it's not 2 ways to flexible ways to do. 536 00:55:50.309 --> 00:55:55.920 Oh, okay, okay. Um. 537 00:56:00.030 --> 00:56:04.230 So, um, the next step up from open. 538 00:56:04.230 --> 00:56:07.920 M P. is open a. 539 00:56:07.920 --> 00:56:14.130 And I'll just hit it very quickly. And again, it's a, it's, it's a living thing. 540 00:56:14.130 --> 00:56:17.519 To have a website to get files of information. 541 00:56:19.050 --> 00:56:23.969 The. 542 00:56:43.675 --> 00:56:47.275 The problem is that the project is a different resolution to the laptop. 543 00:56:47.579 --> 00:56:52.139 The laptop to justice resolution to match the projector, but I confused. 544 00:56:52.139 --> 00:56:56.250 So, we'll be like that. 545 00:57:35.965 --> 00:58:08.304 Okay. 546 00:58:16.619 --> 00:58:22.889 Hello. 547 00:58:35.519 --> 00:58:47.579 Okay. 548 00:59:30.090 --> 00:59:38.219 So, the large place here, a block of. 549 00:59:38.219 --> 00:59:44.849 Okay, okay. Um. 550 00:59:57.300 --> 01:00:03.750 Here and it's very low. I wanted to do with a couple of things. 551 01:00:05.460 --> 01:00:13.949 Okay, so the sorts of things that happened here, if you feel like. 552 01:00:13.949 --> 01:00:17.250 You got Craig and you have a walk. 553 01:00:17.250 --> 01:00:21.960 But something like this says is that when you go to the parallel region copy. 554 01:00:21.960 --> 01:00:27.360 Arrange a media, but when you started a coffee, when you finish it, so. 555 01:00:29.820 --> 01:00:33.989 Uh, that is handled generated automatically to some extent. 556 01:00:33.989 --> 01:00:37.980 The new things in parallel and this. 557 01:00:37.980 --> 01:00:42.659 Says it's prebaked to do the same firewall. 558 01:00:42.659 --> 01:00:49.920 So, the takeaway from that is that is also. 559 01:00:49.920 --> 01:00:53.789 Um, there. 560 01:00:53.789 --> 01:00:59.429 The way you do something in parallel, you write your sequence what code is an algorithm and parallelizable. 561 01:00:59.429 --> 01:01:04.320 All right, um, and. 562 01:01:05.699 --> 01:01:13.019 And again, um, this tutorial is aimed at people and overage and saw that are using the fall. 563 01:01:13.019 --> 01:01:16.469 I'm super computers, um. 564 01:01:18.030 --> 01:01:23.369 You can do something like that that tells the requirements to try and do stuff in parallel. 565 01:01:23.369 --> 01:01:28.769 So. 566 01:01:30.269 --> 01:01:35.489 People using it, so it's like, yes, yes. 567 01:01:35.489 --> 01:01:47.130 Guessing, but like, so, like, for example, with open empty, like, you, like, you think about Florida, which is, like, explicit thing, that opens right understand. 568 01:01:47.130 --> 01:01:55.289 We're hoping BCC, like, okay, I have this blocked code and, like, it probably can be. 569 01:01:55.289 --> 01:01:58.289 No, it's a little more. 570 01:01:58.289 --> 01:02:01.650 I mean, not actually correct. It's if it's. 571 01:02:04.050 --> 01:02:07.199 If it's analysis shows, it's probably. 572 01:02:07.199 --> 01:02:12.840 Parallelize it, that's the theory of all the theory of in practice. 573 01:02:13.949 --> 01:02:17.429 By the way you should you be inclined. 574 01:02:17.429 --> 01:02:26.010 Hey, Tyler is trying to do any sort of optimizing and there's some programming techniques, which are beautiful for that like using aliases. 575 01:02:26.010 --> 01:02:31.019 Get the same address that has 2 different names and the power doesn't know what is the different names. 576 01:02:31.019 --> 01:02:36.599 So, that will destroy any sort of optimizing the. 577 01:02:42.750 --> 01:02:48.539 So, I'm trying to fill it. 578 01:02:48.539 --> 01:02:53.250 You talked about profiling, so there are profiling tool. 579 01:02:53.250 --> 01:02:56.880 Parallel director stuff. 580 01:02:56.880 --> 01:03:09.360 What's happening here is this concept you've got levels of credit and video, but they're trying not to use a video technology. So the result is to confuse every 1. 581 01:03:09.360 --> 01:03:13.079 Um, yeah. 582 01:03:15.179 --> 01:03:25.559 And it okay, so you can have the production boss. 583 01:03:25.559 --> 01:03:32.940 For so I mentioned, so it transparent and visibly build this tree parallels. 584 01:03:32.940 --> 01:03:36.179 So, private 1 free spread, and then go off. 585 01:03:39.840 --> 01:03:46.860 Okay, 1 thing I have not talked about at all, of course, none of the programs I showed you are optimized. 586 01:03:46.860 --> 01:03:50.039 You do something like this, that optimizes that. 587 01:03:50.039 --> 01:03:53.730 And. 588 01:03:54.840 --> 01:03:58.530 Uh, hey, um. 589 01:04:01.260 --> 01:04:06.360 And you can specify target so another thing, lots of information. 590 01:04:06.360 --> 01:04:10.739 If you say has a switch minus big admin phone, I mean. 591 01:04:10.739 --> 01:04:17.309 It'll the compiler larger amounts of information about what I was doing, depending on what. 592 01:04:17.309 --> 01:04:21.599 Yeah, so and. 593 01:04:21.599 --> 01:04:24.840 And they're talking about speed up. 594 01:04:24.840 --> 01:04:29.099 So, I was trying some of their examples I was getting smaller amounts of speed up. 595 01:04:30.750 --> 01:04:33.929 Hello. 596 01:04:33.929 --> 01:04:45.360 Oh. 597 01:04:47.519 --> 01:04:58.679 Um, for example, uh. 598 01:05:26.934 --> 01:05:27.204 Oh. 599 01:05:36.719 --> 01:05:42.659 And I'm comparing, I'll show you the source code in a minute it just summing numbers and giving the time. 600 01:05:42.659 --> 01:05:48.269 This is the correct answer, um, is going to be correct. 601 01:05:50.789 --> 01:05:55.650 Later and they're all correct. Oh, I was doing. 602 01:05:55.650 --> 01:06:01.559 Well, that's big. 603 01:06:05.250 --> 01:06:09.659 I did see, I didn't compile that anything. 604 01:06:24.480 --> 01:06:29.940 So, open pages the wrong answer that. 605 01:06:39.030 --> 01:06:43.769 So, sequential types of 2nd, uh, open ACC. 606 01:06:43.769 --> 01:06:49.320 Takes a different, um, every time I write it, it takes a different amount of time. 607 01:06:49.320 --> 01:06:53.610 Why so I don't know what's going on. 608 01:06:55.710 --> 01:07:01.380 Twice the open ATC blocked for 2 seconds the 1st time for 3rd. A 2nd. 609 01:07:01.380 --> 01:07:04.559 The final thing was the topics. 610 01:07:04.559 --> 01:07:08.639 Which will get the right answer hasn't finished yet. Uh. 611 01:07:08.639 --> 01:07:12.539 Oh, that's. 612 01:07:14.610 --> 01:07:20.400 Well, I have so many new numbers 102Billion minus. 613 01:07:30.119 --> 01:07:37.019 Okay. 614 01:07:37.019 --> 01:07:43.110 The open hcc is less mature. 615 01:07:43.110 --> 01:07:49.889 Which is the so they say it does give the speed up. 616 01:07:49.889 --> 01:07:54.210 Um, there's slides are. 617 01:07:54.210 --> 01:07:57.630 For the test for all the work specifically. 618 01:08:00.179 --> 01:08:04.860 Hey, so that. 619 01:08:06.599 --> 01:08:09.690 On opening. 620 01:08:09.690 --> 01:08:18.210 We'll get to next is the parallel version of the standard template library paradigms. 621 01:08:18.210 --> 01:08:22.079 We're working in parallel where everything is a map reduce. 622 01:08:23.130 --> 01:08:27.210 After that well. 623 01:08:27.210 --> 01:08:31.710 Yes. 624 01:08:33.510 --> 01:08:37.109 See, the number of. 625 01:08:37.109 --> 01:08:43.260 So, for all you always has been, uh. 626 01:08:43.260 --> 01:08:47.520 It was before I was. 627 01:08:47.520 --> 01:08:51.689 Which was the right function the call and the maximum is. 628 01:08:51.689 --> 01:09:00.149 All the changes and okay, I think that on Friday is the current number of scripts that are run. 629 01:09:00.149 --> 01:09:04.590 So, unless you call it inside a parallel block behind the scene. 630 01:09:06.210 --> 01:09:12.420 Yeah, learning this stuff is, I'm going along and. 631 01:09:14.159 --> 01:09:22.500 1 version of the program might be helpful is I called that, but I call it inside parallel block, but I put it critical around. 632 01:09:22.500 --> 01:09:30.300 So otherwise it would have been called 56 times there. 633 01:09:33.270 --> 01:09:39.750 But I do like the fact that they give you the tools to invest to determine the state of the state. 634 01:09:40.829 --> 01:09:45.420 Okay, so that's just showing there. 635 01:09:45.420 --> 01:09:50.159 Where he's here, so oh, okay. Um. 636 01:09:52.020 --> 01:10:01.829 Let's see, it's not like that 1. 637 01:10:03.689 --> 01:10:10.289 Okay, so. 638 01:10:12.899 --> 01:10:17.340 Stuff today so review, I'm just showing you to. 639 01:10:17.340 --> 01:10:24.239 Extension is to C. plus plus to do parallel programming, once your algorithm is code it in a parallel way. 640 01:10:24.239 --> 01:10:29.520 You could have the prices to run it either on the many core. 641 01:10:29.520 --> 01:10:37.260 That would be the see on parallel machine or or multi core that's see on site. 642 01:10:37.260 --> 01:10:42.060 56 hyper threads or many core. That's the. 643 01:10:43.079 --> 01:10:46.260 Then we have to talk about the hardware detail yet we're. 644 01:10:46.260 --> 01:10:50.250 And open is more mature. 645 01:10:50.250 --> 01:10:57.329 Lower level, but it's more mature problem in the past was it did not generate target code for a. 646 01:10:57.329 --> 01:11:01.439 Now, it does, to some extent, I haven't played with it that much. 647 01:11:01.439 --> 01:11:05.220 My task really evaluate this stuff runs. 648 01:11:06.329 --> 01:11:09.869 Is it worth using? I don't know, but certainly open to. 649 01:11:09.869 --> 01:11:14.010 It's well worth using open agency is a higher level. 650 01:11:14.010 --> 01:11:18.060 The theory is, the compiler determines what the parallelize. 651 01:11:18.060 --> 01:11:21.390 They're using it at the energy Labs. Uh. 652 01:11:22.800 --> 01:11:26.579 How hard of what's for them to use that? I don't know. And you read this stuff. 653 01:11:26.579 --> 01:11:31.229 They don't always tell you how much work it took them to get things working. So. 654 01:11:31.229 --> 01:11:35.130 There are some books that opened or a couple of them. 655 01:11:35.130 --> 01:11:39.930 And there's so so so there's 3 stuff if you wanted to free stuff. 656 01:11:39.930 --> 01:11:47.130 On the public directory, they said a couple of open ATC books you can buy that. 657 01:11:48.329 --> 01:11:56.100 I was looking to support it and next time, um, we'll move in to parallel versions of. 658 01:11:56.100 --> 01:12:04.079 So, operating over vectors in parallel, we'll see some new paradigms that we're only in parallel to don't. 659 01:12:04.079 --> 01:12:07.109 That are not useful for sequential machines are useful. 660 01:12:07.109 --> 01:12:13.529 Very useful that if there's any questions. 661 01:12:15.300 --> 01:12:21.029 So, and like, teaching style is I like to work those specific to the general. So. 662 01:12:21.029 --> 01:12:25.079 Some other clients, so I'm giving you some specific. 663 01:12:27.869 --> 01:12:31.560 Good question. 664 01:12:31.560 --> 01:12:36.689 This was a weird way to do it, but I. 665 01:12:36.689 --> 01:12:40.529 If it worked, um. 666 01:12:48.899 --> 01:12:56.369 And there was an old 1, so I'll upload this.