WEBVTT 1 00:00:00.000 --> 00:00:07.230 It's the same thing, but they're not in the fall for the day. This is great. 2 00:00:07.230 --> 00:00:11.308 I think it would fall into, I don't know. 3 00:00:11.308 --> 00:00:14.728 Laughing. 4 00:00:14.728 --> 00:00:18.028 Hello. 5 00:00:18.028 --> 00:00:26.399 Okay, so what's happening today is. 6 00:00:26.399 --> 00:00:30.179 Continuing on with trust and. 7 00:00:30.179 --> 00:00:34.049 I put piles of extra stuff on line for you. 8 00:00:34.049 --> 00:00:40.439 I'm not sharing from a set of notes. It's a few years old presented, uh, to keep your technology conference. 9 00:00:40.439 --> 00:00:54.204 Because it's well presented and NVIDIA has not updated that. And the parts that it's discussing are still valid what's been added since then our newer things like working with unified memory and some, a few extra operators and so on. 10 00:00:54.234 --> 00:00:57.145 But the core ideas are still the same. 11 00:00:57.780 --> 00:01:04.799 Um, on, um, walk you through a little some of the. 12 00:01:05.969 --> 00:01:12.030 That's true. Um, some of the website here, let's see. 13 00:01:12.030 --> 00:01:18.450 Okay, a little smaller get a little more on it. 14 00:01:18.450 --> 00:01:22.769 Okay um, so. 15 00:01:26.189 --> 00:01:29.489 Okay. 16 00:01:29.489 --> 00:01:33.420 I'm Stanford has some well presented, um. 17 00:01:33.420 --> 00:01:39.599 Slides talking about for us, but I'm working with the NVIDIA thing to have a little more information on them. 18 00:01:39.599 --> 00:01:45.030 But I moved stuff around a little here a couple of things I did. 19 00:01:45.030 --> 00:01:51.840 Is there were several other key teaching kits on the NVIDIA developer website? 20 00:01:51.840 --> 00:01:56.040 And I brought them in for you, if you wanted to look at the, we were looking from. 21 00:01:56.040 --> 00:02:00.719 The gpo, um, accelerated computing kit for some time. 22 00:02:00.719 --> 00:02:08.189 That, but there's also you can see some data science, some deep learning. 23 00:02:08.189 --> 00:02:11.460 And so on in robotics. 24 00:02:11.460 --> 00:02:14.969 And video sees deep learning as a major. 25 00:02:14.969 --> 00:02:22.500 Area that customers want, so that has some tutorial information. In fact, they're packaging the. 26 00:02:22.500 --> 00:02:29.250 And servers intended for deep learning and so on and they're used, for example, while. 27 00:02:29.250 --> 00:02:32.879 You know, cars have NVIDIA in them. 28 00:02:32.879 --> 00:02:38.699 And so on okay, so I've got that 2 samples. That's in the past. 29 00:02:38.699 --> 00:02:44.219 I moved stuff around for thrust here. There's some of the documentation I'll go from. 30 00:02:44.219 --> 00:02:48.870 Gtc part 2 other stuff that's in here. Um. 31 00:02:48.870 --> 00:02:55.800 In video, what do I have in here? I'm putting stuff in here, but, um. 32 00:02:55.800 --> 00:03:02.729 Some programs, some invidious some programs I've worked on, let me show a program or 2. 33 00:03:02.729 --> 00:03:08.789 Before I go back to the documentation, um, this is an old set of examples from. 34 00:03:08.789 --> 00:03:13.050 I'll I'll, I'll update them later, but. 35 00:03:14.219 --> 00:03:20.430 Well, let's see when the earlier 1 say, Saxby. 36 00:03:20.430 --> 00:03:24.300 The biggest possible, um. 37 00:03:28.650 --> 00:03:32.669 So, it's just a simple example of. 38 00:03:32.669 --> 00:03:38.370 Multiplying a vector by a constant factor, and adding a vector to another vector to it. 39 00:03:38.370 --> 00:03:41.699 Some of the time techs plus why. 40 00:03:41.699 --> 00:03:45.569 It's a standard operation in a lot of numerical computing. 41 00:03:45.569 --> 00:03:50.729 And so here, let me on the demonstrate that it actually works. 42 00:03:52.199 --> 00:03:57.479 You can you help? Oh, well, but to show you that the program is. 43 00:03:58.650 --> 00:04:02.430 Include a pile of header things here. 44 00:04:03.479 --> 00:04:14.879 What we have up here? Well, 2, different ways it might be implemented. The 1st thing is to review. This is this crazy thing in C + . 45 00:04:14.879 --> 00:04:18.839 Where you can overload the parenthesis operator. 46 00:04:18.839 --> 00:04:24.209 So, this is defining a class because the structure in the class are the same. 47 00:04:24.209 --> 00:04:29.999 Except the stock members by default are public and a class members by default our private. 48 00:04:29.999 --> 00:04:36.509 Other than that, that's the same. So this is defined a new class called Saxby factor and. 49 00:04:36.509 --> 00:04:40.259 Extend the base class, which you'd hardly you actually need. 50 00:04:40.259 --> 00:04:43.439 Now, in this class, there is a member. 51 00:04:43.439 --> 00:04:48.778 Um, float here that's called a, and it's called a it's a cost. 52 00:04:48.778 --> 00:04:56.098 So, if you have a constant member, the only way you can set its value is when you construct an element of the class. 53 00:04:56.098 --> 00:05:03.658 So, that's when you construct a new element of the class. So is an exception to cost. 54 00:05:03.658 --> 00:05:10.559 But it's locked down once you do that. Okay, so you've got this member here. And the only way it's going to get a value is, um. 55 00:05:10.559 --> 00:05:18.718 When you construct it, but we have here again, so people different levels of experience with C, + . 56 00:05:18.718 --> 00:05:24.869 So Here's a function and it's in the class. Its name is the name of the class. 57 00:05:24.869 --> 00:05:29.218 So this is a constructor so if you call. 58 00:05:29.218 --> 00:05:32.338 If you construct a new element of the class that. 59 00:05:32.338 --> 00:05:35.908 It will if appropriate call the function. 60 00:05:35.908 --> 00:05:39.389 Dysfunction, I mean, this default instructor is also. 61 00:05:39.389 --> 00:05:44.189 But what it will attempt to do is when you construct an element of the class, you can give it. 62 00:05:44.189 --> 00:05:51.329 An argument, and if the argument, if it can match it with afloat, then it will call this function. 63 00:05:52.499 --> 00:05:59.759 And what the function will do is, it will construct an element of the class. Then we'll sign a value to a. 64 00:05:59.759 --> 00:06:05.129 Now, there's some new syntax in here also it may not have seen. 65 00:06:05.129 --> 00:06:08.189 Well, 1st underscore is just a convention for. 66 00:06:08.189 --> 00:06:16.709 Temporary, so you sometimes the idea is, if you have an underscore named, then it will not conflict with the user supply name. Perhaps. 67 00:06:16.709 --> 00:06:21.209 Now, the body of the function is empty. 68 00:06:21.209 --> 00:06:25.199 And Here's something some of you might not have seen before. 69 00:06:25.199 --> 00:06:29.968 You've got the header for the function and you got a colon. 70 00:06:29.968 --> 00:06:42.088 Then you got this location, this location is a way you can give initial values to elements of the function to variables inside the function. 71 00:06:42.088 --> 00:06:49.678 Or, in this case, in the constructive so, but this says is that a resolves to this member of the class. 72 00:06:49.678 --> 00:06:52.798 Because we're inside of the class definition. 73 00:06:52.798 --> 00:06:59.548 And what this says here is that the element, the member, a, of the class. 74 00:06:59.548 --> 00:07:02.968 Will be Initialized with the value of underscore a. 75 00:07:02.968 --> 00:07:06.418 Which is the argument here. 76 00:07:07.889 --> 00:07:11.069 So, if you construct an element of the class. 77 00:07:11.069 --> 00:07:16.978 And you give it an argument, then this will construct at all with the class and assign a construction time. 78 00:07:16.978 --> 00:07:21.899 Then this whole thing, it returns it on with the class and a has now been locked down. 79 00:07:23.459 --> 00:07:32.668 Those are the construction thing here and what we have down in here is we're overloading the, the parenthesis operator. So. 80 00:07:32.668 --> 00:07:36.418 If we call variables in the classes, if there were functions. 81 00:07:36.418 --> 00:07:41.759 We'll in fact, execute this thing here. Um. 82 00:07:41.759 --> 00:07:46.738 Post device dysfunction will be compiled twice. 83 00:07:46.738 --> 00:07:51.418 It will be compiled once so it can run on. The whole stuff would be the. 84 00:07:51.418 --> 00:07:58.709 See on, and will be compiled again with code machine codes, or it can run on the device. That would be the GPU. 85 00:07:58.709 --> 00:08:02.309 So, it would it be compiled twice? Um. 86 00:08:03.809 --> 00:08:08.728 And so this is an operator friends this is how you overload an operator. 87 00:08:08.728 --> 00:08:15.178 You know, the word operator then followed by the name of the offer and it returns floats. 88 00:08:15.178 --> 00:08:18.689 It takes 2 arguments. 89 00:08:18.689 --> 00:08:22.108 X and Y, by reference. 90 00:08:22.108 --> 00:08:28.288 No, they don't, they don't change and another cost here basically. 91 00:08:28.288 --> 00:08:32.818 Function doesn't go changing stuff and then it returns this. 92 00:08:32.818 --> 00:08:37.499 Now, so you can see so, X and Y, there arguments that came in. 93 00:08:37.499 --> 00:08:42.208 And what is a, is this local member function up here? 94 00:08:42.208 --> 00:08:46.078 Because this function, it's defined inside the definition for the class. 95 00:08:46.078 --> 00:08:51.778 So, it has access to elements of the class, like very bullet member a, for example. 96 00:08:53.339 --> 00:08:58.288 Now, 1, point up here is that so if you say. 97 00:08:58.288 --> 00:09:04.078 Got some valuable you call it as a function. It will call this only if it can do a match. 98 00:09:04.078 --> 00:09:09.058 On the arguments, it can cast them to floats. For example. 99 00:09:10.259 --> 00:09:16.288 If you call the thing as a function with 1 argument or 3 arguments, then it would not call this because it. 100 00:09:16.288 --> 00:09:23.099 It could not do the match at compile time now, thinking of optimization. 101 00:09:23.099 --> 00:09:27.599 The syntax here for this thing is horrible, and it can be replaced with Lambda. 102 00:09:27.599 --> 00:09:32.698 Which is 1 reason I like that and we'll show it can be replaced with placeholder notation. 103 00:09:32.698 --> 00:09:37.379 Also, um, we'll see different ways to do this, but, um. 104 00:09:37.379 --> 00:09:44.099 The thing with this is what the compiler will do if you turn on any optimization at all with a compiler. 105 00:09:44.099 --> 00:09:49.558 Is it will take this because you look at a little thing like, this is the function body's really small. 106 00:09:49.558 --> 00:09:53.339 It would take more time to call the function and find the arguments and. 107 00:09:53.339 --> 00:09:56.399 Handle the return that it would take to actually compute this. 108 00:09:56.399 --> 00:10:00.928 But the point is, is that this little block of code will get inserted in line. 109 00:10:00.928 --> 00:10:06.719 So, the function call will be will vanish. This will just be this little block or call gets inserted in line with. 110 00:10:06.719 --> 00:10:12.089 X and Y, being replaced by the argument, is it, it's like a macro call basically, except a little more. 111 00:10:12.089 --> 00:10:23.908 Intelligent and then so this little chunk or code gets replaced put in, whether what's the function call and then the Optimizer proceeds to re, optimize everything. So it's really efficient. 112 00:10:23.908 --> 00:10:29.068 It's better than certain other popular languages like a Python. 113 00:10:29.068 --> 00:10:34.349 Okay um, and, um. 114 00:10:36.479 --> 00:10:41.698 And what we have up here is just another way to play with it. Okay. Um. 115 00:10:43.678 --> 00:10:47.759 So, what we have here is a way to call this actually it's going to be a. 116 00:10:47.759 --> 00:10:52.349 A user function, Saxby passed and, um. 117 00:10:54.839 --> 00:11:04.948 And what this is going to do here is it will apply slack speech 2 factors, A's, a constant aces to float. And in this case, X and Y, your vectors. So. 118 00:11:04.948 --> 00:11:09.389 So, this will apply this to the vectors and showing you the. 119 00:11:09.389 --> 00:11:12.479 The thrust idea of how you would do this. So this. 120 00:11:12.479 --> 00:11:16.828 And, um. 121 00:11:16.828 --> 00:11:20.489 So, what this has this arguments array and then, um. 122 00:11:20.489 --> 00:11:25.708 Couple of device vectors. So again, the only real. 123 00:11:25.708 --> 00:11:29.369 Particular data type in thrust our vectors. 124 00:11:29.369 --> 00:11:34.769 That can be on the host or on the device and you say host factor or device factor. 125 00:11:34.769 --> 00:11:40.139 And they're sort of like, see, like, vector, save a little extra overhead attached to them. 126 00:11:40.139 --> 00:11:44.249 So you have to convert some back and forth, but. 127 00:11:44.249 --> 00:11:51.688 Okay, now, so with thrust, the ideas are working with factors and. 128 00:11:51.688 --> 00:12:00.629 Functional programming style, just work with factors those factors you do not have to explicitly loop over them. That's not inside thrust. 129 00:12:00.629 --> 00:12:08.938 And and sorts of things you can do with factors are, um, transform and, um. 130 00:12:09.958 --> 00:12:18.719 Um, transforms got a couple of different formats, depending on the arguments that has. 131 00:12:18.719 --> 00:12:27.899 This 1 form here will input will be 2 vectors vector at the beginning and end iterative. 132 00:12:27.899 --> 00:12:37.048 Excellent vector why what it will do is it will operate element by element on 2 vectors X and Y. 133 00:12:37.048 --> 00:12:43.769 And write out a 3rd vector, which will also be why? So we're overriding Y, in place. 134 00:12:43.769 --> 00:12:47.038 And it turns out that that will work. 135 00:12:47.038 --> 00:12:51.688 It doesn't always work, but it works. In this case you have to. 136 00:12:51.688 --> 00:12:58.828 Definitely, you know, documents on transform so transform. We'll combine 2 factories element element writes them out. 137 00:12:58.828 --> 00:13:04.739 So now, what will transform due to the pairs of elements 1 from X1 from Y. 138 00:13:04.739 --> 00:13:08.578 It will apply this function here. Um. 139 00:13:10.528 --> 00:13:16.889 I've written this thing, I talked about a little last time, which was a week and a half ago and I've written it down. I'll show you where I wrote it down. 140 00:13:16.889 --> 00:13:20.578 This here is a function, um. 141 00:13:20.578 --> 00:13:31.139 But that's the filter on a does is it constructs a new element a new variable of class Saxby factor. It's calling instructor explicitly. 142 00:13:31.139 --> 00:13:35.668 The constructor takes 1 argument, which is a, which came into this function. 143 00:13:35.668 --> 00:13:40.769 For the arguments of the function so this thing returns the function. 144 00:13:40.769 --> 00:13:44.879 Um, which can now be called. 145 00:13:44.879 --> 00:13:48.359 And now it's for us to transform. 146 00:13:48.359 --> 00:13:53.849 Well, but if you look in the definition of trust to transform, it will call this function. 147 00:13:53.849 --> 00:13:59.129 With 2 arguments and expect it to return a value. 148 00:13:59.129 --> 00:14:03.479 So, it calls us an argument of a, an X and argument of why. 149 00:14:03.479 --> 00:14:06.958 An element of exit, I'm a little wide returns a value, which we'll right back into why. 150 00:14:06.958 --> 00:14:13.408 So so this is using this notation here. 151 00:14:13.408 --> 00:14:19.019 A factor is just a function on functions you might say, what's the definition of it? Actually. 152 00:14:19.019 --> 00:14:23.249 So, but this is this weird thing now. 153 00:14:25.589 --> 00:14:31.078 So, the syntax is totally weird, but it actually works and it works fast. So. 154 00:14:33.899 --> 00:14:38.938 And, um. 155 00:14:38.938 --> 00:14:42.778 And tax be fast and slow look at it in a minute. 156 00:14:42.778 --> 00:14:49.318 And in the main program, initialize a few things, it doesn't write anything out here. 157 00:14:49.318 --> 00:14:54.239 Well, here, I'll just I'll go back to the slow versions in a minute. 158 00:14:54.239 --> 00:14:58.769 But new flies a C standard a raise. 159 00:14:58.769 --> 00:15:04.859 Copies of we construct some device factors here and we give it. 160 00:15:04.859 --> 00:15:09.568 And it will copy from the host factor. 161 00:15:09.568 --> 00:15:16.408 When you have a C style array like this, and you just use them a name like X that it really is a pointer to the 0 element. 162 00:15:16.408 --> 00:15:20.639 So and again, when you do addition. 163 00:15:20.639 --> 00:15:23.999 On pointers it adds. 164 00:15:23.999 --> 00:15:31.948 +1 is the next item in the list, or the array, or the vector it's not 1 bite later in memory. 165 00:15:31.948 --> 00:15:36.719 It in memory terms, it goes up by the size of elements of X. 166 00:15:36.719 --> 00:15:41.038 Which is probably 4. okay. Um. 167 00:15:41.038 --> 00:15:50.369 So, in any case, so we can do the facts. So you do the fast way here call it. 168 00:15:50.369 --> 00:15:55.649 Now, the slow method, um, what is the slow method of doing? 169 00:15:58.828 --> 00:16:05.759 The slow method here does not do the plus Y, is 1 operation. 170 00:16:05.759 --> 00:16:11.668 It it does it as several operations. It takes 8 times X. 171 00:16:11.668 --> 00:16:19.918 Get AC, scaler as a vector compute side and stores it in temporary or stores it. And Y, and then does. 172 00:16:19.918 --> 00:16:23.759 The temporary plus Y, is a separate operation. 173 00:16:23.759 --> 00:16:28.019 So the reason this is slow, is that a storage temporary data? 174 00:16:28.019 --> 00:16:33.568 In memory, so it basically doubled the amount of audio to the memory. 175 00:16:33.568 --> 00:16:39.389 And since a lot of power programs, your rate limiting factor is data transfer. 176 00:16:39.389 --> 00:16:46.139 I owe you do not want to be storing temporary back in memory if you can at all, avoid it. 177 00:16:46.139 --> 00:16:49.678 So, the fast method does the experts computation. 178 00:16:49.678 --> 00:16:53.548 All at once, it doesn't go storing and then reading it back. Yeah. 179 00:16:53.548 --> 00:16:59.038 So, that's why this is called slow. You see, this does transform is 2 steps here. 180 00:16:59.038 --> 00:17:03.599 Um, and. 181 00:17:04.858 --> 00:17:09.058 And plus flow this is this the approach provided in the library. 182 00:17:09.058 --> 00:17:16.229 And +, and it's a template and instantiated with the flow types. It's just had to do floats. 183 00:17:16.229 --> 00:17:22.348 And again, it's basically constructing a, um. 184 00:17:22.348 --> 00:17:25.348 A new valuable of type plus flow. 185 00:17:25.348 --> 00:17:30.388 The constructor takes no arguments and then this. 186 00:17:30.388 --> 00:17:33.509 What I've highlighted is a function call with Sam gets called. 187 00:17:33.509 --> 00:17:40.288 Okay, that's, um, so fast. Okay, so that's, um. 188 00:17:40.288 --> 00:17:47.368 Simple way um, let me look at 1 would write something out. Um. 189 00:17:47.368 --> 00:17:50.429 Okay, um. 190 00:17:50.429 --> 00:17:53.818 Hello. 191 00:17:53.818 --> 00:17:58.528 A weird way to random numbers. Don't worry about that. Um. 192 00:17:58.528 --> 00:18:07.169 Generates a function, which. 193 00:18:07.169 --> 00:18:12.088 You give it a vector, an element by element it calls my Rand. 194 00:18:12.088 --> 00:18:15.479 You return elements that it stops into the vector so. 195 00:18:15.479 --> 00:18:21.808 Again, here you can copy vector some hosts to device and back again. 196 00:18:21.808 --> 00:18:25.888 Um. 197 00:18:25.888 --> 00:18:33.989 So this is the reduce I've got, I showed it to you last time, a quick review, because it's been a week and a half later. 198 00:18:33.989 --> 00:18:38.759 It takes a vector specified by the beginning and ending iteration. 199 00:18:38.759 --> 00:18:43.499 Makes an initial value in a way you combine them. And for example, if it is 0. 200 00:18:43.499 --> 00:18:52.169 And binary office, plus it's thumbs down and it does it in log in time it's doing the reduction in parallel. We thought, I think how to do that. So. 201 00:18:53.249 --> 00:18:57.028 Okay, um. 202 00:18:58.078 --> 00:19:01.259 Maybe I'll go back to the documentation before. 203 00:19:01.259 --> 00:19:04.888 Let me anticipate things a little. 204 00:19:07.378 --> 00:19:12.868 Anticipating what will be in the documentation. 205 00:19:12.868 --> 00:19:16.409 So you think of iterate or is it points to a vector? 206 00:19:16.409 --> 00:19:27.118 An element of a vector and you bump it up at points to the next element of the vector. Yeah, but your integrator can be more more general idea than that. 207 00:19:27.118 --> 00:19:32.128 It might actually be calling a function to return elements. 208 00:19:32.128 --> 00:19:35.459 And it might be and what we have here. 209 00:19:35.459 --> 00:19:40.229 Is, um. 210 00:19:40.229 --> 00:19:46.469 So this is it's an iterative, but it's an iterate or to. 211 00:19:46.469 --> 00:19:59.818 A vector that doesn't actually exist, but you can imagine it exists because every time you you can point to elements of it and read the elements and what you will always get is a tab. 212 00:19:59.818 --> 00:20:02.939 Well, this is the constant generator it's pointing to. 213 00:20:02.939 --> 00:20:09.898 A, you might say, a virtual or or a lazy evaluated conceptual vector. 214 00:20:09.898 --> 00:20:14.368 And it's as if it was a vector that was infinitely long, had all tens in it. 215 00:20:14.368 --> 00:20:22.798 But the point is that it's not actually stored, it takes no space in memory and it takes basically no time to execute. 216 00:20:22.798 --> 00:20:26.608 Because this is handled inside the library, so. 217 00:20:26.608 --> 00:20:36.088 So, for example, if you need a vector, a constant factor, this, this would construct a vector of everything being the same element for you. 218 00:20:36.088 --> 00:20:46.318 But the advantage of this is, um, 1st, why would you want a vector of constant? You like your. 219 00:20:46.318 --> 00:20:50.999 Doing a dog product of 2 vectors. I'd say 1 of the vectors you want a doc. 220 00:20:50.999 --> 00:20:57.598 You want to be all constants you're constructed with something like the way it highlighted it and you don't have to store it. 221 00:20:57.598 --> 00:21:02.038 And the, here's the big thing is the code that uses it. 222 00:21:02.038 --> 00:21:09.959 Doesn't know that it's a funny vector. A constant factor. This is the big thing. You got some conceptual unification here. 223 00:21:09.959 --> 00:21:13.919 If you wrote this in some low level language, like C. 224 00:21:13.919 --> 00:21:22.138 You want it to hand, you know, you're doing a doc product for 2 vectors, and you want to handle the case for 1 of the vectors is all a constant. 225 00:21:22.138 --> 00:21:31.648 In a low level language you have 2 ways to do it. You actually construct a vector of all Constance, but that takes space in time. Or you got a special case in your code. 226 00:21:31.648 --> 00:21:38.848 Here, neither, because it's just it's a new data type. Constant factor that's defined in the library. 227 00:21:38.848 --> 00:21:44.729 And to find in the header files, actually, and what it does is, it returns constant. So, this is the. 228 00:21:44.729 --> 00:21:52.979 Thing was as functional programming, they try to unify everything to look like a vector and look like a reduce. And this is 1 of the techniques that they do. 229 00:21:52.979 --> 00:21:56.459 Things they provide stuff like constant vector here. 230 00:21:57.659 --> 00:22:01.169 And what it does is if we run it, um. 231 00:22:04.288 --> 00:22:12.328 It just added tend to the each element of the vector. So it's a special. They've got a number of these special vectors here. 232 00:22:12.328 --> 00:22:18.689 Constant is 1, is 1 of which increments up and gives you a. 233 00:22:18.689 --> 00:22:22.199 This counts up via constant every time. 234 00:22:22.199 --> 00:22:33.929 Then there are vectors which actually are functions of other vectors and the functions are a valuable element element as you need them. It's a lazy evaluation. 235 00:22:33.929 --> 00:22:40.378 Okay, so oh, let me show you a little Lambda thing. Um. 236 00:22:41.788 --> 00:22:49.439 Okay, it's a little run. Let's show you a couple of things placeholder location. You can read the comments later, but. 237 00:22:50.578 --> 00:22:53.638 He was the old fashioned way the factor notation. 238 00:22:53.638 --> 00:22:57.929 Overlook creating a new class and overloading parenthesis. 239 00:22:57.929 --> 00:23:01.318 Um. 240 00:23:01.318 --> 00:23:06.719 And. 241 00:23:08.459 --> 00:23:16.588 Is a new way so transform again, it's operating on 2 vectors and creating a 3rd deck operating element. 242 00:23:16.588 --> 00:23:21.959 So, again, so I showed you last time, but this, it's worth showing you again. So, this here is a function. 243 00:23:21.959 --> 00:23:25.378 That takes 2 arguments and. 244 00:23:25.378 --> 00:23:31.439 And it will compute them and what a is so where is a bound. 245 00:23:31.439 --> 00:23:35.068 Well, it's the normal scope resolution for C + . 246 00:23:35.068 --> 00:23:38.909 We'll go looking up to enclosing blocks and and closing blocks. 247 00:23:38.909 --> 00:23:46.409 Until we find where a is defined and way way up here is to find up here is to. 248 00:23:46.409 --> 00:23:53.398 So so, the normal here looking for looking up to and closing blocks to find. 249 00:23:53.398 --> 00:24:01.919 Bind free variables and again, so this is a function will be handed into transform and transform. We'll call it on pairs of elements. 250 00:24:01.919 --> 00:24:06.719 And again it's fast because. 251 00:24:06.719 --> 00:24:11.459 It's a little charter code gets put in in line and. 252 00:24:11.459 --> 00:24:16.828 The arguments with substituted in the compiler optimize this stuff and the way it's implemented. 253 00:24:16.828 --> 00:24:21.298 Is that variables? Underscore 1 to underscore 9? 254 00:24:21.298 --> 00:24:27.269 They are in a class called placeholder or something. 255 00:24:27.269 --> 00:24:33.028 And in that class, the common operators, like + and times are overloaded. 256 00:24:33.028 --> 00:24:38.189 And it will net out to do the right thing. All the placeholder method. It. 257 00:24:38.189 --> 00:24:43.318 It allows you to write. This is the most compact way. You can write the little functions. 258 00:24:43.318 --> 00:24:48.358 So, it's very compact. You can read the code, you see what it's doing. 259 00:24:48.358 --> 00:24:57.538 And it produces code, you can read the source code and the compile code runs really quickly. And I like the stuff that runs quickly. So. 260 00:24:57.538 --> 00:25:02.098 Okay, so that. 261 00:25:02.098 --> 00:25:06.719 And we could run it do anything. Um. 262 00:25:06.719 --> 00:25:10.709 You're going to get the same answer. Good. So. 263 00:25:10.709 --> 00:25:14.098 Okay, so that's a couple of things here. 264 00:25:15.239 --> 00:25:21.509 Okay, I want to go to my notes on the blog before I go back to the PowerPoints of PDF. 265 00:25:23.368 --> 00:25:32.848 Okay, so and the point about why I'm spending time in this class is we're seeing some common. 266 00:25:32.848 --> 00:25:38.638 Parallel programming paradigms and and some algorithms also, we'll see for parallel. 267 00:25:38.638 --> 00:25:42.088 And they got lots of little demo programs, um. 268 00:25:42.088 --> 00:25:49.499 Stuff from, and provided by, and some of my additions I'll look at them in the next couple of classes. 269 00:25:49.499 --> 00:25:56.278 And the thing with thrust is a lot of stuff like that. Um. 270 00:25:56.278 --> 00:25:59.429 That thing will, of course. 271 00:25:59.429 --> 00:26:05.578 We run in parallel as fast, so it will be parallel lives is very nice because I see another thing. 272 00:26:05.578 --> 00:26:09.269 You put stuff in this paradigm of map, reduce. 273 00:26:09.269 --> 00:26:21.778 The reduction thing, because you're reducing combined. Well, the mapping you're operating on elements things are independent that it can do that thing in parallel. Very nicely. 274 00:26:21.778 --> 00:26:24.838 And that's good. So. 275 00:26:24.838 --> 00:26:33.179 Um, and the things like reduced that, you would think would take linear time. In fact, take long time. 276 00:26:33.179 --> 00:26:36.538 And, okay, um. 277 00:26:36.538 --> 00:26:41.699 So, here, I've just written down on what I was saying about. 278 00:26:41.699 --> 00:26:47.878 These sorts of things you can read it on your own. It took me a while to understand what's going on. It's really subtle. 279 00:26:47.878 --> 00:26:52.618 So, okay, um, and here's. 280 00:26:52.618 --> 00:26:56.249 I just wrote down what I just told you, um. 281 00:26:56.249 --> 00:27:01.259 About these factor notation, so. 282 00:27:01.259 --> 00:27:11.159 But it is cool with this factor. Notation is every time you construct a new element of this class. 283 00:27:11.159 --> 00:27:14.338 The new element is a different value for that member. A. 284 00:27:14.338 --> 00:27:20.009 So, you essentially bound it it's called a closure actually, or a binding. 285 00:27:20.009 --> 00:27:23.038 You create different offerings, so. 286 00:27:23.038 --> 00:27:28.858 Each different variable, which has this operator here? Is that a good value for a hidden inside? 287 00:27:28.858 --> 00:27:33.179 So, and that can be turned out to be useful. 288 00:27:33.179 --> 00:27:37.709 You don't have to specify a explicitly when you call. 289 00:27:37.709 --> 00:27:42.959 The function, which is good, because maybe some library thing is calling it need not free to redefine. 290 00:27:42.959 --> 00:27:52.949 How the library operate, so you just create a function you give to the library and you've got your private value, they bound inside to function. It's a very nice powerful tool. So. 291 00:27:52.949 --> 00:27:57.148 Oh, there's some stuff about bugs. I've actually found, um. 292 00:27:57.148 --> 00:28:03.179 Above or 2 insides, NBC inside the NVIDIA software. So. 293 00:28:03.179 --> 00:28:12.689 Got a little annoyed about the 1st, 1, because there was a I found a bug that they already knew about, but they hadn't announced. 294 00:28:12.689 --> 00:28:16.108 The, the users that really P*** me off. 295 00:28:16.108 --> 00:28:19.709 And then I took the competitor and an infinite loop at 1 point. 296 00:28:19.709 --> 00:28:27.509 Okay, oh, and 1, other interesting thing about some of these things. 297 00:28:27.509 --> 00:28:36.479 Trust the complete definition of trust is in the header libraries up here. The include files. 298 00:28:36.479 --> 00:28:40.378 That he is no libraries to bind at run time. 299 00:28:40.378 --> 00:28:46.078 Now, it's not as deep as you might think, because, of course, you know, the header files contain code. 300 00:28:46.078 --> 00:28:50.368 You know, they're defining overloading operators and so on. 301 00:28:50.368 --> 00:28:55.019 Okay, so let me go back to. 302 00:28:57.058 --> 00:29:03.179 Trust by example, here and. 303 00:29:04.949 --> 00:29:10.618 I can make it a slight bit bigger, but not incredibly bigger. 304 00:29:13.739 --> 00:29:18.689 Presentation will. 305 00:29:18.689 --> 00:29:21.749 Starts scrolling automatically so. 306 00:29:21.749 --> 00:29:28.288 Okay, and again sectors and so on. 307 00:29:28.288 --> 00:29:33.989 Let me go back so your big things are. 308 00:29:33.989 --> 00:29:40.469 It's using the template mechanism and C + plus it is a concept of iterate or standard template library. 309 00:29:40.469 --> 00:29:44.338 And computers our functions on function so. 310 00:29:44.338 --> 00:29:52.769 Okay, um, some best practices that we will see are, um. 311 00:29:52.769 --> 00:30:06.568 Fusion, the idea of confusion is the combined functions of function, combined function, writing temporary. So we saw that in that. Sax be example the right way to do. It is not to store temporary. 312 00:30:06.568 --> 00:30:10.378 Factor with intermediate results you want a few things together. 313 00:30:10.378 --> 00:30:18.148 In the functions, so oh, there's a concept here I've mentioned before. Um. 314 00:30:20.699 --> 00:30:29.459 I'll mention again, the GPU likes to have consecutive elements of an array. 315 00:30:29.459 --> 00:30:38.038 Being like 4 bites every 4 or 5 or something like to have an array of plain old data type, final data types in floats and so on. 316 00:30:38.038 --> 00:30:45.028 If you've got if you want to have a structure, then you want to split it apart element by element, and put each element in a separate. Right? 317 00:30:45.028 --> 00:30:49.919 And this will make the memory. 318 00:30:49.919 --> 00:30:56.459 In your CUDA functions when the different threads are reading from the global memory. 319 00:30:56.459 --> 00:31:00.659 Consecutive threads will be reading consecutive words from the global memory. 320 00:31:00.659 --> 00:31:06.689 And it will be much more efficient. It will reduce. It'll be called an in memory colas that we've talked about. 321 00:31:06.689 --> 00:31:10.528 The next thing here is, I just showed, you. 322 00:31:10.528 --> 00:31:14.398 Um, with the constant and so on. 323 00:31:14.398 --> 00:31:19.888 A powerful thing with these generators. 324 00:31:19.888 --> 00:31:29.999 Is that there's more than just a pointed to somewhere in memory they could impact trigger a function call that will create the array on demand. 325 00:31:29.999 --> 00:31:35.219 Lazy evaluation it's called sometimes and again. Um. 326 00:31:35.219 --> 00:31:40.919 Good news memory bandwidth so. 327 00:31:40.919 --> 00:31:44.489 There is an example of fusion supposedly. 328 00:31:44.489 --> 00:31:49.378 Have a vector and we've got 2 functions f and G and we want a compute T of aftereffects. 329 00:31:49.378 --> 00:31:59.308 And, um, I've lazy computer Equifax and stored in a temporary. Why? Let's say you can beauty of Y. 330 00:31:59.308 --> 00:32:02.669 But you got that temporary and on a. 331 00:32:02.669 --> 00:32:08.548 Host that may not be a problem. Really but on the device, you do not want to do that. So. 332 00:32:08.548 --> 00:32:12.179 You do not want to have the temporary here. Let me show you. 333 00:32:12.179 --> 00:32:15.538 Here's a version of transform. It just has 1 factor. 334 00:32:15.538 --> 00:32:20.759 X and input and output vector, and it takes a function that it applies. 335 00:32:20.759 --> 00:32:28.469 Now, how do the compiler know which version of transform you're using? It looks at the arguments and. 336 00:32:28.469 --> 00:32:31.828 This is binding, you know, finding the matching. 337 00:32:31.828 --> 00:32:36.568 You know, when you get overloaded function call, finding them was the 1 whose arguments match. 338 00:32:36.568 --> 00:32:43.409 Okay, so this is the slow way to do it because you're doing unnecessary. 339 00:32:43.409 --> 00:32:46.709 Um. 340 00:32:48.239 --> 00:32:52.108 And it is in computations here. 341 00:32:52.108 --> 00:32:56.068 Because he has the temporary, so you can do the math and, um. 342 00:32:56.068 --> 00:33:00.388 Is the bad way to do it? Um. 343 00:33:00.388 --> 00:33:03.959 Here's a crazy, good way to do it and it shows. 344 00:33:03.959 --> 00:33:10.138 This idea that generator can be more than just a pointer to a word in memory. 345 00:33:11.638 --> 00:33:19.919 Hear what we're doing is what is happening here. This is a new concept outline. It. 346 00:33:19.919 --> 00:33:23.788 There, um, okay. 347 00:33:25.888 --> 00:33:30.719 Function make transform iterate or returns iterate or or. 348 00:33:30.719 --> 00:33:38.729 It's read only you can't write to it. You can read it's a pointer. You can read it. You can increment it. You can add 5 to it and so on. 349 00:33:38.729 --> 00:33:41.909 Just like an iterate a, to a point inside a vector. 350 00:33:43.048 --> 00:33:46.979 But what this iterative does when you de, reference that. 351 00:33:48.028 --> 00:33:54.778 It grabs an element of the vector X and applies that to it and return to the result. 352 00:33:54.778 --> 00:34:02.608 So, what this has is you have a, you have a lazy virtual vector. 353 00:34:02.608 --> 00:34:07.828 X with each element of X, having f applied to it. 354 00:34:07.828 --> 00:34:12.958 But this isn't actually stored, it's generated an element element as you need it. 355 00:34:12.958 --> 00:34:18.748 And so it's called a transform it writer and make transform iterate or just, um. 356 00:34:18.748 --> 00:34:22.048 Returns a pointer to the start. 357 00:34:22.048 --> 00:34:30.539 To 0, and make transform iterate or apply to X dot and. 358 00:34:30.539 --> 00:34:38.969 Returns an integrator pointing to the 1 pass the end of Equifax. 359 00:34:38.969 --> 00:34:47.248 I noticed by the way you had better not the referenced this here because it's pointing 1 pass the end. You know, it's. 360 00:34:47.248 --> 00:34:50.849 So, you'd never be reference and then. 361 00:34:50.849 --> 00:35:01.378 Pointer okay, because pointing 1 at the end of a vector in any case. So we now have a beginning and ending for this virtual vector. Equifax. 362 00:35:02.398 --> 00:35:06.268 And transform, but the thing is, you can read this. 363 00:35:06.268 --> 00:35:11.728 He can read these generators just like they're pointing to a real vector, but they're pointing to this virtual vector. 364 00:35:12.869 --> 00:35:19.349 And I don't mean virtual in the sense of virtual memory it's virtual in the sense of. 365 00:35:19.349 --> 00:35:23.909 It's constructed as you need it, so. 366 00:35:23.909 --> 00:35:28.768 So, what we have here at beginning and ending to the vector Equifax. 367 00:35:28.768 --> 00:35:34.289 Except it's never stored and then so transform, we'll take begin and then for this. 368 00:35:34.289 --> 00:35:40.498 And it will apply g, do we tell them of that? And then it will store the results in Z. 369 00:35:40.498 --> 00:35:45.898 And rrsi just need to begin iterate or because it just writes as many elements as the as it needs. 370 00:35:45.898 --> 00:35:49.978 Which means that you'd better have constructed Z to be big enough. 371 00:35:49.978 --> 00:35:54.389 Okay, because you can easily run off the end of Z here. 372 00:35:54.389 --> 00:35:59.458 Okay, so the big new idea on the slide up here. 373 00:35:59.458 --> 00:36:06.509 Are these, or you can make a transform iterate are pointing to this lazy virtual vector. 374 00:36:06.509 --> 00:36:12.148 So, powerful idea, what is the advantage of it? 375 00:36:12.148 --> 00:36:17.579 Is you're not storing large amounts of temporary data and memory? 376 00:36:17.579 --> 00:36:26.998 So, and another thing, of course, you know, different elements of this virtual vector, don't affect each other. 377 00:36:26.998 --> 00:36:31.139 So, you see something like this transform it can be done in parallel. Okay. 378 00:36:33.478 --> 00:36:40.918 Whole idea here, can anyone think why this transform iterate, or might just slow you down. 379 00:36:40.918 --> 00:36:46.949 I couldn't think of a case where it actually might make the program slower. Any ideas. 380 00:36:53.429 --> 00:36:57.568 I imagine that was a really slow function examples. We've seen here. 381 00:36:57.568 --> 00:37:01.378 Is it really fast function or just modified by any or something? 382 00:37:01.378 --> 00:37:05.608 But if that was a very slow function and. 383 00:37:05.608 --> 00:37:10.438 You were referencing elements of FX many times. 384 00:37:10.438 --> 00:37:14.998 That app would have to get called redundantly many times on the same element of X. 385 00:37:14.998 --> 00:37:18.329 In that case, I could see it would slow you down. 386 00:37:18.329 --> 00:37:25.079 But usually, these little functions are are little in tactic like that. Okay. So new idea here was this. 387 00:37:25.079 --> 00:37:30.869 Transforming and this shows the power of this functional programming idea. 388 00:37:30.869 --> 00:37:34.259 You work with vectors as. 389 00:37:34.259 --> 00:37:41.099 As a unit, and you work on the whole vacuum deep inside is a loop going on, but you don't worry about that. 390 00:37:41.099 --> 00:37:46.528 You just work, so it's really compact code, which is nice. 391 00:37:46.528 --> 00:37:50.159 And compact code, it's. 392 00:37:51.329 --> 00:37:55.139 I get readable, I think, and it runs fast. 393 00:37:55.139 --> 00:38:02.278 That's the, and there's a number of these things built into thrust besides transform iterate or so. 394 00:38:04.139 --> 00:38:11.909 Okay um, and it is the discounting the storage and the bandwidth you can do that sort of thing, but. 395 00:38:13.380 --> 00:38:18.000 Yeah, okay. So. 396 00:38:18.000 --> 00:38:24.690 Um, here they're using this another constructed example, they want to find the length of a vector so. 397 00:38:24.690 --> 00:38:32.789 Square root of some of the squares and 1st day to find a new class called Square overload. Parenthesis. 398 00:38:32.789 --> 00:38:38.250 And all it does is it takes an element and does a square now. 399 00:38:39.690 --> 00:38:43.230 These other ways you could do that, but this is this 1 right here. 400 00:38:44.280 --> 00:38:51.900 And they're really simple way. This is constructed. Okay. Is it the take of vector X and they can. 401 00:38:51.900 --> 00:38:55.889 Talk to new vector event called X squared underscore 2. 402 00:38:57.090 --> 00:39:08.369 Okay, you might imagine his update in place or something, you know, it's an example where they square all the elements for that in a separate factor and then they call. 403 00:39:08.369 --> 00:39:17.190 Reduce and it sums the vector and reduce returns the sob. So we can now call square root on that. 404 00:39:17.190 --> 00:39:21.449 But this is slow because you got the temporary in there. 405 00:39:23.130 --> 00:39:26.699 In the past way is that. 406 00:39:26.699 --> 00:39:32.099 We have the factor X. we do a transform generator. 407 00:39:32.099 --> 00:39:38.460 So, the thing I highlighted is a virtual vector, which is a square of each element of X. 408 00:39:38.460 --> 00:39:41.519 Because when we de, reference. 409 00:39:41.519 --> 00:39:47.880 These generators here, it will point to elements of excellence square them. 410 00:39:47.880 --> 00:39:57.210 Because I, the square function up here, so this will, um, using these virtual vectors using these transformed generators. 411 00:39:57.210 --> 00:40:01.079 Initial vector X, the virtual vector. 412 00:40:01.079 --> 00:40:06.030 Square being squared we can take this virtual vector. 413 00:40:06.030 --> 00:40:09.329 And we can throw it into reduce, let's say. 414 00:40:09.329 --> 00:40:14.219 Reduce doesn't no reduce doesn't care, but it's operating on. 415 00:40:14.219 --> 00:40:23.010 These iterative, all reduce cares is it against beginning and ending iterative that it can that it can be reference to get elements. 416 00:40:23.010 --> 00:40:32.969 And that it can add 1, it can add. So it starts at the beginning 2245 to it and it separates way up the vector until it gets to. And so the only properties. 417 00:40:32.969 --> 00:40:37.139 That reduce requires for its arguments is that it can be referenced them. 418 00:40:37.139 --> 00:40:46.710 And bump them up forward iterate them. It doesn't even require that you backward iterate. Oh, it requires that the thing be randomly accessible also. 419 00:40:46.710 --> 00:40:52.050 It requires that you can add 5 to the iterate and go Pi, elements, soft, that sort of thing. 420 00:40:52.050 --> 00:40:58.289 So 1st that okay, so again, using these, um, transform it operators. 421 00:40:58.289 --> 00:41:02.820 I'm just showing how we would. 422 00:41:02.820 --> 00:41:06.960 Do a transform now if I go back a page. 423 00:41:06.960 --> 00:41:12.210 So, what we had here is we had to reduce applied to a transform. 424 00:41:12.210 --> 00:41:16.260 Well, that's such a common operation thrust, gives you. 425 00:41:16.260 --> 00:41:20.699 A function that does that, and it's a transform reduce. 426 00:41:20.699 --> 00:41:25.679 So, the transform reduce, it takes the vector beginning and and. 427 00:41:25.679 --> 00:41:29.460 It takes a function to transform the elements of the vector. 428 00:41:29.460 --> 00:41:33.539 And so it transforms and reduces 1 function call. So. 429 00:41:33.539 --> 00:41:38.219 This isn't actually necessary to have transform reduced, but. 430 00:41:38.219 --> 00:41:45.389 Makes you call it a little smaller it won't be any faster and then it returns the, the sum. 431 00:41:45.389 --> 00:41:49.079 It returns that from the functions that we call square on it. 432 00:41:49.079 --> 00:41:52.469 But it shows the concept and functional programming. 433 00:41:52.469 --> 00:41:56.159 Where you can work done. 434 00:41:56.159 --> 00:42:00.780 Well, you got functions of function. This is a transform over reduced, so you can. 435 00:42:00.780 --> 00:42:04.289 Operate on functions. This is you to operate on numbers. 436 00:42:04.289 --> 00:42:08.429 You can combine them and so on. That's the point here. 437 00:42:08.429 --> 00:42:16.170 And it is talking about these are old processors here, but you are getting a speed up. 438 00:42:16.170 --> 00:42:23.940 Because you don't have the to the temporary. 439 00:42:23.940 --> 00:42:27.000 Okay, so that was. 440 00:42:27.000 --> 00:42:33.090 Ok, so, so far we've seen a couple of new ideas. We've seen the idea that. 441 00:42:33.090 --> 00:42:37.349 You can have these virtual factors that. 442 00:42:37.349 --> 00:42:42.719 The iterate the pointer calls a function when you de, referenced the pointer. 443 00:42:42.719 --> 00:42:46.739 And so you can use and the 2nd thing, we. 444 00:42:46.739 --> 00:42:54.090 We saw was that, or you can combine functions and so on reduce temporary storage. 445 00:42:54.090 --> 00:43:00.929 And both of those are big ideas now, Here's another big idea. And. 446 00:43:00.929 --> 00:43:07.800 It's a way of thinking, it's, you have to think, sort of unnaturally to be efficient in parallel things. It's. 447 00:43:07.800 --> 00:43:14.699 Improves memory efficiency and the concept of coalescing just to remind you again, is that. 448 00:43:14.699 --> 00:43:21.929 Well, specific to NVIDIA, but the idea is, will apply to other hardware types. It's just an efficient way to do hardware. 449 00:43:21.929 --> 00:43:27.659 Is that you read a word of memory say words 4 bytes doesn't have to. 450 00:43:27.659 --> 00:43:36.420 You read a word a memory from me, a big global memory. Just remind you. So, the past chip on parallels, uh, has 48 gigabytes. 451 00:43:36.420 --> 00:43:42.510 You read a word from that it actually reads 32 words. 452 00:43:42.510 --> 00:43:51.210 And if all you want is 1 word, it reads the 32 words, gives you the 1 word you want, and ignores the other 31. 453 00:43:51.210 --> 00:43:57.750 And that's not efficient. So if you have a warp, a thread to 32 threads. 454 00:43:57.750 --> 00:44:01.769 If the 32 threads and remember they're running synchronously. 455 00:44:01.769 --> 00:44:07.769 70, they're locked together, they're all executing the same instruction just on different data. 456 00:44:07.769 --> 00:44:13.380 If the 32 threads are reading consecutive words of memory, like. 457 00:44:13.380 --> 00:44:23.219 Address a address +4+8, then the 32 threads and the war together will all be reading that same 128 word piece of the global memory. 458 00:44:23.219 --> 00:44:27.570 So that so there's nothing wasted. 459 00:44:27.570 --> 00:44:31.110 So 1 thread wants. 460 00:44:31.110 --> 00:44:34.769 1 word that reads 32 words 108 5. 461 00:44:34.769 --> 00:44:41.429 But it's adjacent threads will use the rest of that piece. So that's a coalescing idea. 462 00:44:41.429 --> 00:44:47.010 And that's why you want to Jason threads to read adjacent words from memory. 463 00:44:47.010 --> 00:44:53.039 Now, here's the problem. Suppose your data structure is like a 3 dimensional point. 464 00:44:53.039 --> 00:44:57.659 Like this, I don't know, I can't just highlight it. 465 00:44:59.760 --> 00:45:04.739 And it's, it's called afloat 3 it's got components X Y, and Z member. 466 00:45:04.739 --> 00:45:13.710 Elements and you got a lot of them, you know, we're doing graphics. Let's say we've got 3 D points and we're rotating the object or something. 467 00:45:13.710 --> 00:45:21.900 So, you got this structure here flow 3 and you'd have an array of it. 1 flow. 3 is 13 dimensional point. You have an array of them. 468 00:45:21.900 --> 00:45:30.659 And now imagine you're doing this on the GPU each element of the each thread is operating on 1 point. Maybe it's rotating it. 469 00:45:30.659 --> 00:45:38.369 And but now, here's the problem is that 1 flow 3 is 12 bytes long. So, adjacent elements. 470 00:45:38.369 --> 00:45:41.579 Are steps by 12 of the global memory not by. 471 00:45:41.579 --> 00:45:51.570 4, and it doesn't call us. Right? So this is a natural way of thinking, but it's not efficient. 472 00:45:51.570 --> 00:45:55.739 For accessing the global memory, this is called a structure. 473 00:45:55.739 --> 00:46:00.989 An array of structures we have a structure, and we have an array of that um, like. 474 00:46:00.989 --> 00:46:08.639 Is using C syntax here and you'd say things like, you know, points of X equals 1. 475 00:46:08.639 --> 00:46:11.820 The efficient way to do this is to invert. 476 00:46:11.820 --> 00:46:15.719 And instead of having an array of structures, infrastructure of a raise. 477 00:46:15.719 --> 00:46:19.739 So you have 3 separate arrays, X Y, and Z. 478 00:46:19.739 --> 00:46:24.869 And you have a structure of those 3 array structure of a range as the way. 479 00:46:24.869 --> 00:46:28.349 And this is using the C notation, which. 480 00:46:28.349 --> 00:46:31.860 It's sort of counterintuitive, but. 481 00:46:32.940 --> 00:46:37.860 But you'd have that now, the thing there is now you have your threads. 482 00:46:37.860 --> 00:46:41.880 They're using X, Y, and Z, but consecutive threads are stepping up an X. 483 00:46:41.880 --> 00:46:46.409 The consecutive axis I stepped up by 4 points and so on. So. 484 00:46:46.409 --> 00:46:53.940 Actually, each thread will be reading from 3 places in the global memory. 1, for the X's 1 for the wise 1 for the disease. But. 485 00:46:53.940 --> 00:47:01.409 This will this call now the trouble is this concept violates good programming style. 486 00:47:01.409 --> 00:47:06.360 You might say, because you conceptual ideas of point, and it splits it apart. 487 00:47:06.360 --> 00:47:11.159 So, this efficient idea is sort of fighting. 488 00:47:11.159 --> 00:47:16.469 With the concept of good programming levels of abstraction, which is a problem. 489 00:47:16.469 --> 00:47:19.829 So, people add tools and so on to make it easy for you. 490 00:47:20.940 --> 00:47:25.949 Any case, so the natural way to think about sequential code is the rate structures. 491 00:47:25.949 --> 00:47:28.980 The efficient way will be a structure of a raise. 492 00:47:28.980 --> 00:47:34.289 And there's going to be tools inside for us to make this efficient. 493 00:47:34.289 --> 00:47:40.199 But the concept is things coalesce better. 494 00:47:40.199 --> 00:47:44.400 And now they're just writing down what I just told you here. 495 00:47:44.400 --> 00:47:47.489 You'd like to have consecutive. 496 00:47:47.489 --> 00:47:52.320 Threads access, contiguous elements in memory. 497 00:47:52.320 --> 00:47:57.510 Structure of a raise. 498 00:47:58.679 --> 00:48:10.074 Um, doesn't it doesn't work right now the newer hardware they try to do things to help. So newer hardware. 499 00:48:10.074 --> 00:48:15.324 This may not be quite so awful, but it's still you want to work with the structure of our right? 500 00:48:15.900 --> 00:48:20.460 Okay, now. 501 00:48:22.139 --> 00:48:26.130 This is something which is really crazy. 502 00:48:26.130 --> 00:48:31.500 Took me a while to understand it. Um. 503 00:48:33.599 --> 00:48:37.320 If you had an array of structures. 504 00:48:37.320 --> 00:48:42.869 You know, a structure point, you have a pointer to them and you add 1 to the. 505 00:48:42.869 --> 00:48:47.610 To your point, or your point to the next reading point the next 3 point the next 3 point. 506 00:48:47.610 --> 00:48:54.750 And you take this pointer to a 3 D point, then you've got the reference that you could pull out X Y, and Z. the components. 507 00:48:54.750 --> 00:49:00.690 So that's the way, which is good programming style, but not fast. 508 00:49:00.690 --> 00:49:05.789 So, what trust has, and also has. 509 00:49:05.789 --> 00:49:10.050 Some other things is a way to make to. 510 00:49:10.050 --> 00:49:15.570 Think of things sort of like that, but compile efficiently. Um. 511 00:49:17.369 --> 00:49:21.000 It's something called a zip at a writer. 512 00:49:21.000 --> 00:49:28.079 What is it better? Writer is creates a pointer to a virtual 3 D point X. Y. Z. 513 00:49:28.079 --> 00:49:42.235 So she said the way you'd like to have, and he's got a 3 D point, it's an element and you have a pointed to a point and you can get X Y, and Z. but the trouble is the fast way you get separate arrays, right? For X. 514 00:49:42.264 --> 00:49:44.784 Ray, for Y, Ray for Zee pointer and. 515 00:49:47.369 --> 00:49:54.150 What does it does is it takes pointers for the separate arrays for X Y, and Z. 516 00:49:54.150 --> 00:49:59.909 And makes a virtual pointer to a virtual point. 517 00:49:59.909 --> 00:50:03.090 Um. 518 00:50:03.090 --> 00:50:09.780 And returns it as a twofold effectively. So it's called, it's a zip iterated. It is. It zips up iterate. 519 00:50:09.780 --> 00:50:14.159 Do the separate X Y and Z and returns iterate or to a virtual. 520 00:50:14.159 --> 00:50:21.420 2 full of the 3 of them so I have to show this by examples, but 1st, what is a tool. 521 00:50:21.420 --> 00:50:26.369 This is standard. 522 00:50:26.369 --> 00:50:30.780 It's a, it's a, it's a. 523 00:50:30.780 --> 00:50:38.309 Sure, it's a list of a couple of different variables that can have different types. The same type for different types. 524 00:50:38.309 --> 00:50:42.000 And so to point it, and and is. 525 00:50:42.000 --> 00:50:45.210 A little group of 3 ends. 526 00:50:47.219 --> 00:50:50.909 And it's treated as 1, and you can access the elements. 527 00:50:50.909 --> 00:50:55.590 How is it different from an array or vector. 528 00:50:57.150 --> 00:51:02.969 Um, the number of elements is fixed at compile time as 1. 529 00:51:03.989 --> 00:51:09.869 The elements kind of different types they don't all have to be the same type as they would have to be for an array of vector. 530 00:51:09.869 --> 00:51:14.940 There could be an end and afloat and a character, let's say, and that's perfectly okay. 531 00:51:14.940 --> 00:51:19.980 The elements are different types of number of elements is fixed at compile time. 532 00:51:19.980 --> 00:51:25.079 And if you change and this is again, this is a template. 533 00:51:25.079 --> 00:51:29.760 So, the types of the elements are different that's a different class. 534 00:51:29.760 --> 00:51:34.019 And it compiles to make fast code. 535 00:51:34.019 --> 00:51:38.579 And again, you may have noticed the theme I have is I like stuff that makes sense. 536 00:51:38.579 --> 00:51:45.239 Okay, so this is another new ID here recipient, or which is that if you have 3 separate factors. 537 00:51:45.239 --> 00:51:55.889 Gray 1, a blue 1 and a purple 1 you can it will create an iterate that when you do reference will return 1, gray, 1, blue and 1 purple. 538 00:51:55.889 --> 00:52:03.809 You add 1 to it and D reference to the return in the next grade blue and purple. You got another 1 and D reference it'll return the 3rd. 539 00:52:03.809 --> 00:52:08.010 Gray blue and purple, so it takes. 540 00:52:08.010 --> 00:52:12.960 3 separate vectors and creates a virtual. 541 00:52:12.960 --> 00:52:17.849 Array of structures it creates a virtual structure, which has. 542 00:52:17.849 --> 00:52:24.179 These grouped up zeros element of each 11, the 1st, element of each 1, the 2nd element of each 1 and so on. 543 00:52:24.179 --> 00:52:29.940 So, you can pretend that you have an array of structures, but it's actually physically underneath the structure of a race. 544 00:52:29.940 --> 00:52:33.510 Everyone's totally confused maybe, but. 545 00:52:33.510 --> 00:52:39.480 Um, they get other ways, you know, they talk about it in the documentation so it. 546 00:52:39.480 --> 00:52:44.400 Okay, so next big idea. So there's several big ideas in today's slides that. 547 00:52:44.400 --> 00:52:48.570 Okay, Here's an example. 548 00:52:48.570 --> 00:52:53.639 On a rotate points by something. 549 00:52:53.639 --> 00:53:01.170 3, by 3 matrix here, and what we have here is. 550 00:53:02.280 --> 00:53:07.349 Again, using this function notation, we have a class rotate flow 3. 551 00:53:07.349 --> 00:53:11.699 And it will take a point flow 3, which is just. 552 00:53:11.699 --> 00:53:18.090 Floats and it will rotate as the rotation matrix is fixed. 553 00:53:18.090 --> 00:53:22.349 Um, and it returns. 554 00:53:22.349 --> 00:53:27.960 A new flow 3 and afloat. 3 it doesn't specify what it is here. 555 00:53:27.960 --> 00:53:30.960 Okay, um, it's actually not mentioned. 556 00:53:30.960 --> 00:53:35.880 And it doesn't matter, it's something that's defined earlier in a week compiled with or you define it. 557 00:53:35.880 --> 00:53:43.110 And so, and then there's a function make flow 3, which had to be defined somewhere, which takes 3 floats. 558 00:53:43.110 --> 00:53:49.199 And returns a flow 3, whatever flow 3 is. And again, it doesn't say here what flow 3 is that would be defined earlier. 559 00:53:49.199 --> 00:53:54.119 Okay, so nothing. 560 00:53:55.679 --> 00:54:01.170 Interesting here. 561 00:54:03.630 --> 00:54:08.820 It's sort of weird actually the way this is done, but okay. 562 00:54:10.199 --> 00:54:13.349 Transform and we've seen that before. 563 00:54:13.349 --> 00:54:18.809 The transform takes the vector and wrote transforms in place and calls rotate float 3. 564 00:54:18.809 --> 00:54:28.349 On each element, and again rotate flow. 3 construct a variable of type of class rotate flow. 3. 565 00:54:28.349 --> 00:54:32.159 The constructor is the default constructor takes no arguments. 566 00:54:32.159 --> 00:54:39.599 Returns a variable of that class and that variable gets called as a function and does this stuff here. 567 00:54:39.599 --> 00:54:45.539 Okay, now nice and simple cold, but. 568 00:54:45.539 --> 00:54:49.230 How is low 3 implemented. 569 00:54:50.519 --> 00:54:54.360 Okay, um. 570 00:54:55.380 --> 00:55:00.510 Here is 1 way we could do it. 571 00:55:01.800 --> 00:55:10.800 Here's our rotation function and it's a class again. 572 00:55:10.800 --> 00:55:13.860 And what it does. 573 00:55:13.860 --> 00:55:18.150 It rotates a 2 pole of 3 floats. 574 00:55:18.150 --> 00:55:21.690 So the 3 D point is a 2 full of 3 floats. So just. 575 00:55:21.690 --> 00:55:28.380 3 floats in memory. There's no over thing with 2 poles is there's no storage overhead like a vector. You got overhead. Okay. 576 00:55:28.380 --> 00:55:31.920 And a vector stored on the heap. That's problem. 1. 577 00:55:31.920 --> 00:55:36.389 And is a header in front of the vector that says how big it is and so on. 578 00:55:36.389 --> 00:55:39.929 Floats are stored wherever. 579 00:55:39.929 --> 00:55:43.440 A scaler data type would be stored. 580 00:55:43.440 --> 00:55:48.000 And I'm not the same. 581 00:55:48.000 --> 00:55:54.119 The Eva could be on your local stack and the size of the float. 582 00:56:11.760 --> 00:56:15.059 Here with the 2 poles, the. 583 00:56:15.059 --> 00:56:19.349 A 2 point was, it was 2 floats with different classes of twofold with 3 floats. 584 00:56:19.349 --> 00:56:26.969 Our people with 4 floats or 1 float so you cannot have a generic function, which handles 2 pools of any size. 585 00:56:26.969 --> 00:56:32.099 Which is a problem, but with 2 poles of a known fixed sized, like 3 D point. 586 00:56:32.099 --> 00:56:36.809 There's no overhead and it's fast there's your trade off. Okay. 587 00:56:36.809 --> 00:56:40.230 Okay, so, um. 588 00:56:41.639 --> 00:56:46.889 This will rotate so it's a class rotate 2 fall. 589 00:56:46.889 --> 00:56:52.170 Um, it takes 1 twofold as an argument, and it returns another tool here. 590 00:56:54.360 --> 00:57:02.550 And this shows you, by the way, suppose he got a function and you want to return 2 things. Some of the. 591 00:57:02.550 --> 00:57:08.400 Um, not just a scale or but a couple of things. So you could put, em, and he could stop them in in a vector. Maybe. 592 00:57:08.400 --> 00:57:14.010 But here is a little more overhead way. You return a twofold from the function and again. 593 00:57:14.010 --> 00:57:17.670 It's less overhead benefactor, so. 594 00:57:17.670 --> 00:57:21.929 And the 2 people, they don't all the elements will not be the same type of here. They're both. Okay. 595 00:57:21.929 --> 00:57:26.730 This an operator, it takes a toll, it returns a 2 full 2 or 3 floats. 596 00:57:28.110 --> 00:57:33.750 And the thing was, a tool is getting elements of the tools, a little clumsy. 597 00:57:33.750 --> 00:57:39.780 Because we don't know what they could be different classes, so we can't just subscript them. So. 598 00:57:40.860 --> 00:57:48.059 So, get here the 0 and the 1 and the 2 are constants. 599 00:57:48.059 --> 00:57:51.809 You cannot say, get some I. 600 00:57:51.809 --> 00:57:58.710 And the reason is that the different elements of the twofold might sometimes at different types. Not the case here. 601 00:57:58.710 --> 00:58:04.019 So, getting elements of the tool is clumsy. There is a downside. 602 00:58:05.280 --> 00:58:12.329 Reason for macros actually, we get the elements of the twofold. We apply this and then we make a new tool. 603 00:58:12.329 --> 00:58:17.610 I make 2 full function and we return it. 604 00:58:17.610 --> 00:58:27.510 Now, by the way, so this function, it's returning something messy, something fairly big. Um. 605 00:58:29.579 --> 00:58:33.269 You might think that there is a temporary involved. 606 00:58:33.269 --> 00:58:38.489 Returning this thing, there's 2 full from the function and then a copy into somewhere else. 607 00:58:38.489 --> 00:58:41.550 All your good C + plus compilers. 608 00:58:41.550 --> 00:58:51.150 Will actually when you call this function though, and you say something equals will look at where that return thing is getting assigned into. 609 00:58:51.150 --> 00:58:55.380 They'll look across the equal sign and look across the assignment. 610 00:58:55.380 --> 00:58:59.429 And they'll bring and they will optimize that. 611 00:58:59.429 --> 00:59:05.519 So that when this function is constructing, it's. 612 00:59:05.519 --> 00:59:14.099 Thing, it's variable that it's going to return. The compiler will make it constructive straight into the target of the outside equal side. 613 00:59:15.239 --> 00:59:20.010 It's really nice and it reduces temporary. 614 00:59:20.010 --> 00:59:24.449 Even for a sequential stop, um. 615 00:59:25.889 --> 00:59:35.940 So, I might show example write down some example. Okay so this rotates a 2 Paul. 616 00:59:35.940 --> 00:59:41.519 Now, the problem with this rotate thing is it requires its argument, be a tool. 617 00:59:41.519 --> 00:59:50.130 Of 3 floats, but the problem is that we go down here our data is X. Y, and Z are separate vectors. 618 00:59:51.929 --> 00:59:56.309 So, what do we do here make twofold? Um. 619 00:59:58.170 --> 01:00:01.949 So, make 2 people take things and acts them into a twofold. 620 01:00:01.949 --> 01:00:07.110 And it's 3 things here are. 621 01:00:07.110 --> 01:00:10.500 All the beginning iterations to X Y, and Z. 622 01:00:12.539 --> 01:00:16.019 And it creates a 2 fold of 3 generators. 623 01:00:16.019 --> 01:00:23.639 By the way the class of the of the return from that function is something horrible. 624 01:00:25.019 --> 01:00:29.369 And you don't care, you just calling, make people, you're not actually. 625 01:00:29.369 --> 01:00:33.599 You know, constructing variables of this of the class, whatever. 626 01:00:33.599 --> 01:00:38.280 If you did, that's why use auto. Okay. So. 627 01:00:38.280 --> 01:00:42.929 We create a twofold of the 3 iterations. 628 01:00:45.090 --> 01:00:53.159 And then make the zip iterate, or takes this tool of iterating and adds a little framework on to it. 629 01:00:54.389 --> 01:00:57.929 So, now we've got a fancy iterate. Are. 630 01:00:59.159 --> 01:01:02.610 And if you do reference it. 631 01:01:03.630 --> 01:01:09.300 You get a 2 poll, you get an iterative to a 2 full of the. 632 01:01:09.300 --> 01:01:15.989 Of the extra y0 and 0 So it plays some games with pointers and fancy stuff. 633 01:01:15.989 --> 01:01:20.340 So, it makes it better returns a pointer. 634 01:01:20.340 --> 01:01:23.969 To a virtual, a ray of X Y, and Z. 635 01:01:23.969 --> 01:01:32.010 It's not stored explicitly, but it was a virtual twofold, but you can go and grab elements of it and it will walk through. 636 01:01:32.010 --> 01:01:35.039 And give you elements of X Y, and Z. 637 01:01:36.989 --> 01:01:40.170 It created a structure of a rate and the rate structures. 638 01:01:40.170 --> 01:01:46.230 And the next thing was, this makes it better this fancy iterate and had to determine the actually use. 639 01:01:46.230 --> 01:01:59.519 If you add 1 to it, you've now got an iterate or to a, to fall appointed to the, to pull up the next elements of a of X Y, and Z. you have to to it you got pointer to the next elements of X Y, and Z. 640 01:02:01.619 --> 01:02:05.099 So, to make it better, make 2 fold that. 641 01:02:05.099 --> 01:02:09.360 Is it created a virtual array of structure is for you. 642 01:02:09.360 --> 01:02:13.800 So, you can use it as if it's in a range of structure as you. 643 01:02:13.800 --> 01:02:19.019 He reference it, you get 1 structure of X Y, and Z or to pull in this case. Not a structure you. 644 01:02:19.019 --> 01:02:22.559 You're incremented and de, referencing the next 1 and so on. 645 01:02:25.019 --> 01:02:29.550 And so what we have here is a virtual. 646 01:02:29.550 --> 01:02:35.670 2, full of our point of an X Y, and Z and we have an array of them. 647 01:02:35.670 --> 01:02:40.380 And then we do that on end, we've got it to the ending iterate or. 648 01:02:41.820 --> 01:02:46.650 And now we call transform and now transform with this set of arguments. 649 01:02:46.650 --> 01:02:50.760 Those are transforming in place got 3 arguments. 650 01:02:50.760 --> 01:02:54.690 2 iterations and a function. This will call transform in place. 651 01:02:54.690 --> 01:02:59.400 And again, so transform it, it's not transforming a simple vector. 652 01:02:59.400 --> 01:03:06.239 And transform doesn't care. This is nice thing about properly design, functional programming. 653 01:03:06.239 --> 01:03:12.780 You see, everything is designed carefully. You can fit the pieces together. It's beautiful. 654 01:03:12.780 --> 01:03:16.110 And that's 1 reason I like it. So. 655 01:03:16.110 --> 01:03:22.139 You know, transform, we saw transforming on vectors dots, transforming on these. So. 656 01:03:22.139 --> 01:03:29.159 There was another key also zip iterate or you can de reference it right to it. 657 01:03:29.159 --> 01:03:32.190 So, it's not just read only it's read write. 658 01:03:32.190 --> 01:03:40.440 And, um, so this version of transform requires that you'd be able to write. 659 01:03:40.440 --> 01:03:45.599 To the referenced them, um. 660 01:03:47.190 --> 01:03:53.159 But given that, they say the pieces fit together and this is a nice compact way. 661 01:03:54.329 --> 01:03:58.260 The code is compact and it runs fast. So again. 662 01:03:58.260 --> 01:04:01.619 We did the points 3 points. 663 01:04:01.619 --> 01:04:06.840 At the low level, the X Y, and Z are stored separately in 3 separate arrays. 664 01:04:06.840 --> 01:04:10.829 But the abstraction layer on top of that. 665 01:04:10.829 --> 01:04:13.829 Is in fact, it's a vector of. 666 01:04:13.829 --> 01:04:22.679 Souffles of triples of flows and at that abstraction layer, which is created with the makes a better. 667 01:04:22.679 --> 01:04:26.400 And again, this makes it better. You can, you know, it's. 668 01:04:26.400 --> 01:04:34.800 A random access iterate, or you can add 1, you you know, you can add 5 to it to get the 5th element and so on counting from 0. 669 01:04:34.800 --> 01:04:44.849 All that sort of stuff. So now, in my programming, what I actually do since I'm always on, makes it better make. 670 01:04:44.849 --> 01:04:49.079 Did I write a new function, which is those 2 combined so. 671 01:04:49.079 --> 01:04:54.300 Actually, I combined various things, so that's the big. 672 01:04:54.300 --> 01:05:02.340 Things on this slide a lot of stuff on this particular slide here, working with the tool balls and we're working with the. 673 01:05:02.340 --> 01:05:08.880 We can make a tubal and then the big new thing is the zip iterate are a very powerful idea. 674 01:05:08.880 --> 01:05:14.280 And it returns a numerator that you can throw into transform, you could throw into reduce whatever. 675 01:05:14.280 --> 01:05:22.500 That's the big idea here and I'll hit it again on Thursday and I'll show you examples. 676 01:05:22.500 --> 01:05:26.010 Okay. 677 01:05:27.210 --> 01:05:30.389 And again, remind me, you know, you. 678 01:05:30.389 --> 01:05:34.170 I'm paying a lot of money to come to our API and. 679 01:05:34.170 --> 01:05:38.699 So, I'm not just trying to show you details of 1 specific. 680 01:05:38.699 --> 01:05:41.969 Programming tool that has competitors. 681 01:05:41.969 --> 01:05:49.230 The reason again, I'm spending time. Is it showing you general ideas here? These are very powerful useful ideas. 682 01:05:49.230 --> 01:05:57.960 That will be, you'll see them another examples and it's a way of thinking for parallel programming, which will need to come back the code that runs fast. 683 01:05:57.960 --> 01:06:02.280 So, that's why I'm spending the time on it. I'm using thrust. 684 01:06:02.280 --> 01:06:06.179 Just, as an example to show you these powerful ideas. 685 01:06:06.179 --> 01:06:13.769 And they're just saying here, they're getting well at 1.3 speed up is not interesting. 686 01:06:13.769 --> 01:06:17.969 2.5 speed ups. Not interesting. Either. Really? 687 01:06:20.519 --> 01:06:27.989 Okay, okay, so we saw this, um. 688 01:06:29.429 --> 01:06:36.449 Those things with now, this slide is showing you what I told you, we had to start here. I talked a little program example. 689 01:06:36.449 --> 01:06:40.949 They call them implicit sequences. I call them virtual. 690 01:06:40.949 --> 01:06:46.110 Arrays and so on so the iterations into. 691 01:06:46.110 --> 01:06:54.809 Arrays that are not stored in memory that calls a function that creates the elements as you read them, but generally read only. 692 01:06:54.809 --> 01:07:04.559 I showed you the constant 1 is implementing ones that count up and so on and no storage concentrator as accounting iterate. Our counts up. Every time you. 693 01:07:04.559 --> 01:07:14.159 Well, you incremented, uh, the thing with the concentrator incremented in de referenced, did you get the same value? 694 01:07:14.159 --> 01:07:20.820 The counting iterate your increment de reference that you get that higher number. Now again. 695 01:07:20.820 --> 01:07:26.670 What's the point is it brings these things into the general concept. 696 01:07:26.670 --> 01:07:31.860 Of a vector that you can apply functional programming map reduced too. So. 697 01:07:31.860 --> 01:07:40.440 It's a vector just like any other vector from the point at this level of abstraction underneath. It's different, but we got the abstraction layer. 698 01:07:40.440 --> 01:07:45.780 And underneath it, there's actual vectors that are being stored and there are these implicit sequences. 699 01:07:45.780 --> 01:07:53.010 And above this boundary, no, 1 cares and that's the powerful idea. 700 01:07:53.010 --> 01:07:56.369 You know, do some special cases. 701 01:07:56.369 --> 01:08:00.150 Um. 702 01:08:00.150 --> 01:08:06.869 Here getting to some big examples. Okay. Time to stop here. Um. 703 01:08:06.869 --> 01:08:11.130 So slide 20 I can't there's too many things in the slides that so. 704 01:08:11.130 --> 01:08:14.519 I'll continue this on Thursday. Um. 705 01:08:14.519 --> 01:08:19.529 But it's a refresh that I was showing you some programming paradigms. 706 01:08:19.529 --> 01:08:24.989 Is functional programming and map producer to a new ways to look at programming, which are useful. 707 01:08:24.989 --> 01:08:30.270 Useful for here, so we'll continue on with slide 23 on, um. 708 01:08:30.270 --> 01:08:36.420 On Thursday, and again, if you look back on. 709 01:08:39.779 --> 01:08:46.949 So, I wrote down here, it took me, as I said, it took me a while to understand what's going on here. 710 01:08:46.949 --> 01:08:50.100 Some of this stuff and. 711 01:08:50.100 --> 01:08:53.699 Here is where the thing I'm projecting from. 712 01:08:53.699 --> 01:08:58.020 And lots, I haven't copied over all of the invidia demos yet. 713 01:08:58.020 --> 01:09:01.529 It's the Git repository inside the Git repository. 714 01:09:01.529 --> 01:09:06.750 And video makes most of the source code publicly available, then I'll put it under care of. 715 01:09:06.750 --> 01:09:13.439 Which is nice of them, so okay, so that's enough new stuff for today. 716 01:09:13.439 --> 01:09:18.600 No homework today. Um, if there's any questions I'll. 717 01:09:18.600 --> 01:09:23.399 Try to answer them. Other than that. I have fun. 718 01:09:23.399 --> 01:09:27.899 Oh, and I actually think. 719 01:09:27.899 --> 01:09:32.430 I was maybe recording things today. 720 01:09:32.430 --> 01:09:37.350 So that's the theory. So. 721 01:09:37.350 --> 01:09:42.149 Oh, 1st just curious. 722 01:09:48.420 --> 01:09:55.229 No, 1 was here, but I'll, I'll upload the recording in case anyone's interested in assuming it works.