WWDC2019 Session 409

Transcript

[ Music ] [ Applause ] >> So, over the past year, we've been working really hard on adding some great new features to the compiler. I'm Jessica, and today, me and my colleagues, JF and Devin, are going to share some of those great features with all of you. So, we've got a lot of ground to cover today. We're going to talk about new platform support, some low-level code size optimizations, some language-level code size optimizations, some great new diagnostics, and then we're going to finish up with some great new static analyzer checks that will help you find bugs in your code. So, let's get started with new platform support. Specifically, I'd like to talk about the Series 4 Watch. The Series 4 watch usually a fully 64-bit chip. But yet, all of the App Store apps are 32 bit. The curious thing about this is that on day one, all of your apps worked seamlessly on the Series 4 Watch. This seems kind of like magic. You didn't have to recompile anything. But yet, all of the apps worked. How did this happen? Now, it would be awesome if I could get the front row or any row of the audience to give me a drum roll like this, but if you don't want to it's fine. Da da da da da da da da da, the answer is Bitcode. [applause] And I explained absolutely nothing. So, allow me to explain this to you. Let's take some source code, your favorite language. What we're going to do is we're going to hand that off to the compiler. Now, normally what you would do here is you would continue the standard compilation process, and you would get like a binary or something, but we're not going to do that. What we're going to do instead is we're going to stop the compilation process early, and we're going to produce LLVM Bitcode. So, what's cool about LLVM Bitcode is it encodes an intermediate state in the compiler. And what you can do with this intermediate state is you can actually pick up the compilation process where you left it off. So, that's what we do, except this time we do it in the App Store. By doing this, we could take one Bitcode for one app and produce two different apps from it, one to run on a 32-bit chip, and one to run on the fancy new 64-bit chip as well. Now, there kind of is a problem here. The problem is is that the compiler doesn't actually know that you're going to be executing code on a 64-bit device, and if it knew that, it could actually leverage that information to optimize the app even more. And so to get around that, what we do is we collect Bitcode for the 64-bit chip as well. This allows us to create a really, really fast app just from the Bitcode. Now, this is pretty cool. This is like-- this is one of the things that makes compilers kind of magical. It's pretty awesome, eh? Now, let's move onto Code Size Improvements. This is one of my favorite topics. We've been hard at work having the compiler actually produce as small of code as possible for you. Now, code size is pretty important because bigger code means bigger, slower downloads, and bigger apps take up more space on users' devices, and you notice this kind of thing. So, to support users that actually want to prioritize code size over all other metrics, we added a new optimization level to Xcode. And that optimization level is -Oz. And I'm going to keep saying -Oz because I'm Canadian, so we might as well call it Oh Canada. [ Applause ] What I'd like to show you today is an example of the kind of optimization that appears at -Oz, but before doing that, I'm going to do my best to tell all of you how a compiler works in a couple of minutes flat. So, when you compile some code, the initial representation is very, very target independent. It's source code. When you put that into the compiler, it gets lowered to an IR, an intermediate representation. This intermediate representation is still mostly target independent, but it does have some target-dependent features built in. It looks kind of like a generic assembly. At this point, you could just stop the compilation process and spit out some Bitcode, but we're not going to do that. We're going to keep going. That representation is further lowered to what we call MIR, which is short for Machine IR. At the end of this process, the Machine IR looks almost identical to the assembly for the target, in this case arm 64 assembly. We could put the code side by side, for example. Now, in the optimization that we're going to talk about, we're going to be working with the Machine IR, but for the sake of familiarity, I'm just going to do all of our examples in assembly, because that's less scary. What I'd like to tell you about is the code size optimization called Function Outlining. Function Outlining is one of those optimizations that's going to do everything it can to save you some size. It's very, very late in the compiler. It doesn't actually rely on any sort of a source code language. The best way to explain what this does is by example. So, let's say that you have some assembly like this. hasse and kakutani are two random functions in some random program. Now, what's interesting about hasse and kakutani is they have some identical instructions. What we can do with this is we could take those identical instructions and factor them out into a new function. After we factor them out into a new function, what we can do is we can replace the sequences that we found with calls or branches. As a result of this, we end up with a smaller program. And how much smaller? Well, up to 25 percent on our test programs. [applause] Now, some of you might be wondering, okay, well where do these savings come from? Is it because of copied and pasted code? Is it because your code needs to be refactored? Well, no, there's actually something a bit deeper happening here. If you do have a bunch of code copied and pasted around, that will impact outlining behavior, but that's not really the most important thing. Let's look at another example. Say you have this function. It doesn't matter what this function does, but what I'd like to address is what happens if we take this function and we put it into the compiler. We get some assembly. Well, you might get something that looks kind of like this. Once again, you don't really need to understand this assembly, but what I'd like to draw attention to is the instructions at the beginning and the end of the function. These are called the function prologue and epilogue. These instructions don't correspond to any individual lines of source code. These are instructions that are inserted by the compiler to meet some sort of system requirements. So, these types of things, like these stores and these loads could appear in many places throughout your program, and this is the kind of thing that the outliner can actually leverage to reduce the size of the overall program. There are some gotchas to do with this though. First off, when you outline, you change the control flow of your program. Here, you might initially have ulam called collatz, but then you could outline the call to collatz. The thing about what happens here is we've changed the control of flow of the program. So, the problem with this is what happens if you crash inside collatz? Well, what's going to happen is you're going to go and you're going to take your program and you're going to throw it into LLDB, and you're going to see the added outline function in your back trace. So, this is something you have to watch out for if you're actually outlining code. Another thing is that outlining can increase the execution time of your program. You're adding calls, and calls could have an execution time overhead. And this is actually okay though because -Oz prioritizes size over everything else. When you're going through -Oz, you're saying, make it small. Because of this, we don't recommend that you compile performance sensitive code with -Oz. If execution time is king in your program, -Oz is not the best thing to use. However, we do recommend that you use instruments. What instruments is going to tell you is where the hot spots in your program are, and this can allow you to make the best decisions with respect to optimization for your app. The compiler has lots of different optimization levels, and those optimization levels all prioritize different things. For example, -Oz prioritizes size at all cost, and as a result, you might have some slightly slower execution time. But on the other end of the spectrum, you have -O3. -O3 is going to prioritize the execution time of your program at all costs, and as a result, you might get a larger program. -Os is the default optimization level in Xcode because it has a good balance between speed and size. But, you know, you might have different optimization needs, and so you can use instruments to figure this out. The compiler also offers some extra optimizations that I don't have a lot of time to talk about but I would like to cover a little bit about anyway. It offers PGO, which is Profile-Guided Optimization. PGO is pretty cool because it allows you to actually execute your program and then collect information about how your program runs. Then you can use that to guide the compiler when you compile it again. It also offers LTO, which is Link-Time Optimization. What's cool about Link-Time Optimization is at the cost of some compile time, what you can do is you can tell the compiler, okay, let's wait until we have every single file in the program and use that to say provide better inlining and outlining. Optimizations like inlining and outlining do better when they have more context, so LTO could help there. You can also combine these extra optimizations with the existing optimization levels to get some really, really good actual performance. Because I don't have a lot of time to actually cover what these things do, I recommend that you check out the previous What's New in LLVM talk so that you can learn a bit more about them. After all that, you might be wondering, okay, how do I enable -Oz? Well, just go to your project's build settings and select -Oz as the optimization level. You can also enable -Oz or other optimization levels on specific files in Xcode by going to your project's build phases, going to the compile sources list, and setting the compiler flags. All right. So, I just told you a lot of stuff. You're probably wondering, okay, how does this impact my app's code size? How do I figure out this kind of information? Well, to figure out code size, I recommend a little tool called size. I use this tool a lot. It's a nice little terminal app. So, size is going to give you some nice low-level binary information about your app. It is not going to tell you the actual total size of your app, because it doesn't include things like assets, for example. If you have like a giant picture in your app, and you have like a hello world of code, the compiler can't help you with that. But, let's say you want to use size. Here's how you do it. It's easy. You just say size, and here's the path to my binary. What it's going to do is it's going to give you some nice output like this. What it's going to do is it's going to tell you the size of each segment in your binary, and it's going to tell you the overall size of the binary as well. But, the thing is, is each segment in a binary is actually comprised of many sections. I only care about executable instructions in this example. So, what I'm going to do then, is I'm going to give size a couple extra flags. Those extra flags are the -l and the -m flags. When I do this, what size is going to do is it's going to give me a per section breakdown. If I want to find out more about executable instructions, then I can just look at the text section. So, I hope that all of this helps you with gaining some more insight into the code size of your apps. Now, I'm going to leave, and I'm going to hand it off to my fellow Canadian, JF, who's going to tell you about some language-level Code Size Improvements. [applause] >> Thanks, Jessica. [applause] >> So, I'm Jeff, and I'm going to tell you about Language-Level Optimization. So, Jessica told you about low-level, kind of assembly-ish optimizations, and I'm going to tell you about the stuff that happens when you use the language itself, right, so the constructs that you write your code in. I'm going to tell you about four of those optimizations today that also affect code size. The first one is related to objective C, when you use blocks, right. So blocks has a bunch of associated metadata that the compiler generates for you. It has metadata as well as helper functions, and we'll go into a few examples to see what that looks like. So, say you write some code, and it looks something like this. Right. So, you'll notice that I have two blocks in two different functions, and the code doesn't really matter, but the important bit to notice is that the blocks do completely different things, right. The code has nothing to do with each other, but they have fairly similar struct in that the capture for the blocks is similar. In this case, I capture two arc-strong pointers. Now, I told you that there is metadata associated with each block. What does that look like? Well, this is the metadata that we're talking about, right. So the compiler generates this automatically for you when you use blocks to track a bunch of information about the blocks and give them the behavior that the language guarantees. What you'll notice is there's a block size in there. There's a copy helper and a destroy helper, which are two methods that we'll go into in a bit, and there's a block method signature as well as the block layout info. All right. And if you look at the example that I have on the screen, this is actually the synthesized code that the compiler generates. Now, that looks like a bunch of gobbledygook, but what really matters is that the struct itself is the same, so we can duplicate in a lot of cases. Now, you'll notice in this case we can't because despite capturing two arc-strong pointers, there's other captures in there, and the block size itself is different. All right. So, we can't merge these cases, right. But in general, there's some cases where we can end up merging these things, right. You'll notice in this example though, we have the functions, right, so the copy helper and destroy helper can be merged, right. And you'll notice that as of Xcode 11, we do merge them, right. So, those are the same. What does that mean? Well, the copy helper is there to help you when you move a block around, right, and destroy helper is there to help you when the block goes away, right. And the code that the compiler synthesizes when you do this looks about something like this. All right, so in the example I had, you'll remember I said we have two strong arc pointers, and the code we generate is something like that, and we copy it around. You retain, and when you destroy it, you release. Now, there's a bunch of other things that blocks end up having to do when you copy or destroy. You might have C++ objects, in which case you have to color copy constructor. You might have-- and the destructors, right. You might have some weak arc pointers. You might have nontrivial C types or something like that, right. So, there's a bunch of other stuff that needs to happen, but basically you write code using blocks, and when the compiler detects that there is redundancy, we try to eliminate as much of it as we can. Now, how much does this pay off? We found that in objective C applications, it's roughly a 2 to 7 percent code size reduction, right, and that comes for free. It's just enabled by default. [ Applause ] The second optimization I'm going to tell you about is related to Instance Variables of Direct Subclasses of NSObject. Now, that's a mouthful, and I've got to give you an example and explain what that actually means, all right. So let's say I'm writing a card game, and I write code that looks something like this. The key thing to remember here is I'm deriving directly from NSObject, right. When I write the objective C code, my properties corresponds to instance variables, right. So there's instance variables generated automatically to back up the properties that I have here, right. Now, the class itself that I wrote, the compiler looks at it and generates a struct that looks something like this roughly, right. It lays out the members, one after the other. Now, the thing in objective C is that you can have a base class derived from it and then change the code in one framework to the other, and the base class changes, has new members and whatever, and the derived classes don't break, right. And the objective, in C++ you can't really do that. If you derive, then the base class's layout changes. You have new sizes. Now, here I'm deriving from NSObject. NSObject is effectively part of the ABI of the platform. So, we know it's not going to change, right. So, we have this layout of the class, and when we're implementing the class, right, so implement this initWithName method, I know exactly the layout of everything in my class, right. So, the compiler, as of Xcode 11, can go in and say, I know where the offsets are. I can hardcode them. Well, okay, what does that actually mean? So, I look at this initWithName method. It'll look something like this, right. So, I have self.name = name. Now, the setter, to generate a code for that looks about something like this before Xcode 11, right. So, it synthesizes a lookup into a table to know the offset of the name property or the name IVAR rather. Now, that's a small amount of code, but what we do as of Xcode 11 is something like this instead, right. So, we just hardcode the offset when you're implementing the method that's a direct derivative of an S object. That makes sense. We know it's not going to change because we're implementing the thing that you just wrote. And it seems pretty trivial; it's just one instruction out of three, right. But it turns out to be roughly a 2 percent saving in applications. It's pretty s sweet. [applause] The next size optimization I'm going to tell you about is improved debuggability for C++ types. Now, you're going to go, wait, that's not size. It is. Just let me get to it. So, I say I write some code. Fairly straightforward code here. It's a command line application, right, and what I'm doing is I'm taking the parameters from a command line as strings, transforming them as integers, putting them into a std::vector and then printing them out one after the other, right. Really straightforward demo application. Now, what important here is I'm using the standard library's types. Specifically I'm using vector push back, and I want to go in and put a breakpoint here. Now, this used to not to work really well before Xcode 11. The reason was that we were controlling the visibility of lib C++'s methods just like push back by force inlining them into your code, right. Now, generally that works pretty well. The problem is push back is giant, and then the optimizer goes to town, moves around, deletes some code, and the debugger, when you tell it break on push back, you're not trying to step into push back, you just want to put a breakpoint on that line. The debugger doesn't really know where push back is because it's been splat all over the place, right. So, what we do as of Xcode 11 is that we don't force inlining. We let the inliner decide when inlining should happen instead, right. So, what used to happen in this specific example, before Xcode 11 is the breakpoint that you put there would actually break on the second loop, right, because the pushbacks are just everywhere now. So, as of Xcode 11, we don't force inlining, and this is what the debugging session would look like. So, say I run lldb, I run my program, and I say put a breakpoint on line 12. Right. Pretty straightforward. Bugger goes in, says yep, breakpoint, got it. Now, I hit run. This is what happens now. All right. So, I break on line 12. Cool. It just works, right. It's kind of a boring demo because it does what you expect. It used not to. So, the cool thing now is I'm talking about code size optimizations, right. Well, this, because we don't force inline really big things, if you use the stl a lot in your code, you create quite a big amount of code bloat. And so what we've measured is on big applications release mode up to a 7 percent code size reduction when you do this. Now, again, this is in release mode, right. So, that's a pretty good amount of code savings and better debugging. That's nice. The last code size saving I want to tell you about is C++ static destructor suppression. And again, let me walk through an example to show you what I actually mean here. So, say I write some really generic C++ code. Most applications end up having a logger, something like this, right. And when you do logging, you don't want to pass a logger around throughout your application, so you have just a global variable called logger. Fairly straightforward stuff. Now in C++ when you have a global like this, it'll have a destructor that'll run at the end of the application's lifetime. Right. And you'll notice, the logger contains a buffer that's a std::vector of string, so that's what the destructor is going to do. It's going to destroy that vector of strings, right. Straightforward thing so far. Now, I go into my application, and it's a game. Right. So I add this code here. It's just a game. So, again, I just have one application and one game, and so I just have a global here for my game. Totally sensible code, right. Now, the problem is if I go in and I add some logging code to the struct of the game, well, you'll notice, I have the logger that's global. I have the game that's global. That might not work out so well. The reason is in C++, between different translation units, the order in which the destructors are called is not guaranteed, right. In a lot of cases, you'll destroy the logger before you destroy the game. What's going to happen, this is going to crash, not good, right. So, that's kind of a headache right here. And then dig a bit more into how C++ works, and this is my mental view of what that is, right. So, you start adding thread local storage. You start adding threads. There's like the graph for C++ destructor ordering is really complicated, and it's complicated enough that even like compiler people who are supposed to know how that works, we have no idea. I had to fix a bug a few months ago in Clang where Clang would in really rare cases crash upon termination when trying to clean itself up, right. That's kind of embarrassing, but it's just to show like it's not like a trivial thing to get destruction order correct. All right. And let's go a bit further. On iOS, this is what the lifecycle in an application looks like. Right. There's not really a logical time for the application to shut down. There's times when it goes to foreground, it goes to background, and it goes away, but like destruction, like the application shutting down is not really a thing that makes sense with that type of lifecycle. What you end up having is you implement callbacks, something like this, right, and then you're told you're going to go like in the background, you're going to come back, something like that. Destructors don't really run in a logical place. So, if we go back to the code we had earlier, right. It's an application. It has a logger. This is what we wrote, well there's not really a logical time for this logger to flush its buffer, right. But the, like in the destructor. What you really want to do is say well, if you're going to go in the background, please flush the buffer first, right. So you don't have any cleanup to do in the destructor. It's kind of silly to have a destructor. There's a bunch of code that gets generated for nothing, right. So, we added, as of Xcode 11 is an attribute that allows you to say, hey, don't destroy this thing, right. It's global. It doesn't need to have a destructor. Of course, you're still going to flush manually when a callback happens, and you can go into your entire application in Xcode and use the setting to do that for the entire application. It seems pretty trivial, but it gives you, depending on how much C++ you use in your code, maybe like a 1 percent code size reduction, right, which is pretty nice. So, let's move on from code size reduction and talk about diagnostics instead, right. So, I'm going to tell you about five diagnostics that are all on by default in Xcode 11. The first one is call-to-pure-virtual function from a constructor or a destructor. What does that mean? So let's write some good object-oriented code starting with a table. All right. So, I have this table here, and I go and I want to have a pure virtual function to illustrate what I'm talking about. So, I'm going to go and write this galahad function that's pure virtual, and I'm going to have a destructor for the table, and when the table gets destroyed, I'm going to say, galahad, please go and find the grip, right. It makes a lot of sense. So, I do that, and I get a warning. The reason you get this new warning as of Xcode 11 is because calling a pure virtual function from a constructor or a destructor really doesn't make sense. There's nothing to call. Because the table is the base class, and the most derived classes have already been destroyed in this case, right. So there's no implementation of this galahad function to call anymore. So, how would you fix this? Well, you could go, and in the derived class that implements galahad, its destructor could call find galahad and return a grail or something like that. So that makes some sense. All right. Let's move onto another diagnostic. Memset with transposed arguments. So, say I have this struct called inbox, right, and I have a bunch of emails in it. I come back from vacation, and I'll get to inbox 0. What do I do? Well, I just memset the whole inbox to 0. Now, I wrote this code. Who can spot the bug? Right. I've transposed the argument to memset. Like, I make that mistake semi-frequently because I have no idea what the order of argument to memset is. Whether the value I'm trying to set destruct to is the first argument or whether it's the second one, right. Whether it's the size that I'm trying to set it to. And so we now detect that as Xcode 11, and we tell you about it. How do you fix it? It's pretty simple; you just flip the arguments around, right. Now, one thing you might want to consider here is instead of using memset, which is hard to get, and even like look at that code, it's still not obvious that it's correct, right. What you might want to do instead is use something like std::fill. In some cases it makes sense. You rewrite the code to look something like that, and now it's much harder to get wrong, and it's easier to get what it's doing. Right. So, that's kind of neat. The third warning I want to tell you about, return of std move. So, move is a bit complicated in C++, but there's a bunch of diagnostics over time that help you really use it the proper way, right. So, again, let's write an object-oriented code to understand what I'm talking about here. So, say I have three structs, lion, goat, and snake, and love object orientation, I'm going to compose them into a chimara, right. Good code. So, I go in and assign bellerophon and I want to go slay the chimara, and then return with proof that I've slain it. Well, I now have a diagnostic that tells me, hey, you know what? Like you're returning a chimara, but you're just, like the return type that you're returning is actually just a goat, right. So, I'm going to take that vector, copy it over, because it doesn't make sense to slice out the vector out of the chimara and put it into a goat instead. Right. So, what you're doing here, the code you wrote is basically what you've been trained to do related to std::move. You're trained to rely on copy elision, right. And most times, you don't need to put a std::move when you do return. In this case you do because it does a copy, right. The language says returning, just slicing out a part of the class is kind of weird anyways. Move shouldn't be implicit, right. So the warning tells that you probably don't want to do this, right. So, first way to fix it, you go in, you call std::move. Right. Now, that will move the vector into the goat, right. So it's much more efficient to do the move this way. So, another thing you might want to do is well instead of returning just the goat and having like people trust you that that's actually a chimara, well you can just return a chimara. That makes sense, and here you get copy elision, right. If you were to add std::move, the compiler would tell you that you're pessimizing things by adding the move. Another thing you might want to do because you're not sure if you're going to get a chimara or not is you might want to return a std::optional with chimara, which again does the right thing. It's not slicing things out of the class and so the language says, yes, this gets implicit copy elision. All right. Another diagnostic I want to tell you about, size-of-pointer-div. What does this look like? Well, say I write this code. Totally fine code. No problem right now, right. So, what I'm saying is I have this array. Take the size of the array divided by the 0th element, and that gives you the number of elements inside the array. That's really standard code for C style code, right, a really common idiom. So, the problem here is if I refactor this code and I do something like this. So, I pass the array as a parameter instead. Well, what happens here is that the C rules say that the array now decays to a pointer. And the new diagnostic tells you, hey, this probably doesn't do what you want, right. This won't return you the number of elements in the array. So, that's a problem. We catch it. How would you actually fix it? Well, you could have written the code a bit differently. So, instead of using that idiom, right, you could have gone in and used something like std::size, which means that instead of refactoring the code wrongly, you would have caught that problem when you tried to refactor it, right. So std::size just does the right thing here. So that's a kind of a neat warning that catches errors. The last diagnostic I want to tell you about is defaulted-function-delete. So, again, say I write this beautiful code here, all right. So, I have this struct aberration. It has some floating eyestalks, some eyes and mouth, and I'm like I want to default aberration. Please give me a default aberration. Well, the compiler comes in now and tells you, hey I don't know what a default aberration is. Why not? Well, I have a float reference, and I can't synthesize the fault constructor for that reference. It's a type that I can't default create. Right. So there's a bunch of other ways, not just references in C++, to create things that can't be default constructed, and the compiler now tells you about it. If you ask for a default constructor, and I can't give you a default constructor, it'll tell you. So, how would you fix this? Well, one way is to just create the constructor yourself. As you pass in the eyestalks, that creates a reference automatically. That's kind of neat. But personally, I think like beauty is in the eye of the beholder, but maybe this aberration should be coded differently and not have a float reference, and maybe you should just do this instead, right. That makes a lot of sense, right. So, now I can default create to the aberration. All right. So, that was the diagnostics I wanted to the tell you about. Now, I'm going to hand it off to Devin, who's going to tell you about New Static Analyzer Checks. [ Applause ] >> Up until this point in the talk, the warnings that we've told you about have come from the compiler when you build. But we have other tools that can help you find bugs. One of them is the static analyzer. The analyzer finds deep bugs in your code, and it can do it without even running your app. This makes it great at testing and catching those hard-to-reproduce bugs that you never even thought to write tests for. And it can even show you the sequence of steps along which the bug manifests. This makes it easy to understand the problem and fix it. Today, I'm going to tell you about three new C++ checks that we've added. A check for use after move bugs, a check for dangling C string pointers with C++ std::string and to check for reference counting bugs in the new DriverKit and in IOKit. Let's start with use after move. In C++, moves allow you to avoid unwanted copies. And here's an example of where you might want to do that. Let's suppose that I have written a novel, and for those of you who know me, I can be quite verbose. And so, I don't want to pay the performance cost of copying the entire text of the novel when I hand it off to my publisher. So, I'll use a move. This moves from the source variable rather than copying it. And what's great about this is that it allows me to enforce a unique ownership semantics where there's no confusion about who has the latest version of the novel, me or my publisher. But I do need to be careful here, and that's because moves leave the source in an unspecified state. Let's see how this can go wrong. Suppose I were to add a call to spell check my novel after I published it. This can have unexpected results or even crash depending on the implementation of the book type. Fortunately, the static analyzer can now catch this bug. To fix this, what I should do is reorder the code. It really does make sense to spell check the novel before I publish it. All right. Let's move on to dangling pointers from std::string. Those of you who have mixed C++ and C strings know that this can be really tricky, and here's an example. I've created this generateGreeting function. It takes in a C string name and returns a C string greeting. And in the implementation of this function, I've chosen to use a C++ std::string because it's easier to manipulate. I declare a std::string local variable, initialize it to hello, append the passed in name and then, because the function returns a C string, I call the c str method on the C++ string, and that's where things start to go wrong. And the key thing to note here is that c str returns an inner pointer to a buffer inside of the std::string. And this buffer is deallocated when the std::string goes out of scope. What this means is that I'm returning a pointer to memory that will be deallocated, and then when I use that memory, the program could crash. The static analyzer can now catch this bug. So how should I fix this? Well, we recommend matching the lifetimes of your C++ and C strings. Here, I have changed the generateGreeting function to return a std::string, and then I store the result into a local variable. This means that when I call the c str method that local variable stays in scope for as long as I need to use the C string. In essence, what I have done here is change the scope of the std::string to last as long as I need it. And I will note that it's often easier to stay within the C++ world as long as possible and only get the C string out at exactly the moment where I need it. All right. So, let's move on to the third and final check I'm going to tell you about, for Reference Counting bugs in DriverKit and IOKit. These driver frameworks use Manual Retain/Release for their memory management. And for those of you who are familiar with CoreFoundation or with Objective-C without Automated Reference Counting, it's very similar. Manual Retain/Release gives you a lot of control over memory management, but it does come with some additional responsibilities. You need to be careful to not over release memory, because if you do so, it could be deallocated, and then when you use it, your program could crash. Similarly, you should not under-release your memory because the memory could leak. Let me give you an example of a leak. Here, I've written some code that allocates a new array of devices. It then fills in those devices and sets them up. And the key thing to note here is that OSArray::withCapacity, it allocates a new array and returns it retained. What this means is that the array will leak if it's not released. The analyzer can now catch this bug. So, how should I fix this? All I need to do is make sure to release the array when I'm done with it. Now, the memory management rules are all based around naming conventions, and these are very similar in spirit to CoreFoundation and Objective-C under Manual Retain/Release. But I do want to point out one key difference for IOKit and DriverKit, and that is that the default convention is to return retained. Or, as we sometimes call it, at +1. What this means is that clients must call release on the result of a method that they call. Otherwise, the object could leak. An important exception to this rule is that getters return unretained, or as we call it, at +0. Clients should not release the result of a getter. Now, you might write code that differs from this convention, and here's an example of some code that I wrote. This method finds the first device in an array, and it has the default convention. It should return retained. But if we look at the implementation, it returns the result of a getter, and getters return unretained. So, there's a mismatch here. Fortunately, the analyzer can tell us about this. So, how should I fix this? Well, I have three different options. The first is to change the behavior to follow the convention. Here, the convention is that the method should return retained, so I could retain the result before returning it. Another possibility is to rename the method. And if I look at this findFirstDevice method, it looks a lot like a getter. So, I could just rename it to getFirstDevice, and that would follow the guidelines. But you might have a method that has the behavior that you want, and it has the perfect name. You don't want to change it, and that's okay. What you should do in this case is add an annotation to tell both readers of your code and the analyzer that you're not following the convention on purpose. In this case, I can add the DRIVERKIT RETURNS NOT RETAINED annotation to express my intent. So, if you have an IOKit driver or you're writing a new DriverKit driver, I highly encourage you to run the analyzer on your code. To do that, all you need to do is go to Xcode's product menu and choose analyze. And you can even have Xcode run the analyzer every time you hit build by going to your target's build settings and enabling Analyze During Build. This will help you catch your bugs before you even commit them. All right, so we told you about a lot today. We showed you how LLVM Bitcode enabled a seamless 64-bit transition for watchOS where your 32-bit apps worked on day one with Series 4 watches. We showed you how to reduce code size with new compiler optimizations and language features and how to run the static analyzer on your code. For more information, please check out our session website, and we really look forward to talking with you in the labs. Thank you. [ Applause ]