---
title: WWDC2004 Session 211
framework: wwdc
role: article
path: wwdc/wwdc2004-211
---

# WWDC2004 Session 211

## Transcript

Kind: captions Language: en section 211 opens yo optimization live I'm Dave Springer this is chris Niederauer it's going to be running demo so it's three-thirty on thursdays like nearly the last day of the conference right guys have been here a long time you're feeling a little conference burn here's a thought that keeps me going in the whole of human history there's been appointed one Dave Springer and for my life my entire life I get to be him okay all right we're going to talk about a couple tools that we developed an apple opengl profiler and opengl driver monitor and we're going to have live demos because we like to live on the edge and this software you see is pretty fresh right so anything could happen at any time but what we're going to do during these demos and we're going to do it is show some of the performance bottlenecks that we've seen that are common among opengl apps okay so we get a lot of applications come through the shop and we see a lot of things like immediate mode when displaylist might be more appropriate we see things like text your upload usage that could be a little better we see state changes that aren't always necessary so also will show you about how to debug your open G applications using these tools as well okay now why did we build this tool well really it's those were the issues that we're running into all the time you see a lot of performance problems and we notice common themes so we built this tool to quickly identify those areas where your may be losing performance in your OpenGL apps also there was a lot of Congress conceptions about why performance was being lost a lot of people would you get a lot of finger-pointing well now we have a tool that will exactly measure and precisely identify where the performance is going so high so what does it do profiler will show your your usage of the OpenGL engine library collect a lot of data and a lot of statistics you can also control your application kind of like a bugger level and we'll also show the graphic state that your application has in it and we'll get into what all that really means in the demos okay now here's how profiler works now I got to refer to my cheat sheet here because my memories like a sieve and besides this makes it look like you're prepared beforehand profiler is it's a runtime system and what happens is it gets in between your app and the OpenGL engine okay so if you don't have to recompile your application run in a special mode or anything like that profiler really does work like a debugger in that sense in that you can just run your app under the profiler environment and now how it does it is that profile it gets into the OpenGL library at the library level and wraps all those functions so imagine like the old days you have jump tables that we loaded libraries get in there and masquerade each one of those function calls to go into profiler and then from there into the engine or the cgl shim and it does wrap both cgl which is our OS dependent layer of OpenGL and also OpenGL and there's a quick note under the x window system if using that platform you actually end up profiling the server not not your client app because it's a runtime system like this it means that you can launch a nap under profiler and then using gdb or another favorite debugger you can attach to that same app so you can have a full debugger at the same time that you're running profiler some little little tricks and how to keep it synchronized there and you'd have to experiment with that that's an exercise to read okay here's a screenshot this is all new for Tiger profiler three point 0 first thing we did was take the two panel approach startup approach that we used to have and compress them to one pal so now there's no start and then start and then really start this is you select your ass at the top now let me let me get out of these fancy build there we go you first of all set up the app that you want to profile in this top table here if you click attach that tables automatically populated with all the running applications on your system okay then you can in the new three-point o profiler set environment variables this window by the way is open large this doesn't default to this size normally this part of it is hidden when you when you run profiler you can set a custom pixel format which I'll talk about more later you can emulate sort of graphics drivers again I'll talk about that more detail layer and new for tiger you can set environment variables so like you can from if you're launching from a shell in UNIX you can you can set environment variables that you can get ends inside your f now you can do it right from profile and I do this all the time to tell my target apps who launched different dynamic libraries like debug versions of a framework for example you can do that your environment variables okay and the third part of this panel is down here the most important part really of this panel is the frame rate now this is an out of the gate estimation of your apps performance and it's not a real precise nailed down this is where my performance really is but this is going to give you a general idea oh yeah i'm getting about 200 frames per second and I'm expecting 500 okay so that that's what that gives you now let me get into some of the things that we collect this is the data that profiler gathers out of your app it takes the amount of time that you actually spend in OpenGL engine so when you make a call we start a timer go into the engine come out of the engine stopped the timer add all those up and so this is a pretty precise measurement of your actual usage of the engine / / call now these are cumulative values and they are also but they are but they are per contact or global so you can see globally have a lot of threads in law context you can see globally how much time you're spending in functions or you can look at it for context as well now one of the really important numbers on this now here is the estimated percent time spent in OpenGL engine and here we're spending about a quarter of the time if the app is a hundred percent we're spending about a quarter of the time actually in GL in other words on the GPU now what this number is going to tell you is profiler the tool you want to keep using to measure performance or do you want to move on to something like shark and Chad tools in order to work on your performance on the cpu side and we'll go into more detail later about how to balance those two off okay with that I'm going to turn it over to Chris he's going to show us the devil okay so we're on computer number three okay that's good so here's a new profiler window and as we see it's actually a little bit smaller than Dave screenshot is how it starts up by default so we we can start an application that shows some of the common pitfalls that we see with a lot of applications today that use OpenGL on Mac OS 10 and so this application we call it what did we call it charge event tri-tips bandit that was its name we wrote it it will we'll use this in OpenGL profiler use these tools and improve its performance so one of the first things that you generally want to do is you want to figure out what percentage of the time your application is pending in OpenGL and what it's doing with that time so we've got the applications right here ready to launch and so launch that so we've got it showing up here it's just running is it would normally if I were to start it from the finder and one thing you may notice is we already have the frame rate is already showing in the bottom of the profiler panel because it's non-invasively able to get capture the cause of OpenGL and it's displaying this information so we're saying we're getting around 33 frames per second right now so let's go and find out where that time is going to make that thirty three frames per second so I'm going to check these statistics collect statistics right here and this brings up a window of all the functions and for each of those functions we have the total time that is spending in that particular function of OpenGL the average time at each of those calls the percent of time the that is of all the OpenGL usage in that application and then the percentage of time that is in the overall entire application so one of the things I like to do is sort by percent of time in OpenGL this gives us a good gauge to start with to see where which which calls are actually taking up the most time so we look at this list we see vertex 3f tech chord qf those are the top two but then I'm also noticing here there's a finish and so finish despite only being called fourteen hundred times we see it's taking quite a bit of time for each finished call to execute and as if you attended John John staffers talked earlier on open G optimization finish isn't always necessary a necessary command to use like flush is sometimes sometimes useful but not really for there's like there's cases where it's useful but in this particular application it's not useful so what I'm going to do is I'm going to show you part of the control functionality of the OpenGL profiler and so what I'm going to do is I'm going to disable that function so let's go so I'm going to open up the breakpoints window and this window gives a list of all the functions of OpenGL and you can control them through different ways we're going to show you how what if you don't understand off it yet don't worry we're going to go over how to use this window in depth a little bit further later on so I'm going to look for GL finish we see the command here and i can simply we see this column execute and i'm going to simply turn off that column and we see already and are statistics that is turned red which basically which means that we're no longer calling that function so i'm going to clear this recheck and we can confirm that GL finish is no longer in the statistics so we're already it's about five percent faster from what I've measured just by taking out the function as it allows you to do this all on the fly so let's see some of that back to you they thanks Chris okay i want to mention here that Chris and I work about 200 miles apart or so and I have never seen this demo until now and it was awesome so thanks it's a silver auction [Applause] okay let's go on to another section of usage data that we harvest with with profiler now this is a call trace what is in the last demo you saw that we capture every function and time it and in this usage data capture we grab every function as you call it and store it so you've got a whole trace of all the GL function calls you make as you make them and you can see here that the output is kind of see style you couldn't actually just take grab this and compile and get errors but it does print out the symbolic names of the parameters so that makes it a little easier to read and I want to point out a couple new features for tiger one is that you can apply a filter to this trace a couple of you guys developers out there had this excellent excellent idea of taking the this text and running it through various Python and Perl scripts to come up with statistical analysis oh now it's built right in so you just say enable filter you pick the filter and it'll push it right through there and show you your output the other thing that we have on here that's me for Tiger is timing information per function so before you saw in the statistics window that the timing information is cumulative for all the function calls so when a great read GL vertex 3 f for example that timing information is the sum of all your dl vertex 3 f calls okay in this case you get the time for just the individual one that's on this list now what that's really useful for is finding hot spots because you might have instances for example of cgl pleasurable I'm just pulling us out of the air that might take a short amount of time and one or two that are super long because of state changes and things like that well on the stats window it's going to show up as taking a long time cumulatively but what you really want to do is narrow down those one or two that are really soaking up all the time and figure out why that is well this timing information that is coming in tiger will will maybe we'll define those really fast plus attached to each one of these lines will be revealed for a full back trace so you can click on the function and find out wearing your code that pacific call was made narrow it down and again this is all / context or global so if you have a bunch of context you can narrow down to looking at function calls just in one context okay and i think with that we're going to turn back over to chris go back to that computer three thanks so I've got here I have a second application here which I've already taken the finish out of and I'm going to demonstrate how to more effectively pass down vertices through your application to open jail so let's launch this application up again and so as David showing open geo profiler allows you to get the trace of at all the OpenGL functions that are being called and I'm going to go ahead and do that so let's click this button create trace and I'm going to stop that because otherwise might fill up hard drive so looking through this trace we see that there's a lot of vertex begin and basically called what's that practical so deal began GL vertex g / texture / text you land and this is actually for static data this is some time this is a actually it's more it would be more efficient to use display list for this particular case for instance the land here is all static yet I'm passing it down through immediate mode and so what I'm going to change about it is you can you can either add vertex of a range vertex buffer objects display list all of these allow you to effectively pick you can pick the type that you feel is most appropriate for the type of data that you're trying to draw with and use that to more efficiently take advantage of the video card more efficiently take advantage of the bandwidth of the system so one thing to note is when you do have all these calls like immediate mode requires a lot of calls with all the begin end versus if you were to say user display list which is a single call GL call list so again also as John software went over and it's talked earlier today on on optimization in general in OpenGL you can use cgl macros to in order to cut down on the overhead that each advocate each function call use the basically the overhead of making the function call itself so this in the case where in this particular application simply switching the cgl macros will will definitely get us a gain a pretty good game so but I've already written this to use displaylist and well first let's check out the statistics again and we can see sorting by the Geo time that we've got vertex 3 f text cord to F vertex 3d color for F begin and all these calls are immediate calls and so let's launch well and so we get about 35 frames per second so I'm going to stop this application startup one using display vertex array range and that alone we're already up to 160 under 70 frames per second 165 simply by passing the data using displaylist vertex range and let's go back to statistics and now we see that we're actually so most of our time is being spent in GL call lists and then the rest of the time most of the rest of time is being spent in cgl flush travel which basically means it's waiting for the video card to stick more data back on it so we've pretty effectively used opengl on the cpu side right now so back to you alright thanks Chris alright let's move on to application control and some of these features that are in profiler for this one of the ways you can control your app is by setting a custom pixel format and what we do here is inject a different pixel format than what you have what you asked for in your code so again this is without recompiling your application or changing any code like that we can do things for example change the depth buffer size so you want to see if your app will run with a 16-bit z-buffer instead of a 32-bit you can do that through profiler without having to rerun your app or we compile your app I mean yes we run it the other thing you can do is what we call driver annulation we don't actually fully emulate the graphics drivers because we can't it's it's hardware there's catch there's all kinds of stuff involved in there but what we can do in profiler as a runtime system is get in the way of the GL get calls and make it seem like you're running on another card so this is useful if you want to make sure your app is following correct code paths for example you got different code paths depending on the return from a GL get string because you're looking at different card features and you're going to enable or disable certain functions in many of those using games all the time you're going to change a menu that allows you to to turn on certain features in a card that's what this really lets you do now when you use this feature and you you change the driver strings that are getting returned your graphics may not show up on the card because it's not the right card you know you your app think that's something else an nvidia card and really it's an ati so you use it with care but it but it does have have you now another way we can control the application chris already showed in this demo GL finish is that you can enable and disable GL calls so you want to see what your app looks like without ever calling GL finish to turn off and he saw that the app not only look the same but ran way faster so we can do that you can also attach script at brake lights now what that means is you can write little pieces of GL code and profiler will take and inject those into your application at break points while you're running and then we're going to see an example of that later on ok this is the break point window and Chris showed this earlier these like a debugger you can set breakpoints but unlucky debugger you can only set them on certain functions which is all the GL calls this is not a general debugger feature this is just a way to stop your app on certain GL calls and the interesting thing is that you can stop it just before it goes into the engine or you can stop it right after it comes back from the engine why this is useful is because along with the back-trace you know standard debugger batteries you also get a full snapshot of the open field of state so you can see what kind of state changes are going on in your Jail calls and verify for yourself and gather it really are happening or or you know passwords or my budget and again this handles the multiple contacts case now what we do here though is you know like a debugger if you have a bunch of threads running and you put a breakpoint on you know function food then it's going to stop in every thread what's the same here if you put a breakpoint on GL flush then it's going to stop at every context in every thread that calls them ok and with that I'm going to go back to Chris catching up yelling so GL profile is good four break points and one of those types of break points that you can set is a break point on any geo error so what I'm going to do is run my application this time with a breakpoint set any time that geo air what might occur so I'm going to go up and go to back to the views that's a break point and we can see here we've got a list of different types of errors we can break on I'm going to break down error which refers to the normal GL error and let's start this program up so already its ecology or and it says that I'm calling he'll blend with geo blend equations when in fact blend equation is not something that's supposed to be enabled and disabled GL blend is more common is actually what's supposed to be there so we get the error GL and val de new and like we also see we can see the back-trace and the actual line of code somewhere here you can see the line of code where this actual this error is is occurring and so using this I was able to quickly realize where the where this was fix it corrected to GL blend instead of GL blend equation and and I'll show you the result we got one taker yeah let's see so the blending that's actually if you notice a function back trace that was in draw sky and so I had blending basically was not enabled for that so let's start up the version without errors I'll set that break point again let's see break on air started up and as we can see it's not breaking on any airs and now we've got the clouds blending pretty well so let's go back to you Dave and you can explain some of the other types of errors that you and break on thanks Chris okay we saw breaking OpenGL errors I'm very useful another way that you can track these errors in your app is on thread conflict John talked earlier about multiple contexts multiple threads and what's legal what's okay and what's not well you can have more than one thread talking to one single GL context but it's up to you to make sure that the thread is locking correctly and not in the content not more than one thread in the context at the same time if you end up with more than one thread talking to the same context you can get all kinds of funky data corruption problems and bad things can happen to your computer so you don't want that well profiler what really is happening here in this thread conflict is that you are supposed to have the mutex lock on the threads if you're going to talk to one context profiler has this mode where it applies with locks that you're supposed to have so if you get into the case where threads are going to conflict it'll trip over one of those locks and stop and say hey you know there's an error here then you can go back into your app again by using the batteries and you can apply the lock seem clean it up personally I recommend that you have one context per thread but that's just my personal opinion doing what now this threading collation stuff is only detected in the OpenGL api's we don't detect it in the cgl layer so you're on your own there another way we can detect errors is the panel out there's a break on bar error it's tough on vertex array range and vertex array object errors essentially these four points some say any time an index that you're using to draw with veers outside of an array range that you've specified or if you're going to hand in a pointer that is outside in one of those ranges that you haven't properly set up then we'll stop and we'll break and again show you the back-trace the full GL state everything you need to see and we validate your vertex array range on any of these functions that you see up there okay we talked about the full snapshot of OpenGL state it is a full snapshot so every GL get call that you can make is done right here and we put it in this list this reveal list now what happens is that the state is gathered it's harvested every time you stop at a breakpoint and the changes in the state are shown in red and the changes being since the last break point and to show that what I did here in this this screenshot is I've got a stop on GL & Abel before it goes into the engine and then another stop as soon as it comes out so what I would expect is that it's dl enabled so I would expect the state to be turned on rights that I'm that I'm changing and that's in fact what happens you can see down here at the bottom and says it it broke after GL and able in other words it's gone into the engines come back out and stopped again and then you can see there that the call face is now enabled so this is really useful for detecting errors where you you think you have state set up that may not be or state that's set with with incorrect values you can watch the change you know that's we had another taker for that awesome okay just another quick couple quick points on this windows there's under that actions pull down their visas just shortcut menu options to stop everywhere before stop everywhere after stop know or you know it turns on all those buttons or turn them all off you can also execute no GL functions so if you want to see and we've had examples of this in the lab for people you know your graphic is slow and my app runs slow because your graphics just not up to par well so we said okay you'll take your app will turn off all graphics and notice it goes to say speed guess what so you can run your app open loop and decide oh well maybe I better get shark and judge holes out and you know make that go a little faster first where I start blaming people randomly not that I've ever done it and of course you can just you can ignore all the breakpoints too if you just want to run your run your app without stopping anywhere okay and what's that turn it back over to Chris and we'll talk about unnecessary state change so already I talked about the immediate mode I talked about how making a lot of calls actually will result in function overhead and the same holds true for setting state except there's also the fact that setting state can also itself the actual setting of the state can take up time and even if that the setting of the state doesn't take up time you may not know but the sum of the state changes will be deferred until your draw command and that will cause your draw commands to go a little bit slower so you won't so like in the statistics for instance you'll see draw raise taking a longer time than usual because you've accidentally turned something on or turn it on multiple times or just switch some sort of states that you didn't need to switch so one thing that developers should try and do is they should try and avoid state changes when they can but they should also keep in mind that OpenGL does keep track of that as a state machine that is keeping track of what you're doing and depending on the type of state that you're setting it may be more efficient for you as the developer with the semantics of the application to decide whether or not to do the state change yourself so I'm going to launch up application here and I'm going to go look at the statistics and one thing I wanted to reiterate sit that they've said early earlier was the the estimated time % time in GL is is a is a really useful feature to look at like here we see the applications taking ninety-one percent of sleep excuse me ninety-one percent of the application is going to OpenGL and sometimes so depending on your application fat percentage will be different but for this particular application since I'm just pounding on the graphics hardware I'm not doing anything that has to I'm not doing any cpu calculations such as physics or anything similar to that so because of that I have a pretty high percentage of time in OpenGL sometimes it's better to have a higher percentage of open jail time because that means that you're giving more data to OpenGL in general but so let's look at I wanted to look in particular at the number of function calls with GL a naval and GL disabled and if you look at the number of cost between those two you'll notice there's actually they're they're very different so there's a hundred and eighty 190,000 disabled calls while there's only 125 enable cost so obviously this is not necessary that means that there's some sort of imbalance there and that's just one example of a fake change which is unnecessary so we so by taking that state change out you might not you don't just gained the time in the actual function itself like here percent so the average time here is very small here for the naval I'm disabled however this time might be actually showing up in your other calls such as GL begin and other similar function drawing command so back to you these are 9680 I 9600 card on a duel to all right let's talk about some of the graphics states that your application keeps know there's a differentiation between state in the in OpenGL which is a state machine and graphic state that your application owns the difference being that your app is going to own things like textures vertex programs and as this slide shows a depth buffer back buffer things like that it's not strictly speaking GL engine state ok but because it's important especially when you're debugging and and in performance analysis to know what's going on with that state profile of captures at all to so this view is the depth buffer and what profiler does here is grab the z-buffer the depth buffer and then gray scales it ok so that on your gray scale here the black pixels are minimum Z and the white pixels are maximum Z as slider at the top is showing you your Z Y or Z range when you get the desk buffer up and you click that magnifying glass profiler will automatically analyze the image and say alright you your minimum Z value in the depth buffer is such in this case point three and four in your maximum is one now in in the in OpenGL the default is that the z values in the depth buffer are always between 0 and 1 there is a way to change that but generally speaking the values of the folding point range of Z in a depth buffer is 0 to 1 now and the idea here is to show how much is the precision you're using so there's a one of the common problems you run into which chris is going to demo later on is something we call v fighting and that's our colloquial term for it and how that manifests itself is you get these little flashing polygons because there's not enough Z position to tell which part is in front and which part is behind consistently and so you don't have enough precision in your depth buffer how you can see if you have enough precision or not is by using this view and if that orange bar at the top is really tiny then you've got almost no Z precision wider that bar is the more z precision you have so that's what you're striving for and the way that you affect the precision is by changing the near and far planes in your GL frustum call and I want to this sometimes has been appointed confusion especially on the OpenGL listing where the values of Z in the z-buffer the range of those values i should say is not affected by GL presently always go from 0 to 1 what changes with GL frustum changes is how many of those bits you're going to use for the z comparing the z-buffer make sense in other words you want to you don't want to have just the top two bits being used for all your z compares you want to try and get all 32 or all 16 or whatever your depth is and your near and far plane are going to be the determining factors for how much precision you the actual values are always do 0 to 1 if the range is always here 21 okay then another kind of buffer you can look at with profiles stance or stencil buffer what profiler does is pseudo color the stencil planes way to use a stencil buffer is that you set individual bit planes so profiler you can pseudo color those on this example here we've got three bit planes being used in the stencil buffer and the profiler pseudo coloured them with blue green red so you can see here there's a black where there's no simple bit set at all then which planes have stencil bit set in the the red and the green and the blue and then where it's purple profiler composites bit planes together and because of another pseudo colors so the purple areas are where you have red and blue set so both of those bit planes have been set in your rendering there now other buffer views that you can get to the back buffer and that's pretty straightforward it just looks like the front buffer before it got swapped you can look at the Alpha buffer which is also greyscale colored and you can look at all your auxiliary buffers so depending on how many you asked for in your pixel format or how many the engine or card supports that's how many you can look at buffer views are all static so they're just what your app put in there you can't edit them and then shove them back in and say oh well what happens if I really had a Z precision range of you know much bigger than I really do you can't do that it's just it's just reporting what you did it's just static static images okay no with that turned over to Chris so I'm going to show an example of being able to look at those buffers so looking closely at this application we can see in the background where the waters in the land our meeting is meeting you see it's sort of the land doesn't quite look right it's not a smooth Lant it's not there's not a smooth line there and what I think this is i think is V fighting so the way that I would check on this the first thing I would do is I'm going to take a look at the depth buffer so to do this I have to set a breakpoint in order to specify exactly where I want to look at the buffer so I go up to views and see breakpoints and for the for the since I want to look at the depth buffer right before basically when everything's been drawn i'm going to set a breakpoint right before geo clear is called so that that means everything's done if going on to the next frame but since I said it before it won't actually execute it until so here so I set my break point and I go up to views let's look at the depth buffer so well you can set slider here but i'm just going to use the auto fine in max which will look we can see that the men in the max v value that we're using is actually very small so the precision that we're using leaves a 32 bit depth buffer in this case but we're only using a very small amount of it from point 996 to 1.0 and what we'd like is for this number this value to be a lot bigger we'd like to we'd like to use a lot more if that's your the one so let's go ahead and figure out why the thrust why why is this this is why we're using so little as a V buffer and i'm going to set a breakpoint on GL frustum and I think that I don't actually call thrust them unless I resize the window here we go trust them and we see the frustum is being set with the XY or I can tremble with these organisms are basically and then these two values are the Zeeman and the v-max we're going from one to a hundred thousand or a million or something like that which is pretty large considering that I'm only really drawing from zero to forty so because of that that's making our depth buffer looking correct so let's see so I'm going to show another application another this same application with the frustum modified so it will clip between zero and 40 and again I'm going to look at the depth buffer well we can see already that the water is looking much nicer we've got clear line special where we used to be having some v fighting issues so let's look set a breakpoint Jill clear and you can see the depth buffer automatically updated and we're actually using a lot more that range so in effect we've gotten rid of those issues and back to you alright thanks Chris as the fighting is that's that in the past has been a real hard one to find you know we've got a lot of chatter on the OpenGL list about my polygons keep flashing in and out and to try and it is not obvious that your Z precision is related to the GL Preston call the GL Runciman has nothing to do with the death buffer right so there's not that instant correlation ok more of the application graphics state profiler will capture all your textures that you're uploading vertex programs and fragment programs and you can look at those and make sure in verify for yourself with profilers that you really did upload what you think you upload and one place where this is really useful & chris is going to show later is in your MIT maps because you can get a lot of weird texturing errors when you think you've got a min map up there that you really don't this screenshot up here is showing a cube map and what profiler does their capture each of the individual six faces that go on the cube map and we stick them on a cable you can rotate that around and it will show you which map is being applied to which face is acute so again verify that you've uploaded the right texture to the right face plus there's a bunch of information up there that talks about the internal form as a source format so when you're looking at performance issues in terms of what kind of what kind of texture formats the cards going to perform best with you you can see oh well if I change the internal format and ask for a different internal format you can maybe get better foreign so the texture dimension there's a mid nap slider down there which is chris is going to get into more detail on as well and other little buttons and things like that just show you can flip the texture up and down let's lie down and so with that and turn over to crisp them okay so we've got the profiler running here this time when we launched the application just like normal we this time I set collect resources however and so this brings up the resources window right here and so right now I'm viewing the textures so looking at this application we see it's a nice sunny day you know as the sun's that it looks pretty warm here but looking down at the ground looks kind of cold like snow but that's actually because one of my textures isn't uploading correctly and so by default when a texture image is not specified correctly it defaults to a white texture and so let's go and see why this texture is being white so we see well here's that a grass texture we don't see that grass texture in here we got fans showed you all these resource I'll show you these and like the cloud and so the MIT map slider down here actually serves a dual purpose in that when when you have MIT mapping enabled it will let you slide between all the MIT maps and see each one and when it's disabled it will this actual slider here will be disabled so let's go look at one of the textures we know is uploaded correctly and go look through these MIT maps it looks like they're all specified correctly so let's go back to the grass texture and we notice that these mipmap these mipmap levels have not been updated so to fix this either we could turn off MIT mapping for this particular texture or what I what I'm going to do is just I'm going to specify the MIT map levels for all of those and so I'm going to actually this is a great a great way to show off the scripting ability in profiler so I'm going to on-the-fly disablement mapping for that for the textures so that hopefully we can make sure we can verify that this is why this texture is not showing up so I'm going to go to the breakpoints window which is where you can set up your script and I'm going to have a script that turns off MIT mapping so a logical place for me to do this is after every bind texture call so by doing this after each bind texture call I'm going to call geotech parameter I with even though the target texture 2d of the mint and sets a min filter to linear as opposed to MIT linear mipmap linear so let's go ahead and do that so text I'm going to attach the script through the actions here so open up my script you know Chris while you're doing that I want to jump in here sure you'll notice that whenever chris is up there looking for function that he's not moving the mouse around he's typing on here because it finds yeah chris is a real keyboard or headed guy and so we put in the morning good yeah put in these ways to find functions class by just typing okay so let's catch that script and for this particular script I'm going to it have some moment map scripts I just I just specified I'm going to have it executes after after the bind texture call you can have either execute before after so I'm gonna have it executes after and after it does execute the script I'm going to have it continue you can have it otherwise pause and show you the state after the scripts and done so let's watch this attached and as you can see on the fly we've corrected that and everything looks much better than it did before well this is a live demo ladies and gentlemen that just really worked awesome yeah shoot so back here alright thanks Chris okay so let's move on to the OpenGL driver martyr the second tool in our in our suite we can call it a sweet as it has more than one tool as to okay driver monitor is where we're profiler attaches to your software and shows how your software's interacting with opengl driver monitor attaches to the hardware and it shows you what's going on in the GPU now earlier versions of driver monitor had these really bizarre obscure parameter names like dart wait time and stuff like that one of my favorites and we got a lot of questions like what does that mean so we developed a decoder ring to say well when you look at these arcane cryptic parameter names this is what's really going on and you had to go to this this Earl to get that well for tiger we built all that into driver modern so not only are the parameter names text that's even sort of human readable it's you can roll over it and it'll pop up the decoder ring for that particular parameter and tell you what you're welcome driver monitor does remote monitoring to which means if you have a full screen app it's pretty hard to run another app on top of it and see it what you actually can't so what you can do is run your full screen app on one computer and then as long as you're connected on a land with a second computer you can run driver motor on that second one and monitor the other GPU over the network okay let's have a demo driver monitor so I'm going to show the drive mother in use I'm going to start out my application just using profiler just because it's handy got my list of applications and I've launched it up and everything looks good let's bring up the driver monitor and so we've got the list of everything we've got of all the parameters and well by default whoops you can set use descriptive name by default it will be like this and we've got actual English but you can change back to the old names if you like those if you're weird like me I guess nerd yeah and you also have mouse overs which explain everything that that you'd want to know about these things so here I've added right now I'm viewing on the graph the current free video memory the texture page off and page on data and so we see that right now we've got let's switch this to linear and we've got about seven megabytes of vram eight megabytes of drm free and we can see that there's only about one or two megabytes being paged on of texture data each frame if if you if you don't understand what's in the any of these things all you can always just go over these in your free time set it up figure it out pick the ones that you think are going to work for what you want to figure out and so let's actually make this window pretty large and we know this 00 this video card must have more vram than I expected look and really beefed up these devil machine ah that's what this little make us look good I don't know let's see so we see that actually we can see that the current free video memory is bobbing up and down and low let's go up to 1 gigabyte we see that we're actually uploading texture page long data is reached it's about 400 megabytes per second simply because I've made the window so large and I've got multi sampling all those nifty features on so it's taking up a lot of urine so you because I've used the driver mother to figure this out I can see that it's the vram issue that it's causing it to slow down so much when I'm at full screen and what I'm going to do is I can I decided to fix this by using compressed textures which allow me to stay the same resolution but i'm actually using only a quarters of memory nvram for these textures using some opengl extensions which allow you to do this let's do that again and look at the driver monitor we can see that the vram has flattened out if I were using well it's about it's a little bit faster but in more extreme cases you'd see huge benefit from from doing things such as compressing textures saving your vram and there's so many other things that you can check you can see what where your time is being spent using the dragon Molitor so all right that's it thanks all right quick words on what's new for tiger in profiler and Driver monitor you saw the single control panel you saw the decoder ring built in the new trace info stuff for the call trace we're also going to we've worked on better integration between OpenGL s and shark a lot of lot of you have said all my time in the shark trace is being spent in gld get string what is that well it's not really there we fixed it also coming in tiger remote profiling so as a driver monitor you can hook up across the network and monitor the GPU of a full screen app do the same thing with OpenGL profiler so you can run your full screen app really in full screen and get the full OpenGL profiler benefit you're welcome quick note this you need to have the same OS and profiler versions running on both computers so to make that work okay now into to wrap up let's talk about really your performance issues is a balancing act and we've seen here a lot of talk about how you can improve the performance your GPU usage but and you use profile to do that and driver motor you also have a CPU in the computer so you need to be sure that you're on top of its performance too so your performance improvement cycle is going to work like this first of all your GPU usage might be very very high in your CPU usage low as a ratio whole app is one hundred percent GPU to start with might be way up in the high 90s and cpu down and ten Thank as a used profiler and improve your performance on the graphics card well now what's going to happen if your GPU usage as a enige is going to drop may be driving your CPU usage higher switch over to chuckles and shark start driving that cpu usage back down well that's going to because the ratio that's going to start pulling your GPU usage up and then this is a cycle and it's going to and what you want to ultimately get towards is where they're just about fifty-fifty belts you're never going to be you know that's a perfect ideal you may not reach but that's how you would use these tools in conjunction with each other
