WWDC2000 Session 182
Transcript
Kind: captions Language: en good afternoon everyone welcome to session 182 which is Java getting the best performance and here to begin our presentation this afternoon please welcome Jim Leske good afternoon we're going to talk a little bit about performance yesterday and we talked a little bit about some of the new technology that was being introduced on Mac OS 10 in particular hotspot hotspot and some of Java too we're going to focus on different aspects of these two technologies but we're trying to zero in on performance and we're going to talk about how we've made improvements to performance in the Java VM and we're going to give you some hints or ideas about how you can modify your code to gain game performance we've broken the stock up into three parts because we wanted to get three different the categories of things to talk about the first part will be done by Yvonne posta of the vm team who's going to talk about the vm the new memory management thread synchronization then I'll come back for part two and talk about code execution and how we boosted performance up in the performance of code and finally John berkey will come up and discuss how we can get some performance out of the aw aw tea and swing and the new Java classes so I guess forget to Yvonne to come up now [Applause] well thanks Jim for the introduction so let's look at first what are the factors of Java performance first of all it's the design of your application that's the most important part of that second is the speed of the byte code execution inside the vm of the speed at which we execute the byte codes of your program next and that's what I will focus about is speed of vm operation speed class loading garbage collection threading synchronization and its associated library and last is the speed of the hardware you're running on and the underlying operating system that's not much we can do about it but we are pressing on the colonel team in that area so the two main attractions of the Java the two main attractions of the Java language are automatic memory management and language level support for threading and synchronization unfortunately these two points are also associated with them having the most impact on performance most negative impact on the performance and we are here to clean up some of those misconceptions and show you what you can do to give the vm hin in that area so first I will deal about memory management we have in the hotspot vm we have direct object references to two objects we don't we don't go to handles as the classic vm goes the stuffs we have see speed up see speed access to fields of instances and we also have basically see speed access static fields in your classes also what we have is a Loper object overhead we have two words per object or associated for use by the vm when you compare that to classic to the classic vm which has three words four objects this doesn't seem significant but this is actually where we get some of the performance or memory memory advantages because Java tends to allocate a lot of a lot of small objects and studies have shown that for the mtrt benchmark in the spec JVM sweet each additional word in the object header cost you twenty percent more memory for java see this additional cost is twelve percent so the lower the per object overhead is the last memory you use and it can be a considerable game so I will the hotspot vm uses a generational copying garbage collector and on the next slide I will deal and dive into bit more detail so it is accurate the garbage collector is accurate which means we know at all kinds in the executing Java program we know at all times where we have life references to object we are not conservative when we walk through the stack so we know what what objects are what words on the stack our actual integers even though they look like object references and we can collect those objects in contrast to the conservative collectors this can really make a big difference in memory usage in the sense that you're not keeping live object that you're that are apparent objects but you actually leading to leads in that area it is generational which means the majority of objects that you allocate in Java when you when you actually study it is they're very short-lived so what we do we allocate all the objects in this new object heat or what we call the nursery and once we exhausted this nursery we copy the surviving object out of there into an old space and can start from scratch in that nursery which means we have a fast allocation in this nursery we don't have to deal with with the garbage here we have in there we don't do anything with the garbage we just copy out the object that survived let's survive from this new generation to the old generation you could you could say that garbage collection is actually the wrong term it's more like a search-and-rescue operation where your rescue the few survivors out of the new heat into an out of the nursery into the old old space it is also a search where we do to the accurate nature we know exactly who's surviving and so it the search is pretty fast i say i mentioned copying we actually move those objects out of out of the nursery in the separate memory area studies have shown that five to ten percent of allocated objects survive from the new allocation space from the nursery into the old generation so if we say this nurseries half a megabyte big we copy 25 to 50 kilobytes words of objects which is not very much also since you have this copying infrastructure already in place for this rather big object space we compact it regularly every time be around the old space collector we can compact with the heat and keep the memory usage to a minimum also to the old space collector also since it has to deal with much bigger memory allocation is incremental so it's it work it works on a chunk at a time every time it's involved it reduces you to perceivable posits so you don't have this big stop in the middle where you're collecting the whole heap but you have these many many small causes which makes your UI applications or server applications respond much faster so the benefits to you as a programmer are very fast allocation since we always allocated a bit nursery and reallocate in a stock like fashion so we always do is increment the pointer and this is our new object all we have to do after we increment at the pointer is check if we exhausted the nursery space and then we have to trigger this new space collection this this allocation code is actually in line in the compiled code so all we have to do is to allocate a new new objects when you have the new operation this is equivalent to 11 instructions when you compare that to the a/c style my log function call you have to go through cross library function glue and then you have to do the c.c prologue to allocate after those 11 instructions you're not even allocating objects in the c language yet versus in java you have your object and are ready to go as i mentioned before we have an accurate collector so we do aggressive reclamation of of this nursery as well as of the old generation so for example you don't leave of the trends that are not accessible anymore and due to these two factors you have essentially free free temporary objects temporary objects are by definition short-lived which we don't have to deal with you have no overhead for short-lived objects because all we have to do is but we copy the few survivors out of there so I claim when you have essentially free of essentially free temporary object especially since javis multi-threaded if you have an allocation cash you would have to lock that allocation cash flows you would have to take the object out of the allocation cash and then unlock but by that time you already allocated the obstacles nursery so what do you need to do well as I mentioned these do not build allocation caches but what you also have to do is you have to tell the vm that you're done with an object so so what do you have to do is if you have an object here that you don't need anymore or you have even universe if you have an object hierarchy especially if you have it in a static field you have to set that value that restaurants to know so we know this object hierarchy or this object is not reachable reachable anymore so we don't copy it out into the old generation or even worse if it's in the old generation we keep compacting it and so on we keep that memory alive so what do you have to make sure that we need all out your object that you're not using anymore and while i mentioned allocation caches what you also have to do is if you have native code via the stub library stub a slave libraries based on on the old native call conventions from java 1.0 jdk 102 or John 11 or if you have data in your project if you have J direct to code you have to basically convert those project to use a direct three there is a talk tomorrow about Michaels and jawline death and that talks about how to convert project page rectory or you have to convert your native stub libraries to use J&I native calls and if you're using the objective-c to Java bridge for wrapping objective-c frameworks to Java then you have to make sure that you recompile your wrapper project with the new Jane I base bridge that comes with DP for on the CD before i go into synchronization i wanted to mention that java threads are one-to-one map 2 to P threads and therefore our also one-to-one map the kernel threads they are fully preemptive which means we inside the vm or you we don't have we do not have to deal with scheduling the colonel does that for us we are multiprocessing ready as we've seen in the hardware keynote this morning if if we have something that the criminal can use use to schedule more threads onto second cpu we in the vm will make use of that and therefore your applications will run faster naturally as well we integrated in that native and java stacks for four invocation stacks into into one memory area so you have better locality of reference this is vm internal this is not something you will notice but this is one of the reasons why you have to go to the J'naii for example as well as the accurate garbage collector makes it necessary to go to j and I so my last slide is on synchronization first I wanted to explain into when we talk about synchronization what is a contended case the combat contended cases when you're executing in one thread a synchronized block below synchronized method or if you do execute the synchronized statement on a particular object you're within inside that call inside that block and a different thread comes in and tries to synchronize synchronize blog on the same object so at that time it is called that you have contention on that object and the on contended case therefore is if you have one thread go in synchronous synchronize in a particular object execute the whole block and exit the synchronized block without any other threat trying to synchronize in the same object studies have shown that the contented case is very rare in those in those instances we use pthread primitives and Colonel primitives to make sure that we do right we do the right thing or blocking the thread so it doesn't use any CPU cycles from then on but what is much more important is that we have very fast and conversation and young contended case what you have basically is the constant time overhead you have a couple of instructions to make sure that you set up that synchronized blocks we do not allocate memory on the heap or anything it's all stack allocated for for that memory are basically for that for that synchronized block we allocated on in inside the invocation of that stack and most important of all we don't use any OS resources so we don't use any colonel resources you don't make any pthread calls and don't make any Colonel calls which which is important since we want this constant time overhead so I will handle podium back to Jim to talk about code execution I want to try to focus on on on basically what we've done to improve performance in the echo generation but I want to give you some background so we're going to go in a little bit of history first and then talk about optimization and then at the end oh I have a pin slides which are what I'm calling a code generator hints which are things that you can do in your code which the code generator can look at and say I hope this means i can do this optimization and hopefully you can pick up a few of those things and put it in your own code so i would say at this liner i guess hopefully most of you to say at this point the jab is matured fairly well the first versions of the java vm the chemo were simple interpreters that took bytecodes take the opcode off the bike code go through this humongous T switch statement figure out what's in strings that needed to do to implement the instruction and go through this process one opcode at a time and that worked out you know initially got you interested in Java you know because it was actually a real thing but it turned out to be very slow performer and so once the initial interest the Java sort of passed we had to find different ways of speeding up the performance so the next generation of interpreters usually ended up being a hand coded assembler program so that it was implemented in the native language of the platform and it would go through the bike codes and interpret the bike codes a little bit faster a little bit more optimal way that would been generated by the original interpreters and it's that way we got you know maybe two or three times improvement in performance the third generation which hotspot is I would classify hotspot is being for generation is that we found that the native code that we created with the assembler was hard to manage that the assembler code is always hard to management that's why we program in higher-level languages to begin with so a lot of these newer interpreters are actually using templates which makes it easy for us to just go in insert the instructions that we want for particular secrets of codes and then the the particular vm that you're working on in this case hotspot gathers all those templates together and produces a an interpreter engine that allows us to to interpret the code so in hotspot we gain another two or three times improvement in speed just by using this implementation now interpretation is impaired you so far but it's you know it's really not the the end game and what we need to be able to do at that point is to get into some kind of native code generation so that we get down to the point where we're actually running at the same sort of speed that you would see in C++ so we get into code generators and we had the first jits that came out used to compile an compiler interface there's a plug into the classic vm and what they did was intercepts the execution of methods and go through and convert the bike codes into native machine code and as a lot of the earlier jits the chemos base call a really did was take to be the bike code sequence and convert them it one to one into sequences of native code that worked again you got another boost performance probably get a vote of five times boost in performance but the problem is it wasn't really utilizing the true performance of the machine wasn't scheduling the instructions it wasn't looking at the sequences of instructions and whether you could get any kind of optimization so then we got a round of static code generators where people would take the back end off of the c compiler and put a job of front-end and and produce static executed coat and that sort of works for some types of applications but but the problem with that is that the java is a very rich and dynamic environment to work in and Static applications don't really fit in to what you know what the spirit of job is about you need that dynamic dynamic environment to run in so then we get into the high-performance gypsum and you're familiar with save a semantic jet that we've been using again mr j there is a couple of other high-performance gifts like the the situ or the server a compiler that's on hotspot and what these these ships would do would basically optimize the heck out of the byte codes and try to reduce it down to something that would be quite close to what you would expect from a C and C++ compiler ok now the problem with these high-performance Jets well does the positive part is that we're getting the really good performance you know really really good performance but the problem with these was that there was a lot of competition between the JIT companies and they were trying to squeeze out as much performance as they possibly could try to get the best caffeine mark that could working with the semantic Jets on the semantic gypsy we were getting infinite scores and caffeine marks because we bit were optimizing methods right down to the point where they were just simple return return stages so but the prob what the problem is that's the farthest side the negative side is that it was taking more and more time to compile these things and more and more memory and that's where the cost was so we have to find some kind of balance between getting the optimization optimization done and keeping the compile time and memory requirements down to a minimum and that's where we are with hot spots of client version so the type the traditional traditional type of optimizations you would expect would be you know expression reduction CFE loops unfolding or loop optimization dataflow analysis these are all the standard sort of things he would see in the in the dragon book i guess the standard compiler book and typically what that would happen would be that you would go through a round of these optimizations and then they would reduce the application down to a little bit smaller and then you'd have to go back and repeat them again because then it would bring it down again so we have a her stick tape algorithm to actually reduce capitalization and this is really where a lot of these high-end optimizers got into trouble was because they loop and loop and loop and it could loop several hundred times before they give up and say well this is the best I can do with this method and then actually executed now one of the great things about the runtime just-in-time compiler is that you can do optimizations that you wouldn't be able to do an aesthetic aesthetic situation one of the most important ones is be able to determine whether a virtual method is mine and more thick or not what this means what might a morphism is about is that we have in Java the ability to create subclasses of a particular message and then to be able to billet e to be able to override those methods so in order to do to call a particular method with a particular object as a dispatch goes on it says which method is associated with this object so it's basically virtual objects pervert virtual dispatch but most of the time and it turns out to the most application a little bit over eighty percent of the classes that you have in your environment are not overridden they're basically their leaf classes so there's really no need to go through this dispatch mechanism you can make a direct call to that method and not worry about having to you know hitting the wrong hitting the wrong method so the just-in-time compilers try to exploit this and one of the things that you can do besides just you know simplifying the actual call to the method is in-line the code for the method that you're dispatching to and if you take a look at some of the examples of your own code you probably find lots of places where your calling is this method that's rather simple you know maybe a few lines of code a good example those would be get accessors finally gets getters and setters for your class and isn't it a shame i have to go off and call this method because this code could fit in rates Rajon lineman and be very inexpensive but but this is what the JIT does for you it goes off and says hope that's a really simple method it's I can deter I determined that it's monomorphic there's nobody over writing this so I'm just going to in line is code in and the code becomes very instant now there's always the possibility that that method may get overloaded as some other or not overloaded but overridden it some other time so what will happen in the just-in-time compiler environment is it it may create several flavors of the same method so you may have several different versions of that method for one where it may be a call is overloaded or overwritten and another case where it isn't over so you so what happens is that it's the method that needs to be executed will be chosen that run time and will go off and do that code or tutor or which one whichever one suits the situation and also should make a point and this is another great advantage of of a different types of fathers is that we can do process or specific optimizations at runtime so the great beauty of Java is that I can take this bytecode and poured it to any machine and have it run on that machine well even on the same architects general architecture like the PowerPC I can get different kind of optimization on the g3 than I was on a g4 because of scheduling and so and so forth so the types of things that you could do would be to say on the part we see I can use masks and shift operations or if I'm running on the g4 I can use the velocity engine instructions right so I can do that on the fly and then I can do instruction scheduling and once I've done that on a particular implement our particular machine that I'm running on I can cash the code that are generated so the next time I go and execute it I'm going to use that cash code they don't have to recompile it and that code is already been tailored for the machine that's running on ok so what gets compiled now that there's probably all kinds of myths about their boats all these magics her sticks that we used to figure out what method gets executed and there is some some truth to rumors but generally the things that do get compiled into native language our native machine code our primarily methods that have loops and methods that have interpreted and number of times so those are the two primary triggers the trigger whether something gets converted to negative is coconut a method that has a loop may get executed once in the interpreter but then each of such a time that it may get converted the native code and get run run as native code and the triggers for a number of times I said n number of times because different Jets trigger at different levels like hotspot triggers at around 1500 executions before it actually goes and convert the natives but that's bec that fluctuates depending on different kinds of criteria what doesn't get compiled are typically things that say if you have a method that's currently running and it's looping and calling other things and it seems to be looping for a long time if it didn't get meet the original loop criterion didn't you compiled it may sit there and a trip continuous as an interpreter this is something that would require on stack replacement and the current version of hotspot we don't have that in place yet but you should be able to replace eventually will be able to replace a next something that's currently being interpreted with something that's been compiled but that doesn't prevent that method from being compiled what happens is that if that method is being called by any other point in your application then it will use the compiled version of it because it's already been triggered to be compliant class initializers typically don't get off compiled because the fact that they're only just one that usually do the initialization of their statics and create whatever things that they need they don't need to go beyond that so they typically don't get compiled and finally things are written in Java assembler that are very convoluted and there go to structures or whatnot where it's really hard to do the analysis of that code we can't generate generate native code for them where we may try but it's typically not worth the trouble and the things I'm the only time I've ever run in for that really is with the JDK there's a lot of the Jacob K test that try to you know see what you can do to to a trip up on the jet and so you just don't think we have to worry with them okay so now i have 10 hints or things that you think that you can do that you can provide the code generator provide to the code generator that will actually help for your performance there's various degrees of performance improvement here some made me a little bit more dramatic than others but you don't need to use them all and just don't have but you don't necessarily have to go out and feel it you have to use them all they're just ideas that you can keep in the back of your mind when you're trying to tune your application at the end of end of your your cycle the first is probably the most important thing is the right small and concise method try to avoid methods that have two thousand lines of code in them because what happens is that when the just-in-time compiler kicks in it's going to compile the whole thing and maybe you're only going to use a couple of lines of this because you've got this big case statements and as you know maybe two lines and if it gets executed most of the time and the other ones may be compiled or sorry runs very rarely so what you should try to do is to try to to keep them small so that they compile quickly and then if you've got some code that's not going to be used very often move that code into separate routine so that you know if it's necessary that we compiled in but otherwise news'matt don't worry about you message being too small because method inlining will take care of that the jit we'll figure out a nice load balance to get a nice size routine and what what in lines nicely and so on so it's don't worry about the size of a single sentence for its that's not crucial or whether the method is too simple and then finally you should not always know that are always remember that the accessor methods are almost always inline even in the classic classic interpreter the actress and math methods were often in line so you know it's good to actually use asked accessor methods instead of fun that's accessing the fields directly the next hint is to trust the supply classes what what we try to do is look at look for hot spots in your codes like things that take a lot long time and try to gain performance and one of the things we do in their analysis is identify methods that get executed a lot and want to try to tune those so that they execute very quickly and often we two min write directly into December so so methods in the class string and string buffer bestvector which are used a lot well we actually have intrinsic or built-in methods to deal with a lot of those situations you get better performance so if you think you can write it better than Java sorry better than Sun and did well just remember that well maybe we're going to do a little bit better for you so in the background so that's something to keep in mind array copy is something we gain performance on if you're running on a g4 hopefully we'll be able to get this or not currently in place but if you're a ray copy running on a g4 use the velocity inch in the help the coffee in sine and cosine is Han Han on Intel you would call the hardware directly to do those we call the library directly so you don't actually go through the group so it's a bit of a performance and season we don't have 64-bit architecture so there is a cost in using long so if you if you don't really need long you're just doing it because you think you might need the decision later then we'll maybe rethink is a little bit and go back to using straight industries long multiply takes five instructions long design has to call the subroutine a shift operation may take several instructions so it's not as simple as the same long and you know if things are going to work flow there's lots of techniques to get around some of the problems you might have long i did a class library that's it handles the situation when you're trying to use long's that deal with the unsigned integer problem when you wanna do lunch on compares there's a way of actually doing that without having to resort too long floats versus what of vs devils floats obviously are smaller take up less memory and most circumstances floats and doubles and have equivalent and execution but there are some circumstances like divide where divided of a double is actually twice as long or almost places on is a float so if you don't really need the precision stick with float and the other reason why I'm recommending using flowed is that as we progress to the velocity engine the velocity engine doesn't support doubles it only supports float so if you're if you're thinking about declaring an array of doubles see if you can use a very floats instead because then that's most likely what will be able used to or what will apply the velocity engine there's no commitment to that project sister just keep that money try to avoid the use of generic types it costs actually to use these generic types especially when you're doing assignment between it's the generic types with specific types because the vm has to do a type check to make sure that it's valid to do that and it does that say check at runtime and it may have to search up the class hierarchy in order to determine whether that's a member or not okay so that's just something that you should keep in mind especially when you're doing assignment from a generic type of ray to a specific array because that means like even if you're doing an array coffee it has to validate everything that's being moved from that array / okay and has to go through a class trip so try to use subclassing and method overloading as much as you can because that will actually be better in the long run than actually and then using one the sabres writing one routine that has a generic type and then doing an instance of check inside of it it's better to do the overloading momentum copy the valley local values and then some of the optimizers in the image its will actually do this optimization for you but it's it's better for most Kate or better for the interpreter it's better for the lower end jets and so on and so forth to have you move it up and work with that coffee and then stuff it back in if you speed to okay on this particular examples example on the on the left-hand side we have the increments of the index or sorry extracting value from the table increments the value check the CP values exceeded 100 and then reset to zero if it has every time you have that index it's going to have to do an array bounds check again the higher-level optimizers will take care of that and move that out but that's that you can't rely on that okay so it's probably a good idea to move it into a separate into a separate entity okay the other thing is a semantic this issue there as a semantic issue there where if there's another thread that goes and changes that field or changes that array entry you don't know which company you're going to get so if you extract the coffee work with that coffee and stuff it back in again you know exactly which value your business in the situation where you have multiple threads that are accessing something you should use volatile if you're not using synchronization volatiles a little bit cheaper than synchronization because of what it says is you need to reload that value every time you access it and it volatiles not there then what will happen is it'll highly optimizing jit will say oh well I've got this value I don't have to reload it but then meanwhile another thread changes the value and you're sitting there in your loop waiting for it to change and it won't change okay so use the word volatile when you've got local values that final we've had a lot of discussion internally about these two final but my it's one of my favorite words as far as just in time compiling a concern because it gives me a lot of hints about what the class can be or what kinds of optimizations that can do in the class but it's something that you don't need to over use okay write your application and if you feel that the class is not going to ever be overwritten for specifically for any you know any reason for instance your application class or class it was not going to be overridden declare it as being final and what this wind you is the fact that it says all of the methods in this class can now be monomorphic still never be overwritten I can make direct calls so it improves performance at all it also says that if I do in the instance of on that class all I have to do is compare the sieve is equal to the class survive the class string which is declared as final then it's the check to see if that is a string is a simple compared to see if the classes Aretha I don't have to search at the class hierarchy in order to find out what's going on the other use of final of course the other use of finals courses on statics and this says this value is constant it's not going to change so once the just-in-time compiler knows of this constant it will just grab it and say okay I've got this I can apply it to optimizations in the code and in this code sequence what I can gain here is that I know that the allocation of that character rate is a fixed size so all I have to do is increment that pointer by six shots the the allocation pointed by a fixed size full month don't like my my array declared in the loop I know that it's a fixed size loop or the loop is going to go iterate a fixed number of times 5032 so I can actually get rid of the loop and maybe do a blanket initialization of that array into spaces so if you have a choice between declaring class hierarchy virtual class hierarchy or using interfaces you'll get better performance from virtual calls than you will sue interface calls that's because a virtual call is requires a simple index into an array to get the address as a message that you want to call where an interface requires that natural search of the class to make sure we find the implementer of the class and then it doesn't indexing inventory so there's a little bit of overhead now in Hawks bars are very clever where they actually cashed the last instruction or last method that you called from a stickler call point so it's a little bit better in hot spots but it still has to go through a verification to make sure that well really is an instance of that class that's a that is being passed through limit the use of J and I and J direct and initially what I wanted to do this I wanted to convey to you that if you feel you can do it get better performance out of see you should probably really rethink it a little bit and think that Java is the way to actually write your code now try to avoid going off and doing things as you can because the optimization levels that you're going to get in Java will be free close to see it's not better depending on which kind of level of optimization that you're doing when you're using J&I there's a translation layer that has to take place you could translate into these visitors system and then coming back again you have to actually do a lookup for the methods in order to find out which method is called back into into the vm so use java as much as fun and so in conclusion i just want to repeat what Devon said earlier but you're the best thing first of all is that make sure that your application has a good design okay and watch that you have a good design then go back and look at places where your you need to improve performance we didn't talk about performance tools here yet because we're not really finished with them yet but watch bata is actually releases its release with a pipe and x prof ok which will give you a profile of the methods that have been executing and what percentage of time that you spend in there as we go on we're going to have better tools as h cross tools that will give you much more detailed reports on and performance so get a good design your application then go back and then start tweaking it and maybe apply a few of these hints so that you can get for different I'm compiler to produce better code for you ok [Applause] so hi I'm John berkey and I my Nativity King and I'm going to talk about a little bit different side of performance and that's how we can work together to group performance specifically from the framework level lot of a just heard is about how do I good methods and good class yourself but from the ADA bikies perspective we're more concerned with just highlighting a few things that will help you use our frameworks actually traveled up from the time limitations of funds are friendly so anyway there are five major areas in the cover are here to read them but basically you'll see a lot of usage pattern kind of stuff so for image creation the main thing is there's a new column job of you called get compatible image and for all the 11 usage specifically swing the kinds of things that are in the toolkit class a lot of math you take care of this for you and what this will do is depending on the decisions we make based on device steps and stuff we'll make sure that it's the optimum image most users of image don't need doing together than this and if you do need to dive into bit style manipulation of the image and number one check make sure you really want to do that and then go into the imaging classes and that'd be real careful about the ones you check one of the cases here is that there are some image types on windows for example that aren't as a common on Mac and so they may not perform into expected and again so get some credible image of stuff from the first choice the next thing is a thing called rendering hints and if you have a graphics object you can both get the list of default hands and also set your own and the basic idea is that with Java too there's an ability to do a lot of really nice graphics you can do anti-alias text anti-aliased primitives you can do image splitting with different kinds of convolutions and do really high quality work but the fact is for a lot of what we do today including most of our normal GUI framework operations we don't need quite the quality so that's why for example a lot of these will be defaulted to lower quality fishin for like one swing music and that's also because a lot of times in GUI framework building anti-aliasing can get in the way and cause buzzing other cameras that you experience so the key here is is that we haven't actually fine-tuned all that stuff yet my imitation but what I recommend is that first you get familiar with these and extreme at with them understand the different ones i will explain them in sec here and then as we moved excuse me towards shipping you try these again with our final candidates because then you will start to experience differences this is really important for us because we can make a serious changes in our implementation Dustin when we're in fast mode so rendering is a key if actually deleted off the front of these in key underscore and there but they're in this undoubtedly PS thunderclap and so rendering is a key that you can pass into this little hash map and you can say basically quality or a fast anti-aliasing you can turn on and off for both primitives and text as well as two of them in fractional metrics to specify sub-pixel positions for context one thing I'm out here too is there's a new way to do text in Java to which is glyph vectors and it is the highest performance way 2d text don't assume that you can make these do those kind of things yourself and if you really are doing tech stuff you want to see a high quality and speed look at the way the swing examples be the stuff before you go and later on because they're using to effectors and that is the fastest way to the text so fractional metric comes up because there's an additional cost with doing sub-pixel positioning of your letters on your text rendering so bittering there's a couple different choices there again it's quality vs interpolation they're filing it by cubics and that's for your image bleeding basically we're image world was a little better when scaled different sizes if you use a higher quality ones but it's slower so keep that in mind elf interpolation st. same quality speed and color rendering same thing for speed so bitmap image manipulation this is freaky for our platform we're optimizing first to make swing apps and normal usage at fans what that means is we actually are not going to use the same data buffer types internally that are used on the windows implementation and that's because then we can do hardware accelerated blitz between our off screens and ice cream and so in order for us to do that we have a different implementation class under there so don't assume that you can type and save cash down to specific data buffer types they won't be down there so at least look doing this themselves you can create the other types and they will work that they will be slower so just be careful again this next one is kind of obvious but there's a whole bunch of AP is that are in the new imaging stuff that are useful for some high quality advanced imaging stuff and they were developed this sentiment develops that in mind but they're not as useful for on typical case so I'm calling it low call frequency methods the point is that I'm read about a raster and bufferedimage there's two basic ways to do things in that this is the ones that I think are good for typical case and it's basically you can pass in a whole wreck to whatever size you want of pixels push between your ear on your bed max you're after and this is better because you can control the frequency of cooperation for the number of pixels you want to do so for example if you want to drink ram you've got it you can basically have a full copy that gets pushed across and make one call to copy or if you want to go all the way their way I would recommend at the lowest per scanline you can say make a medical person and that's much better than these for speed which existing API but you want to be where and that's where you basically make a method call for pixel and it's shooting yourself in the foot in most cases so double buffering this is kind of interesting so I Mac os10 we double buffer all carbon windows right now in most cases so what that means is interrupts already double buffered and we'll take care of flushing that up of efficiently if you want to have the effective double buffering use the swing stuff because on Windows that stuff will be double buffering and on our platform you'll feel just you double buffers whereas if you create your own Java image and drawn to that which is really easily doing Java is mostly now I'm sure and then with that image you'll actually be triple buffer done that with 10 which is kinda graceful so again we take care of the swing so what I do in the case where I want to implement this is I just again you the swing stuff and new jpanel etc and then it will all just take two so that's another issue from echoes 10 its performance related basically the hard work of doing live window resizing is yours f developers will make of course all the primitive stuff fast but when you do live winter sizing suddenly component that pain action is to be fast right can be called every time you move a little bottom of the window so if your codes can't handle that you'll either have built chunky performance when you move in that window your intellect funny you know like you have a low frame rate or something and just the mouse to become unresponsive or that you can do is do some kind of threaded rendering so you know the case total cases are like JPEG voting where you show what you got don't wait or if you've got just a real complicated image maybe we just pass over and do some kind of simple versions at and cue up a thread that does the rest of the work and then at that point is that your con securely event maybe you would use double another buffer in fact we triple buffered but draw to an image and then do it later that gets a little more complicated because then if the size changes maybe you want a sample and drama gym scale can talk to people afterwards if there's specific examples the main thing I want to point out is that with live resizing your penis can we call it a lot so if thats all for my size Jen I guess I really don't have much to say as far as the summers concern but the environment that you'll be working in is a little bit different than mr j and there are going to be some different things that you may have spent a lot of time trying to get the forms done it in my day they're going to be a little bit different when you get to macro as Ken so hopefully we've sort of covered a broad net area that you get to keep in mind in your ass and you're working so I guess we're going to get Alan to come up and coordinate a Q&A session we took time