WWDC2000 Session 410

Transcript

Kind: captions Language: en good morning everyone please take a seat we're about ready to get started if you are in the overflow room we have plenty of rooms still here in the main hall probably the only time that's true today so come and take advantage of it so glad to see you all here this morning how many of you were made it to the community boss last night you have a good time get me some people learn a few things excellent so as you can see we got tired of having bill preaching at us from the aisles we decide to just bring them up on stage and make it a lot easier on all of us so we're pleased to have Alex code and Bill Baumgartner here from code fab these people between them probably have a good fraction of the WebLogic experience on the planet and so we're really pleased to have them here about how to tune the last drop of performance out of your applications Alex and Bill good morning thank you all for getting up at the crack of dawn and getting down here I know next time we'll make sure that this session is later in the day so everyone from sleep in in the morning that's the first trick to optimizing is make sure you get a good night's sleep okay so you know basically you know this is about the good stuff yeah you've got an opportunity to do killer high performance applications is when somebody wants to do the big high transaction rate East doors it's going to make a bazillion dollars and you know this thing has actually got to work it's got to work well it's got to work well with a lot of people doing a lot of transactions through this thing and you know and in truth you know you have to do a little bit of thinking and planning before you can get something that works exactly the way you want it to be ultra-high performance and also reliable also scalable and so forth and we're going to try and give you some good points on how to make my web objects app that can take all that punishment and make it look easy and and we want to help you guys build an application like that and make it look easy ok so this is what you're going to learn how to do it leaner meaner faster nastier web objects applications it will kick everyone's butt most importantly you know optimize them before you build them and then you know what to do when the application turns out to be quite as fast as you wanted it to be and how to get that last ounce of performance out of it okay so what can go wrong with an application I mean what will stop an application from running you know perfectly with billions of transactions and we know that blowing up well now can it can work too hard you know the the application server is cranking away at 100% and there's no spare CPU cycles and so the response times just start to slow down because you know you're just competing for actual cycles on the server similarly you can run out of memory and the machine is working too hard getting stuff in and out of swap space you can be bound by the network you know literally just you know have trouble getting enough packets in and out of the machine and you know bound by you know the actual code you've written you know just you're doing too much per request/response loop and it just takes too long to get those responses out and of course you know common one is that the database is working too hard and you know takes too long for your database calls to come back because it's doing too much the first three can be fixed by spending more money now CPU bound will just buy bigger computers you know memory bounds take those things full of RAM Network bound by a bigger pipe I mean in general when you're building a big site like this it's important to say you know you have to have enough computer power to do this kind of thing it's not going to run on you know some windings box in the corner you're going to talk big you know Solaris machines with lots of CPUs in them similarly you're going to talk about gigabytes of RAM the basic idea is to make sure that you know you want to get it close as possible and not swapping at all similarly you want to get you want to buy CPUs until you've got a lot of idle time left on your CPUs because you know you get that Slashdot effect going on you get mentioned in the popular press and suddenly you know you get ten times the people hitting your site as you used to and you know when it's running it's sort of normal rate you want to have you know the CPU usage down where you've got some extra space but you know the last two problems require optimization and that's what we're here to talk about um you know I start from here sure okay so one of the things is that with optimizations you don't want to optimize something before it works however there are some things you can do before you start coding that will lead to an application that is both relatively optimal or at least passable as well as something that can be optimized one of the philosophies we work by in is to make it work then make it work right and then make it fast and it's very important to make it fast last the big mistake people make is trying to optimize something before it exists in that what we're going to talk about is design what you can do in the design and then the initial coding process to really lead to an application or our server solution that can be scaled and managed and extended in that within that understand usage patterns you want to optimize the most used areas first you really want to focus your design around the areas that you think the users are going to use if you're talking about a large site and you have the budget to do it do some use testing do some focus testing where you take some to your potential user audience and run them through the design and get their feedback there's companies out there that do this the other thing is make your entry page fast one of the things you'll see on the web especially when you get nailed by the slash dot a factory get touted in AOL or something like that is that you'll get hundreds and hundreds of users that will hit your entry page and go no deeper kind of depressing but it's the reality to that ends one of the things that Dave Newman in mentioned in his security thing yesterday which was very interesting if you're talking about a site or will not even the site where there's logins but in general if you're talking about a site where you don't need to create a user session when the user hits the entry page you can avoid a tremendous amount of overhead by not creating an individual user session when the user hits that entry page and that can be a great boon to performance plan your business logic around response generation one of the things we commonly find ourselves doing is we're building back-end applications like entry tools etc and while those are more complex what you really want to do is design your business logic about how the front-end the high traffic piece is going to be used you want to avoid repeating expensive calculations use caching you know just avoid the expensive ones altogether use less precision don't provide as much information or provide user interfaces where if the user wants the expensive information they have to drill down a little bit they have to ask for it retain and reuse data know when it is out of date that's a huge issue there's a caching session gof caching session I recommend everyone go to that and manage your cache data carefully this is a huge issue as well think carefully about how often you really need to refresh that data and how you're going to go about doing that one common mistake is invalidating your cache simultaneously across all your applications so every single app hits the database at the same time bad idea is it you also want to minimize your memory footprint by doing that you can run more instances which gives you more opportunity for scaling by spreading your traffic out across instances share data across your sessions what that means is you you when the application starts up you pre cache information you want to clean up thoroughly and you want to clear transient instance variables and no longer need and now what that means is that if you're doing Java programming just because you have a garbage collector don't be a lazy programmer okay when you're done with something set the pointers to null I'm not pointers set the variables to know that will not only will that make your code cleaner it also means that you're going to avoid issues like using a variable that you really didn't mean to use anymore it also means that when the garbage collector does run it has less of less of an object graph to traverse use stateless components stateless component is a component that literally has no state within it it's cached by the application not by session you shared sessions if appropriate if you're talking about a new site maybe you don't need user sessions at all and set the session timeout value to something appropriate you don't want these things sitting on your server forever you want to plan your data access your queries caching cache updating and understand your data at latency you really want to try for 0 requests or zero data requests per response now obviously you can't do that I mean you're never going to achieve zero else your apps and we're going to display anything but you really want to minimize those because if you can minimize the trips to the database you can really increase app response time it also leads to better scaling if you have fewer applications talking to the database simultaneously as you hit that huge flood of traffic if you're doing your caching and you're sharing that data across sessions you can increase your scalability because as your as your traffic Peaks you don't have sudden bursts of activity against your database using memory searches where possible obviously anytime you can avoid traffic across the network in your server environment you're going to get a huge boost in efficiency you want to manage your faulting and manage your caching again this is just about making sure your data is up-to-date at the same time making sure you're not expending huge amounts of processor time network bandwidth etc updating these caches and use the shared editing context for reference data and there's a shared editing context functionality that was new and four or five this allows you to relatively easily share data across sessions I mean if you've got to go to the database and read the the upcoming calendar events every session doesn't need a copy of that and you can use time outside your request for response loop for housekeeping or you can use time you manage the time with which you're doing the housekeeping in the request response loop very carefully for example you can load reference data at the application startup instead of forcing the first person to hit the site to refresh your caches or fill your caches use the application will finish launching notification to pre fill those caches you can use timers or perform after to delay to do database access or to do cache and validation cache updating etc got to be a little careful with that because of the way what objects works and the way threading works you can end up with a thread issue but there's discussions of that on the omni group mailing list which everyone should be subscribed to I'm going to try to keep this high-level serialize and lock request handling that's very important and this is when you get into really advanced web objects programming and you start doing things like multi-threaded cache updates cache and validation or the timers is what you want to do is you want to make sure and this is kind of a warning we have scars you want to make sure that you're locking your request handling when you're in situations where your caches are being updated because you could find yourself in multi-threaded situations that can lead to some serious data destruction and these are again things that have been discussed on the omni web list and well not really going to them you want to partition your functionality into multiple applications one of the temptations with web objects is and with the ease with which it is to add functionality of applications is to make a monolithic application that just does everything and part of it there is that is very convenient if you have a single session for the user and that that session contains everything in the world it's very efficient well yes it is but at the same time it also greatly limits your scalability if you talk about spreading the user session across multiple applications such as the user say comes into your site and browses that's a different application and say drilling down into a product or doing searches what this means is that you have a much greater opportunities for scalability if you need to if the search tool proves to be a bottleneck you can just run more of them versus having to run more of your monolithic application move more expensive operations from live site to data entry that's just about building administration tools think about when you're building larger sites building a front-end application that the users it's all optimized and oriented to performance building a back-end application which is what the administrators see which is optimized to flexibility and power and manipulating the business and by doing that you can move your expensive operations into the backend if you're doing something like now storing images and you know off on the server and you can keep track of information like how big the images are and things like that you gotta do other sorts of processing it's necessary to prepare the user interface for the person visiting your site to see you know move those calculations to like the data entry time when I upload the image let's calculate what the height and width of the image is and sort that information in the database rather than grabbing that you know at runtime and you know there's a variety of things like that if you can you know compose you know compose images by you know compositing and put them store them when you enter the data rather than when the user comes to view the data you can save time now if you construct cached HTML pages when you're doing the data entry and you know thus have sort of information predigested ready for the site to work because that a small number of people using the administrative application relatively infrequently and we can move functionality from sort of presentation time to data entry time you know that will significantly increase the speed of your presentation time tool which is actually the thing that your speed is all about it really cares if the admin tool is fast or not I mean yes they'll complain a little bit if it takes too long to save something but now the real throughput that you're looking to optimize you know is the thing that the customers see and one of the the real benefits of the web objects environment is the ease with which you can create modules and you can assemble these modules and put them all together and that's really the last three points here and one of the things you can do is create a different view of your data for the front end application versus the back end if the front end application is primarily read-only which generally they are I mean if you're in a store it's not like the customer can edit the price so if that's the case then the front end application doesn't need to have the business logic or the expense of supporting that so you can create neo model or say - yo models one that has a very simple view of the data that's optimized for speed and a second yield model that's used by the entry tool location or the the administration application suite that is optimized to the functionality and the power required by the business managers and all of this can be leveraged through frameworks while generally what we're finding is that our applications end up being extremely thin they'll be almost no code in the applications themselves all they do is load a bunch of frameworks everything is in the frameworks and by doing that you can reuse those frameworks across as many applications as you need to realize the site you have minimize use of frames and user interface it's just an optimization it clearly frames or necessary frames are necessary but frames can cause a lot of issues they can cause a lot of extra traffic against your site as well when you're doing dynamic applications where you have to update content across multiple frames you'll find situations where you end up having to reload the whole page which means you get one hit to load the frame set one hit to load each frame and that can be very very expensive it also means that when those hits for the frames come in the browser it can be lead to a lot of bugs because the browser you don't know what order the browser is going to load the frames in and it's going to load both of them simultaneously whichever one gets there first is going to be the first one to load and if the user hits the stop button then okay one frame loader the other one didn't how do you know you don't so that can just be a lot of confusion there use direct options wherever you can direct actions are wonderful not only do they allow for bookmarkable sections within your application they can also be very very fast because they don't go through the full request response handling they don't have to do form processing or things like that you can certainly use for values those request Act work with the report the direct action request handler but you don't have to and beware of mixing job and objective-c yes certainly the the environment does support fully mixing these things in just about any way you want to there are a couple of little subtle limitations you can run into however there's some serious performance issues with going across that bridge between Java and objective-c and it should be avoided it's also very difficult to debug okay so I'm going to turn it back over to Alex here ok ok so it's better some good pointers just sort of upfront you know think about when you're structuring your application you're taking apart your problem and figuring out how am I going to go about building a solution you know we organize things well we do some good planning about database access and some good planning about caching of our data in order to you know minimize our round trips to the database and we've thought through our framework design and everything else and the apps are done and it's up and running but ok you know it's a bit of a pig you know let's assume maybe the opposite situation is you've inherited a pig that somebody else's built and now it's time to figure out a way to make that pig fly so this is you know now all the planning up front is all well and good but as we all know that you know no good plan survives contact with you know the enemy or reality or your customers and what have you there's just a limit to how much you can get right planning upfront because you get half way through this thing and the design change is remarkable your client calls you up and says look our business model has changed and you know so it doesn't always end up working out by timing out the app written that it's exactly the way you thought when you started out so ok so it's a pig it's a little too slow you know it's that's more memory you know yeah damn that's a big application instance size isn't it 50 megabytes nobody's even started a session yet you know you go off into this request and it's like on my machine that's 2 seconds before I even started gloating in my browser now you're testing this out with a hundred users and the CPUs on your multiprocessor Sun or pegs and it's like wow all right now what do you do okay first of all don't pee fill it I you know this seems like this seems like a trivial advice here but this is actually good we've we've we've had some situations where we've had clients who got stuck with you know they had whoa caching enabled turned off you know for debugging purposes and you know they managed to get into production with the big sites on caching was turned off so every time somebody loaded any web objects component loaded it from disk in fact that particular client I'm thinking of I had actually coded into the code for the application to turn this off explicitly in eight places in eighth place and so yeah everybody kept saying ah I found it you know they take this line out it would still be still sucked all right but you know this sort of thing doesn't really show up when you're doing the desktop development the pages load fast and if the apps right there anyway but you know once everybody starts hitting this this uses up a lot of see you know resources on the server reading all stuff off disk make sure whoa deep buggin is off all right so you've added you know going to monitor you add the application in and it's running in monitor and what have you well monitor doesn't automatically turn off whoa debugging you have to actually go in there and edit the command-line arguments and say you know we don't need all those debugging messages spewing to the log file during while the application is running in production so turn that off and of course you know the corollary is when you're actually doing logging you know use debug with format as opposed to say log with format which doesn't get turned off when you turn whoa debugging off and in general you can achieve some speed pickup by having your applications methods you know your action methods that return the same page return you know this context page as opposed to returning nil that basically does is it short-circuit the action processing within web objects so that whenever basically what happens is whenever web objects invokes the invoke action for request method that's the thing that says okay which button did the user click on oh it was this button okay I got to do something well when you got to do something if you return nil web objects doesn't know that you actually did anything so it's going to keep searching for whatever user interface element was clicked on by returning something in this case this context page which just simply reloads the page already on which does exactly the same thing as returning nil but what it does do is it gives web objects a signal oh you can stop searching for whatever the user clicked on or whatever the user did and depending on how complex your page and how many nested components and what have this can save significant amounts of CPU processing okay so you got to start cleaning this stuff up where do we start well we've got to start with the most frequently used bits this is the the classic thing about all optimisation is you know you could have this one page that totally sucks but nobody goes there so don't bother worrying about that one for now start off with the stuff that they do usually you know you got a store they're doing a certain amount of browsing you know they got the whole checkout process and so forth so on and so forth handle that you know if you're dealing with the loss to your password do the whole section of the site trying to optimize that you're optimizing the wrong thing people don't spend their time there and they don't complain that it takes too long cause little form to send me an email to give me my lost password know so while your user actively know what they're actually doing use the most statistics store logging this is a great thing it's only gotten better than four or five gives you a tremendous amount of information on which pages the users are using and how long they're taking and what the average response time is you look right down this look this one's got an average response time of ten seconds like okay sir sometimes it gets they buy with half a second but we've got thirty five seconds here and there that gives you a good good indication of where to go to start cleaning stuff up capture your direct action activity the direct action information is not by default has most of the low statistics store logging deals with component actions it'll also keep track of your direct actions and what's happening but you can code your direct actions in such a way that you're always going to some direct action with the same action method and then it's got some other arguments to tell it what to do a statistical show that okay the default direct action you know have these 50,000 hits on it but doesn't really tell you that much if you go to the same method and then you branch based on other conditions so putting some logging stuff so you can tell which direct actions are doing you know doing the most work and then tune the most visited areas first alright this is this is generally where your butt gets a bit the most I mean by and large web objects applications are big database applications and the thing that most you know after you've cleared up the fact that you were running this onto smaller computer or you know you cheap tout when it came to putting RAM in the machine or what have you generally what it comes down to the fact is that your bottleneck on talking database so you need to start out by making sure that the ad doesn't do amazingly stupid things with the database so you know a common thing is you know going there on that search page and you know if nobody fills in anything on any of the fields and hit Search don't go off in search and return all records you know you know say hello you got to put in at least one qualifier you know something like that that's a tremendous help because you know the big search and the big result return results you know is going to take time to do the search move the data across the wire and Stan she ate all those objects you just going to look at the first page and then say oh well that's way too much information and type something in anyway so you know a little bit of sort of smart modifying the users behavior can go a long way I use fetch limits this is you know this simplifies a bunch of things I mean net-net you're mostly doing a bunch of the same work as doing it you know a large query to the database you know has to process the the database request in the first place but you can choke off returning back tens of thousands of records by putting in a fetch limit I mean nobody wants to look at you know more than a hundred items on a return anyway except in very rare circumstances and if you put a fetch limit on there and take bring back the first hundred or the first twenty records and then make sure that the user wants to see more you know you can limit the amount of data moved across the wire the amount of objects you instantiated the size of your cache and so on and so forth it's often useful to cache search result this is kind of a interesting thing as an interesting design pattern here if you go to the search page and you search for all you know blue t-shirts for men that are medium or larger what have you we get one you go go down and look at that t-shirt and say I didn't really want that when you go back to the search page it turns out to be very nice if you go back to the search page and your results of your previous search are there now then they can go down to the second one and the list and go down there this may involve you having to write code to keep that search around and keep the search results around on a per session basis as opposed to when you go back to the search page you clear everything out and I got to do a search again because one of the most expensive operations generally on your site is doing these sort of big database searches that's one a the the user is expecting to have a long response time because you're connecting to the database and returning a bunch of stuff if you can just sort of minimize that that's gonna be no last half-dozen apps I've done that's turned out to be the page that had the worst performance with the big unlimited database search page and so just by having that thing come back be there automatically when they come back just limp you know drastically reduces the number of searches an individual user will do and you know last but not least on this particular subject you know if you have a small enough set of objects and you're doing a store but you have a hundred products or 200 products you know maybe it makes sense to have you a read-only set of product data in memory that you initialize when the app starts up and you do the searches against this cache using in-memory searches and don't bother going to the database you know if you've got a you know CD store with 400,000 records in it you know maybe it doesn't make sense to bring all of that into memory and do searches but you know if they're doing searches on relatively small things definitely using a memory searches they're fast and all sorts of precious resources Network you know bandwidth going over to the database server the database servers resources etc you know suddenly you're disappeared from the equation you know you know when you got to do some fetching let's optimize this a bit you know it seems obvious but it's definitely good things you've got pop-ups you've got reference data you've got stuff that is constant across everybody stuff you know fetch it at the application level in a shared editing context and keep it there you know it's really easy to start coding up using you know the default editing context for a session and start doing stuff there and you can end up with you know copies of data in every session in every editing context you just don't need to do that you know use the sessions editing context only and I mean only for data that the sessions user will actually edit you know if he's not actually changing the values in it you can share the data with everybody else yeah you can have a list of you know you have a session specific list of things I'm interested in or what have you but it's not doesn't mean you need a session specific copy of the actual data you know and that's just a general good rule of thumb you know is the user going to edit this piece of data in here session No then we don't need it in the session editing context okay you know you get to the stage where you want to avoid having doing fetches in order to draw the pages of the users looking at now good idea is to cache and share data that you know that's used to draw the pages of the users are looking at and you know try and keep that cache data up-to-date have you know you end up with a situation you know like we've done some financial sites where people are putting in bids and offers and doing trades and such so that session a you know app instance a is going to put some data into the database everybody else needs to see you know you need to find a good way to make sure that everybody information is up-to-date in a timely fashion you know there's some really neat stuff for you can do inter application messaging so that the the individual applications don't have to fetch from the database every time there was some good work that Dave Newman posted originally on doing a snapshot updating and we've done a bunch of stuff to modify that stuff but that just avoid you having to go to the database to get the cash you know to to update your data you can also use the time between request response loops we mentioned during it up in the design session you can just when nobody is actually requesting something go in and do a section update the cache data it's a little less efficient than you know notifying the various app instances that the data has changed but you know met when it comes time to handle a request for a response you know you know you've got up-to-date data in your application you don't have to go to the database and of course if you've got to get some sort of non object based data out of the database you know go ahead and use the raw road stuff it's quite fast and you know it doesn't doesn't involve instantiated objects I mean do not you know don't try and get around the whole object mechanism using this stuff but if you want to know if there are you know if there have been any changes to the database with you know it's in this particular time frame or you want it you know you can use wall rows for certain specialized stuff and it's quite fast right the thing that really bites you in the butt is you have this picture what the application is doing and you know you think it's being very efficient because you optimize the design before you wrote the whole thing and it's still flow and the database is still cranking away so you know obviously you're doing fetching where you didn't expect to do fetching so you adapt your debug enabled is your friends now turn this on you'll see all sequel that's being generated you go to this page and you think there's no sequel you know queries involved in this page and you go hit this page like Cory Cory Cory Cory Cory Cory Cory Cory query query process is like where's this all coming from and you know it's very easy to discover that you know in your WOD file you're referencing object that relationship that relationship that value and you know you're smart caching or you preloaded the data up front you didn't use any prefetching or anything else like that so all stuff on the other ends of these relationships hasn't been fixed yet and so you go to video visit the page and you've got some binding here and that forces several fetches in order to get the data to answer the binding now especially bad is when you're just saying you know you're like testing to see whether or not we should show this component or not you know you know does this object have you know one of these things on the end of this relationship and so you fold in the relationship only to find out no it doesn't have anything in your app and display anything anyway be very careful about you know what you bind to and how you answer some of these questions this actually raises a interesting point about what logics in general I know virtually nothing about databases I'm lost and when I hit a relational database you give me a raw sequel window I don't know what to do but what he'll modeler even I can set up a really complex database generated use it and do very useful things of course I can't make it go fast I mean the power of these tools can be intoxicating it can lead you to some trouble add these things here being you know using the adapter debug enable looking at the database plans things like that is critical because it's very likely you're going to have someone on the project like me you can make this thing work at the object level and it's going to realize an application it's just going to be a pig in production at the database level so yeah there you are cleaning up after the pig but now one of the good things you can do for to avoid excess faulting is when we're talking before about having separate data models for data entry and data display now you can have instances of feel like the product you're going to display on the screen or whatever the article that you're display on this article page that you've tuned for the runtime application and you do things like flatten relationships in and so you know testing to see if you have a picture and if the article has a picture then we need to now put in the picture component or what have you if it's been flattened in you know we can check the value without causing faulting if you have this thing is you know separate relation you know you know article that image you know you know in order to say if article that image is you know not equal to no you know you have to fire a fault so you can you can optimize you know you're affecting behavior by tuning the EEO model and flattening relationships in for present for presentation you know one of the most common mistakes that we've had to clean up I'm sure none of you would do this you're all very good is you know you build this you're building the components up one at a time if you know this component I'm going to need this pop-up list of all the states in the in the country and so like in your init method you write a little thing in there there's a statute of all the state you know you know states all objects so that you can populate the pop-up because you know you're just writing this one component at this moment or not thinking about it and you know six other developers on the project for six other pages that have a list of all the states also write the same thing and so every time these components are initialized they go off and do the database effects this stuff in and components come out of the cache and they are recreated and the infection and components are cached in different sessions and each one it's an it aspecting in this list of 50 states you only need one copy like the states change all that often you know think you know you go through and you clean all the stuff up and you know move the stuff off to application in the shared shared editing context everybody has got to pop up or browsers or things like that you know valid you know regions that we shipped to so on and so forth can get this common reference data out of one place and not try and do this stuff on each thing if need be you know fetch you know all the objects you need and then you can use filtering you know to produce the stuff that you need for each individual page the other common mistakes that involves excess fetching is okay you've got this you know you've got this shared thing in the shared editing contact and you know you accidentally cause stuff to be fetched into the sessions editing context by sort of not managing you know which which objects are in which editing context now if you must you know if you you've got to the point where the user is going to edit some object and you use local instance of objects you know to get local copies of the object without doing faulting I mean basically without going to the database to get this now basically it's creating a new instance of the snapshot data in a particular editing context and doesn't require a round-trip to the database so you carefully manage when you move things across the boundary between the shared editing context and the session is editing context and you know follow this stuff all around have a policy you know these objects are all here will only have this object when we do this or whatever and you know stick to it metals that'll soup things up a bunch okay optimize your yo models again this is a there's a there's a tendency to go batshit on your eel model or come up with like the perfect you know normalized abstracted Yeoh model with everything is an object and so on and so forth I mean you know we had one client who had you know you know we had a table for gender objects with it was a row for male and a row for female so that they could have all the people who'd signed up for their site you know have a reference to either the male object of the female object now it's like oh please now use flags you know simplify some of this stuff down it may not make everything an object but you know they were faulting these things in all over the place and like no um okay the other cool thing is Els and inheritance it's cool you can do just amazing things with this and I know I've been given the you know zo F inheritance abuse award a few times you know think seriously about you know how much of the inheritance stuff you absolutely need to have in your model you know if worst comes to worst you can do a complex hierarchy for editorial tool and simplify it for your application but there are a lot of cases in which having a complex inheritance hierarchy especially when you're doing deep fetches which is now I've got 15 of users and I want to select all users who haven't been here since last week and I've got to do a fetch again each one of the 15 tables even when you're doing something like single table inheritance that's going to do you know fetch against table a where flag equals one and fetch against table a where flag Eagle so each round trip is expensive when you're trying to minimize it's not the data that's pulled across but the actual number of round trips to the data server and so the complexity of your inheritance hierarchy especially when you're using deep fetches can cause a lot of round trips to the data server now this again brings up another point where a person like me can get you in a lot of trouble because I think in objects you know I look at a bunch of users and I think a big you know an inheritance hierarchy yeah that makes total sense but EOF provides a brilliant object-oriented interface to a relational database and a relational database doesn't do inheritance well object-oriented databases do but there are other issues there so keep the object models simple not because the object model being simple is great but because it's going to make the database that much faster and you can overload these tables too I mean you can have complex objects that you use for editing and then slap over on top of the same table a simplified object and with flattened attributes and what-have-you but you use for presentation you know maybe we know we're doing on this page some simple piece of information processing and you know we can take a you know and create a new user entity that you know spans the important shared part of all the other user entities and we'll just do a query against that and doesn't give us the whole complex hierarchy but it gives us enough information to answer the questions that we need to do and it's only it doesn't have an inheritance hierarchy at all now there's tricks you can play like that that will simplify simplify things you know again you know think about what you're going to use these things for use batch faulting we're appropriate you know you can basically you're using what this does is you sit there you set batch faulting in your email model say when you're going to fetch this object once you fetch the next ten because we might need them basically what you're doing is you're pre-populating the cache that's stored by us the snapshot dictionary of your objects but then you need to make sure that you're using that appropriately if you've got two one relationships in the same editing context in the object on the other side of the two one relationship is already in your cache it'll go find that with that flat without faulting the database but you know it's not going to know if you've got a too many relationship it's going to have to go to the database anyway I can't tell with this guide you know all the children in there for the parent because even though it you know you know and I know that you know all three children are already been brought into the the snapshot dictionary you know it doesn't really have anything that can tell to make sure that the you know the list is complete so I have to go to the database even if the result of this is that it can satisfy there's many relationship out of the cache you know use prefetching this is you know this says transmogrified from the earlier days to the current days from you know hints the actual directives and you can just say when you're going to populate this object you know populate these things on the relationship this is useful for when you're building up your cache to make sure that later when people start using the objects and following the relationships to things the objects on the other end of the relationship were already there and just you know beware of excess complexity in your model in general you end up with extra extra pointers to various objects in there that can then cause further fetching up activities or in certain cases access back pointers can prevent prefetching from working the way it's supposed to so once you've set up all its prefetching you've got to actually watch the stuff with adapter debugging able to make sure that the right objects are being fetched when you expect it to all right so UF there's all this great stuff for it'll build your tables it'll build your database so on and so forth you know really nice stuff the one thing it doesn't do for you out of the box it doesn't create indexes all right so you've gone off and you create all these objects that have unique primary keys doing all this fetching based on unique primary keys create indexes on those things also look at your queries and see what you're doing people are doing these sort of queries where they've got seals they can type in values and do a search what are they searching on create indexes on those values you can speed up your database activity tremendously by you know properly indexing things if you're not quite sure how to have the database is using stuff this is a great thing and everybody's not like a database geek doesn't know about this but this is a database propellerhead saying you know for sure you know it's I Basin or it's I base that show plan an Oracle it's explained plan you turn this on and run your query and it says well you know I was going to check in this table in that table and then I was going to gather this information here and then I've been a process it and do that stuff there and it tells you exactly how it's going to go back giving you back the three rows of data you would actually get from your complex query and one of the useful things this will tell you is you know and then since you asked the question in just this way I decided not to use your index and to do a table scan instead you would get the results out and you know you know by sort of doing explained plans fiddling with your indexes and what-have-you you're going to actually make sure the database is doing what you want it to do not what it thinks it has to be able to do now you're playing time yeah we go over was a two-hour session right okay no problem now we're getting close to the end other good tuning thing is no the database is running exactly the way it's supposed to it puts most of the information that you're going to access on a regular basis thing to memory cache and you can check that the database statistics to find out is it doing that or just going to the disk every time for your your data and you can tune that and also just more silliness and you may have to get somebody's a database with to come in and do some tuning in the operating system you got a multiprocessor machine oftentimes you have to actually tell the database to use all the processors similarly databases often have a bunch of parameters about how much memory they use you know how much data they put in there and how much store procedures they put into memory tune that appropriately you can have a big piece of iron is basically sitting there idle because the database is trying to run a little tiny slice of memory on one processor doesn't do anyone any good speaking from experience databases run really really slow when they're tuned for safe 512 Meg's of RAM but you only have 256 and it turns out not to work quite as well as you'd like and last but not least actually look at the generated sequel you know it'll suggest additional indexes you know you shouldn't ever need to do hand optimized sequel and put that into the EO model it's definitely a last resort but once in a blue moon basically on the way you've constructed your object model and what-have-you gof mains construct sequel that is less than completely optimal and there may occasionally be special purposes where you need stored procedures you know it's basically compiled sequel runs on the server faster than you know now on the fly sequel and it can be useful in certain in certain circumstances all right now that we've gotten out of a scary database part I'll get this back to Bill ok once you get the database going fast because that is generally where most of the bottlenecks are you need to start looking at optimizing your application itself and optimizing your components one of the first things is there's great temptation to componentize everything make everything a reusable component that's actually a significant performance hit do it carefully simplify your component nesting you know don't make every image in the navbar an individual web component make the navbar itself a component things like that to find your own compiled subclass of world component and put your common functionality there what this does is just simplifies your overall component hierarchy well you did what we always do is we always have a subclass of whoa component and every single component B it Java Objective C web scripts doesn't matter inherits from that specific subclass by doing that not only do we gain the benefits of sharing all this functionality across the component hierarchy we can also push some debugging information into there some little debugging triggers do some logging things it's a great place when you start to get into debugging and performance optimization to be able to put breakpoints and print information etc you also consider caching pages or using new stateless components on any page where you're not displaying information specific to the user or even if you are to a limited degree there's no reason to not use the stateless component State components are great that means they're cached at the application not in the session the other thing is caching your pages if again if you're talking about a page where you're say selecting a region for some store application or something well regions in the United States aren't going to change that often so cache that information and finally make static content static and this is one thing that a lot of people miss there's a great temptation to serve everything from web objects or everything from the dynamic content generator if you use static content you get an order of magnitude performance improvement static content comes straight off the disk of straight out the web server there's no state associated it's blazingly fast because it's exactly what the web was designed to do I mean in effect all these middleware things were doing all this web object stuff is doing something to the web that it was never designed to do and there's a big performance penalty for making something do what it was not designed to do refactor your software once you get to thing built once it's working right and you found where your bottlenecks are start to you know compile anything that does serious calculations look at optimizing your calculation engines look at generalizing that and moving it out of sort of the application layer and into the backend layer and really treat it like this area a serious calculation engine that you want to maximize the performance up and then use it from the upper layers simplify your application and session objects this is more this isn't really about optimization as much as facilitating optimization what you want to do is if you have say something that does region management going back to the store thing where you've got multiple regions at the application level and there's some complex a product selection or product availability on regions as an example we did a record store record stores there's certain records you can't sell in certain countries in the world so we have a region manager we push that region manager into an object of its own that you can access through the application by doing that it moves that functionality out of the application level and it means that is we're optimizing that optimizing other things we're modifying something that's relatively isolated and finally don't forget about the web server is the web server optimized for the environment a classic example of this is uh okay so you're running against Apache you've got that well object adapter in there you tuned your application out to the ants degree oops you're only running five Apache servers I did that once tune it make sure it has the appropriate configuration for the amount of load you expect to have use a mixture of your static and dynamic content wherever you can use static content again that's just going to boost performance direct actions allow you to integrate the static content with a dynamic content if you enable OK in web objects one of the great things about HTTP since it's totally stateless is that when you have a user having a user experience with your site you have to pass around a user ID a session identifier well normally by default that session identifier goes back and forth in every single Earl so every time a hyperlink is generated in the dynamic content that hyperlink has to have this big long nasty number that identifies that user such that then when the user clicks on that web obvious can figure out what session to associate that hit with well if you move that to the cookie now cookies are have their own problems but pretty much they're supported everywhere now by doing that pretty much all the URLs in your content no longer have to have user specific information in them this allows you to integrate your dynamic content in your static content so for example again using a record store example we may have static pages that describe albums static pages that describe artists well those don't change very often leave them on disk is static let the web objects application navigate over to them have direct actions in those pages that bring the users back into the web objects application the more hits you can get against the static stuff the better off you are and with quickest QA okay we'll be actually do we have 10 minutes after to do QA 15 great thank you optimized for fast browser display this is another little war story here we had a client and the content generation the content delivery was really really slow and this was back in the days when tables didn't really quite work right and you couldn't really specify image sizes quite right and to do layout you had a spacer dot gift everywhere I mean if anyone's been on the web for more than two years you probably remember this well the path to the spacer that gift was like webobjects slash some app dot Walla slash web server resources slash images slash spacer gif and all we did is we put spacer gif as s gif in the root level of a web server and reduce the amount of HTML generated by about forty percent across the entire site that was smaller pages display faster less HTML you generate faster it goes out the less dynamic content you're generating the faster it goes out you want to batch your displays along sets of data I mean not only does the user not want to see three thousand products all at once this makes things go faster show them ten at a time shown fifteen at a time generate short URLs this again gets back to the space or dot gif thing instead of slash images use slash I this also do better with images and just everything or surrounding the static resources that are associated with every web site split installs and web objects are very very convenient they're very useful we never do them and it's not because they don't work or anything like that we never do them because we put all of our static resources as close to the top level of the web server as we can we leave it there and it reduces the amount of HTML we generate you want to also improve the structure of your HTML now this isn't as much as an Optima this is optimization is in optimizing towards a working application not a fast one using HTML code checker such as web lint which everyone should do on the web objects mailing list an omni group WWAMI group comm your ever planning on doing any development with web objects or you're even interested immediately sign up for that list and the reason why I mention that as well is because we're going to be throwing a bunch of code out there next week when we get a chance to go back and one of them is this thing called web lint what it does is it looks through your web objects HTML or your generated HTML check the structure of it make sure everything lines up simplify your table structures it's tempting to nest tables deeper and deeper and deeper especially when you have an object hierarchy or component hierarchy you want to reuse all those components every component needs to guarantee that it displays correctly so it has its own little table that's really slow it's really really slow in Netscape it's just very slow in Internet Explorer and watch for nesting problems especially things like nested forms if you open a tag don't close the tag until you've closed every other tag inside of it and always make sure you close the tag in the HTML standard that seemed to have come out of the early browser implementations closed closing in table cell closing a table row even closing a table closing forms is pretty much optional that doesn't work when you're talking about dynamic content generation and it's going to break things and also one of the when it gets back to actual form it's one of the risks there especially if you have forms that are mis structured is that you can get incomplete data back to your application or you can get broken data and you can get a performance hit as your application goes in you know oops exception and has to go and deal with maintenance stuff associated with a air conditioner an exceptional state the the classic one is the overlap problem I can't tell you how many times we work with HTML producers or we have been doing HTML ourselves and just a simple HTML overlap where you open a form you open the table then you close the form of close the table there's a problem there and this can cause some serious problems the forms don't work the processing must be broken the code must be broken it's like no it's actually in the HTML and one of the things to keep in mind is that that especially when you're developing you focus on a single component I'm doing this component well sometimes problems can span across multiple components and what we like to do is we check we use Web lint on the components themselves as well as on the entire generated content and for more information this here as well please sign up to the omni web web omni group mailing list and there will be a lot more information coming out after WWDC there's always discussions on that mailing list follows up from the sessions at center I'm sure Dave Newman who's making a bunch of code available will post information there as well well we hope this stuff was it was a good oh this is a good start at optimizing your application so we got some time for a bunch of QA now the usual who didn't who to contact and let's a little question-and-answer first of all a big hand for our presenters you