WWDC2004 Session 622
Transcript
Kind: captions Language: en so good afternoon and welcome to session I'm bill Bumgarner I'm a manager of enterprise objects framework and the core data framework I've also done what logics development for ever since two point oh I also be joined by max Mueller's lead engineer in the iTunes Music Store and which is probably the highest throughput web objects application ever built if you know of another example I'd love to hear it I can think of a few that were pretty high throughput but nothing like that so between us we've done some optimization and would like to share some wisdom with you we hope its wisdom anyway so as an introduction here scalability and low latency is critical when you're doing web-based applications or when you're doing any kind of server based application and building one successfully is a challenge it requires a lot of analysis a lot of tuning planning and also one of the sort of critical areas is that security security always complicates these things security is kind of the antithesis of efficiency and convenience so you know that will challenge things and it's not just about the code it's about the entire development environment and deployment environment the whole process and we're going to cover on a bunch of those points this is going to be a very information-rich session you're not going to see a lot of pretty pictures we'd also like to get through the content fairly quickly because we've always found that talking with you the developers who are living this stuff day to day and answering questions about your specific problems always brings up new things and is always interesting to everyone so what you'll learn a little bit about and these are all guidelines unfortunately i don't have the perfect formula for everyone is how to architect apps to be fast and scalable some of the patterns and pitfalls for success and failure to achieve a fast application that works how you can analyze existing applications to determine where the bottlenecks are and then address them and finally you know it's not just about the code it's about getting it out there and into production once you get it into production there's a lot more services available to you than just the level objects application itself there's other tools and resources you can use depending on your budget and your traffic rates to be able to keep an app up and running and we're also going to cover things like when good apps go bad and there'll be a fox special this fall like if you'll notice if you deploy on say your eight props or see gr8 CPU solaris box you can see your CPU usage get eight hundred percent that's always fun you can even see it go to nine hundred percent and then it takes down the machine next to it there's us the problems with like when you're working set gets larger than physical memory and then disk is obviously a lot slower than RAM and when you start hitting the disk for memory that's bad also there's the the situations with a lot of people seem to miss which is like network saturation when packets or connections are being dropped I know one example of a very large company who will remain unnamed who called the FBI because they thought they were being hacked and it turned out that they had two windows boxes and their traffic rates were so high that's a tcp/ip stack was falling over and dropping connections so you know that gets back to the analysis thing and calling the FBI to analyze the performance problems of your apps probably not a good idea there's also just the basic situation of like when your responses require too much computation if you're going off and calculating pi to 100,000 digits to bring up your welcome page that's going to be a problem your database can be obviously overwhelmed you're hitting it way too often or you're just pulling too much data from it or pushing too much data into it that can be bad and a great one is external services because no external service ever wants to admit they're wrong so instrumenting your apps such that when they're wrong you can prove it will make your life a lot easier so how can we fix these problems well unfortunately dollars are always involved the basic problems you can just kind of throw money at it and get more cpus get more memory get more network get it faster database that kind of thing but that won't always work and that's what we're going to focus on is when that doesn't work you need to throw engineering at it and what's really looking to do in that case is do less work to generate the response or to make more efficient use of the database or ideally don't hit the database at all or optimize your external service integration and that becomes especially critical as your site grows in size and you need to start relying on those services more and more so good rules the code by these are somewhat obvious but they bear repeating because when you get into sick of things it's easy to forget test driven development now this is something that's really become popular in the recent years and I can't emphasize how valuable this is put the unit test together you do it before you write the code that test the capabilities and the requirements of the code and then run them every single time you build and when you push the deployment run them there and run many of them in parallel if you can they will uncover so many obvious problems and save you a huge amount of time the you know another obvious one that everyone misses including myself is make it work make it right make it fast as developers we often like to make problems a lot harder than they really are because then we're you know I don't know our egos are boosted when we solve a hard problem it's like look I've sorted something 20 times faster than the next guy never mind it's only 10 elements don't optimize without analysis and I can't say this enough time I've walked into so many situations where someone's got the world's most optimized means of writing out the HTML page that's only used once when the user signs up the first time it's like come on you know go fix the problems that are really there and also optimizing small tests and test your results after each of those steps and this was gets back to the unit testing if you got the unit tests in place then this makes optimization a lot easier because you can go off you can do the optimization and know immediately if you've broken things and you know the last point is just it's an obvious one but everyone does it if it ain't broke don't fix it which as object-oriented developers that means if it works now don't generalize it because that's one that we commonly do also there's a lot of optimization that you can do a design time if you understand your problem well and any optimizations you can do before the first code first lines of code or written are always good keeping in mind that you shouldn't optimize things too prematurely so you really want to understand the usage patterns of your applications know how users are going to use it no it's how it's going to be administrated now clearly you know you can't be omniscient but you know you can do some a lot of upfront work there and of course make your entry page fast I can't tell you the number of times I walked into a client and I hit the entry page and it did 600 sequel queries like no no no make it static you know even if you have to have an entry page that just comes up that's just long enough to get them into the site that's fine but make that fast you also your business logic in your application should be designed around the the response generation it should be designed around what the customers are going to be doing with the application even if that makes the Administrative Tools less convenient you know the administrator tools they need to be powerful they need to be intuitive but they take a little bit of time okay that's okay because it's all about administrating the content for the purposes of the customer and if your business is like any of business I've ever heard of it's the customer that pays the bills and you know I've seen a lot of people to lose sight of that you also want to make sure you're retaining and we're using data and know when it's out of date this will come up again but it's something that bears repeating you know if you go into the database to pull out the list of states in the country it's more likely not going to change in the next five minutes so keep it around then introduces some complexity and working with enterprise objects because of course you can't create relationships between entities they're different editing context so you have to be a little bit careful and you want to manage that cache data carefully because if you end up cashing everything all the time but just leaving it in memory then you're going to get back to the part where you run out of memory your machine starts flopping and then your performance because the help so you also want to look less the database server share in the work databases have been around a long time and they do things like sorting stuff really fast we also do indexes and all this other stuff and you really want to leverage those wherever possible so one of the optimization is to minimize your memory footprint if you can minimize the footprint of the application then you can have more instances running you can balance the load more effectively and this means sharing data across sessions which is not something while objects does naturally out of the box you're going to have to play some games there to get that kind of thing to work it's not hard it is detail-oriented you also want to clean up thoroughly by the egg it annoys me to see a developer makes a statement that because of garbage collection they don't have to care about cleaning up the code okay not every object it's going to be collected by the garbage collector is just using memory it may be using scarce resources like connections to close servers or it may be keeping a file descriptor open or something like that and you want to let the code know that it's done and over with also any null doubt reference the garbage collector doesn't have to traverse to deal with collecting so that's another good way to ensure cleaning up happens quickly you want to also clear your transient up instance variables when they're no longer in use and this is not just an optimization issue this is also a debugging issue if you have stale state in your object graph and you don't clean up when you're done and then you come along some later point in time through some code paths that you didn't really think about and you run across the transient data and now it's not a date and you didn't know it your host speaking from experience traders on trading desks get really irritated when they start seeing the wrong prices so some scars you also want to do things like set you right session timeout value sessions will stick around and they'll automatically go away unfortunately on the web there's no quit button so it's hard to know when the session should go away so you need this gets back to looking at the usage patterns look at the use patterns of your app understand how they are and understand when it's safe to make the session go away then there's also instrument I can't say this enough either is that you know instrument your applications we've got some wonderful tools built in for instrumenting them which max was going to demonstrate shortly and they do a great job of allowing you to detect when resources are being used or when things are getting out of control and you want to review those results often and ideally you want the instrumentation to be something you can turn on dynamically in production so that if any customer calls up and goes you know your apps not working right you can turn this stuff on and figure out what's going on and unfortunately because we're building applications where it's guaranteed that our development environment is about is different from our deployment environment as is possible there's going to be a whole series of problems that will only come up in production which makes life an adventure so instrument and collect and then analyze also you want to really you know plan the data access plan when your queries when your cash is when your cash updating is going to happen and understand the data latency issues now data latency is all about looking at your application and understanding when the data becomes stale and how often do you really need to know to let app a know that app Beast ace is updated and a great example of this is I ran into a client and site and they needed some optimization and there's fight would just cry and go halt when I had a bunch of users and what was happening they had all this user specific state a shopping cart and every time the shopping cart got updated for user one all other 30 app instances got notified that that user shopping cart was updated even though that user shopping cart was only on one session in one app instance you know it was a case where they use the generic notification method and by simply removing that all the sudden their app was stable even under higher loads and they could have more shopping carts that's always good you also want to do things in memory wherever possible and to try to get zero queries per response I mean the fewer times you go to the database the fewer times you go to disk the better off you are and especially when you get into the high throughput sites like the music store you know any database it's going to be orders of magnitude more expensive than being able to just blast backs on static thing from memory you want to manage your faulting and manage your caching so that your can explicitly update your stale data you generally want to avoid situations where web objects is either deciding to populate relationships on its own are popularly caches on its own because it will choose a very general purpose solution that is probably not optimized to your actual use patterns and you also want to use a shared read-only while if they read only shared it should just be really called a read-only editing context for reference data it's a little bit tricky because again you get into the situation where you're going to be careful about how you make relationships to improve objects that were fetched into that editing context but you know that way you can have that single shared editing context it's read-only so it never pays the penalty of doing updates or inserts or anything like that and then another one that's really something that's more of a modern optimization this is something that's become much easier with recent releases of web object is to partition your functionality across multiple applications and what that means is that if you've got say a site where it has an extensive search operation plus shopping cart management plus say a library of information plus a couple of other different things an administrative tool then you partition those different features into different applications and then you can control the number of application instances individually and control the configuration of those applications individually to optimize those particular applications for the youth they encounter that's very important and with director actions and with putting the session ID and cookies and then being able to reconnect across the different apps you can achieve a lot of efficiency and what you can also do is use optimized object models per application so you go an enterprise objects modeler and you can bring up your object model and you're going to your full object model through administrative tool it's got read it's got right it's got everything but then all those entities where you don't need to say edit particular fields in your application your customers see you can create a second yo model that has just the field you need for that particular application this will reduce the memory footprint size and reduce the amount of data going but to and from database it just overall will make the appt faster give me a little tricky because of course then you're going to keep the two things in sync that's a pain but you know it if you're faced with this problem it can really help a lot that course obviously maximize reuse through framework again got to point it out because some people forget about that learn into that a number of times and you also want to partition between section full and session lists and threaded and non-threaded because threading is always a very complex issue there will be certain apps where threading is an obvious optimization and certain other apps where it's not so obvious in particular things like where you have multiple writers to the same database you probably don't want to thread that because bad things can happen when you cross commits or do partial transactions the session full versus session list there may be number of apps like search apps are often things that don't need to have per user state and so if you can get rid of the session in the search app and then make it such that it's only caching has one big cash for all the search data you can make things extremely fast and max can talk about some of the optimizations they did in the music store that go even beyond that okay so you've done all the right signs of the development time you know it's been the world's most perfect development schedule and you even delivered early and you got the app in production and now it's too slow or it's using too much memory if trashing the disks or you know the CPUs are just like big space heaters or it's just occasionally just gets crawls to its knees just really really slow so now what do you do well the first thing is don't be silly and this comes from years of experience I've been very silly myself and I've seen many developers do really silly things turn on or off the obvious flags low caching enabled yeah that's a good one to have on low debugging enables really good one to turn off of you know it'll bring it app to its knees go into production all of a sudden hundred thousand users hit it and it's trying to log every sequel query yeah that good idea there's the built-in in a slog facility plus there's of course log4j which is wonderful because they can both be dynamically configured such that you can hit a production server and pay the penalty on logging on only the areas that you know or problematic and that gets back to instrumenting and making sure that you have dynamic instrumentation so you can actually catch problems in production put indices on your database table sounds obvious but you know that's one that people often miss and then look under production their data sizes grow to a couple of orders of magnitude bigger than they are in development and Allison they're left scratching their heads as to why the heck is so slow just to bring up a simple page and this is another one that is funny minimize the size of the generated content we ran into a number of sites where the initial page load which is nice big beautiful page with a couple hundred images on it and a bunch of texts and all this other stuff it was about a hundred and sixty k of HTML which way too much of which 40k of that was comments and another 30 k of that was because the image URLs were all / images / clients / site / code name splashed fubar jpg or gif and by simply erasing all of that making it / I / for the images we were able to reduce the page from 160 k down to like 50 60 70 k we're still too much and of course again Annalise Annalise Annalise one of the challenges with building a web object space tap or other dynamic applications is that as a user goes through the site unless you specifically do the engineering work you're not going to get a good record of what the heck they did so what you need to do is leverage the tools available to there's a lull event there's the wolf statistics store both of which maximum scan will show parts of you can also capture the direct action activity direct actions are the one exception to the rule direct actions because the Act the Earl has the name of the action that was fired in it the direct actions leave a mark in the log as to what the users been doing if you also pass the session ID in the Earl which is an option then that means that you can differentiate between neat different users in your logs obviously you want to tune the most used areas first if your copyright page is slow probably nobody cares so go for the high-traffic areas also just as a recommendation I would ensure that your checkout process or other payment gathering process actually runs fast had a client that was wondering why no one was paying for anything on their site it was because when you click the button to go to the thing where you filled out your credit card information it's like a minute and a half the load and everyone went away and bought their stuff elsewhere so you know optimize those kinds of things it's not even always the most used pages that are the ones that need the optimization it's the user experience that needs optimization and there's also just a wealth of third-party tools available which a lot of people aren't necessarily aware us both optimizing jaypro works really well web service and log web server log analysis tools there is a slew of free ones available the plus there's also commercial products like urgen that worked very very well and there's also every major database server out there has sequel query analysis tools built in use them show plan that kind of thing the database server will actually come back and tell you why your queries took so long and enterprise objects of course will provide you with tools for optimizing those queries or even running ex with sequel by hand something to be avoided but sometimes it can't be now I'd like to bring max up who's going to demonstrate some of the stuff [Applause] aggieville so what I did here is built two very simple little applications one is a cocoa web services appt it just makes a simple query to a web of stuff i have also running on this machine and it just brings back a bunch of soap objects and list them here and then I you know you can click through them and either choose to then you know if it's two hours you know you can choose to update them and that basically makes a soap call back so it's just very very simple it'll collapse and on the on the server side we have a very simple web objects app that threw in a bit of a direct web so you can basically see the current users I mean this is something that you can do out-of-the-box very very quickly so the question then becomes if if things are slow what do you do so the first thing you can you can look at is the me is the staff page which we kind of just look at your overall statistics of hunger pages and say you know what yeah what am i doing wrong and also gives you a good idea of woes that also can give you a really good idea just where the high traffic sites that you're hitting are the pages that are coming out saying say that and this has had 66 pages rendered so far and what you can see here is this you know the this will give me the number that has been served and and the number and it kind of the averages and the and the outliers so obviously you've given this ass I should optimize the glow events ploy page because this is my this is what's getting hit the most but it also can give you a good idea that you can see is the first query you know it's taking 1.2 seconds to come up and the first inspect page you know relatively quickly and the list page you can see that that I said it's rendered eight times but the first time it took three point six six seconds whereas the average is 0 point six seconds so that can sometimes tell you that you have something that's coming up it is that is rather slow and so to then drill down and figure out what you know what exactly am i doing wrong with that you can go over to something called the events so that you know the status basically just gives you kind of overall overall overall look and you know music store will login to various different apps and just kind of check out to see what what the current with the current you know apps are looking like in terms of what their averages are what the what the outliers are because sometimes we've got some very long ones will need to start looking into so for the events so this is it's actually already but let's just go to the setup so there's all the stuff that's just you comes right with what objects very very simple stuff as long as you specify the password nothing worse than trying a million passwords and realizing oh it's not quite in the properties file and those are right there so so we're turning all of the event events on so we're setting its everything and then we're going back to our ass and we're hitting the list all users so we can go right back to this and show the event log now this has many different options and it can somewhat be rather non-intuitive exactly how these are organized the one that I always liked if I was just like to look at the events group of the page and by the component as this can show that off of the main page here that the mate the main is obviously the one that's hurting the most and you know the query page is coming up slow as well relatives to everything else so they're so pretty much when you look at this you can say well jeez you know the main page that's the pig and so when you turn event logging on it pretty much just covers all your application and so anything that's going on it's getting logged with events so then you'd say well jeez ok it's not it's not the new list page it must be something else so just kind of gives you an ability to drill down so you can see that this and wawa lift users what's going on here objects with such specification I think I've seen one of those before so you can see it on list users on the pole that's the action that's what I have bound up to the action link so when you're clicking on the link it says list my users so you can see the bit that what's what's hurting here is is the fetch right here is the the fetch stack of pulling out pulling up the users so later on we'll come back and basically show some more fine-grained tools as we go through some of the database optimization to then try to discover what's going on so we can go back to slides so as bill mentioned my name is max Mueller I am on the lead engineers in the itunes music store and I've been on this now for since the very beginning so we worked very hard and and came across all the issues that the bit bill to put one over and made some of the snakes even though we've all been you know pretty much on most of us on the team been doing this for since the well just three days so this is kind of just more stuff that we've did we've stumbled across that we've we've had optimized being being on lead engineers I pretty much get to eat breathe and live optimizing these applications when we launched in in Europe the average from when we launched to Europe we were selling on average five point eight six songs per second that's how many we'd since we launched so if you're eight so if you're doing many things per second over an averaging obviously the peaks much higher in the trough lower small little problems can very quickly turn into very large problems and that's one thing that we really found you know we we had to spend a lot of time tuning the database because what objects it felt very fast if you've got the secure web app that it's not doing any any database work and you've somehow made it slow you've really done something wrong because yeah because out of the box is very it's very fast I mean you're able to generate responses very quickly and the request responses are very quickly so so in terms of getting down to the database work a lot of the stuff that you can have happening in kind of your administration ass can also very quickly affect things that are taking place in your production applications so we have you know a content management application that that our content team is consoling they're working on building the new the new storefront and you know we we would notice that sometimes during the day the store get flow and turns out there and they're just doing all these queries that are bringing out very large sets of results and it's it's actually causing a lot of causing the database a lot of pain and so the store itself is starting to get slow because the database is having to service all the content requests rather than actually the requests that people want to buy music which you know is not a good thing so we so basic so putting a plate special Emmet's putting place requiring certain queries can significantly reduce the amount of database work or that your database is doing which nice which are which your users actually might not be seen so there's also a number of tools from database vendors taking queries and and handing them off your dbas constantly very very handy 11 bits it's out there is that we did was when we when you week when we open the new connection to our database is actually a stored procedures you can call an oracle to put in information about about the connection because by default any for all the JDBC stuff when it connects in all that all that the DBA is going to be able to see is that you know it's a Java process yeah well that doesn't really help you if you got a whole hundred genius mix of of you know back in processing app store apps and a very large environment of Java applications so you can put in information hint of connections so that they see some query that's running it's running a muck they can actually look at the connection that's causing that query so we put in the application name the host it's on even when it started up because sometimes we found that it you know we forget to stop an instance and it would be off in la-la land when we would have rolled a new version of software and be like wow what's going on that we probably fix this problem and lo and behold it's like well that guy's actually been running for a week kind of thing so binary data in the database is dead no no especially if you accidentally check the locking column to wear us thinks and needs to lock on that might attribute and so it will issue the where clause if you do moving down to a 2-1 relationship also this goes back to what Bill was saying about you know having certain having certain models or certain attributes only for what what you're working on for your administration work and different ones for maybe the consumers so maybe you have a large Club field that people entering the bunch of notes about you know about an album that shows us elbow you know this came from this blah blah blah blah blah well that Club which could get very large you know we don't want the store app pulling that thing up I mean so when so you can either you know take the approach of creating the model or at run time because turn off that attribute say you know really us you know you don't need to worry about that one leave that in the database when we're in this kind of read only mode because nobody's going to be you don't need the club so so we use the shared at it in context for pretty much just reference data kind of complete type in the sense that it's kind of like a just a type-safe and whom that you know where it says you know you know key one is this key and key to is this key instead of kind of having a look up that's pretty much what we use it for so when when your app starts up although all the shared editing context information is loaded and then it doesn't have to agree fetched and likewise when you trip the relationships to the shared infrared to using your in the shared context not it doesn't require a database trip so there is interest messaging if you if you if you need to synchronize state between applications for you know critical critical pieces of have snapshots a lot a lot of the times just telling you that it's no longer it's no longer a valid snapshot it's good enough so it's not that you really want to move all the snapshots over and say you know here's the new snapshot it's more just a long lines of saying hey you know now this is a snapshot for this guy got update about this one so the next time you need it you better go get from the database rah Rosa is is a useful technique for for pulling back large large content where you don't need all the snapshots stuff that you not going to be editing now within that within the store we have all these popularity caches that are going to rebuild you know it's like if you'd like this you like that well as the number of number kind of items that you can buy expand we will pull that stuff in with raw with raw fetch actually in a separate thread so the threat can just kind of sit in the background every so often and determines it needs to go out to the database and pull in pull in a new a new set of the sorry recommendations between so when we're rendering some of the pages you can sync it up so it will basically refreshes cash you know in the background in the background thread using raw using the raw raw sketches catching a memory it's good a lot of the times if you have if you're if you have kind of read only application you can look at instead of at using one kind of shared editing context it that you'll then pull stuff into and then hold that hold on to and so the application will hold the whole the reference to be to this it's nice it's not a shared editing context but it's a shared editing context adapter debugging enabled obviously that's when you just can turn it on and say and what what sequel is going on here it's this one as the godsend java.lang.throwable being able to generate back traces anywhere is very handy i'll showing you how we can change the logging pattern at runtime to deter to start throwing in back traces anywhere we want which can really help a lot of the times if you're just looking at the sequel be like whoa hold on words that coming from one of the one of the very common mistakes that a lot of people will make is if a such an object and then they're like oh maybe they prefetch everything and then they'll on the next on the next request they'll say well we need to go ahead and validate the editing context here so we make sure we have fresh data the problem is you've got the object graph there and so you've got it you've got an object and you've already spent the time to prefetch out out the ever all the stuff that you got it in and two or three or four fetches but then you've been even validated all the all the data underneath you in solves and you start to trip over these things again it's like oh that's actually been turned back and go fault I need to go out to the database again and you're like well jeez I prefetched everything and now I've got sequel going out the ying-yang so turning on adapter debugging tins can help you see it but being able to actually see the back trace of where the faults are firing so I'll showing you in the demo what we strictly using musics are quite a bit worrying turn on there's actually a delegate hook and we throw backs races when whenever a false was fired so a lot of times we'll turn that on and then go to render a page that is somehow start to get flow for some reason and a lot of the times it isn't because that page itself is gun floats because something else is triggering something that's causing housing the you know the snapshots to get old or wiped out so yeah excess faulting that's that's a that's a hot one also notice another trick that you can do with the java.lang.throwable are in the act and a constructor of a java object if you have you no debugging turned on some debugging flag you can actually create create a throwable object in the constructor and sash that away in a live horror at which point then at any point later on you can always ask the object you know what's the stack trace where you were created which can be very handy and say both sessions where sometimes will we have several apps that are completely session list and I'll sudden we'll start to see sessions popping up and we'll be like what's going on and so we'll set this value and then we can add a later point yes I get the both session store just basically it will say dump out all your sessions and give me the back-trace because there's somebody who's doing something bad and more often Oslo active image somebody put a low active image in and if you don't bind it up correctly it'll go ahead and create a session and create a component action for you handy but you know not what you want when somebody just accidentally forgot to do a binding and then yeltsin you've got these pages that are generating lots of sessions because a lot of times that won't be referenced and so you'll get a lot of them created so you have one request I thinking and you can somehow get multiple sessions created which just causes all sorts of nightmares fetching is from pop-ups yes yeah yeah the local instance of object absolute yes oh also be aware of is if you're you're such timestamp flag is set that the fetch timestamp lag of saying you know how new snapshot do you care about so oftentimes what people do a lil create new editing context will set the fetch timestamp lacked right now say I want everything fresh and then we'll start doing the local instance of object and of course then when they touch the object that goes to the database so you could have fetched all the stuff in the knee like how long I need to do a local instance now let me get let me create a recurring up fresh editing context and so that usage pattern all of a sudden you'd be like well I thought I was doing something good but it turns out that that local instance of object can actually cause a lot of a lot of trips of the database simplified object model and you know if you can you know we're at three or four hundred entities right now is a music store and it's growing more each week the the deep inheritance the vertical inheritance is the only efficient efficient form of inheritance and UF I'm used it for many years now works rather well it allows you to you know to have a user and then a person user and all these kind of things mapped onto the same table and you can still have relationships to the top abstract entity so when you trip a relationship you could be getting all the different sub entities but it all at all had on it but eof handles that gracefully for you underneath the covers you know the the other form of inheritance is across multiple higher he across multiple tables and that one is rather an efficient because anytime you trip a fault it's going to be like is it in this table and then this in this table so if you do if you do have to use that type of inheritance the best way is to always trip it's always model relationships down to all the sub non abstract entities the views of the database queries if you can get an efficient one that has the bind that a lot of times if you don't need you don't need all the bonding variables coming through in a view it can be efficient the excess back pointers is a really hot ticket one because a lot of the times you'll have you'll have a situation where a user you know in music store a family so you have a user's got many purchases so when somebody clicks by you know you're going to be creating a purchase for them and so the tendency might be just to say you know create a purchase add objects both sides relationship you know to get the user on and save it well if you recall sees keynote a few while back you know the number one personally music store 27,000 115 songs at that point that's a whole lot of purchases and so with what happens is when you add objects both sides relationship if that relationship hasn't been faulted you're going to trip it which means going to be pulling in all those things so all of a sudden you're like well geez why is it slow why is my appt getting slow on random intervals was by so if you know three minutes in production what what's what the heck happened there not only that but the memory so trying to run through the proof well we just pulled in twenty seven thousand things just because they're trying to purchase one more thing so if you don't have to trip the relationships whereas you know if you just create purchase set the user save it to the database that's fine so the for the backs of the back relationships or you can just not model it just you know remove the modeling in the in in the model completely so I mean yeah hope this isn't too advanced just trying to cover cover a bunch of bunch of stuff that we found so rate that's a little known technique little-known database technique so you can in the you know databases or their their bill for these kind of things in there there are now that's what that's what you pay the big bucks for for the big big tools is that they you know they provide all the tools and and you know if you've got a good good DBA you can get in there or you can get in there yourself and look look there's if you can identify the your top queries in your database then you can start to go back and look to see what it coming from thee in your application about once a month er dba's will send us a spreadsheet be like all right here's the top ten go for it kind of thing so yeah then we can start hunting around like okay where's this one coming from and okay who did this kind of thing so I'm sorry very useful yes so generated sequel and it's obviously one you know we we basically focus on on optimizing the you know the parts that are there the that are going to hit the most and we will we don't optimize the copyright page copyright pages flow is and stored procedures there they're useful for something and some some places where they are very useful but other other places where you are if you're using a stored procedure to update rose that you're also modeling you can really wind up and stay very quickly where you just execute a stored procedure call it's updated something underneath and now your snapshots out of date there are techniques that you can use to keep your snapshots up to date but it's yeah the pain so let's if we can go back over to demo for so i'll just show a few a few bits here we're okay on time so coming back coming back here we can see that the list users is inefficient so then it's well what do we do so the first thing is you check your logs okay nothing in the logs so I'll bring up this and I'll show in just a few tricks that we use and all this all the stuff is in is in project wonder that we've contributed back so nothing i'm doing here is is a proprietary or something like that so the first thing we can start to look at it and be like well geez now let's look at our those without restarting the app by the way so I just let's start looking at our database track let's see what's going on and for for a little app here so lets users all right whoa a whole lot a whole lot of sequel there huh so we can see that we're fetching we're fetching stuff from the user table but then all of a sudden we've got user infos all over here a whole lot of user info columns so you're like what's going on here so has user infos you know there's six there well there's more than six queries I sir there's six ways sorry so this is fifty one more thing else sequels not really that useful here so let's do it looks like a fault firing let's see when we're actually firing trolls so all that all of this thing and so let's go here yeehaw so then we can look down here and so here's main here's my list users method so the set data source so let me show you the code very very simple so I'm just creating a database data source of users setting the data source on the list page handing it off that's all the code that's going on here and we have one kind of services that that the other app is using to talk to talk to it it's just a this is the plain vanilla out of the box and what about just handling all the all the services so i wrote a bit of code here I did it myself so that I could this is a user service and so I expose the fine users method I use the if you saw Bob talk on the first part of web on the introduction webobjects about and there's this WS make stubs have command line ask that you can run so all i did was i wrote the find users and an update user takes the user ID the first name last name and the find user takes a first name and last name and then in my application I said whoa what service register register the size of this guy and I ran the brand that makes sub on this it generated for the cocoa side all these stubs you have the ws generated object and then I just wrapped it in a little bit of user services so I me know if you know took all of a few hours to do nothing nothing complex here at all actually not even nine a few hours half an hour so going back here now we can see we can see basically the back to race and so that's happening on a set data source is our fetch specification so we can see we're fetching users no qualifiers nope no no prefetching keys then next we're fetching user info so here's main who's that day so she was fetch well what's going on here awake from fetch so exam just walking up to three user line 33 so say well what's going on user line 33 using line 30 30 first name is Max and I've got this test user info then I'm fetching it 10 times not so good so so let me set this at all look left it on a production to oops so now I can clear out the console go back to the without now hit list users go back to the sky and lo and behold a little bit better so you know these are just a few techniques that it you know that we that you can use to kind of quickly get your head around kind of what's going on I'll show you one more which is so we use log4j for for all for pretty much everything we do and one of the nice features that have is you can see if this current one right here I'm saying my conversion pattern I want to use and putting I'm saying a date I want to have my memory staff and so this is used versus free memory what category is logging the line numbers is it's calling out the priority level which this is all the log4j stuff looking a priority of debug or fatal or forget what X is in this message and then a new line so when we go back and look at when one of these gets called see so sorry delete all my stuff so we're back here so let's go back and look at the first part of this line you know we have the date so far this ass he's using 11 megabytes and it's got 22.9 53 this is being called but from a clap from the method from the from the class ERX database context delegate line 149 this is a debug and then this is the message so it's printing the stack trace itself if I turn trying to think so if I go back to this if i turn the fault firing off because that guy is going ahead and putting its own factories in there let's say this let's go back here one more time backs of home page so list users so here we have the exact same all the same information coming in this is the log4j bridge that's that's just capturing NS log events and routing to log4j and so again we're getting messages coming through here but now what happens is in lo and behold in production now for some reason there's something going wrong yeah they've got some random sequel coming out from this application so you connect in you change the pattern to this pattern so this gives you the web objects will give you the name or give you the number of sessions or give you the World Court it's bound on it'll give you the into the pit of the probe of the process based format give you your vm staff Oh Josie priority but then I also put % ass at the end which says you know go ahead and dump that back trace I want to see where this is coming from now when we turn the fire hose on clear the fire is on one more time lo and behold laughs I'm update long enough okay I didn't I turned it off so you can see that this does have all the all the information here so it has the name of the application that's a bid at the port so far I've created 10 sets 10 active sessions they memory used all those kind of stuff as well as you know stack trace of where that long line message is coming so these are just a few of the techniques that you know will use to hunt down down performance problems we're now looking at where the faults are firing where the database traffic is and then you can also look at the woe of bins if you want to get more fine-grained and look at where your components are you know potentially cause me problems so see back to five thank you thank you max so as you can see there's a tremendous wealth of debugging opportunities both offered by web objects and also in third-party tools everything max demonstrated is like you said you know available project wonder has a tremendous wealth of stuff in it even if you don't want to take on the full project going there and reading some of the code and learning from there is a wonderful way to to learn about some of the optimization opportunities available as well as a number of other things so now moving on okay so you got your database fast now we need to start making the actual application fact there's some tricks to optimizing components what components are a wonderful thing I mean they're rarely reusable and you can plug and play and all that except for plugging and playing well components has a price and if you have pages that are getting hit a lot if you have a deep nesting hierarchy low components you'll find that there's a lot of overhead there and a lot of times you can reduce the overhead by simplifying the component nesting kind of unfortunate because it moves away from reuse a little bit but it will yield a lot of efficiency at times the next point is you know as an object-oriented programming like why are you pointing that out well a lot of people when they're doing the component side they sort of lose sight of the fact that low components are really just a hierarchy of objects so there's no reason why you can't define an abstract superclass for your whoa components to encapsulate the common functionality and then make all the other components in your app subclass of that it's also a really good place to stick in debugging cloaks and other annotations that will help you during your analysis phase and to reiterate make sure the debugging hooks can be toggled easily as max demonstrated you don't want to pay that overhead on time there's even I think a local component floating around somewhere little embed a stack trace in your page it's kind of fun you also want to consider cash in your pages you're using stateless components the left state and individual component has the more can be reused across the rest of the application and the left coster is associated with re with bringing it back into play where the users moving across the application and also you know make static content static if you have pages in your application that just aren't being aren't changing that often or aren't changing at all push them out of static web pages and use the push the session ID into the into a cookie such that when the user navigates through the static content and comes back into the site they get reconnected to their sessions see their session state again you can use direct actions for this too so it's very easy to embed URLs and that static content that will bring the user back into the application wherever they left or however wherever you want them to and the same thing goes for multiple applications using the multiple application approach to optimization you can use the direct actions again to both navigate the user between different apps while preserving state as well as to control how they enter those individual applications also static content doesn't really have to be static you can play some fun tricks like using proxy servers and things like that to cause content to be generated once dynamically and then cash for the next set of users you know this is another one of those areas that a lot of people forget about you spending all your time of developing web objects applications writing code writing EO models testing databases etc you still need to go and optimize for fast browser display and that again means checking the total size of the generated page smaller pages display faster they parse faster they render faster they're just factor um you want to batch display of longer sets of data as was mentioned on a previous slide no user out there is probably going to scroll past the first 30 to 50 hits when they're doing a search or a query so batch it up and don't even fetch those other ones much less generate the HTML and again generate those short orals look for a lot of opportunities to do anything you can do to reduce content size is going to make it go across the wire faster you also want to do better things with images and by better I mean use smaller images compress them more or use an appropriate format for the kind of content that's being presented you also want to use common image names so if you have the same image repeated across the site or if you have say a graph that's being rendered based on say time series data where it only updates every five minutes generated static image put it somewhere making sure everything uses that same image name and of course use less images you know be more intelligent about the use of the image because every image is not only the cost of pushing the bits of the image across the wire it's also the cost of a whole new decoding session in the browser for dealing with that image and a whole new connection through to the web server and whatever you do don't stick the images in the database it's just not doesn't make any sense web servers are really good at caching binary data and serving it up caching static data and serving it up as soon as you stick an image in a database you go from web server file system throws the data up the wire to the client you're done to web server whoa adapter dispatch to an application application calls a database database pulls the data back you end up with like five copies of the data in memory and a whole bunch of other overhead also in now pretty much all the browsers out there support dealing with compressed content so if you have situations where you're just producing fairly large pages with modern CPUs it's almost always more efficient just to go ahead and compress the data on the server side send it over the wire in the compressed form and let the browser uncompress it Apache has mod gzip which is are easy to install there's a one another area of optimization is if you generate crap HTML it takes the browser longer to figure out what it should do with it so if you generate well-structured HTML the browsers render faster and this is an interesting one because of course in the early days of HTML it didn't matter if you closed your paragraph tags or your table tags or anything else because the browser would figure it out thanks Microsoft and as you know as things evolved it not only affects both the HTML processing and parsing because now the browser have to look ahead and then go oh well that tagged over there probably means this one over here needs to be closed but it also confuses things on the web object side web objects components really want to generate a hierarchy of tags that are nested in insane fashion so looking for things like overlap problems using an HTML to like web went to check the generated content and you really got to check that generated talent content because you know a web object page is generated of many components all spewing forth HTML but then gets serialized and is one big response and you need to check that the content in the context of that whole response because there can be overlapped problems that are caused by component miss nestings and things like that simplifying table structures is another great way to reduce content size and moving to CSS or having a site-wide common CSS document CSS being cascading style sheets which fortunately browser seems to support though in consistently is another great way to both reduce the content size speed up the rendering time and make your site more flexible so there's also optimizing direct to the direct to stack is incredibly powerful the whole notion of having rule-based content generation and data management and navigation management and user management and everything else I don't know of any other tool out there that compares with web objects when it comes to this but it is also overhead and there's a different approach to optimizing it and Max can certainly answer any questions in this regard so you in the context of direct to the rules engine it has this notion of significant keys and unbounded keys significant keys are the ones that are the focal points and then the ones will be cached etc the unbound keys are the ones that will require calculation a lot of faulting through the rule system to figure out the values of those things that's very expensive if you want to avoid that you also want to be to optimize the data being accessed by property keys to a given task or page so the web objects direct the directory stack of this very strong notion that the user is doing something somewhere and you can optimize all of your data access around that notion it gives you a lot of hints about what the users doing at any given time there's a number of debugging hooks and both an EO and direct to and also down at the lower layers and a lot of which you can find a project wonder and there's warm-up techniques you can do to cause the rule caching system to warm up its state such that subsequent evaluations of those rules will be much more efficient like one of the most common complaints we see about direct to web or direct to Java client it's the first hit always takes a long time because it goes off and the rule cash is empty and it has to go off and evaluate all these rules to fill the rule cash well the rule cash most of it is actually going to be static results so there's like entire huge sets of rules that just never need to be evaluated again because the results never change and so you don't want your first user to have to pay that penalty and when you're building custom components and this is true of both direct to as well as everything else go for stateless stateless means that there's no session specific data it means the component can be shared across the app it doesn't have to be archived and on archives and reconstituted during request response it's just a lot more efficient then also you got to look beyond the while objects application itself you know make sure your web server is doing its share of the work and that means tuning the configuration like Apache has a mod status and one other module which I can't remember the name of any way that out of the box can give you a lot of information about what your web servers doing Plus look to your web server especially as your site grows you'll want to look to your web server to be able to farm out content across multiple web servers multiple boxes and even up to the level of farming out to say and Akamai or the other content aggregators because of course once you do that then any hit that doesn't hit your web server is more CPU cycles for the primary content generation offloading all serving the content you can like images files multimedia to other servers is a great thing one of the challenges as always if you have a site that's secure as soon as you go into the HTTPS then that means all the images that are on that page have to be encrypted as well because web browsers don't like mixing encrypted and unencrypted content this gets back to security being the antithesis of efficiency and convenience so that makes for quite the adventure because now you're going to have to figure out how to pay the price of encrypting the content thats related to that page including the static content and of course you know encrypted downloads is a really bad idea nothing like encrypting say 45 megabyte download for one user because everything has to be encrypted purse individual user caching proxy servers this is a really neat technology you can use something like squid or the caching proxy server and apache so the first user that hits your site will pay the price of the dynamic generation but then that HTML page gets stuck in a caching proxy server that's in between the web server front line in your world obstacle application once that item is in there you can then put timeouts on it or you could have an external interface for invalidating it or the easiest thing to do is to just simply have a dynamic page which has the set of URLs that lead to what will be cached and just change those URLs once it's invalidated and that way you know since it's an oral that hasn't been cashed at the caching proxy server will go oh I got to go get it go get it cash it next user will be really fast optimized deployment deployment is just such an adventure to the is just different than development and max is coming on stage because he's done a lot of deployment oh you know it's how large will memory footprint can you live with for each of your applications because you got to make sure that when you get this thing into production you start hitting high loads you don't hit the physical memory limit because physical memory is so much faster than swapping go hard drive that as soon as your app starts flopping or your server starts flopping it's done for its spiraling the bowl you're you know you're looking to reboot pretty soon you want to try to pour a heterogeneous mixture of servers across of applications across servers so if you're running multiple application types and mix it up a bit that way if one application goes pathological I mean it's going to take out your whole site but you have some more wiggle room before it does so and you know to be honest about it like if the search component in say itunes music store goes down and it was all on one server the sites unusable anyway so you know having a little component of the site up and running is probably really not that much advantageous if it's going to cause the site to be unusable there's also tuning the adapter timeout values and making sure your will work or threads settings are all you know set up correctly because as is the case with most things most things the generic out of the block box configuration is pretty much guaranteed to be wrong for your application this is also why well objects doesn't do synchronization of data between instances it's we could do a generic solution for that but it'd be guaranteed that would be inefficient for your specific business problem and you also want to determine ahead of time how you're going to monitor the system for problems I mean every component in the system as you add more machines as you add more complexities you add firewalls and everything these things need to be monitored and you need to plan ahead for a catastrophe and max's get some great images on that i'm sure and i'm going to turn it over to max now to talk about this particularly fun fun issue so i just wanted to finish up with the the production quality deadlocks which any of you have been in high traffic site it's always one of those things that if you've got something if you've got a recipe in it it's definitely gonna get baked in production and you're going to find in production so if you just a few topics one is that kill minus quit you know within the Java world that would basically view full full stack traces to to every for the running ass for all the different threads that are currently running one of the one of the most common places of is is having initialization things that are happening in your dispatch request because dispatch request is is completely threaded or is will have multiple threads coming through there at any given point so even if you have your asset and kind of single single threaded mode it's not doing concurrent requests you're just batch request has to be threaded likewise any of the code in there if it's if you you know if you have one method there it's like oh let me go out fetch something from an edging context cash that value and then and that value will then be used for any request that comes in and you can guarantee the to that when you start you're going to have to two threads are immediately going to get in there and start doing eos stuff which is yeah which you will run into serious problems with most of the what's the dedlock's we have to track down or because of eos or we're not locking things correctly or the multiple yo stacks and a single shared editing context that one will kill you every single time so you have to if you want to use who's full-blown eo's and multiple different eof stacks you definitely have to create new shared editing context for each one of the stacks that you want to use if you by default you don't do anything this new objects to a coordinator new video editing context fetch anillo and you're going you're going to be dead in the water so the monitoring to detect wedged wedged instances that's a that's a big one if you if you are having this problem and also it's very important to start your load testing before start your application a lot of the times and we ran into several issues where where we didn't detect a certain deadlock condition because we had all of our ass up and running and they were like all right now turn on the load test whereas if we would have had the load testing up which is what you haven't production if you bounce your apps because you can't say have users and they're clicking on everything and then the app has some initialization deadlock and it starts up it's like a switch and the dead time interval and the adapters can be a killer because if you because of that interval that basically says how long should we wait for it if we try an instance and it doesn't respond back how long should we wait until we try it again and so you get this nice you know ripple effect where the wave will crash down and all your ass will basically register themselves is dead if they're starting up and and so then it will basically wedge all your web servers and your web servers will finally and we come back and your ass will be like now we're ready and then the web servers are going well here you go and the wave will come sweeping over the abstract no no no no no more no more and so they'll wedge and the dead timeouts will set and so you get this nice you know few sauce after all of a sudden things will be really fast and things we really slow up really fast and really slow so that's that's a yeah so it's just 11 last slide of these things so yeah users are funny because they pay your bills but to hate you because as soon as your ad starts misbehaving how do they respond by clicking like spastic monkeys fine so you know quick summary here so start thinking fast from the beginning but don't overdo it I mean you want to instrument you want analyze you want to track you want to track your performance over time but invariably you want to stay calm and that's like a point that just again stay calm because when things start going wrong the worst thing you can do is to start just you know throw your hands up in the air and start rebooting things random without understanding the problem or getting your analysis tools up and running or gathering metrics or gathering evidence because simply you know doing the spastic monkey routine on the reboot buttons not going to fix anything it's just going to make it happen again sometime later there's a tremendous wealth of tools available it's easy to forget exactly how much stuff is out there but you know the industry as a whole has been doing web based deployments now for more than a decade and web application deployments now for a decade too and so there's just a boatload of free and commercial products out there to do a lot of management and analysis some of which are better than others you know always be aware of that security implication there's a lot of really obvious optimizations that one can perform on a site that will make it completely insecure like the direct to the direct actions thing is one to watch out for you got these direct actions in place if you carry too much state in that Earl or the Earl's like there was one case where someone decided to separate their shopping cart out from their main application and when they put the products in the shopping cart they put the price in the Earl and they believed it yeah it wasn't good having done this for so long how all of us having done this for so long there's a tremendous number of community resources there's Google does the apple in the Omni list there's again Google which searches a lot of those lists and indexes everything else there's project wonder and other random community projects that are out there including a wealth of various random free Java projects that you can leverage and finally again Google if you get an error message is coming back from something almost always you can put that error message into quote marks in Google hit return and find ten other people that are experiencing the same thing one of which might have the answer so with that there's for more information that should have just said Google their sources of documentation in sample code the documentation has been updated I was reminded of one other thing as far as performance analysis are concerned shark and the ched tools now do Java as well that works with love object [Applause]