WWDC2004 Session 622

Transcript

Kind: captions
Language: en
so good afternoon and welcome to session
I'm bill Bumgarner I'm a manager of
enterprise objects framework and the
core data framework I've also done what
logics development for ever since two
point oh I also be joined by max
Mueller's lead engineer in the iTunes
Music Store and which is probably the
highest throughput web objects
application ever built if you know of
another example I'd love to hear it I
can think of a few that were pretty high
throughput but nothing like that so
between us we've done some optimization
and would like to share some wisdom with
you we hope its wisdom anyway so as an
introduction here scalability and low
latency is critical when you're doing
web-based applications or when you're
doing any kind of server based
application and building one
successfully is a challenge it requires
a lot of analysis a lot of tuning
planning and also one of the sort of
critical areas is that security security
always complicates these things security
is kind of the antithesis of efficiency
and convenience so you know that will
challenge things and it's not just about
the code it's about the entire
development environment and deployment
environment the whole process and we're
going to cover on a bunch of those
points this is going to be a very
information-rich session you're not
going to see a lot of pretty pictures
we'd also like to get through the
content fairly quickly because we've
always found that talking with you the
developers who are living this stuff day
to day and answering questions about
your specific problems always brings up
new things and is always interesting to
everyone so what you'll learn a little
bit about and these are all guidelines
unfortunately i don't have the perfect
formula for everyone is how to architect
apps to be fast and scalable some of the
patterns and pitfalls for success and
failure to achieve a fast application
that works
how you can analyze existing
applications to determine where the
bottlenecks are and then address them
and finally you know it's not just about
the code it's about getting it out there
and into production once you get it into
production there's a lot more services
available to you than just the level
objects application itself there's other
tools and resources you can use
depending on your budget and your
traffic rates to be able to keep an app
up and running and we're also going to
cover things like when good apps go bad
and there'll be a fox special this fall
like if you'll notice if you deploy on
say your eight props or see gr8 CPU
solaris box you can see your CPU usage
get eight hundred percent that's always
fun you can even see it go to nine
hundred percent and then it takes down
the machine next to it there's us the
problems with like when you're working
set gets larger than physical memory and
then disk is obviously a lot slower than
RAM and when you start hitting the disk
for memory that's bad also there's the
the situations with a lot of people seem
to miss which is like network saturation
when packets or connections are being
dropped I know one example of a very
large company who will remain unnamed
who called the FBI because they thought
they were being hacked and it turned out
that they had two windows boxes and
their traffic rates were so high that's
a tcp/ip stack was falling over and
dropping connections so you know that
gets back to the analysis thing and
calling the FBI to analyze the
performance problems of your apps
probably not a good idea there's also
just the basic situation of like when
your responses require too much
computation if you're going off and
calculating pi to 100,000 digits to
bring up your welcome page that's going
to be a problem your database can be
obviously overwhelmed you're hitting it
way too often or you're just pulling too
much data from it or pushing too much
data into it that can be bad and a great
one is external services because no
external service ever wants to admit
they're wrong so instrumenting your apps
such that when they're wrong you can
prove it will make
your life a lot easier so how can we fix
these problems well unfortunately
dollars are always involved the basic
problems you can just kind of throw
money at it and get more cpus get more
memory get more network get it faster
database that kind of thing but that
won't always work and that's what we're
going to focus on is when that doesn't
work you need to throw engineering at it
and what's really looking to do in that
case is do less work to generate the
response or to make more efficient use
of the database or ideally don't hit the
database at all or optimize your
external service integration and that
becomes especially critical as your site
grows in size and you need to start
relying on those services more and more
so good rules the code by these are
somewhat obvious but they bear repeating
because when you get into sick of things
it's easy to forget test driven
development now this is something that's
really become popular in the recent
years and I can't emphasize how valuable
this is put the unit test together you
do it before you write the code that
test the capabilities and the
requirements of the code and then run
them every single time you build and
when you push the deployment run them
there and run many of them in parallel
if you can they will uncover so many
obvious problems and save you a huge
amount of time the you know another
obvious one that everyone misses
including myself is make it work make it
right make it fast as developers we
often like to make problems a lot harder
than they really are because then we're
you know I don't know our egos are
boosted when we solve a hard problem
it's like look I've sorted something 20
times faster than the next guy never
mind it's only 10 elements don't
optimize without analysis and I can't
say this enough time I've walked into so
many situations where someone's got the
world's most optimized means of writing
out the HTML page that's only used once
when the user signs up the first time
it's like come on you know
go fix the problems that are really
there and also optimizing small tests
and test your results after each of
those steps and this was gets back to
the unit testing if you got the unit
tests in place then this makes
optimization a lot easier because you
can go off you can do the optimization
and know immediately if you've broken
things and you know the last point is
just it's an obvious one but everyone
does it if it ain't broke don't fix it
which as object-oriented developers that
means if it works now don't generalize
it because that's one that we commonly
do also there's a lot of optimization
that you can do a design time if you
understand your problem well and any
optimizations you can do before the
first code first lines of code or
written are always good keeping in mind
that you shouldn't optimize things too
prematurely so you really want to
understand the usage patterns of your
applications know how users are going to
use it no it's how it's going to be
administrated now clearly you know you
can't be omniscient but you know you can
do some a lot of upfront work there and
of course make your entry page fast I
can't tell you the number of times I
walked into a client and I hit the entry
page and it did 600 sequel queries like
no no no make it static you know even if
you have to have an entry page that just
comes up that's just long enough to get
them into the site that's fine but make
that fast you also your business logic
in your application should be designed
around the the response generation it
should be designed around what the
customers are going to be doing with the
application even if that makes the
Administrative Tools less convenient you
know the administrator tools they need
to be powerful they need to be intuitive
but they take a little bit of time okay
that's okay because it's all about
administrating the content for the
purposes of the customer and if your
business is like any of business I've
ever heard of it's the customer that
pays the bills and you know I've seen a
lot of people to lose sight of that you
also want to make sure you're retaining
and we're using data and know when it's
out of date this will come up again but
it's something that bears repeating
you know if you go into the database to
pull out the list of states in the
country it's more likely not going to
change in the next five minutes so keep
it around then introduces some
complexity and working with enterprise
objects because of course you can't
create relationships between entities
they're different editing context so you
have to be a little bit careful and you
want to manage that cache data carefully
because if you end up cashing everything
all the time but just leaving it in
memory then you're going to get back to
the part where you run out of memory
your machine starts flopping and then
your performance because the help so you
also want to look less the database
server share in the work databases have
been around a long time and they do
things like sorting stuff really fast we
also do indexes and all this other stuff
and you really want to leverage those
wherever possible so one of the
optimization is to minimize your memory
footprint if you can minimize the
footprint of the application then you
can have more instances running you can
balance the load more effectively and
this means sharing data across sessions
which is not something while objects
does naturally out of the box you're
going to have to play some games there
to get that kind of thing to work it's
not hard it is detail-oriented you also
want to clean up thoroughly by the egg
it annoys me to see a developer makes a
statement that because of garbage
collection they don't have to care about
cleaning up the code okay not every
object it's going to be collected by the
garbage collector is just using memory
it may be using scarce resources like
connections to close servers or it may
be keeping a file descriptor open or
something like that and you want to let
the code know that it's done and over
with also any null doubt reference the
garbage collector doesn't have to
traverse to deal with collecting so
that's another good way to ensure
cleaning up happens quickly you want to
also clear your transient up instance
variables when they're no longer in use
and this is not just an optimization
issue this is also a debugging issue if
you have stale state in your object
graph
and you don't clean up when you're done
and then you come along some later point
in time through some code paths that you
didn't really think about and you run
across the transient data and now it's
not a date and you didn't know it your
host speaking from experience traders on
trading desks get really irritated when
they start seeing the wrong prices so
some scars you also want to do things
like set you right session timeout value
sessions will stick around and they'll
automatically go away unfortunately on
the web there's no quit button so it's
hard to know when the session should go
away so you need this gets back to
looking at the usage patterns look at
the use patterns of your app understand
how they are and understand when it's
safe to make the session go away then
there's also instrument I can't say this
enough either is that you know
instrument your applications we've got
some wonderful tools built in for
instrumenting them which max was going
to demonstrate shortly and they do a
great job of allowing you to detect when
resources are being used or when things
are getting out of control and you want
to review those results often and
ideally you want the instrumentation to
be something you can turn on dynamically
in production so that if any customer
calls up and goes you know your apps not
working right you can turn this stuff on
and figure out what's going on and
unfortunately because we're building
applications where it's guaranteed that
our development environment is about is
different from our deployment
environment as is possible there's going
to be a whole series of problems that
will only come up in production which
makes life an adventure so instrument
and collect and then analyze also you
want to really you know plan the data
access plan when your queries when your
cash is when your cash updating is going
to happen and understand the data
latency issues now data latency is all
about looking at your application and
understanding when the data becomes
stale and how often do you really need
to know to let app a know that app Beast
ace is updated and a great example of
this is
I ran into a client and site and they
needed some optimization and there's
fight would just cry and go halt when I
had a bunch of users and what was
happening they had all this user
specific state a shopping cart and every
time the shopping cart got updated for
user one all other 30 app instances got
notified that that user shopping cart
was updated even though that user
shopping cart was only on one session in
one app instance you know it was a case
where they use the generic notification
method and by simply removing that all
the sudden their app was stable even
under higher loads and they could have
more shopping carts that's always good
you also want to do things in memory
wherever possible and to try to get zero
queries per response I mean the fewer
times you go to the database the fewer
times you go to disk the better off you
are and especially when you get into the
high throughput sites like the music
store you know any database it's going
to be orders of magnitude more expensive
than being able to just blast backs on
static thing from memory you want to
manage your faulting and manage your
caching so that your can explicitly
update your stale data you generally
want to avoid situations where web
objects is either deciding to populate
relationships on its own are popularly
caches on its own because it will choose
a very general purpose solution that is
probably not optimized to your actual
use patterns and you also want to use a
shared read-only while if they read only
shared it should just be really called a
read-only editing context for reference
data it's a little bit tricky because
again you get into the situation where
you're going to be careful about how you
make relationships to improve objects
that were fetched into that editing
context but you know that way you can
have that single shared editing context
it's read-only so it never pays the
penalty of doing updates or inserts or
anything like that and then
another one that's really something
that's more of a modern optimization
this is something that's become much
easier with recent releases of web
object is to partition your
functionality across multiple
applications and what that means is that
if you've got say a site where it has an
extensive search operation plus shopping
cart management plus say a library of
information plus a couple of other
different things an administrative tool
then you partition those different
features into different applications and
then you can control the number of
application instances individually and
control the configuration of those
applications individually to optimize
those particular applications for the
youth they encounter that's very
important and with director actions and
with putting the session ID and cookies
and then being able to reconnect across
the different apps you can achieve a lot
of efficiency and what you can also do
is use optimized object models per
application so you go an enterprise
objects modeler and you can bring up
your object model and you're going to
your full object model through
administrative tool it's got read it's
got right it's got everything but then
all those entities where you don't need
to say edit particular fields in your
application your customers see you can
create a second yo model that has just
the field you need for that particular
application this will reduce the memory
footprint size and reduce the amount of
data going but to and from database it
just overall will make the appt faster
give me a little tricky because of
course then you're going to keep the two
things in sync that's a pain but you
know it if you're faced with this
problem it can really help a lot that
course obviously maximize reuse through
framework again got to point it out
because some people forget about that
learn into that a number of times and
you also want to partition between
section full and session lists and
threaded and non-threaded because
threading is always a very complex issue
there will be certain apps where
threading is an obvious optimization and
certain other apps where it's not so
obvious in particular things like where
you have multiple writers to the same
database you probably don't want to
thread that because bad things can
happen when you cross commits or do
partial transactions the session full
versus session list there may be number
of apps like search apps are often
things that don't need to have per user
state and so if you can get rid of the
session in the search app and then make
it such that it's only caching has one
big cash for all the search data you can
make things extremely fast and max can
talk about some of the optimizations
they did in the music store that go even
beyond that okay so you've done all the
right signs of the development time you
know it's been the world's most perfect
development schedule and you even
delivered early and you got the app in
production and now it's too slow or it's
using too much memory if trashing the
disks or you know the CPUs are just like
big space heaters or it's just
occasionally just gets crawls to its
knees just really really slow so now
what do you do well the first thing is
don't be silly and this comes from years
of experience I've been very silly
myself and I've seen many developers do
really silly things turn on or off the
obvious flags low caching enabled yeah
that's a good one to have on low
debugging enables really good one to
turn off of you know it'll bring it app
to its knees go into production all of a
sudden hundred thousand users hit it and
it's trying to log every sequel query
yeah that good idea there's the built-in
in a slog facility plus there's of
course log4j which is wonderful because
they can both be dynamically configured
such that you can
hit a production server and pay the
penalty on logging on only the areas
that you know or problematic and that
gets back to instrumenting and making
sure that you have dynamic
instrumentation so you can actually
catch problems in production put indices
on your database table sounds obvious
but you know that's one that people
often miss and then look under
production their data sizes grow to a
couple of orders of magnitude bigger
than they are in development and Allison
they're left scratching their heads as
to why the heck is so slow just to bring
up a simple page and this is another one
that is funny minimize the size of the
generated content we ran into a number
of sites where the initial page load
which is nice big beautiful page with a
couple hundred images on it and a bunch
of texts and all this other stuff it was
about a hundred and sixty k of HTML
which way too much of which 40k of that
was comments and another 30 k of that
was because the image URLs were all /
images / clients / site / code name
splashed fubar jpg or gif and by simply
erasing all of that making it / I / for
the images we were able to reduce the
page from 160 k down to like 50 60 70 k
we're still too much and of course again
Annalise Annalise Annalise one of the
challenges with building a web object
space tap or other dynamic applications
is that as a user goes through the site
unless you specifically do the
engineering work you're not going to get
a good record of what the heck they did
so what you need to do is leverage the
tools available to there's a lull event
there's the wolf statistics store both
of which maximum scan will show parts of
you can also capture the direct action
activity direct actions are the one
exception to the rule direct actions
because the Act the Earl has the name of
the action that was fired in it the
direct actions leave a mark in the log
as to what the users been doing if you
also pass the session ID in the Earl
which
is an option then that means that you
can differentiate between neat different
users in your logs obviously you want to
tune the most used areas first if your
copyright page is slow probably nobody
cares so go for the high-traffic areas
also just as a recommendation I would
ensure that your checkout process or
other payment gathering process actually
runs fast had a client that was
wondering why no one was paying for
anything on their site it was because
when you click the button to go to the
thing where you filled out your credit
card information it's like a minute and
a half the load and everyone went away
and bought their stuff elsewhere so you
know optimize those kinds of things it's
not even always the most used pages that
are the ones that need the optimization
it's the user experience that needs
optimization and there's also just a
wealth of third-party tools available
which a lot of people aren't necessarily
aware us both optimizing jaypro works
really well web service and log web
server log analysis tools there is a
slew of free ones available the plus
there's also commercial products like
urgen that worked very very well and
there's also every major database server
out there has sequel query analysis
tools built in use them show plan that
kind of thing the database server will
actually come back and tell you why your
queries took so long and enterprise
objects of course will provide you with
tools for optimizing those queries or
even running ex with sequel by hand
something to be avoided but sometimes it
can't be now I'd like to bring max up
who's going to demonstrate some of the
stuff
[Applause]
aggieville so what I did here is built
two very simple little applications one
is a cocoa web services appt it just
makes a simple query to a web of stuff i
have also running on this machine and it
just brings back a bunch of soap objects
and list them here and then I you know
you can click through them and either
choose to then you know if it's two
hours you know you can choose to update
them and that basically makes a soap
call back so it's just very very simple
it'll collapse and on the on the server
side we have a very simple web objects
app that threw in a bit of a direct web
so you can basically see the current
users I mean this is something that you
can do out-of-the-box very very quickly
so the question then becomes if if
things are slow what do you do so the
first thing you can you can look at is
the me is the staff page which we kind
of just look at your overall statistics
of hunger pages and say you know what
yeah what am i doing wrong and also
gives you a good idea of woes that also
can give you a really good idea just
where the high traffic sites that you're
hitting are the pages that are coming
out saying say that and this has had 66
pages rendered so far and what you can
see here is this you know the this will
give me the number that has been served
and and the number and it kind of the
averages and the and the outliers so
obviously you've given this ass I should
optimize the glow events ploy page
because this is my this is what's
getting hit the most but it also can
give you a good idea that you can see is
the first query you know it's taking 1.2
seconds to come up and the first inspect
page you know relatively quickly and the
list page you can see that that I said
it's rendered eight times
but the first time it took three point
six six seconds whereas the average is 0
point six seconds so that can sometimes
tell you that you have something that's
coming up it is that is rather slow and
so to then drill down and figure out
what you know what exactly am i doing
wrong with that you can go over to
something called the events so that you
know the status basically just gives you
kind of overall overall overall look and
you know music store will login to
various different apps and just kind of
check out to see what what the current
with the current you know apps are
looking like in terms of what their
averages are what the what the outliers
are because sometimes we've got some
very long ones will need to start
looking into so for the events so this
is it's actually already but let's just
go to the setup so there's all the stuff
that's just you comes right with what
objects very very simple stuff as long
as you specify the password nothing
worse than trying a million passwords
and realizing oh it's not quite in the
properties file and those are right
there so so we're turning all of the
event events on so we're setting its
everything and then we're going back to
our ass and we're hitting the list all
users so we can go right back to this
and show the event log now this has many
different options and it can somewhat be
rather non-intuitive exactly how these
are organized the one that I always
liked if I was just like to look at the
events group of the page and by the
component as this can show that off of
the main page here that the mate the
main is obviously the one that's hurting
the most and you know the query page is
coming up slow as well relatives to
everything else so they're so pretty
much when you look at this you can say
well jeez you know the main page that's
the pig and so when you turn event
logging on it pretty much just covers
all your application and so anything
that's going on it's getting logged with
events so then you'd say well jeez ok
it's not it's not the new list page it
must be something else so just kind of
gives you an ability to drill down so
you can see that this and wawa lift
users
what's going on here objects with such
specification I think I've seen one of
those before so you can see it on list
users on the pole that's the action
that's what I have bound up to the
action link so when you're clicking on
the link it says list my users so you
can see the bit that what's what's
hurting here is is the fetch right here
is the the fetch stack of pulling out
pulling up the users so later on we'll
come back and basically show some more
fine-grained tools as we go through some
of the database optimization to then try
to discover what's going on so we can go
back to slides so as bill mentioned my
name is max Mueller I am on the lead
engineers in the itunes music store and
I've been on this now for since the very
beginning so we worked very hard and and
came across all the issues that the bit
bill to put one over and made some of
the snakes even though we've all been
you know pretty much on most of us on
the team been doing this for since the
well just three days so this is kind of
just more stuff that we've did we've
stumbled across that we've we've had
optimized being being on lead engineers
I pretty much get to eat breathe and
live optimizing these applications when
we launched in in Europe the average
from when we launched to Europe we were
selling on average five point eight six
songs per second that's how many we'd
since we launched so if you're eight so
if you're doing many things per second
over an averaging obviously the peaks
much higher in the trough lower small
little problems can very quickly turn
into very large problems and that's one
thing that we really found you know we
we had to spend a lot of time tuning the
database because what objects it felt
very fast if you've got the secure web
app that it's not doing any any database
work and you've somehow made it slow
you've really done something wrong
because yeah because out of the box is
very it's very fast I mean you're able
to generate responses very quickly and
the request responses are very quickly
so so in terms of getting down to the
database work
a lot of the stuff that you can have
happening in kind of your administration
ass can also very quickly affect things
that are taking place in your production
applications so we have you know a
content management application that that
our content team is consoling they're
working on building the new the new
storefront and you know we we would
notice that sometimes during the day the
store get flow and turns out there and
they're just doing all these queries
that are bringing out very large sets of
results and it's it's actually causing a
lot of causing the database a lot of
pain and so the store itself is starting
to get slow because the database is
having to service all the content
requests rather than actually the
requests that people want to buy music
which you know is not a good thing so we
so basic so putting a plate special
Emmet's putting place requiring certain
queries can significantly reduce the
amount of database work or that your
database is doing which nice which are
which your users actually might not be
seen so there's also a number of tools
from database vendors taking queries and
and handing them off your dbas
constantly very very handy 11 bits it's
out there is that we did was when we
when you week when we open the new
connection to our database is actually a
stored procedures you can call an oracle
to put in information about about the
connection because by default any for
all the JDBC stuff when it connects in
all that all that the DBA is going to be
able to see is that you know it's a Java
process yeah well that doesn't really
help you if you got a whole hundred
genius mix of of you know back in
processing app store apps and a very
large environment of Java applications
so you can put in information hint of
connections so that they see some query
that's running it's running a muck they
can actually look at the connection
that's causing that query so we put in
the application name the host it's on
even when it started up because
sometimes we found that it you know we
forget to stop an instance and it would
be off in la-la land when we would have
rolled a new version of software and be
like wow what's going on that we
probably fix this problem and lo and
behold it's like well that guy's
actually been running for a week kind of
thing so binary data in the database is
dead
no no especially if you accidentally
check the locking column to wear us
thinks and needs to lock on that might
attribute and so it will issue the where
clause if you do moving down to a 2-1
relationship also this goes back to what
Bill was saying about you know having
certain having certain models or certain
attributes only for what what you're
working on for your administration work
and different ones for maybe the
consumers so maybe you have a large Club
field that people entering the bunch of
notes about you know about an album that
shows us elbow you know this came from
this blah blah blah blah blah well that
Club which could get very large you know
we don't want the store app pulling that
thing up I mean so when so you can
either you know take the approach of
creating the model or at run time
because turn off that attribute say you
know really us you know you don't need
to worry about that one leave that in
the database when we're in this kind of
read only mode because nobody's going to
be you don't need the club so so we use
the shared at it in context for pretty
much just reference data kind of
complete type in the sense that it's
kind of like a just a type-safe and whom
that you know where it says you know you
know key one is this key and key to is
this key instead of kind of having a
look up that's pretty much what we use
it for so when when your app starts up
although all the shared editing context
information is loaded and then it
doesn't have to agree fetched and
likewise when you trip the relationships
to the shared infrared to using your in
the shared context not it doesn't
require a database trip so there is
interest messaging if you if you if you
need to synchronize state between
applications for you know critical
critical pieces of have snapshots a lot
a lot of the times just telling you that
it's no longer it's no longer a valid
snapshot it's good enough so it's not
that you really want to move all the
snapshots over and say you know here's
the new snapshot it's more just a long
lines of saying hey you know
now this is a snapshot for this guy got
update about this one so the next time
you need it you better go get from the
database rah Rosa is is a useful
technique for for pulling back large
large content where you don't need all
the snapshots stuff that you not going
to be editing now within that within the
store we have all these popularity
caches that are going to rebuild you
know it's like if you'd like this you
like that well as the number of number
kind of items that you can buy expand we
will pull that stuff in with raw with
raw fetch actually in a separate thread
so the threat can just kind of sit in
the background every so often and
determines it needs to go out to the
database and pull in pull in a new a new
set of the sorry recommendations between
so when we're rendering some of the
pages you can sync it up so it will
basically refreshes cash you know in the
background in the background thread
using raw using the raw raw sketches
catching a memory it's good a lot of the
times if you have if you're if you have
kind of read only application you can
look at instead of at using one kind of
shared editing context it that you'll
then pull stuff into and then hold that
hold on to and so the application will
hold the whole the reference to be to
this it's nice it's not a shared editing
context but it's a shared editing
context adapter debugging enabled
obviously that's when you just can turn
it on and say and what what sequel is
going on here it's this one as the
godsend java.lang.throwable being able
to generate back traces anywhere is very
handy i'll showing you how we can change
the logging pattern at runtime to deter
to start throwing in back traces
anywhere we want which can really help a
lot of the times if you're just looking
at the sequel be like whoa hold on words
that coming from one of the one of the
very common mistakes that a lot of
people will make is if a such an object
and then they're like oh maybe they
prefetch everything
and then they'll on the next on the next
request they'll say well we need to go
ahead and validate the editing context
here so we make sure we have fresh data
the problem is you've got the object
graph there and so you've got it you've
got an object and you've already spent
the time to prefetch out out the ever
all the stuff that you got it in and two
or three or four fetches but then you've
been even validated all the all the data
underneath you in solves and you start
to trip over these things again it's
like oh that's actually been turned back
and go fault I need to go out to the
database again and you're like well jeez
I prefetched everything and now I've got
sequel going out the ying-yang so
turning on adapter debugging tins can
help you see it but being able to
actually see the back trace of where the
faults are firing so I'll showing you in
the demo what we strictly using musics
are quite a bit worrying turn on there's
actually a delegate hook and we throw
backs races when whenever a false was
fired so a lot of times we'll turn that
on and then go to render a page that is
somehow start to get flow for some
reason and a lot of the times it isn't
because that page itself is gun floats
because something else is triggering
something that's causing housing the you
know the snapshots to get old or wiped
out so yeah excess faulting that's
that's a that's a hot one also notice
another trick that you can do with the
java.lang.throwable are in the act and a
constructor of a java object if you have
you no debugging turned on some
debugging flag you can actually create
create a throwable object in the
constructor and sash that away in a live
horror at which point then at any point
later on you can always ask the object
you know what's the stack trace where
you were created which can be very handy
and say both sessions where sometimes
will we have several apps that are
completely session list and I'll sudden
we'll start to see sessions popping up
and we'll be like what's going on and so
we'll set this value and then we can add
a later point yes I get the both session
store just basically it will say dump
out all your sessions and give me the
back-trace because there's somebody
who's doing something bad and more often
Oslo active image somebody put a low
active image in and if you don't bind it
up correctly it'll go ahead and create a
session and create a component action
for you
handy but you know not what you want
when somebody just accidentally forgot
to do a binding and then yeltsin you've
got these pages that are generating lots
of sessions because a lot of times that
won't be referenced and so you'll get a
lot of them created so you have one
request I thinking and you can somehow
get multiple sessions created which just
causes all sorts of nightmares fetching
is from pop-ups yes yeah yeah the local
instance of object absolute yes oh also
be aware of is if you're you're such
timestamp flag is set that the fetch
timestamp lag of saying you know how new
snapshot do you care about so oftentimes
what people do a lil create new editing
context will set the fetch timestamp
lacked right now say I want everything
fresh and then we'll start doing the
local instance of object and of course
then when they touch the object that
goes to the database so you could have
fetched all the stuff in the knee like
how long I need to do a local instance
now let me get let me create a recurring
up fresh editing context and so that
usage pattern all of a sudden you'd be
like well I thought I was doing
something good but it turns out that
that local instance of object can
actually cause a lot of a lot of trips
of the database simplified object model
and you know if you can you know we're
at three or four hundred entities right
now is a music store and it's growing
more each week the the deep inheritance
the vertical inheritance is the only
efficient efficient form of inheritance
and UF I'm used it for many years now
works rather well it allows you to you
know to have a user and then a person
user and all these kind of things mapped
onto the same table and you can still
have relationships to the top abstract
entity so when you trip a relationship
you could be getting all the different
sub entities but it all at all had on it
but eof handles that gracefully for you
underneath the covers you know the the
other form of inheritance is across
multiple higher he across multiple
tables and that one is rather an
efficient because anytime you trip a
fault it's going to be like is it in
this table and then this
in this table so if you do if you do
have to use that type of inheritance the
best way is to always trip it's always
model relationships down to all the sub
non abstract entities the views of the
database queries if you can get an
efficient one that has the bind that a
lot of times if you don't need you don't
need all the bonding variables coming
through in a view it can be efficient
the excess back pointers is a really hot
ticket one because a lot of the times
you'll have you'll have a situation
where a user you know in music store a
family so you have a user's got many
purchases so when somebody clicks by you
know you're going to be creating a
purchase for them and so the tendency
might be just to say you know create a
purchase add objects both sides
relationship you know to get the user on
and save it well if you recall sees
keynote a few while back you know the
number one personally music store 27,000
115 songs at that point that's a whole
lot of purchases and so with what
happens is when you add objects both
sides relationship if that relationship
hasn't been faulted you're going to trip
it which means going to be pulling in
all those things so all of a sudden
you're like well geez why is it slow why
is my appt getting slow on random
intervals was by so if you know three
minutes in production what what's what
the heck happened there not only that
but the memory so trying to run through
the proof well we just pulled in twenty
seven thousand things just because
they're trying to purchase one more
thing so if you don't have to trip the
relationships whereas you know if you
just create purchase set the user save
it to the database that's fine so the
for the backs of the back relationships
or you can just not model it just you
know remove the modeling in the in in
the model completely so I mean yeah hope
this isn't too advanced just trying to
cover cover a bunch of bunch of stuff
that we found so rate
that's a little known technique
little-known database technique so you
can in the you know databases or their
their bill for these kind of things in
there there are now that's what that's
what you pay the big bucks for for the
big big tools is that they you know they
provide all the tools and and you know
if you've got a good good DBA you can
get in there or you can get in there
yourself and look look there's if you
can identify the your top queries in
your database then you can start to go
back and look to see what it coming from
thee in your application about once a
month er dba's will send us a
spreadsheet be like all right here's the
top ten go for it kind of thing so yeah
then we can start hunting around like
okay where's this one coming from and
okay who did this kind of thing so I'm
sorry very useful yes so generated
sequel and it's obviously one you know
we we basically focus on on optimizing
the you know the parts that are there
the that are going to hit the most and
we will we don't optimize the copyright
page copyright pages flow is and stored
procedures there they're useful for
something and some some places where
they are very useful but other other
places where you are if you're using a
stored procedure to update rose that
you're also modeling you can really wind
up and stay very quickly where you just
execute a stored procedure call it's
updated something underneath and now
your snapshots out of date there are
techniques that you can use to keep your
snapshots up to date but it's yeah the
pain so let's if we can go back over to
demo for so i'll just show a few a few
bits here we're okay on time so coming
back coming back here we can see that
the list users is inefficient so then
it's well what do we do
so the first thing is you check your
logs okay nothing in the logs so I'll
bring up this and I'll show in just a
few tricks that we use and all this all
the stuff is in is in project wonder
that we've contributed back so nothing
i'm doing here is is a proprietary or
something like that so the first thing
we can start to look at it and be like
well geez now let's look at our those
without restarting the app by the way so
I just let's start looking at our
database track let's see what's going on
and for for a little app here so lets
users all right whoa a whole lot a whole
lot of sequel there huh so we can see
that we're fetching we're fetching stuff
from the user table but then all of a
sudden we've got user infos all over
here a whole lot of user info columns so
you're like what's going on here so has
user infos you know there's six there
well there's more than six queries I sir
there's six ways sorry so this is fifty
one more thing else sequels not really
that useful here so let's do it looks
like a fault firing let's see when we're
actually firing trolls
so all that all of this thing and so
let's go here yeehaw so then we can look
down here and so here's main here's my
list users method so the set data source
so let me show you the code very very
simple so I'm just creating a database
data source of users setting the data
source on the list page handing it off
that's all the code that's going on here
and we have one kind of services that
that the other app is using to talk to
talk to it it's just a this is the plain
vanilla out of the box and what about
just handling all the all the services
so i wrote a bit of code here I did it
myself so that I could this is a user
service and so I expose the fine users
method I use the if you saw Bob talk on
the first part of web on the
introduction webobjects about and
there's this WS make stubs have command
line ask that you can run so all i did
was i wrote the find users and an update
user takes the user ID the first name
last name and the find user takes a
first name and last name and then in my
application I said whoa what service
register register the size of this guy
and I ran the brand that makes sub on
this it generated for the cocoa side all
these stubs you have the ws generated
object and then I just wrapped it in a
little bit of user services so I me know
if you know took all of a few hours to
do nothing nothing complex here at all
actually not even nine a few hours half
an hour so going back here now we can
see
we can see basically the back to race
and so that's happening on a set data
source is our fetch specification so we
can see we're fetching users no
qualifiers nope no no prefetching keys
then next we're fetching user info so
here's main who's that day so she was
fetch well what's going on here awake
from fetch so exam just walking up to
three user line 33 so say well what's
going on user line 33 using line 30 30
first name is Max and I've got this test
user info then I'm fetching it 10 times
not so good so so let me set this at all
look left it on a production to oops so
now I can clear out the console go back
to the without now hit list users go
back to the sky and lo and behold a
little bit better so you know these are
just a few techniques that it you know
that we that you can use to kind of
quickly get your head around kind of
what's going on I'll show you one more
which is so we use log4j for for all for
pretty much everything we do and one of
the nice features that have is you can
see if this current one right here I'm
saying my conversion pattern I want to
use and putting I'm saying a date I want
to have my memory staff and so this is
used versus free memory what category is
logging the line numbers is it's calling
out the priority level which this is all
the log4j stuff looking a priority of
debug or fatal or forget what X is in
this message and then a new line so when
we go back and look at when one of these
gets called
see so sorry delete all my stuff so
we're back here so let's go back and
look at the first part of this line you
know we have the date so far this ass
he's using 11 megabytes and it's got
22.9 53 this is being called but from a
clap from the method from the from the
class ERX database context delegate line
149 this is a debug and then this is the
message so it's printing the stack trace
itself if I turn trying to think so if I
go back to this if i turn the fault
firing off because that guy is going
ahead and putting its own factories in
there let's say this let's go back here
one more time backs of home page so list
users so here we have the exact same all
the same information coming in this is
the log4j bridge that's that's just
capturing NS log events and routing to
log4j and so again we're getting
messages coming through here but now
what happens is in lo and behold in
production now for some reason there's
something going wrong yeah they've got
some random sequel coming out from this
application so you connect in you change
the pattern to this pattern so this
gives you the web objects will give you
the name or give you the number of
sessions or give you the World Court
it's bound on it'll give you the into
the pit of the probe of the process
based format give you your vm staff Oh
Josie priority but then I also put % ass
at the end which says you know go ahead
and dump that back trace I want to see
where this is coming from now when we
turn the fire hose on clear the fire is
on one more time
lo and behold laughs I'm update long
enough okay I didn't I turned it off so
you can see that this does have all the
all the information here so it has the
name of the application that's a bid at
the port so far I've created 10 sets 10
active sessions they memory used all
those kind of stuff as well as you know
stack trace of where that long line
message is coming so these are just a
few of the techniques that you know will
use to hunt down down performance
problems we're now looking at where the
faults are firing where the database
traffic is and then you can also look at
the woe of bins if you want to get more
fine-grained and look at where your
components are you know potentially
cause me problems so see back to five
thank you thank you max so as you can
see there's a tremendous wealth of
debugging opportunities both offered by
web objects and also in third-party
tools everything max demonstrated is
like you said you know available project
wonder has a tremendous wealth of stuff
in it even if you don't want to take on
the full project going there and reading
some of the code and learning from there
is a wonderful way to to learn about
some of the optimization opportunities
available as well as a number of other
things so now moving on okay so you got
your database fast now we need to start
making the actual application fact
there's some tricks to optimizing
components what components are a
wonderful thing I mean they're rarely
reusable and you can plug and play and
all that except for plugging and playing
well components has a price and if you
have pages that are getting hit a lot if
you have a deep nesting hierarchy low
components you'll find that there's a
lot of overhead there and a lot of times
you can reduce the overhead by
simplifying the component nesting kind
of unfortunate because it moves away
from reuse a little bit but it will
yield a lot of efficiency at times the
next point is you know as an
object-oriented programming like why are
you pointing that out well a lot of
people when they're doing the component
side they sort of lose sight of the fact
that low components are really just a
hierarchy of objects
so there's no reason why you can't
define an abstract superclass for your
whoa components to encapsulate the
common functionality and then make all
the other components in your app
subclass of that it's also a really good
place to stick in debugging cloaks and
other annotations that will help you
during your analysis phase and to
reiterate make sure the debugging hooks
can be toggled easily as max
demonstrated you don't want to pay that
overhead on time there's even I think a
local component floating around
somewhere little embed a stack trace in
your page it's kind of fun you also want
to consider cash in your pages you're
using stateless components the left
state and individual component has the
more can be reused across the rest of
the application and the left coster is
associated with re with bringing it back
into play where the users moving across
the application and also you know make
static content static if you have pages
in your application that just aren't
being aren't changing that often or
aren't changing at all push them out of
static web pages and use the push the
session ID into the into a cookie such
that when the user navigates through the
static content and comes back into the
site they get reconnected to their
sessions see their session state again
you can use direct actions for this too
so it's very easy to embed URLs and that
static content that will bring the user
back into the application wherever they
left or however wherever you want them
to and the same thing goes for multiple
applications using the multiple
application approach to optimization you
can use the direct actions again to both
navigate the user between different apps
while preserving state as well as to
control how they enter those individual
applications also static content doesn't
really have to be static you can play
some fun tricks like using proxy servers
and things like that to cause content to
be generated once dynamically and then
cash for the next set of users you know
this is another one of those areas that
a lot of people forget about you
spending all your time of developing web
objects applications writing code
writing EO models testing databases etc
you still need to go and optimize for
fast browser display and that again
means checking the total size of the
generated page
smaller pages display faster they parse
faster they render faster they're just
factor um you want to batch display of
longer sets of data as was mentioned on
a previous slide no user out there is
probably going to scroll past the first
30 to 50 hits when they're doing a
search or a query so batch it up and
don't even fetch those other ones much
less generate the HTML and again
generate those short orals look for a
lot of opportunities to do anything you
can do to reduce content size is going
to make it go across the wire faster you
also want to do better things with
images and by better I mean use smaller
images compress them more or use an
appropriate format for the kind of
content that's being presented you also
want to use common image names so if you
have the same image repeated across the
site or if you have say a graph that's
being rendered based on say time series
data where it only updates every five
minutes generated static image put it
somewhere making sure everything uses
that same image name and of course use
less images you know be more intelligent
about the use of the image because every
image is not only the cost of pushing
the bits of the image across the wire
it's also the cost of a whole new
decoding session in the browser for
dealing with that image and a whole new
connection through to the web server and
whatever you do don't stick the images
in the database it's just not doesn't
make any sense web servers are really
good at caching binary data and serving
it up caching static data and serving it
up as soon as you stick an image in a
database you go from web server file
system throws the data up the wire to
the client you're done to web server
whoa adapter dispatch to an application
application calls a database database
pulls the data back you end up with like
five copies of the data in memory and a
whole bunch of other overhead also in
now pretty much all the browsers out
there support dealing with compressed
content so if you have situations where
you're just producing fairly large pages
with modern CPUs it's almost always more
efficient just to go ahead and compress
the data on the server side send it over
the wire in the compressed form and let
the browser uncompress it Apache has mod
gzip which is
are easy to install there's a one
another area of optimization is if you
generate crap HTML it takes the browser
longer to figure out what it should do
with it so if you generate
well-structured HTML the browsers render
faster and this is an interesting one
because of course in the early days of
HTML it didn't matter if you closed your
paragraph tags or your table tags or
anything else because the browser would
figure it out thanks Microsoft and as
you know as things evolved it not only
affects both the HTML processing and
parsing because now the browser have to
look ahead and then go oh well that
tagged over there probably means this
one over here needs to be closed but it
also confuses things on the web object
side web objects components really want
to generate a hierarchy of tags that are
nested in insane fashion so looking for
things like overlap problems using an
HTML to like web went to check the
generated content and you really got to
check that generated talent content
because you know a web object page is
generated of many components all spewing
forth HTML but then gets serialized and
is one big response and you need to
check that the content in the context of
that whole response because there can be
overlapped problems that are caused by
component miss nestings and things like
that simplifying table structures is
another great way to reduce content size
and moving to CSS or having a site-wide
common CSS document CSS being cascading
style sheets which fortunately browser
seems to support though in consistently
is another great way to both reduce the
content size speed up the rendering time
and make your site more flexible so
there's also optimizing direct to the
direct to stack is incredibly powerful
the whole notion of having rule-based
content generation and data management
and navigation management and user
management and everything else I don't
know of any other tool out there that
compares with web objects when it comes
to this but it is also overhead and
there's a different approach to
optimizing it
and Max can certainly answer any
questions in this regard so you in the
context of direct to the rules engine it
has this notion of significant keys and
unbounded keys significant keys are the
ones that are the focal points and then
the ones will be cached etc the unbound
keys are the ones that will require
calculation a lot of faulting through
the rule system to figure out the values
of those things that's very expensive if
you want to avoid that you also want to
be to optimize the data being accessed
by property keys to a given task or page
so the web objects direct the directory
stack of this very strong notion that
the user is doing something somewhere
and you can optimize all of your data
access around that notion it gives you a
lot of hints about what the users doing
at any given time there's a number of
debugging hooks and both an EO and
direct to and also down at the lower
layers and a lot of which you can find a
project wonder and there's warm-up
techniques you can do to cause the rule
caching system to warm up its state such
that subsequent evaluations of those
rules will be much more efficient like
one of the most common complaints we see
about direct to web or direct to Java
client it's the first hit always takes a
long time because it goes off and the
rule cash is empty and it has to go off
and evaluate all these rules to fill the
rule cash well the rule cash most of it
is actually going to be static results
so there's like entire huge sets of
rules that just never need to be
evaluated again because the results
never change and so you don't want your
first user to have to pay that penalty
and when you're building custom
components and this is true of both
direct to as well as everything else go
for stateless stateless means that
there's no session specific data it
means the component can be shared across
the app it doesn't have to be archived
and on archives and reconstituted during
request response it's just a lot more
efficient then also you got to look
beyond the while objects application
itself you know make sure your web
server is doing its share of the work
and that means tuning the
configuration like Apache has a mod
status and one other module which I
can't remember the name of any way that
out of the box can give you a lot of
information about what your web servers
doing Plus look to your web server
especially as your site grows you'll
want to look to your web server to be
able to farm out content across multiple
web servers multiple boxes and even up
to the level of farming out to say and
Akamai or the other content aggregators
because of course once you do that then
any hit that doesn't hit your web server
is more CPU cycles for the primary
content generation offloading all
serving the content you can like images
files multimedia to other servers is a
great thing one of the challenges as
always if you have a site that's secure
as soon as you go into the HTTPS then
that means all the images that are on
that page have to be encrypted as well
because web browsers don't like mixing
encrypted and unencrypted content this
gets back to security being the
antithesis of efficiency and convenience
so that makes for quite the adventure
because now you're going to have to
figure out how to pay the price of
encrypting the content thats related to
that page including the static content
and of course you know encrypted
downloads is a really bad idea nothing
like encrypting say 45 megabyte download
for one user because everything has to
be encrypted purse individual user
caching proxy servers this is a really
neat technology you can use something
like squid or the caching proxy server
and apache so the first user that hits
your site will pay the price of the
dynamic generation but then that HTML
page gets stuck in a caching proxy
server that's in between the web server
front line in your world obstacle
application once that item is in there
you can then put timeouts on it or you
could have an external interface for
invalidating it or the easiest thing to
do is to just simply have a dynamic page
which has the set of URLs that lead to
what will be cached and just change
those URLs once it's invalidated and
that way you know since it's an oral
that hasn't been cashed at the caching
proxy server will go oh I got to go get
it go get it cash it next user will be
really fast
optimized deployment deployment is just
such an adventure to the is just
different than development and max is
coming on stage because he's done a lot
of deployment oh you know it's how large
will memory footprint can you live with
for each of your applications because
you got to make sure that when you get
this thing into production you start
hitting high loads you don't hit the
physical memory limit because physical
memory is so much faster than swapping
go hard drive that as soon as your app
starts flopping or your server starts
flopping it's done for its spiraling the
bowl you're you know you're looking to
reboot pretty soon you want to try to
pour a heterogeneous mixture of servers
across of applications across servers so
if you're running multiple application
types and mix it up a bit that way if
one application goes pathological I mean
it's going to take out your whole site
but you have some more wiggle room
before it does so and you know to be
honest about it like if the search
component in say itunes music store goes
down and it was all on one server the
sites unusable anyway so you know having
a little component of the site up and
running is probably really not that much
advantageous if it's going to cause the
site to be unusable there's also tuning
the adapter timeout values and making
sure your will work or threads settings
are all you know set up correctly
because as is the case with most things
most things the generic out of the block
box configuration is pretty much
guaranteed to be wrong for your
application this is also why well
objects doesn't do synchronization of
data between instances it's we could do
a generic solution for that but it'd be
guaranteed that would be inefficient for
your specific business problem and you
also want to determine ahead of time how
you're going to monitor the system for
problems I mean every component in the
system as you add more machines as you
add more complexities you add firewalls
and everything these things need to be
monitored and you need to plan ahead
for a catastrophe and max's get some
great images on that i'm sure and i'm
going to turn it over to max now to talk
about this particularly fun fun issue so
i just wanted to finish up with the the
production quality deadlocks which any
of you have been in high traffic site
it's always one of those things that if
you've got something if you've got a
recipe in it it's definitely gonna get
baked in production and you're going to
find in production so if you just a few
topics one is that kill minus quit you
know within the Java world that would
basically view full full stack traces to
to every for the running ass for all the
different threads that are currently
running one of the one of the most
common places of is is having
initialization things that are happening
in your dispatch request because
dispatch request is is completely
threaded or is will have multiple
threads coming through there at any
given point so even if you have your
asset and kind of single single threaded
mode it's not doing concurrent requests
you're just batch request has to be
threaded likewise any of the code in
there if it's if you you know if you
have one method there it's like oh let
me go out fetch something from an edging
context cash that value and then and
that value will then be used for any
request that comes in and you can
guarantee the to that when you start
you're going to have to two threads are
immediately going to get in there and
start doing eos stuff which is yeah
which you will run into serious problems
with most of the what's the dedlock's we
have to track down or because of eos or
we're not locking things correctly or
the multiple yo stacks and a single
shared editing context that one will
kill you every single time so you have
to if you want to use who's full-blown
eo's and multiple different eof stacks
you definitely have to create new shared
editing context for each one of the
stacks that you want to use if you by
default you don't do anything this new
objects to a coordinator new video
editing context fetch anillo and you're
going you're going to be dead in the
water so the monitoring to detect wedged
wedged instances that's a that's a big
one if you if you are having this
problem and also it's very important to
start your load testing before
start your application a lot of the
times and we ran into several issues
where where we didn't detect a certain
deadlock condition because we had all of
our ass up and running and they were
like all right now turn on the load test
whereas if we would have had the load
testing up which is what you haven't
production if you bounce your apps
because you can't say have users and
they're clicking on everything and then
the app has some initialization deadlock
and it starts up it's like a switch and
the dead time interval and the adapters
can be a killer because if you because
of that interval that basically says how
long should we wait for it if we try an
instance and it doesn't respond back how
long should we wait until we try it
again and so you get this nice you know
ripple effect where the wave will crash
down and all your ass will basically
register themselves is dead if they're
starting up and and so then it will
basically wedge all your web servers and
your web servers will finally and we
come back and your ass will be like now
we're ready and then the web servers are
going well here you go and the wave will
come sweeping over the abstract no no no
no no more no more and so they'll wedge
and the dead timeouts will set and so
you get this nice you know few sauce
after all of a sudden things will be
really fast and things we really slow up
really fast and really slow so that's
that's a yeah so it's just 11 last slide
of these things so yeah users are funny
because they pay your bills but to hate
you because as soon as your ad starts
misbehaving how do they respond by
clicking like spastic monkeys
fine so you know quick summary here so
start thinking fast from the beginning
but don't overdo it I mean you want to
instrument you want analyze you want to
track you want to track your performance
over time but invariably you want to
stay calm and that's like a point that
just again stay calm because when things
start going wrong the worst thing you
can do is to start just you know throw
your hands up in the air and start
rebooting things random without
understanding the problem or getting
your analysis tools up and running or
gathering metrics or gathering evidence
because simply you know doing the
spastic monkey routine on the reboot
buttons not going to fix anything it's
just going to make it happen again
sometime later there's a tremendous
wealth of tools available it's easy to
forget exactly how much stuff is out
there but you know the industry as a
whole has been doing web based
deployments now for more than a decade
and web application deployments now for
a decade too and so there's just a
boatload of free and commercial products
out there to do a lot of management and
analysis some of which are better than
others you know always be aware of that
security implication there's a lot of
really obvious optimizations that one
can perform on a site that will make it
completely insecure like the direct to
the direct actions thing is one to watch
out for you got these direct actions in
place if you carry too much state in
that Earl or the Earl's like there was
one case where someone decided to
separate their shopping cart out from
their main application and when they put
the products in the shopping cart they
put the price in the Earl and they
believed it yeah it wasn't good
having done this for so long how all of
us having done this for so long there's
a tremendous number of community
resources there's Google does the apple
in the Omni list there's again Google
which searches a lot of those lists and
indexes everything else there's project
wonder and other random community
projects that are out there including a
wealth of various random free Java
projects that you can leverage and
finally again Google if you get an error
message is coming back from something
almost always you can put that error
message into quote marks in Google hit
return and find ten other people that
are experiencing the same thing one of
which might have the answer so with that
there's for more information that should
have just said Google their sources of
documentation in sample code the
documentation has been updated I was
reminded of one other thing as far as
performance analysis are concerned shark
and the ched tools now do Java as well
that works with love object
[Applause]