WWDC2000 Session 412
Transcript
Kind: captions
Language: en
good afternoon everyone welcome back you
have a good lunch yeah see a lot of
empty seats that usually means either
lunch was really good or really bad but
we have here are the session that was
hinted at several times yesterday how
many of you remember the EJB session how
many of you under have digested all of
it yet absolutely yeah so you might also
think about retitling the session
cashing in on synchronization because as
you learned yesterday one of the real
advantages well obvious has is
incredibly powerful mechanisms for
caching and synchronizing data across
multiple object stores running on
multiple systems not just the sort of
heavyweight who am I
here are my kind of things you get with
EJB and to tell you all about that here
is daniel abrams please give a warm
welcome thanks it's a it's really great
to see everyone I'm frankly amazed it
and how many web objects people were
getting into these sessions and and a
little intimidated too late a visitor
who will go ahead like there any feds
the name of the session is caching and
synchronization and at a very high level
it's a very simple concept caching in
the sense of grabbing data from an
external data store like a relational
database and displaying it to users and
the real issue is how fresh that dude is
that your users are seeing and a balance
between hitting that external store and
caching that data so that you're not
hitting it too often but as a
consequence users might not be seeing
the freshest data and synchronization
which is taking changes that your users
make to objects and applying the back to
the database in an orderly organized way
so we'll jump right into it and get
started I want to divide the
presentation into essentially three
parts the first part at a high level
will use the slides to go over some of
the caching and synchronization issues
and solutions we'll jump into a little
demo app that I've prepared to
demonstrate some of these things and
then we'll get into the QA and hopefully
get a constructive discussion going
about how to deal with some of these
before I get started I did want to sort
of get it show of hands just to see on
the level we should go at how many
people here have a good sense of what
neo fat specification is how to deal
with it I have used that before okay so
so maybe half I would say which means
that we're probably going to lose some
of you at least in the PowerPoint part
of the presentation and hopefully we'll
bring you back in with the demo but
let's get right to it
so I'm Daniel Abrams I work out in the
field as a consulting engineer and I've
been building web applications with web
objects for about three years now
large scale small scale all sorts of
different types so I've run into these
issues over and over again and and I
think I'm pretty good in broad
perspective on how to fix these things
and I think if you've built with objects
applications out in the field at all
you've probably run into these issues as
well so this is what I want to cover I
want to go over the default web objects
you have deployment scenario that is out
of the box when you go to deploy what
objects application what does that look
like and what effect does that have on
caching and synchronization I want to go
over fetching and snapshotting
snapchatting is sort of at the heart of
these synchronization and caching issues
committing and receiving changes in
coordinating updates so this is what the
default deployment scenario looks like
you essentially have a whole bunch of
client browsers out in the real world
you could have one or more web servers
and web objects adapters but let's take
the simple situation where you have one
and it's very likely that if you have
any sort of volume at all coming into
your application you have multiple
application instances so if you look at
this slide we'll see we have instance
one and instance two and within each
instance we have multiple editing
contexts sitting on top of the shared
els stack so that's essentially what you
get out of the box with web objects
without making any changes to the
default component scenario so I wanted
to spend a little time talking about
snapshots because I think that if you
understand how snapshots work how
broadcasts occur and
going on behind the scenes with your
data then then you can figure out any
given cashing or synchronization issue
there the master repository for
coordinating all data retrieval and
updates and by default in a single
instance you have editing context
sharing a set of snapshots so these are
some of the issues that when I go in and
I see clients have built webobjects
applications they talk about or they say
why is it that I'm not seeing the
freshest data or why am i seeing the
application hit the database but my
users are seeing that data show up so
let's talk a little bit about how
fetching occurs in web objects so one
thing before we get into this that I
wanted to go over was that when you do
have multiple application instances in a
deployment scenario by default they're
not communicating with each other at all
so this leads into the first level of
complexity of what would seem to be a
relatively simple situation we're about
to see a number of behaviors where
within a single application instance
editing context or communicating with
one another because they share a shared
stack and a shared set of snapshots but
when we're in multiple instances there's
no communication going on whatsoever
between those instances so if an editing
context in one instance makes an update
to the database the editing contexts
that sit in the other instance in no way
recognize that and it's essentially the
equivalent of another application going
in and making a change to the database
or there's no real difference from
within the instance so let's talk about
a fetch within a single application
instance you for those who don't know
for those who are familiar with the
concept of fetch specifications in AOF
you essentially can programmatically
construct a set specification and
there's a series of parameters when we
go over the demo app I'll show you how
you do that that allows you to query a
database get back a group of objects and
display them to the user if you want to
so after you've constructed this spec
specification and trigger to fetch
obviously query the database to get the
data we'll snapshot that object within
that shared stack so even
a given editing context in this case
editing context one has triggered a
fetch it's actually snapshotted in a in
a shared shared location so editing
context tool & editing context 3 will
eventually become aware of that object
as we'll see as they either fetch it or
manipulate it and finally we create an
object and point into the the first
editing context so now I want to talk
about what happens when an additional
odd when an additional editing context
attempts to display the same object so
in this particular case we have editing
context to initiating a fetch on the
same object an object with global ID 1
global ID is just a of's way of
displaying and packaging up primary key
so it's just a way of uniquely
identifying an object so in this case we
do the exact same thing we editing
context 2 triggers a fetch on object
with global ID 1 we query the database
and what's significant here is that when
that data comes back we actually ignore
any updates to the data so if an
external instance or even some other
external application has changed that
data and we simply do a fetch the
out-of-the-box behavior is that you're
not going to see it so this is the first
thing that actually throws people off
they're not aware of this they construct
a FETs bet they fetch their data and
they don't see updates but they do see
the application hitting the database and
I'll show you that in the demo but it's
important to be aware that that's the
default behavior so we've ignored
updates we create an object in editing
context 2 but that object is actually
created off the snapshot because we
haven't specified otherwise so I want to
talk about what you can do to change
that situation if you want to let's say
that you always want users that fresh
data or you want users that fresh data
in this particular case what are some of
the things that you can do to make that
happen well one of the things you can do
is on your fetch spec so you can use a
method called refreshes we fetched
objects and if you do that when you
query the database and the data comes
back it updates the snapshot so in this
case editing context 3 has triggered a
fetch and they've set that flag for
refresh
we fetched objects to on so we created
the database and we can see that the
snapshot gets updated in that shared
stack now there's an additional wrinkle
when this happens when we have an update
to a snapshot all of the editing
contexts that share that staff there are
referred to as peer editing context
receive that change receive a broadcast
of that change
so we can see that when this snapshot
gets updated
we broadcast out to those editing
context that our peers that already have
an instance of that object and then we
pull that object into editing context 3
now we'll get into more detail later
what happens if editing context 1 or
editing context to have made changes to
that object before that broadcast is
occurred and what actually happens is
you get merged and we'll talk about that
more on that when we get it more into
synchronization there's an additional
wrinkle I wanted to talk about which is
specific to four or five so four or five
have has added a number of different
ways that that you can manipulate the
way objects are refreshed or updated and
they sort of add an additional level of
flexibility to what you can do but also
an additional layer of complexity that
you have to be aware of so in four five
edit editing context 3 when a trigger a
fetch decide the query database and
check timestamp are actually inverted so
what will happen is editing context 3
triggers of fetch we'll check the
timestamp if that timestamp has expired
will actually go back to the database
will do a query will update the snapshot
and and again because the snapshot has
been updated or broadcast those changes
out so I wanted to talk about one of the
alternative methods that you can use to
to update snapshots and that's
invalidating objects so I've added a
little bit of complexity here to the
diagram what you see in each of the
editing context is the object that we
were talking about before and now has a
too many
because relationships are actually
treated a little bit differently and
again we're sort of layering
on top of what we were doing before so
one thing that happens when you do a
fetch with fetch with refresh turned on
you actually don't update that that
objects too many relationships so you
might see changes to that object but
you're not going to see changes to the
too many relationship or the other
relationships that it has even to one
relationships so I want to spend a
little time dealing with both
alternative ways of invalidating as well
as methods for dealing with updating
relationships so in this particular case
we have editing contexts one in
validating the given object that we were
talking about and there's two ways to do
invalidation one is on an individual
object and one is to invalidate every
object in either the editing context or
within the stack we'll do it
individually first and then we'll move
on to to invalidating globally so when
editing contexts one triggers this
invalidation we Leefolt the object weary
fault it's too many relationships but we
preserve its to one relationship when
editing context one trips that fault
we see a series of actions occur as a
result we first queried the database
against that object so we now have a
fault for that object and it
specifically queries for that particular
object we update the snapshot and update
that object and then we broadcast those
changes out because the snapshot has
been updated now there's one additional
wrinkle here which is that the broadcast
actually occurs a little differently
than it does when you have refresh is
turned on in the case of refresh when
you broadcast out in editing context -
or editing context 3 if you had a dirty
object that is an object that someone
has modified that broadcast would merge
in those changes when you invalidate an
object by default if you don't do
anything else it will overwrite those
changes and the users in the other
editing context will lose their changes
so in some ways invalidating is a more
powerful tool but in some ways you have
to be careful because you could
potentially write overwrite other users
changes so right now we have this object
with global ID 1 pulled into each of the
editing contexts
but we still have the too many
relationship faulted and I want to go
over what happens when that fault is
tripped as well so when that when the
too many relationship fall the stripped
we will query the database for that too
many relationship just like it did the
first time but it will actually discard
the changes so you will not see any
updates the too many relationship when
you invalidate an object like that and
and then it will pull that relationship
in from the existing snapshot so this is
an effect invalidating objects
individually is an effective way to to
update the given object but it will not
work for for updating changes to a to
many relationship and actually as an
additional wrinkle it will create the
database against that relationship so
I'll show you the demo app and users are
sometimes confused because they see that
query occur but you don't see changes so
finally the most drastic thing you can
do is invalidate all the objects so when
you invalidate all the objects
essentially everything either in the
editing context of the shared stack is
refolded so every object is reef Alton
every relationship is reef Alton for
every everytime you trip one of those
faults you're going to have a new query
every snapshot is updated all those
changes are broadcast including too many
so when you invalidate all the objects
you will update the to many that's a
fact of way of pulling in new data for
you to menus but there's a lot of really
significant issues associated with
invalidating all objects one is that it
it's very expensive period to to try and
pull every single object as you trip
them back into the database and two is
it can actually be more expensive than
your original queries so if you pull in
a bunch of objects into editing context
through a series of queries you pull
those objects in in sets at a time right
so you might pull in objects five or ten
at a time unless you recreate every
single one of those original queries
when you have to even validate at all
it's going to trigger effect
individually on each of those objects as
you trip the faults
the other issue is that it will wipe out
any changes to any editing context that
share those objects so if editing
context one invalidates all objects and
editing context to editing context three
happen to be making changes to those
objects or deleted those objects but
haven't committed those changes those
changes will be lost
so one user will have the freshest data
but another user may may simply lose
data behind the scenes without really
realizing what's going on so you have to
be very careful when you do that to
ensure that your users don't lose data
and then the other thing to be aware of
is that with every single one of these
mechanisms when you actually go to
deploy an application a user may end up
on one instance or multiple instances
would be actually a better way to say
that is a user may users may end up on a
share instance or they may end up across
application instances so if they end up
on a shared application instance and
you're doing things like updating
fetches with refresh or invalidating
objects and those two users who are
editing the same object are going to see
changes as a result of this broadcast it
by random chance they happen to end up
on two different application instances
even if your code is exactly the same
they're going to see a different set of
behaviors so potentially users are going
to be very confused unless you're very
careful about the way you're doing this
because from their perspective they're
doing the exact same thing but from the
perspective of the web objects
application instances that are running
they're either not communicating with
each other or they are just depending on
where those users ended up and this you
particularly start to get into these
issues when you talk about coordinating
changes and different users having the
ability to edit the same object at the
same time so I want to start going over
some of those things talk a bit about
the locking behavior in EOF and how it
works
explain why sometimes users see that
locking and sometimes they don't and
explain why relationships can change so
let's talk about committing changes
within a single application instance so
what we look what we're looking at here
is editing context 1 modifying and
committing changes to an
object it has a too many relationship we
won't worry about that right now
and as you can see all of the other
objects are in line with what's in the
snapshot so what I mean is in editing
context too you can see that there
haven't been any changes to the object
the data in editing context too is the
same data that's in the snapshot and the
same with editing context tree so right
now the only user who's committed
changes to an object is an editing
context one he modifies the object and
commits he goes to save to the database
so we see an update to the database the
snapshot is updated and again every time
the snapshot is updated we're going to
broadcast out those changes to other
editing context that is other users who
are sharing a stack so okay good before
it was getting getting cut off but I
think it's okay so in this particular
case I want to talk about what happens
when two users within the same
application instance modify and attempt
to commit changes to an object at
essentially the same time so editing
contexts want to edit context to modify
an object so editing context 1 in
editing context 2 are now out of sync
with what's in the snapshot they haven't
committed their changes that they're
carrying around locally dirty versions
of this object with global ID 1 editing
context one goes to commit exchange we
update the database checking to make
sure that we don't have a locking
failure which in this case we don't on
the snapshot was was in sync with what
was in the database the snapshot is then
updated and the changes are broadcast
out now you notice editing context 3 has
received that broadcast while editing
context 2 receives that broadcast but
essentially maintains its own changes so
we'll attempt to merge in those changes
and where there's discrepancies editing
context 2 will reapply the changes that
it's already made and and maintain those
changes and the other thing to note is
that right now the snapshot is in sync
with what's in the database so editing
context one has come
change we updated the database we
updated the snapshot so those two are in
sync and that's important because this
is essentially EOS mechanism for doing
optimistic locking right so what happens
when you go to save a change as we
compare what's in the snapshot with
what's in the database and if they're at
a sync we have an optimistic locking
failure and as long as those two are in
sync we're not going to get not domestic
locking failure and we're going to be
allowed to update those changes so let's
look at what happens when editing
context 2 goes to commit its changes the
database is updated because the like I
said the snapshot was with in sync with
what's in the database and we broadcast
those changes out to the other objects
so the important thing to note here is
that the out-of-the-box behavior is even
when you have an attribute on a given
object mark for locking within the same
application instance to share say the
same stack by default they share the
same set of snapshots so you're not
going to see one editing context that's
appear of another attempt to walk
against each other now I want to talk
about the exact same behavior within
multiple editing context so in this case
we have editing context 1 and editing
context 3 modifying an object you can
see that they're in different
application instances and attempting to
commit those changes so everything
context 1 and 3 modify the object you
can see that editing context one updates
the database and we lock against the
snapshot so in this particular case the
database is in sync with the snapshot we
don't have any problems with locking the
snapshot is updated and we broadcast out
to the other shared instances so at in
context 2 is now aware of the fact that
we've committed this update to the
database whereas editing context between
editing context 4 or not and just to be
perfectly clear users from a user's
perspective there's really no difference
right they could have ended up on in the
first application instance and they
could it ended up in the second they
could be editing context 2 or 3 or 4
they really don't know what editing
contexts are going to be in so let's see
what happens when editing context we
commits changes to the database we go to
lock against the snapshot and in this
case we fail right because we have a
snapshot that has been updated or rather
a database that has been updated since
the last time we've updated the snapshot
so editing context 1 or rather
application instance 1 updated the
database and then in context 2 goes to
update that database it's going to fail
with an optimistic locking failure so I
wanted to go right to the demo and show
you some of these behaviors and there's
actually some additional angles that
come into play when you're when you're
doing this in a real-world situation so
from a high level or from an application
instance level this is exactly what
happens but because the web is a
stateless medium there's some additional
wrinkles that are introduced when you
have a web browser that has a chance to
get out of sync with what's actually in
your application so if we could cut over
to the demo machine that'd be good
so the demo is about as simple as you
can get this is the EO model for the
demo you can see we have an object
called movie here with a too many
relationship to roll an A and a 2:1
relationship to studio and essentially
I've constructed a demo that that has
one component that allows you to edit
any of these BOS or their relationships
and it actually oversimplifies the case
somewhat because in the real world you
have cases where you're going through a
workflow on on any given page you'll see
some objects and you won't see other
objects but in this particular case you
see everything on one screen it makes it
a little simpler but as you'll see in a
second it's still very complicated so I
have two different browsers here ie
Netscape and we all know how well they
like to communicate with each other so
these two different sessions and I'm
going to start to make some changes to
the to the objects so I've done effect
I pulled all the objects into each of
the Associated editing contexts right
now they're on the same application
instance and I'm going to make a change
so I'm going to go behind the scenes and
I'm going to edit one of the objects
directly so I'm going to edit the movies
description and changes from labor union
history let's say to labor union movie
so right now you would expect that if I
did a fetch I probably wouldn't see that
change according to what I told you so
let's go ahead and do that and before we
do that I just want to pull up what's
going on so you can see so we do a fetch
against those movies you can see that
we've hit the database we've actually
pulled back all three movies but we
don't see that change right so I want to
do the same thing but this time I'll do
a fetch and refresh the snapshot so
according to what I've told you
if you fetch and refresh the snapshot
you should pull back from the database
update that snapshot broadcast out to
the other instances and you should see
the change so let's do that
sure now if we see the change but I want
to introduce an additional wrinkle so
let's actually go to the other
application instance and let's do
similar let's do something similar let's
do fetch and refresh and you would
expect that you might see the change
right but but you're not so in in this
particular case we're still seeing labor
union movie whereas in this particular
case we see the old value labor union
history and if you go into the console
you can see that sure enough I'll
demonstrate it just to be absolutely
sure that when we go in and fetch sure
enough we're hitting the database but
what we're not seeing any updates so so
what's going on here well what's going
on is that when this first session went
and refresh the snapshot it broadcasts
out those changes to all the other
instances so it pulled that data into
its own editing context so we saw the
change it that broadcasts that change
out to the editing context that's
sitting on a server that is represented
by by this particular session but when
this session went back to do a fetch and
refresh that synchronize the bindings so
it took the values
labor union history that was that was
saved within this overview compared it
to the values that had which had been
broadcast out from the other object and
noticed they were different and assumed
that this particular user was actually
making changes
so from this user's perspective he
hasn't made any changes at all and not
only has he not made any changes at all
but he's committed an accident that uh
that you would think and that he would
think would would send him to the latest
data but in fact it hasn't done that at
all and in fact committing this action
has has cemented this older version of
the object right back into the edited
context so if we were to hit Save
Changes which does nothing more than
than editing context save changes so it
merely saves unchanging unchanged
committed unchanged objects within his
editing context it will actually update
the database at this point so go ahead
do that
so we can see that it's updated the
database so without either user being
aware of it we've actually managed to
overwrite the commit that the first
users done and and replace it with an
older value so this isn't even a case
where two users are attempting to edit
the same data but there's still one user
fighting against another and overwriting
the data it's actually sort of
interesting I want to do the same thing
but rather than then fetch and refresh
the snapshot which in this case is a
button which submits the form I want to
follow this hyperlink and it says do
nothing which is essentially a no op
action it simply returns the same page
but before I do that I want to clear
clear out all the changes so often both
cases all invalidate all the objects so
we're completely up to date right now I
will commit a change the database from
the back end so we'll change this back
to label Union or just get rid of the
word all together so we're up to date
and that in this case will fetch and
refresh refresh snapshots exactly like
we did before and we can see that it's
gone but in this case we're going to
follow the do-nothing hyperlink and you
can see we're up to date so what's the
difference here the difference is that
when we follow a hyperlink we don't
actually submit any of the data that's
in the form we don't update those
bindings and so that we see the update
that's broadcast at us in the editing
context so the other thing I want to
demonstrate is a very similar behavior
but with regards to too many
relationships so let's clear everything
out again and this time I'm going to
make a change to a neo but I'm going to
make the change directly to one of the
relationships
so I've now made a change to this role
right here and in this particular case
why don't we start by doing a fetch so
I'll do a fetch and you can see that we
don't see the change in the relationship
which is probably what you'd expect and
if we go to the consult you can see that
even though we don't see that change
we've actually gone out and hit the
database again so let's do the same
thing but that's let's just kind of do a
fetch n dot and movies and refresh the
snapshot well what do we have in this
case we again go out and and and hit the
database we again pull back those three
rows but we again haven't seen that
update okay why don't we try
invalidating that movie so we invalidate
that movie and again we don't see that
change let's look at what's going on
behind the scenes you can see that we do
a fetch against the movie so we pull
back that particular movie the one that
we've been validated
we've also Reef altered it's too many
relationship so we actually pull that
those new roles right here but you're
still not seeing it so the point is that
even when you invalidate you're not
necessarily getting updates to too many
relationship so this time let's
invalidate all so we do an invalidate
all you can see that we've actually
gotten that update to that object but I
mean we really have a flurry of database
activity right we uh we've hit the
database once for every single movie
whereas before we were pulling back
three movies at a time and that's
because we're essentially iterating over
an array with these movies in them and
rather than fetching we're simply
pulling all those movies back and then
we're also searching back each of the
relationship so that too many and the
two ones but in this particular case
when we pull back two too many
we can see that it's updated
so the other thing I want to show you
was the difference in behavior between
when you update within a single instance
versus when you update within share
instances so let's make a change
actually I think we're invalidated but
just to be sure let's clear out the
entire cache let's make a change so I
have overview I don't know if you
noticed in the model but I have overview
designated as a locking attribute so you
would expect that that if two different
users make changes to an overview behind
the scenes that you should lock on that
attribute so when would when they're
within the same editing context or sorry
when they have the same shared stack but
so you have editing context that the
chera stack and we commit saves we're
just going to see that they can
overwrite each other so let's take a
look at that in the console so we have
two sets of updates right there and we
haven't detected any conflicts because
the snapshots are always within synch
with the database or even though those
particular attributes of MRSA locking
we're able to update but let's simulate
the exact same behavior within separate
editing contexts so we'll start over
pull the plug
make a change behind the scenes
and now attempt to make a change so this
is like I said analogous to a user in
another application instance committing
a change to the database and will also
commit a change in this editing context
and attempt to save it and what you can
see is that that we've gotten optimistic
locking failure occurring so the exact
same behavior could result in in the
user seeing either an update to the data
or a lot to failure depending on which
application instance they end up in so
there's there's actually a couple other
things we could demo but I think I'd
rather just jump right to questions and
then as people have questions maybe
we'll demo that behavior in there so I'd
like to bring Steve miner and arrogantly
out to the stage they're both part of
the web object engineering team and and
open it up for questions
[Applause]