WWDC2000 Session 412

Transcript

Kind: captions Language: en good afternoon everyone welcome back you have a good lunch yeah see a lot of empty seats that usually means either lunch was really good or really bad but we have here are the session that was hinted at several times yesterday how many of you remember the EJB session how many of you under have digested all of it yet absolutely yeah so you might also think about retitling the session cashing in on synchronization because as you learned yesterday one of the real advantages well obvious has is incredibly powerful mechanisms for caching and synchronizing data across multiple object stores running on multiple systems not just the sort of heavyweight who am I here are my kind of things you get with EJB and to tell you all about that here is daniel abrams please give a warm welcome thanks it's a it's really great to see everyone I'm frankly amazed it and how many web objects people were getting into these sessions and and a little intimidated too late a visitor who will go ahead like there any feds the name of the session is caching and synchronization and at a very high level it's a very simple concept caching in the sense of grabbing data from an external data store like a relational database and displaying it to users and the real issue is how fresh that dude is that your users are seeing and a balance between hitting that external store and caching that data so that you're not hitting it too often but as a consequence users might not be seeing the freshest data and synchronization which is taking changes that your users make to objects and applying the back to the database in an orderly organized way so we'll jump right into it and get started I want to divide the presentation into essentially three parts the first part at a high level will use the slides to go over some of the caching and synchronization issues and solutions we'll jump into a little demo app that I've prepared to demonstrate some of these things and then we'll get into the QA and hopefully get a constructive discussion going about how to deal with some of these before I get started I did want to sort of get it show of hands just to see on the level we should go at how many people here have a good sense of what neo fat specification is how to deal with it I have used that before okay so so maybe half I would say which means that we're probably going to lose some of you at least in the PowerPoint part of the presentation and hopefully we'll bring you back in with the demo but let's get right to it so I'm Daniel Abrams I work out in the field as a consulting engineer and I've been building web applications with web objects for about three years now large scale small scale all sorts of different types so I've run into these issues over and over again and and I think I'm pretty good in broad perspective on how to fix these things and I think if you've built with objects applications out in the field at all you've probably run into these issues as well so this is what I want to cover I want to go over the default web objects you have deployment scenario that is out of the box when you go to deploy what objects application what does that look like and what effect does that have on caching and synchronization I want to go over fetching and snapshotting snapchatting is sort of at the heart of these synchronization and caching issues committing and receiving changes in coordinating updates so this is what the default deployment scenario looks like you essentially have a whole bunch of client browsers out in the real world you could have one or more web servers and web objects adapters but let's take the simple situation where you have one and it's very likely that if you have any sort of volume at all coming into your application you have multiple application instances so if you look at this slide we'll see we have instance one and instance two and within each instance we have multiple editing contexts sitting on top of the shared els stack so that's essentially what you get out of the box with web objects without making any changes to the default component scenario so I wanted to spend a little time talking about snapshots because I think that if you understand how snapshots work how broadcasts occur and going on behind the scenes with your data then then you can figure out any given cashing or synchronization issue there the master repository for coordinating all data retrieval and updates and by default in a single instance you have editing context sharing a set of snapshots so these are some of the issues that when I go in and I see clients have built webobjects applications they talk about or they say why is it that I'm not seeing the freshest data or why am i seeing the application hit the database but my users are seeing that data show up so let's talk a little bit about how fetching occurs in web objects so one thing before we get into this that I wanted to go over was that when you do have multiple application instances in a deployment scenario by default they're not communicating with each other at all so this leads into the first level of complexity of what would seem to be a relatively simple situation we're about to see a number of behaviors where within a single application instance editing context or communicating with one another because they share a shared stack and a shared set of snapshots but when we're in multiple instances there's no communication going on whatsoever between those instances so if an editing context in one instance makes an update to the database the editing contexts that sit in the other instance in no way recognize that and it's essentially the equivalent of another application going in and making a change to the database or there's no real difference from within the instance so let's talk about a fetch within a single application instance you for those who don't know for those who are familiar with the concept of fetch specifications in AOF you essentially can programmatically construct a set specification and there's a series of parameters when we go over the demo app I'll show you how you do that that allows you to query a database get back a group of objects and display them to the user if you want to so after you've constructed this spec specification and trigger to fetch obviously query the database to get the data we'll snapshot that object within that shared stack so even a given editing context in this case editing context one has triggered a fetch it's actually snapshotted in a in a shared shared location so editing context tool & editing context 3 will eventually become aware of that object as we'll see as they either fetch it or manipulate it and finally we create an object and point into the the first editing context so now I want to talk about what happens when an additional odd when an additional editing context attempts to display the same object so in this particular case we have editing context to initiating a fetch on the same object an object with global ID 1 global ID is just a of's way of displaying and packaging up primary key so it's just a way of uniquely identifying an object so in this case we do the exact same thing we editing context 2 triggers a fetch on object with global ID 1 we query the database and what's significant here is that when that data comes back we actually ignore any updates to the data so if an external instance or even some other external application has changed that data and we simply do a fetch the out-of-the-box behavior is that you're not going to see it so this is the first thing that actually throws people off they're not aware of this they construct a FETs bet they fetch their data and they don't see updates but they do see the application hitting the database and I'll show you that in the demo but it's important to be aware that that's the default behavior so we've ignored updates we create an object in editing context 2 but that object is actually created off the snapshot because we haven't specified otherwise so I want to talk about what you can do to change that situation if you want to let's say that you always want users that fresh data or you want users that fresh data in this particular case what are some of the things that you can do to make that happen well one of the things you can do is on your fetch spec so you can use a method called refreshes we fetched objects and if you do that when you query the database and the data comes back it updates the snapshot so in this case editing context 3 has triggered a fetch and they've set that flag for refresh we fetched objects to on so we created the database and we can see that the snapshot gets updated in that shared stack now there's an additional wrinkle when this happens when we have an update to a snapshot all of the editing contexts that share that staff there are referred to as peer editing context receive that change receive a broadcast of that change so we can see that when this snapshot gets updated we broadcast out to those editing context that our peers that already have an instance of that object and then we pull that object into editing context 3 now we'll get into more detail later what happens if editing context 1 or editing context to have made changes to that object before that broadcast is occurred and what actually happens is you get merged and we'll talk about that more on that when we get it more into synchronization there's an additional wrinkle I wanted to talk about which is specific to four or five so four or five have has added a number of different ways that that you can manipulate the way objects are refreshed or updated and they sort of add an additional level of flexibility to what you can do but also an additional layer of complexity that you have to be aware of so in four five edit editing context 3 when a trigger a fetch decide the query database and check timestamp are actually inverted so what will happen is editing context 3 triggers of fetch we'll check the timestamp if that timestamp has expired will actually go back to the database will do a query will update the snapshot and and again because the snapshot has been updated or broadcast those changes out so I wanted to talk about one of the alternative methods that you can use to to update snapshots and that's invalidating objects so I've added a little bit of complexity here to the diagram what you see in each of the editing context is the object that we were talking about before and now has a too many because relationships are actually treated a little bit differently and again we're sort of layering on top of what we were doing before so one thing that happens when you do a fetch with fetch with refresh turned on you actually don't update that that objects too many relationships so you might see changes to that object but you're not going to see changes to the too many relationship or the other relationships that it has even to one relationships so I want to spend a little time dealing with both alternative ways of invalidating as well as methods for dealing with updating relationships so in this particular case we have editing contexts one in validating the given object that we were talking about and there's two ways to do invalidation one is on an individual object and one is to invalidate every object in either the editing context or within the stack we'll do it individually first and then we'll move on to to invalidating globally so when editing contexts one triggers this invalidation we Leefolt the object weary fault it's too many relationships but we preserve its to one relationship when editing context one trips that fault we see a series of actions occur as a result we first queried the database against that object so we now have a fault for that object and it specifically queries for that particular object we update the snapshot and update that object and then we broadcast those changes out because the snapshot has been updated now there's one additional wrinkle here which is that the broadcast actually occurs a little differently than it does when you have refresh is turned on in the case of refresh when you broadcast out in editing context - or editing context 3 if you had a dirty object that is an object that someone has modified that broadcast would merge in those changes when you invalidate an object by default if you don't do anything else it will overwrite those changes and the users in the other editing context will lose their changes so in some ways invalidating is a more powerful tool but in some ways you have to be careful because you could potentially write overwrite other users changes so right now we have this object with global ID 1 pulled into each of the editing contexts but we still have the too many relationship faulted and I want to go over what happens when that fault is tripped as well so when that when the too many relationship fall the stripped we will query the database for that too many relationship just like it did the first time but it will actually discard the changes so you will not see any updates the too many relationship when you invalidate an object like that and and then it will pull that relationship in from the existing snapshot so this is an effect invalidating objects individually is an effective way to to update the given object but it will not work for for updating changes to a to many relationship and actually as an additional wrinkle it will create the database against that relationship so I'll show you the demo app and users are sometimes confused because they see that query occur but you don't see changes so finally the most drastic thing you can do is invalidate all the objects so when you invalidate all the objects essentially everything either in the editing context of the shared stack is refolded so every object is reef Alton every relationship is reef Alton for every everytime you trip one of those faults you're going to have a new query every snapshot is updated all those changes are broadcast including too many so when you invalidate all the objects you will update the to many that's a fact of way of pulling in new data for you to menus but there's a lot of really significant issues associated with invalidating all objects one is that it it's very expensive period to to try and pull every single object as you trip them back into the database and two is it can actually be more expensive than your original queries so if you pull in a bunch of objects into editing context through a series of queries you pull those objects in in sets at a time right so you might pull in objects five or ten at a time unless you recreate every single one of those original queries when you have to even validate at all it's going to trigger effect individually on each of those objects as you trip the faults the other issue is that it will wipe out any changes to any editing context that share those objects so if editing context one invalidates all objects and editing context to editing context three happen to be making changes to those objects or deleted those objects but haven't committed those changes those changes will be lost so one user will have the freshest data but another user may may simply lose data behind the scenes without really realizing what's going on so you have to be very careful when you do that to ensure that your users don't lose data and then the other thing to be aware of is that with every single one of these mechanisms when you actually go to deploy an application a user may end up on one instance or multiple instances would be actually a better way to say that is a user may users may end up on a share instance or they may end up across application instances so if they end up on a shared application instance and you're doing things like updating fetches with refresh or invalidating objects and those two users who are editing the same object are going to see changes as a result of this broadcast it by random chance they happen to end up on two different application instances even if your code is exactly the same they're going to see a different set of behaviors so potentially users are going to be very confused unless you're very careful about the way you're doing this because from their perspective they're doing the exact same thing but from the perspective of the web objects application instances that are running they're either not communicating with each other or they are just depending on where those users ended up and this you particularly start to get into these issues when you talk about coordinating changes and different users having the ability to edit the same object at the same time so I want to start going over some of those things talk a bit about the locking behavior in EOF and how it works explain why sometimes users see that locking and sometimes they don't and explain why relationships can change so let's talk about committing changes within a single application instance so what we look what we're looking at here is editing context 1 modifying and committing changes to an object it has a too many relationship we won't worry about that right now and as you can see all of the other objects are in line with what's in the snapshot so what I mean is in editing context too you can see that there haven't been any changes to the object the data in editing context too is the same data that's in the snapshot and the same with editing context tree so right now the only user who's committed changes to an object is an editing context one he modifies the object and commits he goes to save to the database so we see an update to the database the snapshot is updated and again every time the snapshot is updated we're going to broadcast out those changes to other editing context that is other users who are sharing a stack so okay good before it was getting getting cut off but I think it's okay so in this particular case I want to talk about what happens when two users within the same application instance modify and attempt to commit changes to an object at essentially the same time so editing contexts want to edit context to modify an object so editing context 1 in editing context 2 are now out of sync with what's in the snapshot they haven't committed their changes that they're carrying around locally dirty versions of this object with global ID 1 editing context one goes to commit exchange we update the database checking to make sure that we don't have a locking failure which in this case we don't on the snapshot was was in sync with what was in the database the snapshot is then updated and the changes are broadcast out now you notice editing context 3 has received that broadcast while editing context 2 receives that broadcast but essentially maintains its own changes so we'll attempt to merge in those changes and where there's discrepancies editing context 2 will reapply the changes that it's already made and and maintain those changes and the other thing to note is that right now the snapshot is in sync with what's in the database so editing context one has come change we updated the database we updated the snapshot so those two are in sync and that's important because this is essentially EOS mechanism for doing optimistic locking right so what happens when you go to save a change as we compare what's in the snapshot with what's in the database and if they're at a sync we have an optimistic locking failure and as long as those two are in sync we're not going to get not domestic locking failure and we're going to be allowed to update those changes so let's look at what happens when editing context 2 goes to commit its changes the database is updated because the like I said the snapshot was with in sync with what's in the database and we broadcast those changes out to the other objects so the important thing to note here is that the out-of-the-box behavior is even when you have an attribute on a given object mark for locking within the same application instance to share say the same stack by default they share the same set of snapshots so you're not going to see one editing context that's appear of another attempt to walk against each other now I want to talk about the exact same behavior within multiple editing context so in this case we have editing context 1 and editing context 3 modifying an object you can see that they're in different application instances and attempting to commit those changes so everything context 1 and 3 modify the object you can see that editing context one updates the database and we lock against the snapshot so in this particular case the database is in sync with the snapshot we don't have any problems with locking the snapshot is updated and we broadcast out to the other shared instances so at in context 2 is now aware of the fact that we've committed this update to the database whereas editing context between editing context 4 or not and just to be perfectly clear users from a user's perspective there's really no difference right they could have ended up on in the first application instance and they could it ended up in the second they could be editing context 2 or 3 or 4 they really don't know what editing contexts are going to be in so let's see what happens when editing context we commits changes to the database we go to lock against the snapshot and in this case we fail right because we have a snapshot that has been updated or rather a database that has been updated since the last time we've updated the snapshot so editing context 1 or rather application instance 1 updated the database and then in context 2 goes to update that database it's going to fail with an optimistic locking failure so I wanted to go right to the demo and show you some of these behaviors and there's actually some additional angles that come into play when you're when you're doing this in a real-world situation so from a high level or from an application instance level this is exactly what happens but because the web is a stateless medium there's some additional wrinkles that are introduced when you have a web browser that has a chance to get out of sync with what's actually in your application so if we could cut over to the demo machine that'd be good so the demo is about as simple as you can get this is the EO model for the demo you can see we have an object called movie here with a too many relationship to roll an A and a 2:1 relationship to studio and essentially I've constructed a demo that that has one component that allows you to edit any of these BOS or their relationships and it actually oversimplifies the case somewhat because in the real world you have cases where you're going through a workflow on on any given page you'll see some objects and you won't see other objects but in this particular case you see everything on one screen it makes it a little simpler but as you'll see in a second it's still very complicated so I have two different browsers here ie Netscape and we all know how well they like to communicate with each other so these two different sessions and I'm going to start to make some changes to the to the objects so I've done effect I pulled all the objects into each of the Associated editing contexts right now they're on the same application instance and I'm going to make a change so I'm going to go behind the scenes and I'm going to edit one of the objects directly so I'm going to edit the movies description and changes from labor union history let's say to labor union movie so right now you would expect that if I did a fetch I probably wouldn't see that change according to what I told you so let's go ahead and do that and before we do that I just want to pull up what's going on so you can see so we do a fetch against those movies you can see that we've hit the database we've actually pulled back all three movies but we don't see that change right so I want to do the same thing but this time I'll do a fetch and refresh the snapshot so according to what I've told you if you fetch and refresh the snapshot you should pull back from the database update that snapshot broadcast out to the other instances and you should see the change so let's do that sure now if we see the change but I want to introduce an additional wrinkle so let's actually go to the other application instance and let's do similar let's do something similar let's do fetch and refresh and you would expect that you might see the change right but but you're not so in in this particular case we're still seeing labor union movie whereas in this particular case we see the old value labor union history and if you go into the console you can see that sure enough I'll demonstrate it just to be absolutely sure that when we go in and fetch sure enough we're hitting the database but what we're not seeing any updates so so what's going on here well what's going on is that when this first session went and refresh the snapshot it broadcasts out those changes to all the other instances so it pulled that data into its own editing context so we saw the change it that broadcasts that change out to the editing context that's sitting on a server that is represented by by this particular session but when this session went back to do a fetch and refresh that synchronize the bindings so it took the values labor union history that was that was saved within this overview compared it to the values that had which had been broadcast out from the other object and noticed they were different and assumed that this particular user was actually making changes so from this user's perspective he hasn't made any changes at all and not only has he not made any changes at all but he's committed an accident that uh that you would think and that he would think would would send him to the latest data but in fact it hasn't done that at all and in fact committing this action has has cemented this older version of the object right back into the edited context so if we were to hit Save Changes which does nothing more than than editing context save changes so it merely saves unchanging unchanged committed unchanged objects within his editing context it will actually update the database at this point so go ahead do that so we can see that it's updated the database so without either user being aware of it we've actually managed to overwrite the commit that the first users done and and replace it with an older value so this isn't even a case where two users are attempting to edit the same data but there's still one user fighting against another and overwriting the data it's actually sort of interesting I want to do the same thing but rather than then fetch and refresh the snapshot which in this case is a button which submits the form I want to follow this hyperlink and it says do nothing which is essentially a no op action it simply returns the same page but before I do that I want to clear clear out all the changes so often both cases all invalidate all the objects so we're completely up to date right now I will commit a change the database from the back end so we'll change this back to label Union or just get rid of the word all together so we're up to date and that in this case will fetch and refresh refresh snapshots exactly like we did before and we can see that it's gone but in this case we're going to follow the do-nothing hyperlink and you can see we're up to date so what's the difference here the difference is that when we follow a hyperlink we don't actually submit any of the data that's in the form we don't update those bindings and so that we see the update that's broadcast at us in the editing context so the other thing I want to demonstrate is a very similar behavior but with regards to too many relationships so let's clear everything out again and this time I'm going to make a change to a neo but I'm going to make the change directly to one of the relationships so I've now made a change to this role right here and in this particular case why don't we start by doing a fetch so I'll do a fetch and you can see that we don't see the change in the relationship which is probably what you'd expect and if we go to the consult you can see that even though we don't see that change we've actually gone out and hit the database again so let's do the same thing but that's let's just kind of do a fetch n dot and movies and refresh the snapshot well what do we have in this case we again go out and and and hit the database we again pull back those three rows but we again haven't seen that update okay why don't we try invalidating that movie so we invalidate that movie and again we don't see that change let's look at what's going on behind the scenes you can see that we do a fetch against the movie so we pull back that particular movie the one that we've been validated we've also Reef altered it's too many relationship so we actually pull that those new roles right here but you're still not seeing it so the point is that even when you invalidate you're not necessarily getting updates to too many relationship so this time let's invalidate all so we do an invalidate all you can see that we've actually gotten that update to that object but I mean we really have a flurry of database activity right we uh we've hit the database once for every single movie whereas before we were pulling back three movies at a time and that's because we're essentially iterating over an array with these movies in them and rather than fetching we're simply pulling all those movies back and then we're also searching back each of the relationship so that too many and the two ones but in this particular case when we pull back two too many we can see that it's updated so the other thing I want to show you was the difference in behavior between when you update within a single instance versus when you update within share instances so let's make a change actually I think we're invalidated but just to be sure let's clear out the entire cache let's make a change so I have overview I don't know if you noticed in the model but I have overview designated as a locking attribute so you would expect that that if two different users make changes to an overview behind the scenes that you should lock on that attribute so when would when they're within the same editing context or sorry when they have the same shared stack but so you have editing context that the chera stack and we commit saves we're just going to see that they can overwrite each other so let's take a look at that in the console so we have two sets of updates right there and we haven't detected any conflicts because the snapshots are always within synch with the database or even though those particular attributes of MRSA locking we're able to update but let's simulate the exact same behavior within separate editing contexts so we'll start over pull the plug make a change behind the scenes and now attempt to make a change so this is like I said analogous to a user in another application instance committing a change to the database and will also commit a change in this editing context and attempt to save it and what you can see is that that we've gotten optimistic locking failure occurring so the exact same behavior could result in in the user seeing either an update to the data or a lot to failure depending on which application instance they end up in so there's there's actually a couple other things we could demo but I think I'd rather just jump right to questions and then as people have questions maybe we'll demo that behavior in there so I'd like to bring Steve miner and arrogantly out to the stage they're both part of the web object engineering team and and open it up for questions [Applause]