WWDC2004 Session 427

Transcript

Kind: captions Language: en what Apple we're going to be talking today about data synchronization hello sorry okay this is the second talk on data synchronization the first talk covered the architecture kind of the core concepts and the API talked a lot about basically what sync services is all the different parts and what it means this talks a bit more of a how-to how do you take an application and get it to synchronize data now there's going to be a little bit of overlap from the first talks don't know if anybody was at the previous talk if you work feel free to doze off a little bit as i go over things but i do want to make sure that we frame all the context and all the concepts so that it all makes sense so let's talk about syncservices what it is initially we had three applications that you could synchronize for three data types contacts calendars and bookmarks could also synchronize these two devices phones ipod and you could synchronize them up to dot mac now what we've added is the ability for you to synchronize your own custom applications so you can take your app and you can synchronize your data you can either synchronize with the existing types of data we already provide so you can kind of join in on the party or you can sync your own data and you can get it up to dot mac and across to other machines so what are we going to show you today we're going to go over a review of the sinking concept this would be a bit of a refresher if you saw the previous talk or read about some of this before but I want to make sure that I cover the basic concepts so you know basically what I'm talking about I'm going to talk about what you need to do in an application I want to make sure I cover the what what is it that you're doing and we'll give you a demo that will highlight and illustrate what we just tells you so first I'll tell you what to do then we'll show you what that looks like and then what I'm going to do is cover how to do it will show some small code snippets and we'll talk about exactly how you do all these things when you walk away from here this is what I want you to take with you want to make sure that I'm bring enough of the overall architecture that you understand what syncservices is that you understand what it means for your application to be synchronizing that consists of three main things how to work with schemas if I can get that to highlight data schemas are used to identify the different types of data that you'll be synchronizing so we're going to cover how you can make your own schema and how to use schemas that are already in the system how do you manage the sink session this is the core of the talk what do you do when you're actually synchronizing your data will go into that in pretty good detail we're going to talk a lot today today tonight seems dark in here we're going to talk a lot about how you trick will sink the trickle thinking is a way to basically push small bits of data very frequently so that from the users perspective their data is just constantly trickling out when you make changes in your application you want to trickle the changes out as soon as you can and when changes are made in other applications that you're syncing with you want to be able to pull those changes in transparently to the user I'm going to cover a little bit of terminology clearly we need to define our terms so we'll introduce some of that as we go through the talk and I'm also going to talk a bit about best practices by that I mean things that you can do in an application in order to provide the user with the best experience things that you can do so thats thinking goes smoothly fairly transparently and so that when problems occur the user has choices on what they can do so let's talk a bit about the types of sync clients and the types of thinking that you can do there's four main types of clients we have applications frameworks devices and servers so let's look at that first diagram again but from the perspective of the types of clients here we're showing applications ical and Safari or both applications they sync data from within the application this is what we'll be talking about today for your custom applications from within the application you'll be interacting with sync services you might also have a framework that's used by multiple applications address book does this the address book app I chat male and other applications can all access a common data store so there really isn't one process running on the system or one application that owns this data and in that case the framework must provide some mechanism to sync usually by having a demon that can get invoked when you want to sync similarly when you're sinking to a device you'll have some proxy to that device something that runs on the computer that represents that device for syncing finally with the server you'll also have a process that can talk to that server now although I'm going to focus today on talking about applications much of what I say is going to be applicable to all these types of clients and one thing that they all share in common is that they you know I really can't find this thing Kobe made it look so easy before all the data is pushed into something we call the truth each user on each machine will have a truth database that means if you as a specific mac OS user have two different machines that you're sinking you'll have a truth database on each machine and if you have more than one user on a machine each one of those users has their own truth database and the truth is an aggregate it's an aggregate of all the data that's synced by every client so although your client might be thinking some data you might not think every field or attribute that's in certain set of record types you might sync contacts but you only think a few of the fields there might be a richer client that thinks many more fields so the aggregate is the combination of everything even if your client isn't synchronizing it we have three main sync modes fast sink slow sync and refresh fast think is the most desirable that's done when what you want to push is just the changes from the last think if you're able to keep track of deltas between sync operations then when you synchronize you can just push up changes added this record deleted this record made a mod to this record so you push up little changes and then you pull down from the engine only what changed their if you don't keep track of the deltas or if it's the very first time you're sinking you'll need to do a slow sync when you do a slow sync you push up all of your records what the engine does is it looks at all the records you're pushing up and compares them to the last set of records that your client had pushed if you don't push a record up that you had previously sent that's treated as a belief it also compares all the fields so it can tell what records have changed this is clearly a fairly expensive operation especially if you have thousands of records so it's always more desirable to do a fast think when you can but there are a few cases where you'll have to do a slow sync and I'll talk about that and a little more detail as we go through the slides finally you can do a refresh think this is typically done when you've lost all the data on your machine or you've lost all semblance of state regarding your sink sessions if this happens you can tell the sync server forget anything you ever knew about me I'm just going to refresh we're going to start over as if we've never synced before typically you'll be doing that to pull everything down from the engine so you can reset your state but there are cases where you might have added a few records and you want to also push those up to the engine but still do a refresh for instance you may have lost the battery or power in a phone and lost all the data but you're traveling and you've added a couple of new contacts so you've just got to contacts on this phone you tell the engine to do a refresh think you just push up your two contacts they're added and all of the contacts that it initially been in the sink engine are pushed back down to your phone so with these different modes of thinking we give you the performance when you need it but we also provide security that you'll always be able to recover all your data and that you'll always be able to sync every record most important thing when you're sinking is correctness second most important to speed and most users are going to think the most important thing is speed especially when things are slow and take a long time but trust me the minute you lose some of their data they're going to realize what was most important and that's being correct I'm sorry I was supposed to show you that before I'll be doing that throughout the talk so get used to it there's a one other mode of thinking which we call trickle sing and it's not really a mode in the same sense as the other three it's more like a way of life for your application if you trickle think you routinely fast sink and there's a number of different places opportune moments when your application can do a fast think you try to do it in the background make it transparent to the user and one important point an application probably shouldn't ever synchronize unless the users told it to so when your application starts for the first time and the user starts to use it you should provide them some configuration so that they can say yes I want to sync my data now at that point when they do they've given you carte blanche to sync it whenever you want so you should go off and think small changes frequently we talked in the previous presentation about how the more frequently you can think the less data you have to sync each time so it's more like a continuous flow that doesn't put much load on the system does it seem to take a long time for the user and it keeps things synchronized more expediently if you make changes in your app and push them out right away they're available to other applications and available to get pushed up to Mac right then and there similarly when changes are made in other applications you'll be notified and you want to pull them in as soon as you can so you always have the most up-to-date representation for the user so let's talk about what you need to do in an application to sync it what are the five most important things the five main things I've grouped what I think are the core fundamental pieces that you're going to have to put into an application to sync the first one is setting up a data schema a data schema represents the type of records that you'll be sinking in a canonical form at the engine can understand this is important for a few reasons most applications have their own way of representing data you could use objects all the way down you might archive them or store them in a graph you might just have a simple text file comma separated values you could be using a database and all of these things are going to be different for your application from other applications but the engine needs to know one specific way of representing this data so you'll define in the data schema the set of entity types that you're going to be synchronizing entity is kind of like class for objects and instances of classes an entity is the type of object and it'll have a set of attributes so when you think contacts you'll define a contact entity of phone number entity and you'll define all of this in a schema that can be used by the engine another important point about this when you think you're not just thinking your application to the sink engine you're sinking with every other client that synchronizes those clients don't want to know how you represent your data and you don't want to have to know how all these other clients represent their data you don't want to have to know how data is represented on a phone you don't want to know how its represented on a server and you don't need to you just need to know one way to represent it and that's the way that's defined in the schema the second thing you have to do is configure your application you could have the greatest sync client in the world if you don't configure it nobody's going to know about it it's never going to run configuration is actually fairly simple it's a combination of a static file that defines some properties and a very small bit of API that you use by configuring your client you're registering it with the engine making sure that the data schemas that you want to use are registered and specifying any alerts that you might want to get when other clients synchronize it's pretty straightforward the fun stuff is the actual thinking control flow we have a sinking state machine API that you use we have a fairly object-oriented API in general but the actual act of data synchronization is a state machine it's a procedural API with methods you invoke on a sink session object we went with a state machine with the procedural API for flexibility by providing an API like this there's a lot of different junctures and points where you can opt in and out of the sink you have a lot of flexibility on how you want to do the sink if you get involved in a sink operation that starts to take too long because other clients are also saying they're taking a long time you can opt out of the sink you can cancel the sink if something goes wrong at any time you can complete it in the middle and take up where you left off on the next sink you have to do the sink steps in the order we're showing you here but other than that you can leave after you push your changes you can stop sinking before you pull anything down or you can go all the way through it's up to your application I talked before about having a data schema to represent your data in a canonical format that's kind of the class this is the object when you take any of your data and want to transform it over to the sink engine you'll be creating essentially instances which we call records of the schema types that you've defined when you are pushing records up you take the records or the objects in your application you transform them into records based on the data schema and these are essentially NS dictionaries you push these dictionaries into the engine and when you get changes back you'll be taking dictionaries or sets of field changes that are represented in something we call a nice ink change and you'll be transforming that back into your own record types for your app finally for fast thinking you need to do some state management if you want to fast think you have to keep track of the differences between the first sync or the previous think you did and the current state of your data this is for both adds and deletes modifies being somewhat of a special case of ADD if you've modified a record or added a record you can put it on a list when you delete a records you can put the ID in a list when you go to sync you just consult these lists and that's all you'll have to hand to the engine I added this i modified this I deleted this if you don't do that you'll have to give the engine every record so there's a few different ways that you can maintain state you could keep timestamps on all your records you can keep a dirty bit on the records you could have a list of what records you've changed what you do is going to be something that you'll choose that's most appropriate for your applications data model I find typically keeping a list of what's changed is very straightforward you'll also want to save that list in a fighh if you quit out of your application without thinking that way when you restore your application the next time you'll still remember the Delta so you'll know what you can sink there's a few other things that you can do to when you're sinking you're an application whenever an application does anything you shouldn't just go off and do something without the user knowing what's going on so you need to provide some sort of feedback most think operations will hopefully be very brief some of them might be so fast that the user couldn't even detect them but others may take a few seconds or longer so if you're engaged in any kind of sync operation make sure that you put some animation up so the user knows what's going on you can use the spinning bar animation you could put up a status pain you could just put some status text somewhere so the user can look and see what's happening users tend to get very annoyed if an app becomes unresponsive if you're sinking in the main loop of your application the user might be clicking in a text field trying to do something even if it's just a second or two seconds it'll get kind of confounding to someone that's trying to use your application if it doesn't respond but if they see something spinning that's kind of a queue ok something's going on here and we'll get used to seeing that and it'll be less troublesome for them I mentioned before about trickle sinking sink often and then try to sink fast if you can do that in an application it's going to be a much smoother experience for the user so I just told you what you have to do but there's a lot of things you don't have to do you're probably thinking boy I have to do all these things what did you guys do well we actually did quite a lot and I think we did a lot of the heavy lifting so that we're going to make it fairly straightforward for an application to synchronize and these things that you don't have to do are fairly complex first one is conflict management you don't have to worry about conflicts between different sources in fact you can treat thinking almost as if you're in isolation your application just sinks into the engine if there's conflicts with records for other clients will take care of that will notice the conflicts will keep track of them will present you I to the user to resolve them and then we'll handle merging in the conflicts correct also you don't need to present what will change to a user if you're going to make a hundred changes if you're going to remove a thousand records will pop up something we call an airbag it's an opportunity for the user to opt out of that sink will tell them these changes are about to be made by this sync client so you won't have to worry about doing anything to notify the user about changes that you're making due to a sink you don't have to detect duplicate records sometimes when you think two sources for the first time in a very common case as a phone and address book or a phone with calendar data and a calendar you'll have the same record on both devices or both clients will detect if those are duplicates and will handle merging them together to present them as one unified record so you won't end up duplicating every contact you have just because you synchronize the phone that happened to have all the same contacts on it already you don't really have to pay attention to other clients or worry about dot Mac we've got a decoupled architecture regarding all the clients your client just thinks to the engine you worry about that if you get that right will fan all of your changes out to the other clients will get the changes from them into you will take care of everything you don't have to do anything special to sync to dot Mac thought Mac will automatically be able to sync your data types up and down to other machines as long as you've defined a schema for them ok now I'm going to bring a Nancy Craig hell up to do a demo she's going to illustrate many of the things I just talked about you wired up iclicker please sorry I like it and there was on the slides first so we want you to feel confident when you leave the session that you too can write sing cabool applications so we did select a little more sophisticated example not a trivial one so you can get the most out of this session and also when you're watching a demo on the rest of gordy slides I think it's time to begin to think about how you might modify your existing apps to sync and how you might create a new application that's syncopal okay so what you're gonna learn from the demo and the rest of the talk really is how to sync your custom objects you can think like Toby and Gordie said you can sync all the contacts and calendars and bookmarks but if we think it's a lot more exciting if you create your own object models and you think those objects you're going to learn how to sync relationships in your object model and you're going to learn how to sync your applications simultaneously if you combine that with syncing your applications often then you'll learn how to trickle think and the good news is that these demos that you're seeing today are available now on your tiger seed DVD so you can go to developer example sync services and if you're gung ho you can open up your Xcode project now and you can follow along because Gordy's can actually show a lot of the details later that relates to the schema files in the client description etc so because it's a sophisticated example I'm just going to take a moment to just tell you what the architecture is and the object oriented at so of course you have the sink engine and the truth database at the center we have one app we call event it's just going to import ical files and create custom event objects the second application is media assets and it's just going to go to any old iPhoto library year folder and parse those files and create custom media objects and each of these applications has their own local database store so what does the object model look like there's an event object corresponds to a wedding or a birthday party and a media object and it corresponds to a photograph that was taken at an event so naturally there's a 2-1 relationship from media to events and a to many relationship from event to media so this is a special relationship because if you set the 21 relationship from mediate event you would expect that media objects to be added as it one of the destination objects of the too many from that event to the media so we call this an inverse relationship and the good news is that sync services is supporting and verse relationships in the sink engine and will maintain the integrity of inverse relationship even if you don't okay and then this is an example of an event we went to Mendocino to the beach and here's the photographs in retrospective life is probably not needed but i just wanted to show off my photography but can we go to demo one alright so this is the events application it's a simple master-detail interface I've already loaded the ical file by the way so for those of you in the back row I'm going to zoom in okay you see that event has the title attribute start date and end date I'm going to point out a couple of other things down here you see the record ID and below that is the client ID so each application has its own client ID okay another area I want you to look at over here now your when you do your applications you're not going to have a big old ugly sync button here in a trickle check box but we added that there because we're going to first show this demo slowly step by step to show you the process of what's happening between the two apps and then will speed up later by turning trickle singing on and we also implemented a calendar view because you typically want to do your events on a calendar so at this point the events application has the local event objects I'm going to push the sync button there's going to be a progress indicator that runs here and it's going to push the local event objects out to the truth database well that's the end of the demo no so this must be a tiger boat because I can't hide it hide it like that let's bring up the media assets up go away okay news media access app stain saying its master detail interface I'll import some an iphoto library of 2004 photos and so just down here you see kind of it helps if i zoom in but basically this title is one of the attributes the date of the media objects the image is just a URL and it's being shown here below now there's an event pulldown menu you can't tell that I'm actually pushing the mouse on there nothing is appearing because this application doesn't know anything about event objects right now again it has a record ID under here for each record and has a client ID okay so if I now push the import button it's going to push the media objects to the truth database is going to pull over the event objects I do that all the time sorry think there we go so the events just got pulled over and populated into the menu so I have to know that's Chinatown and this picture was taken at Chinatown's but I'm not going to if you sit there while I said all of these relationships so what we did was you created a smart events button and it's just going to run down these media objects and assign them to the most logical event matching the date sub so if you push that button now all my media objects have events so again just to review we've just created 21 relationships between the needy and the events object and too many relationships from all the event objects to the media again it's just local and I haven't pushed it yet so if i push sync going to push it out and then we'll bring up the events application and the calendar view and when I push think here on the events app it's going to pull both the media objects and the 21 and the too many relationships and hopefully we'll see them on the calendar there so let's turn on trickle thinking on the events at and then we go back to media will import a cup some more photos or februari march okay they're down at the bottom put smart events create some more relationships let's move this up to februari okay now when I push the sync button the events EPS is set to trickle think so it's going to get an alert that media assets is sinking and it will begin sinking simultaneously so you have to look quick there will be a progress indicator over here will be another progress indicator over here huh no soda okay now we'll go back to go back to events let's this is a multi-day event Tahoe skiing trip let's say we want to change the name of that but I want to show this off so I'm going to find Tahoe skiing over here too and let's change that to winter trip now when I hit the tab it's going to modify the local objects it's set to think about every five seconds so it'll be a moment delay it will think when it updates the local changes will update the tahoe skiing down here and then when it pushes the changes out it'll update media assets if i turn trickle sinking on haha alright i havn hit the tab button yet now i'm in hit the tab there goes and it didn't work there goes right it works okay so let's have a little more fun this is a multi-day event so let's move some of the pictures let's say that this picture was actually taken on the sixteenth and it should appear over here on the date of the sixteenth hit tab I'm waiting for the sink it goes that one less for those of you miss it will do one more time when with this photo to the 18th hit tab should sync up here go 18 ok to applications sharing the same content and singing together that's it so we just go back to the slides for a minute okay you're all probably wondering a little bit how its implemented so I'm just going to cover that briefly especially if you're looking at the code so it does use the model-view-controller paradigm and the models are the syncopal objects in the design there we go there we go okay and we use cocoa bindings of course to update all of those changes that are done locally in the app but in addition when you're pulling all the changes and applying them to the local objects that's how the displays are being updated we also used key value observing which is the underpinnings to cocoa bindings and we use that to record all of the local changes as Toby and gordy were saying you need to that's one of your jobs to record all the changes you make locally for pushing later and we also found transformers that is the NS transformer class useful for converting your models to records before you push and then when you pull the changes in from the sink engine you need to apply them to your models and sometimes when you get additions you need to create models so use transformers there we also use transformers for resolving the relationship so gory is going to show that more detail but the relationships that come from the sink engine are not what you expect you have to convert that to actual references to your objects and i think that's it can go back to Gordy for more details on how to a little bit about how we did many of the things you saw in the demo and Nancy actually wrote the demo so she gets all the credit but if I get it wrong you got to forgive me but I do promise to do better with the clicker now huh okay I talked about the five main things you need to do to sync let's just recap quickly you'll set up a data schema you'll have configuration for your application then we'll have the main sync loop before you mentioned that's the meat of syncing pretty vegetarians out there that's the tofu of sinking you'll have data transformation and Nancy just touched on that a little bit and then of course keeping track of your data so that you can fast think let's go over that now let's look at the schema so what goes into schema essentially you're defining entities so in the example we just showed you we had two entity types media assets object which had a picture of title and a date associated with it and we also had an event object which had a title and a date we mapped the media objects to the event objects we have attributes such as the title and the date we also have the relationships such as the relationship from the media asset to an event and the relationship back from the event to the media asset now being a little bit redundant but I want to make sure that we didn't skip over anything here this is kind of a pictorial representation I'm just going to run through it from the top just to tell you all the different parts of a schema you start off with a data class a data class is actually somewhat of an informal construct it's used to present what you're thinking to the user so if your application has a number of different entities that you want to sync you can group them together in one data class an example of that is contacts and calendars rather than specifying every entity type to the user providing them with way too much information you can sort of summarize it by naming it in the data class as I mentioned before you're sinking entities so a data class consists of a number of entities and then deconstructing further we can see attributes these are primitive types that you use basically what you would put in an nsdictionary they represent the different attributes of each entity in a contact you would have name first name last name include we just saw an example in what we showed you you also have the relationships and one other thing that we didn't mention before identity properties the first time you synchronize a new object or a new record from one source what we'll do in the engine is compare it to all the records we have from existing sources if we see that it's the same record we won't duplicate it that way and I mentioned this before when you think your phone for the first time with address book you're not going to duplicate every entry that you've entered dutifully into both the way we do that is by having schema specify the identity properties these can be attributes or relationships you might want to scope the identity of something through a relationship for instance for a phone number you might scope its identity through the enclosing contact by specifying the relationship from that phone number back to a contact as well as the type and the value which would both be attributes so you tell us dynamically this isn't something that you're stuck with it's not a static description but it's something that you put in your schema that we can use that can be different for each data class or each entity type you tell us what the identity of an object is how to notice that and we'll take care of mapping duplicates so let's look a little bit more to sync schema it's a plist straight up is very straightforward you'll have a name for your scheme that way any introspection tools any UI can be used to look at it will be able to determine the exact name of the schema also the engine has to be able to identify schemas uniquely you don't want two schemas with the same name so we recommend that you use a dnf style name here we have calmed apple snake examples as our name you'll have a set of data classes usually you'll just have one data class for instance with the contact schema we just have the contact data class but depending on the complexity of your application and the choices you make to how you want to organize your schema you could put more than one data class in one schema it's up to you you have a list of entities and that's really the main thing that you're going to be putting inside of the data schema so let's look at an entity each entity also has a name its dns qualified as well so that it doesn't conflict with other entities and any names are in a global namespace they're not just mapped within the schema that they exist inside of the entities are treated as global that way you can refer to an entity and another schema if you want to extend something or if you wanted to refer to a data class and another schema that you're adding an entity type too so you have to make sure that you use a unique name you specify the data class that your entities in you give it a display name the display name would be used again by any user interface something so that the user doesn't get stuck looking at really long weird disambiguated names in this case media makes a lot more sense to the user then you have your attributes relationships and identity properties and let's look at those attributes are very simple I've just included to hear I put a lip sees at the bottom because this isn't everything it's kind of hard to fit things and I hope you can see this actually I've noticed in some of the presentations is hard from the back to be able to see when we put code up or any kind of text like this but in this case i'm specifying two of the attributes the date and the title this is for an event object you specify the name and the type the name is the field that you're going to use in a record dictionary that represents one of these entity types and then the type is just simply what it is here's a list of the attribute types that you can use I just put it here quickly for completeness standard stuff that you can put into a property list also you can use an array or a dictionary as a primitive type but you need to be careful we're doing field level differencing if you have a record and it's got five different fields will difference those fields independently so if one record from one source changed field one another record from another source changed field two that's not a conflict will merge it together but if one of your fields is in a red or a dictionary that entire collection is going to be considered the atomic unit for that field so if you make one small change in that in one source and another change in another that's going to cause a conflict there are some cases though words very convenient to be able to use a collection but wherever you can it's best to split up your attributes into separate or you know use separate attributes for each one of your somatic fields there's a few additional types calendar date just because it's so useful you can't put calendar dates into property lists but you can put them inside of a record we have NS data in case you want to take something like an image or something that's your own you know object-type that's not represented by one of these you can just sort of stuff stuff it into an NS data you can also specify an enumeration of string this is useful to have a bounded set of strings and the engine will actually do some consistency checking for you so if you want to have weekdays Monday Tuesday Wednesday and so on you could specify those rather than just saying string and then possibly miss typing something you can also have a URL which is very useful to reference things elsewhere so let's look a little bit at a relationship relationship starts off with a name just like an attribute does it has a display name now I didn't show this for the attribute because it wouldn't fit on the slide but both attributes and relationships have a display name that way as tools or develop that can do introspection into these things you can display something a little bit more meaningful than the normal name that you'll pick notice that the names of the attributes and relationships don't need to be DNS qualified they're scoping is local to the entity that they reside in for a relationship you specify whether it's one to one or one too many we have a one-to-one relationship in our example from a media asset object back to an event each media object corresponds to one event however the events can have many objects they can have many media objects so in one direction we're specifying a one-to-one relationship and the other so one too many we're showing the media right here this is the relationship on probably should have mentioned this this is the relationship in a media record back to an event you specify the target type this is the fully qualified target type of event so here we're just simply specifying we've got a one-to-one relationship to an event I did a lot of talking that was actually something that's fairly simple you can also specify an inverse relationship these are very useful when you want the engine to do some consistency checking for you if you've set up a media asset to point back to an event you want that event to contain that media asset similarly if you move a media assets relationship from one event to the other you want to make sure that it's unwired from that first event and wired into the second one you can specify an inverse relationship now this is a little tricky to look at outside of context if you look at the examples that we've provided and you look at the entire schema you'll be able to see how this is wired up a little more clearly clearly if I could say that clearly there's a metaphor and there somewhere we have the entity name for the inverse relationship and the name of the relationship that's back so we're saying a media object has a relationship to an event and then the inverse relationship is from the events media relationship field this is just how you specify identity properties it's very simple it's just a list of attributes and relationships that are being used for the identity for that record in this case we're using the date and the title of an event to identify it uniquely okay so let's talk about what you're thinking I just described the classes now let's talk about the instances of those or the records when you're sinking an object you have to push up two things a record dictionary and a unique identifier for it now the identifier must be unique across all of the entity types that you're synchronizing so if you have contacts and you have phone numbers you can't use the same identifier for a contacts that you use for a phone number just because they're different entity types you always have to make sure that all of your identifiers are completely unique however you don't have to worry about other clients your client has its own namespace for love its identifier you need to put the entity name in a record that's essentially likely is a pointer back to a ass inside of an object by specifying that the engine now knows what kind of record is dealing with if you didn't put that in the record we'd look at this nsdictionary you'd be filled with all kinds of great fields but we wouldn't know what it was so you always have to make sure that you put the entity name in everything that goes in that record is just a set of key value properties for an attribute is one of the types i showed you before so it's just set straight up dictionary for a relationship if it's a one-to-one relationship you'll have an array with one element in it if it's a one-to-many relationship you'll have an array with zero or more elements the reason that a one-to-one relationship still uses an array is for consistency so you don't have to have code that's doing is kind of all over the place to see if this is an array or just a singleton object relationships are specified by using the unique record identifier of the target now one thing about record identifier often when you're using a relational database you can construct a unique identifier with some combination of your primary key and your record type in the database so you might want to use what you have for a primary key as part of the record identifier if you do that you may not want to put those fields into the dictionary for the record or into your schema because it's redundant you'll be using them for the record identifier there's no reason for you to also put them inside of the record itself so let's look at what an application sees a user will look at an application and they'll be presented with some kind of visual representation of your objects in the application you've got your own objects internally like I said before these could be struck they can be objects it can be constructed out of strings could be whatever you want when you're sinking you need to transform those into records so these are probably a little hard to see from the back but these are just straight up and as dictionaries and that's what you're going to be sinking back and forth to the engine so let's look at one in more detail this is a media record in white I have the actual name of the entity and then I'm just highlighting the relationship to separated from the attributes very straightforward it's just a dictionary we've an array for the event which is a relationship back for the enclosing event hey Lou okay this is what a event looks like and the only difference here is that it has a list of media objects since it's a one-to-many otherwise very similar so we have a very regular way of specifying all the records when you're sinking you don't have to worry about pushing up different objects in different ways everything eventually grounds down to just being a dictionary so now let's talk about configuration we know how to describe the schema for the data in our client now what we're going to do is we're going to set up a client description property list this is also a plist file it statically describes the characteristics of your client so what does it have we've got a list of the entities and the properties in those entities now you might have a data schema that you're sharing with other applications and there could be a whole slew of entities in there and a lot of properties because you're trying to cover all the bases this particular sync client that you're writing may not use all those entities it might not use all of those fields that's fine in your clients description property list you'll specify the subset that you use that way the engine knows which entities and which fields to be giving to your client and it also knows what to expect from your client so it won't erroneous we delete things just because your client doesn't pass up certain attributes you can specify whether entities are push or pull only most of the time you'll be both pushing and pulling entities you'll be contributing to the pool of data you'll be pulling in changes but sometimes for instance in the case of an iPod you'll only be pulling things down the engine can make certain optimizations in its data store when you give it that information you also specify what type of clients you want to sync with so if you're an application typically when other applications think the same data types that you do you want to get notified so you can sync when dot max thinks you'll want to start up when devices think you'll want to start up and we'll show that for our examples here we specify that we wanted to sink when Max ink and also when each of the other applications sync here's a look at a property list very straightforward we have the display name for the client and you can also specify I can get this to go here an image pad when you have a user interface that presents a list of clients which we provide for you it's nice not only to have the name of the client but to have some kind of an icon that represents it often it will be the same as your application icon but in some cases you might choose something different something that sort of illustrates that this is data that you're synchronizing so you can specify an icon relative to the past of this properly relative to the path of this property list and that way we'll present something nicer to the user than just simply text we also have a list of the entities as I described here we're specifying the event entity and the fields that we're going to synchronize I'm kind of going through these fast cuz this is pretty straightforward stuff okay let's talk about thinking now this is the interesting part when you're going to synchronize the first thing you need to do is register your data schema now it's not that expensive to reregister the schema every time your application starts you don't want to do it every time you think if you can avoid it but what you can do is start your application and just register the schema without worrying if it was already registered typically your data schema file isn't changing so all this amounts to is a quick stat by the sync server it checks to see if there's any differences and if there's not it actually doesn't do anything so when your application starts up make sure that your schemas are registered that you're using then you need to register client now I'm sorry I screwed up the slide I've done this so many times let's talk a little bit more about registering the schema I'm just going to show you the code now can you I don't know if you can see this and if you can i'll talk through it a little bit more but it's very straightforward what you'll do is you'll keep your schema in a bundle in your application you might as well keep the schema localized if it corresponds to your application now we mentioned that sometimes data schemas are decoupled from applications certainly from the engines point of view it doesn't make any assumption that a schema cartilage to any one given application if you're just providing a schema in your app for your own use you can keep it inside of your resources if you're not you might want to put it inside of a framework somewhere so other applications can access it once you have it it's a simple path and you make a call into the sync manager now I'm introducing the sync manager an API here there's just a few objects that you'll need to use in order to affect the sink in the sync manager as the name implies does mostly management type of functions you use it to register your schema you use it to register your client as you'll see very straightforward call you just pass in the path and you're done now the second thing I started to talk about is registering your client now you only need to do this if you haven't registered it before so when we look at the code if I can get to it unconvinced somebody's got a voodoo doll for this click or somewhere sure there we go okay so when you're registering a client you check to see if the client is already registered so what you do is by specifying the clients identifier you ask the sync manager for your client object if you get it back great you're done you could just return if not then you'll need to register it to register it is very similar to registering a schema you just tell the sync manager you want to register a client now the one difference is when you register a client you provide an identifier provide both the client description file and the identifier so that you can refer to it again in the future for instance to start a sync operation and the other difference is that you'll get a client object back so you can then use that to proceed through the sync operation this is really hard okay okay the second thing you'll do when you register a client is specify an alert handler this is pretty straightforward we're just doing this so that we can sync when other applications or servers sink in the example that we had we specified programmatically to the engine that we wanted to synchronize when applications or when servers synced we also specified an alert handler that gets called inside of our application so you've got a running application while it's executing if some other sinks or so server or application goes off to sink you want to get notified so you can join in and sync at the same time that'll happen in the main run loop of your application and it will happen just with a simple callback that we're specifying here I should note that you could also specify what types of clients you want to sync with in your clients description property list it didn't really fit on the screen when I made an example of that before and I also wanted to highlight that you can do it programmatically ok once you've got all that done you're ready to sync data so let's see what we have to do for that the very first time you think you need to do a slow sync you do a slow sync because you don't really have any basis to compare to so you're going to push up every record you have if you've done that you'll be able to fast think the next time and we've pointed out that you want to try to trickle sink as often as possible and when you trick will sink you only want to push up Delta's to make it fast when your application is launched you'll want to sync now when we shows the demo before we had a checkbox for trickle think and a button for sinking in a real application you wouldn't have a sync button nor would you have that check box you would just have trickle sync behavior all the time when your application starts up it would make sure that it had the most current set of changes so it would sink immediately and pull changes down similarly before you exit you want to synchronize if you synchronize before you terminate your application that ensures that changes the user have made are not only flushed to a data file and save but they're also synchronized out to the rest of the world I've really got to figure this thing out it's like if I pointed somebody over there it seems to go ok so we mentioned before that stinks session is a finite state machine so you have a certain set of steps you go through i want to point out that pulling changes down is optional if you want to you can just push changes up so when would you want to do that possibly when your application is exiting you'll just push your changes up and quit when users quit an application they don't want the application to sit there forever while they're waiting for it to sink they wanted to just get its business done and exit quickly so you want to make sure you don't spend a lot of time when an application is terminating going in a full sync operation so you can opt out of this at any point and in this case you would just push your changes up and then exit you do have to execute these steps in order though so let's look at them in more detail when you start a sink session you can specify a blocking call or a non-blocking call a blocking call would typically take a timeout you don't want to call into a blocking method and then just wait forever and have your user locked out if you call the blocking call from the main run loop you typically want to specify the time out around two seconds after 2 seconds you're going to get the little spinning beachball so you can specify an extra second and hope it doesn't happen but typically you want to sync operation to start quickly or you're going to bail out of it with a non-blocking call you give a call back into the engine make a call that returns immediately and then at some point in the future the sync operation will start now there's a couple of issues you have to be careful about one is responsiveness that just mentioned you don't want to go off forever waiting for a sink to start secondly if a user goes and makes modifications to data you want to make sure that you use the data at the point the session actually starts so if your application decides to sink and makes that call to start a sink session don't collect any data to use in the sink wait until the sink session actually starts and then you can use it if you have a blocking call don't loop if you call this blocking method and it returns without being able to sink it actually returns yes or no whether or not you've got a session don't immediately call it again first of all you would just be banging on the engine typically the reason it returned no is because another client is syncing the same entity types of you so if you have a phone that's synchronizing and then address book decides it wants to trick will sink but the phone is already synchronizing it could take a while for that device to finish so if you ask the question to the engine to start a session and it returns no you probably want to wait a sufficient amount of time before you try again or an even better approach is to just use the non-blocking call all the time so here's some code to begin a session and notice at the very top the first thing that we do is we save our file now the code snippets i'm showing you or actually they're taken from the demo so here we save our data before we start a sink you don't want to synchronize data you haven't saved in a file because the next time your app starts if it was unable to save you're going to be out of sync no pun intended with the engine so the first thing to do when you're going to synchronize save all your data and then at the bottom here you can see that we're starting a session now I actually committed an egregious crime here I specified five seconds I did that just for testing and I ended up leaving it in the slide by mistake i'm using a blocking call and specifying a time long enough that the beach ball is going to come up as it takes more than two seconds that's going to annoy a user so you typically want to keep that limited to two seconds or less when you use the block and call the next thing you need to do once you've actually established the sink session is negotiate do you want to do a slow sync or a fast think so i'm going to show a little code here from our app even though we mentioned that you need to do negotiation up front and we sort of show it at the beginning of the sync operation you can spread a little bit of it out if it's more natural for your app and you'll see how we actually did spread that out now in the previous talk we discussed the sinking modes and then talked about how we don't actually have a call into the engine i want to sass sink or i want to slow sync there's no call back to your client asking it what it wants to do we actually have a set of methods that you can use because it's much more flexible so in this case we're checking to see if we want to do a refresh think so we would do a refresh think in our example if we lost our data file so if we start our example code up and our data files gone will refresh think that way will restore everything from the engine we also do a slow sync in some cases we do a slow sync the very first time we ever synchronize also we catch errors with exceptions during the sync operation anything goes wrong during sync operation on the next sync we force ourselves to do a slow so in this case we're telling the session whether or not we've actually reset all the entity names so that we can refresh that's the first line or in the second if clause we're actually checking and we're telling the engine that we want to push all of our records which essentially amounts to a slow sync but besides what your client wants to do you also have to ask the engine what it wants you to do a user might have gone and said I want to reset every client from dot Mac at that point when you think the engine won't want you to push any records you're going to be reset also something might have gone wrong during the sync operation from the engines perspective so the next time you think it will want you to do a slow sync and we'll see where we ask those questions and then the API as we proceed now we're going to go push changes very simple flow chart here just to show you what we're doing if we want to push all of our records then we're going to get every record that we have convert it and push it up to the engine otherwise we'll just push up the deltas I want to point out one thing when you're pushing records you don't have to push an entire nsdictionary record you can push something we call a nice ink change in this case we've got one change to a record we've changed the title of an event to sean's birthday so I've got a very small I sink change object but if I wanted to push up the entire record I'd have to push up all the relationships and all the other fields and you can see that's a lot more information and that can add up if you're doing that for every record and you have a large amount especially when you're doing an initial sync so if you push up a nice ink change you'll save the engine a lot of time walking through an entire record trying to figure out what exactly has changed in it now here's our code this is a little bit dense I pointed out that negotiation is actually split up I showed you a previous slide where we told the engine our intentions but you also have to ask the engine what it wants you to do so we're asking it if we should actually push changes for this entity type we're walking through all of our entity types so in this case media access objects and event objects we're going to ask the engine should we push them there's a few reasons why it might tell you you shouldn't it might be resetting you from state somewhere else or you might have in configuration told the engine I'm not going to actually sink this entity type a user should be offered the choice you might only want to sync events and not media assets so if you turn one of them off you don't have to remember it you don't have to keep track anywhere the engine knows that you've disabled it so when you ask it if you should push changes it'll say no if it wants you to push changes then you need to ask it if it wants you to push all changes so I just mentioned before sometimes the engine needs you to push everything and do a slow sync this is where you ask it now the rest of this code is pretty straightforward we're just walking through all of the records either every record if we're doing a slow sync and pushing them all or just the changed records and we're going through and at the bottom you can see the call to session to push the changes from that record so we're just simply pushing it up specifying a unique identifier at the very bottom I didn't I wasn't able to fit this on this slide so i'm going to show you another slide that's the continuation here if you're doing a fast think you need to explicitly delete your records so what we do here is check to see if we've kept track of state and we have a list of any deleted records if we do and the engine isn't forcing us to do a slow sync then we push the records that deletes up now why wouldn't we push the delete supper for doing a slow sync reasoning is the engine knows what your previous state was so if you push up your entire set of records anything you don't push up it treats as a delete now we come to the fun part mingling mingling is actually pretty simple you're going to tell the engine that you're ready to pull changes and then the session is going to enter into the mingling state if other clients are sinking at the same time the engine is not going to return until they've finished pushing all their changes and it's been able to mingle from all sources so this can take a while the answer is going to be doing field level differencing here so it's going to be walking you through all the changes that come in from every source that's thinking at this time and it's going to compare them on a field by fields basis you can call this blocking or non blocking if you call it blocking again you have the same set of issues you want to make sure that you're being responsive so you don't want to go in and start mingling and waiting too long for the user and a blocking call if you call it non blocking though you have to be careful if the user makes modifications to data while you're sinking the engine is going to be pushing changes down to you that are predicated on the state you had before you started mingling if the user has made changes to any of the records that the engine also has changes for you typically want to let the user win so you need to have some mechanism in your application to keep track of changes that are being made during a sync operation and make sure they override any changes that come down from the engine so here's our call we're going to make a blocking call in first we're going to build a filtered list of all the entities this is similar to what we did when we pushed entities we asked the engine should I pull these entity types if they're disabled or if the engine wants to be reset from your client it won't tell you to pull anything so you may end up with an empty list then you simply ask the engine to prepare to pull changes that's the call that goes into the mingling state now I committed an even worse than here can anybody spot it and if you can read this you'll see I specified the date distant future so this application is going to wait forever now we actually didn't do that in the example I think I took this slide from an older version of it this is an example of something to be very careful about here your application is just going to go and wait for the engine forever so if something takes an incredibly long time in another client your client is going to be penalized finally we get to pulling when you pull records down you need to ask the engine if you're supposed to replace every record if the engine is trying to replace everything because the user said I want to pull everything down from Mac and clear everything and replace it from there on my machine it's going to tell you to replace all of your records if that's true don't go and whack your data store because you don't know if the sink is going to be successful anything could go wrong the user could pull the plug on the machine something could happen that could cause your application to crash instead just internally set all your data aside and take everything you get from the engine and only when you're able to save that completely should you then throw out other data this is pretty straightforward you get an enumerator when you're pulling changes the enumerator is pretty much like an array enumerator accept it contains i sync change objects I talked a little bit about these before this is an object that just contains a list of the changes for a record it's often more efficient particularly when a record has a lot of attributes and there's only a few changes however sometimes with your client the logic might be simpler if you could just get the entire record from the engine and just map it right on to what you have in your client you can do that by pulling the complete record out of a nice ink change so you have your choice as to whether or not you want to use the fields that are indicated in the change or whether you want to pull out the entire record we walk through the change in numerator we pull out each icing change with next object and then we apply each one based on the change type so let's look at that when we're applying the changes we're going to use the two-phase commit for each change that we get from the engine will either accept reject or ignore it if we accept it the engine will assume that we've taken it and will give it to us again assuming we successfully complete the sink session if we reject it the engine will assume for some reason we don't want it perhaps you have a device or an application that doesn't fit every field on it so some of the fields that are handed to you you just can't use so you reject that change so you don't keep getting hit with it if you don't do anything the next time you think the engine is going to push that record at you again now after you've gotten every record and successfully applied them or reject it and if that's what you want to do you tell the engine to commit at that point the engine says great whatever this client just told me I'm going to believe is the truth and i'm going to keep track of this so the next time you sync everything you've accepted doesn't get pushed to you again everything you rejected doesn't get pushed to you again but if you don't make it to commit if your application blows up before that or something goes wrong the engine will then try to push all those changes back to you on the next sink and that's good because if something goes wrong you don't want to lose all that data let's look at the code for it we think we look at the change type we switch on it if we have an ad or a modify you simply try to apply it what we're using is a transformer here if we're able to successfully transform it then we tell the engine that we've accepted the record otherwise we tell the engine that we're refusing the record so in our example if there was something malformed in a record something wrong with it we would reject it otherwise if the record looked good we would accept it you do the same thing for deletes you take the delete down check for consistency if you've got that ID for that record it's referring to you just delete it once you're done going through all of that make sure that you save all the data to your data store before you commit to the engine you don't want to tell the engine sure I've taken all your changes because you've got them in memory and then crash and not actually write them out to a file so we're just showing that here and then at the very end we tell the session that we're finished and we're all done so it took a little bit about state management before I'll just reiterate you need to keep track of the ads and the deletes and the modifies that you do that way each time you think you can just push up deltas which is much faster you need to save this info if your application exits without synchronizing if you don't save it the next time you start up if you try to fast think you'll have forgotten about the deltas from before if you lose the Delta information make sure that you slow sync remember the most important thing is correctness after that comes speed so I've just got a few best practices to describe to you before we close sink quickly and often we keep talking about trickle thinking I keep banging on this and saying it over and over again it's really the best thing to do for the user the more often you think the more frequently and the quicker you can sink the more transparent it becomes the less load on the system and the better the experience for the user your data is getting moved around and it's where people want it at the time that it's changed you want to be responsive it's very important that you don't lose responsiveness in an application users hate it when they're banging in the text or trying to do something or get a menu to come down and they don't know why application is hung you want to provide user feedback by that I'm talking about something like a progress indicator or a progress bar possibly a status line with some text telling the user what's going on depending on how much data your application sinks what type of application it is if it's a pro app or if it's just a simple application that users don't really pay much attention to you might want to give them more or less information but make sure that you show them something you need to offer choices to the user if you go to quit an application and start thinking and you can't get a session within a reasonable amount of time you might want to pop a panel up to the user saying do you want me to synchronize your data before I exit that way the user has more of a choice similarly when you start up if you're not able to get a sink session right away you probably want to start the app immediately and wait a minute to get the sink session you might want to notify the user that you don't have all the data but you certainly don't want to make a user wait 10-15 30 seconds for another client to finish syncing before your application starts now I put this here almost as a joke when I first wrote the slides and it sounds kind of ridiculous don't corrupt the users data I might as well tell you write an application that links and doesn't crash when it runs but this is actually a critical point keep in mind that the data that you're synchronizing is not only being synchronized with your application but it's being pushed out to other applications so if you corrupt some data it's going to fan out to other applications that might fan out to devices it will fan out across Mac and it'll have catastrophic effects so it's very important to test that last one percent test all the boundary conditions make sure that you handle errors it's really critical when you write a sync client that you get all of that right it's very important to have correctness a couple people you can contact we have developer relations we also have a technology evangelist the names are here please also we're looking forward to getting suggestions from people file radars when you encounter problems and stay in touch with us and don't hesitate to contact these two people if you have additional questions it's a couple more things you can look at for information but get to it okay we've got a lot of reference documentation we've got some concept documentation that's only available online the reference is actually also available on the tiger DVD that we gave you also on the tiger DVD the examples that we showed you all the source code is there in a project it's in developer example syncservices have a look at it you'll see everything I talked about in as much detail as you want