WWDC2004 Session 432

Transcript

Kind: captions Language: en I work with the sync services development team two years ago we introduced i think i think was an end user application to let you synchronize your contacts your calendars with the phone a PDA and ipod with dot mac you could synchronize your contacts and your calendars to other computers and we soon introduced safari bookmarks today it's my pleasure to be able to tell you about syncservices with which you can add synced your applications that kind of begs the question why would I want to sync well for those of us with more than one computer is a great way of never having to worry about did i leave the changes on this computer or that computer or over here portable devices are becoming more and more powerful and so it's convenient to be able to have my information available on my phone or my palm or wherever but there's another reason too and it's not one that's immediately obvious when you think about synchronization people have got different applications that kind of do the same thing but some apps are better suited to one particular kind of task than another sink can be used to share data between different applications what we're going to talk about today is I'm going to tell you what sync services is going to get it into the architecture a bit of the data model and just give you a road map to the api's I'm not going to get too deeply into what the api's are we've got a lot of great documentation in the SDK that's on the tiger DVD that you've got with you and your bags and also available on the web there's the second session that directly follows this one that's going to be a much more hands-on oriented approach into getting actually into how you right now you incorporate sync services into your application so let's jump in and answer the question what is sync services it's a system service built directly into tiger for data synchronization the basic gist is you provide us with access to your data and we do the rest we care of all of the sinking and all of the sink smarts for that it's a single solution for everyone today if you want to get your contacts from your phone onto a palm or some other PDA you often need two three sometimes more different solutions we want to provide one solution for everyone the problem with the existing solutions today is that they operate outside of your application and this leads to a number of different problems there's the multiple writers issue meaning if you've got your application and there's a separate sync process both trying to access your data store you have to solve the problem of multi-process concurrent access to your data and that's the hard problem to solve if you're going for a file sync solution you want to keep your word documents in sync with your laptop so you get some kind of file sync solution there's a granularity issue there what happens if you change the documents in both places what happens if you change just part of the documents on one and part of the documents on the other it'd be great if you could merge those two proprietary formats if you're a device developer and you want to sync your device to some third party application you have to go in reverse engineer their format and figure out how they store their things and that doesn't really create a lasting solution there what we're doing with sync services is giving you the ability to build think directly into your application you focus on syncing your app and we'll take care of the rest getting it to interoperate with other applications other devices syncing between computers the things not just about applications it's for devices too we've worked a lot with devices with various kinds of memory limitations devices that can only store a certain number of Records devices that can only store fields of a certain length we've got some good solutions to help you do that manage that filtering that's formatting you provide the application to manage and configure your device you incorporate sink into that application and it's the same way you think the same way that you would any other kind of application you synchronizes your devices the same way you synchronize your applications we're going for one solution for everyone here I want to talk a little bit about the design goals because if you understand how what we were trying to do when we set out to accomplish things I think it will really help you get into the mode of what we're trying to do that we started with a really simple preset your data where you want it when you want it and from that we drive three basic design goals think should be decoupled think should be extensible and think should be invisible let's jump into what these mean a little bit more things should be decoupled now there's two things that we mean by this applications and devices should be able to think independently of each other sometimes when I want to sync with that mac my phone's not turned on later when my phone is available I don't have a network connection yet I still want my changes to flow back and forth between the two of them so we want to be able to synchronize these things independently of each other now that being said we also want to try to synchronize these things at the same time if I'm thinking two devices independently I have to do three things one to get the changes from say my phone into my computer and then from my phone or my computer into address book and then from address book back into my computer and then I have to sync my phone again to get the changes back onto the phone so we want a way to be able to try and sink both of these things at the same time it reduces system resources it makes things go faster it's easier all around the second kind of thing the second kind of decoupling that we're talking about is that schemas are independent of applications and devices the bookmark data model is not owned by Safari it's not owned by Mozilla address book does not own the contact schema there's a clear separation between the two there the data model needs to be extensible we provide some standards schemas for contacts for bookmarks for calendar entries if you want to be able to exchange data between different applications and devices you need to conform to that schema at the same time we want you to be able to achieve perfect synchronization we're not going to be able to think of everything that that you're going to be able to synchronize in your application we can't think of that ahead of time and so you need a way to extend the standard data types to add your fields you need to be able to add new data types if you want to synchronize photos for example and the most important point is that things should be invisible the user should not have to explicitly think about synchronization changes the idea is that changes should just flow back and forth naturally between the two to that end we put your application in control you think when you want to sync now it's important that you try and maintain application responsiveness during this sometimes things can take a while a device might be slow in responding there might be a lot of changes going on you don't want to give users the spinning beachball of death there the model that we've come up with is what we call trickle sinking there's nothing that we do to provide trickle thinking it's the way that you write your applications to to synchronize the idea is that we do lots of frequent sinks the more we think the fewer changes we do on each sink the fewer changes we have to do on each sink the faster the sinks go the faster the sinks go the more frequently you can sink and so on and so forth it creates kind of a virtual cycle there so let's come back to the question of what is sync services and answer it again from a bit of a more of a geek perspective it's a public framework this framework fly it provides administration you register to synchronize you register your schemas it provides API with which you can give changes to the sink engine the sink engine processes those changes and gives changes back to you it's a demon the sink engine runs inside the sync server there it's a field level differencing engine and I'll come back to that point a little bit later to explain exactly what we mean by that but we're also coordinating sinks between multiple applications devices servers and the sync server takes care of coordinating all of those and finally sync services is a UI we provide a standard UI for conflict resolution we provide an airbag panel so that the user can protect their data from rogue devices we provide an API for you to reset a specific device or an application what I'm going to be focusing on in this talk is mostly distinct server and the sink frameworks there so let's get into some of the details of sync as three things I'm going to cover here we're going to talk about schemas I'm going to talk about clients and then we're actually going to get into the meat of how synchronization works so what can you synchronize I don't know how many of you went to the core data talk a couple of days ago but for those of you who are there we're going to do a quick recap over what we can synchronize we use the entity relationship model this is the same model used by the core data framework and it's sort of based on the preset that if you can save it you can sync it on top of that we've added some extras just for sink now the entity relationship model is an industry standard way of decomposing data an empty describes a single discreet thing a contact a phone number a bookmark and mp3 etc an entity has a name in our case the name must be unique across the whole space there's a global naming space for entities there so we recommend using a DNS style syntax column that Apple contacts phone number for example to avoid to minimize the risk of collisions entities are composed of property a property is an attribute which describes a single characteristic of the entity a first name a last name something along those lines attributes are strongly typed you see on the monitor here the list of pipes that we support it's a fairly rich type and we can look at adding new types in the future the basic types strings numbers data date etc we've got some aggregate types you can create arrays of these primitive types you can create dictionaries of these primitive types and we have a new noon type in a new ms basically a string with a fixed set of values that are allowed and the engine will enforce a certain level of data typing on this relationships are very interesting often data by itself is not all that useful what's interesting are the roles the relationships between the data a relationship describes the particular role between two entities for example a contact has one or more phone numbers a bookmark has a parent so a relationship is directional you have there are two kinds of Orden allottees for a relationship you can have a to one relationship or you can have a to many relationship a to one relationship is a one-to-one association between two entities a bookmark has a single parent and a to many relationship you've got multiple relationships the contact can have multiple phone numbers for example there's the concept also of an inverse relationship if you have a relationship from a bookmark to its parent folder there's also going to be an interesting relationship from a folder down to the bookmarks that it contains and we'll come a little bit later as to into why that's important to maintain so what happens when an entity is deleted this is again where relationships come into play when you delete a particular entity we find all of the relationships to point to that entity and we nullify them out you may also optionally say when this entity is deleted I want you to traverse through this relationship and duly all of the guys that he points to him as what we call a cascading delete rule some of the extra stuff that we've added for sink is the notion of an identity property when you add a new record into the sync services engine there we want to try and match that record up against existing records already provided against by other clients otherwise we'll end up with duplicates in the case of a contact we want to find a contact that looks the same as this guy and we use the identity properties for that kind of matching a contact your identity properties will most likely be the first name and the last name any object any record in the database that has the same first name and last name it's probably going to be the same kind of end same kind of record and so the sink engine will merge those two records together an identical property can be a relationship or an attribute for example a phone number is bound to a specific contact you don't want to match the phone number from one person to the phone number of another person even if they're the same phone number they might be two roommates and when one roommate moves away and you change the phone number you don't want the other person's phone number to change at the same time if you put the two contact relationship and to the identity set there what it's saying is that we will limit the set of records we look at to all of the records matched on that relationship there's the notion of dependent properties for example let's say that I have a calendar event he's got a start date and he also has a bit specifying whether or not he's an all day event these are two separate fields and so I don't want to try and merge them together in the sink engine but there is a semantic relationship between the two if I change the start date on one event and I change the fact that it's an all day event on another on a different client I want the sink engine to generate a conflict for that by putting those two in the dependent property set there by marking them as dependent properties the sink engine can catch and generate conflicts for those even though they're two different fields I mentioned before the ability to extend existing entities to add your own fields to add your own attributes these are what we call entity extensions you can even create new relationship on an existing entity to another entity or to a brand new entity but it's very careful you don't change the fundamental properties of the entity that you're extending if you add a new cascading delete rule or something like that you may end up causing bugs and some other clients that were depending on the original behavior finally we've introduced the notion of a data class a data class is just an informal association of entities and informal grouping of entities there tends to be a lot of entities that are going to fall out in your schema you've got contacts phone numbers URLs aim addresses all of these kinds of things and yet in the mind of the user what they're thinking about for all of those things is the notion of a contact and so the data class gives you a way of presenting a user friendly name to your collection of classes there the sink engine itself doesn't actually use data classes for anything intrinsic to the sink operations your schema is described in a file it's this file contains the description of all of your entities your extensions the attributes on the entities the relationships it's a standard plist file the format is well documented in our documentation and it's contained in a sink schema bundle that bundle can be located anywhere you can include it in a framework you can include it in your app wrapper you can put it in the standard system location the key point to remember here is that your schema is decoupled from your application even if you're writing your application your schema and that's all that's going to be syncing that schema in the eyes of the sink engine the two are not related to each other your app does not own the schema we provide three standard schemas for contacts calendars and bookmarks and those are located in system library sync services schemas and I encourage you to go in open up the schema bundle find the plist and have a look through it to see the standard format unfortunately we don't have any documentation on that yet but we hope to be addressing that at some point in the near future so let's come back again to the what can you synchronize question we talked about entities relationships attributes this defines the data model that you can synchronize what you actually synchronize our records a record is the basic unit of exchange in terms of the API it's expressed as an nsdictionary the keys are the attribute and relationship names the values are the types that corresponds to the associated attribute or relationship a record has an identifier and that identifier must be unique across your entire entity space if you have a contact density and you have a phone number entity the records are the identifier must be unique and this is where we differ a little bit from your standard database terminology one key point is that your record dictionary must contain an entity name you have to tell us what kind of entity your record is the sink engine depends on being able to know what kind of record is so it can do the right kind of matching the right things with the field values so we've talked a lot about applications devices we sort of mentioned that the sink engine is a little bit agnostic it doesn't much care whether your client is an application or a device and so we came up with this generic name for them we call them sync clients a sync client has a unique identifier that is how your clients is identified to sync services you can also give it a user-friendly display name and an image for display in some sync you I there's a one-to-one correspondence generally between a client and a data source in the case of an application the the association is pretty clear but in the case of the device you've got to think that I'm writing a client for a specific kind of device but the user may end up with multiple copies with multiple kinds of devices I got many different kinds of phones I've got a couple of PDAs two icons each of those corresponds to a unique client that you register with the sink engine a client description file provides a template description for your client it generally contains just the static details the type of your client is at an application or a device the list of entities that you've synchronized and the specific properties on those entities that you synchronize that's important to note just because an NCD defines a set of fields doesn't mean that your client is going to want to synchronize all of those fields and so you can specify I'm only interested in synchronizing save the first name and the last name on the contact entity some clients only push changes some clients only pull changes the ipod is basically a pull only device you only pull information onto the iPod the iPod is never going to change that information by specifying that in the schema in the client description file you can help the sink engine optimized some of its processes now a lot of the times is going to be a one-to-one association between the clients and the client description and so you can also include a display name and an image directly in the client description file and if you're writing an application that's probably what you're going to be using most of the time you can also specify that information dynamically using the sync services API so when a phone is registered when the user decides I want to synchronize the phone your client can figure out what kind of phone it is and register the appropriate name and image for that phone let's get now into the actual meat of how sync works there's basically five phases of sync I'm going to cover them briefly here so we have a framework for the conversation we'll start diving into the nitty-gritty the first thing you do is you create a sink session you then negotiate how you're going to sink you push your changes into sync services we process all of those changes and then you pull what changes are due to you back out of sync services now there's a couple of things that you have to know first and what I'm going to cover here is the truth database the truth database is an aggregate of all of the information from all of the clients if you have a client that is synchronizing contacts and he pushes in just the first name and the last name you've got another client who is synchronizing the first name the last name and the company name what we store in the truth is the aggregate of all of those fields the first name the last name and the company name the truth is what the client sink to not with each other if you remember I mentioned earlier that clients are decoupled from each other and this is how we accomplish it a client can sink into the truth another client can come along and sinking through the truth and then they can pull their changes directly out of the truth now we are storing a copy of the data here and that's worth keeping in mind if you're going to be synchronizing photos if you're going to be synchronizing large data files or things like that you probably don't want to push that information into the truth because then you're going to end up with multi gigabytes of data lying around on the user's disk we're going to come up with a solution for that at some point in the future the client state database what this contains is a snapshot of all of the records on a device we need this information for a couple of reasons what we do when we know a record is on a device when the device gives the record to us or when we push a record to the client we store a copy of that record in the client state the reason we do this is so that on the next sink if the client gives that record back to us we can pull what we knew was on the client before out of the client state and compare the two of them from that we can figure out how this record changed at all we can figure out specifically what fields on that record have changed what we push into the sync server are just the field level differences if you change just the first name we're not going to push the whole record across we're going to push just the first name across into the mingler the other place where this is used is when we're formatting records I mentioned earlier that some devices have limitations on the length of the fields that they can store a phone may truncate names at 20 characters for example so if we take a really long name and we push it on to the phone the phone is going to truncate it if the phone then gives that record back to us what we would do is we would look at the shortened name we compared it to the longer name in the client state would say hey this has changed and we'd end up propagating the truncated name everywhere and that would make people generally pretty unhappy so what we do is we store in the client state the formatted record we're going to store the truncated name in the database there in the client state so that when the client gives that record back to us we'll compare the two fields will say those are the same it hasn't changed unless of course the users actually physically changed the name on the device now you're probably thinking oh great they're storing yet another copy of all of my photos and my contacts and things well it's not that bad actually what we store in the client state is really just a hash of the information that we push the device just enough information so that we can do that comparison successfully one important thing to understand is that the record identifier are scoped to a particular namespace each client has its own namespace the truth database has a namespace and there is no correlation whatsoever between any of these namespaces so if a client has a record called foo another client may have a record called foo and those can be two completely different records there is no association between the two of them there so putting everything together the way things work is this a client is going to take a record give it to sync services services inc services is going to pull the record out of the client state and compare them if the record isn't in the client state then we know that this must be a new record and we push what we call an ad into the sync server if the record exists in the client state we compare the two and push just the field level differences into the sync server the sync server process those processes those changes merges them into the truth and clients then pull all of the changes out of the truth so coming back to the start of the process creating a sink session the first thing you have to understand is you may not be allowed to synchronize and there's many reasons for this it could be that some other client is already synchronizing now because the sync server is writing into a common database we can't allow multiple people to all synchronize at the same time we need to maintain a certain state of integrity of the truth database there and so we can only process sets of clients at a time so if a client is already in the middle of synchronizing other clients must wait until that guy has finished it's important to be able to maintain application responsiveness throughout this so we provide both blocking api's for convenience we also provide non-blocking api's so that you can basically request the sync services I'd like to start a sink session now please generally you'll probably be able to go straight away but if you can't we'll call you back when you're ready to go now that being said I also mentioned earlier that we want to synchronize clients simultaneously this is not a contradiction what sync services provides is the notion of a sink alert when you register your client you can specify the kinds of clients that you want to synchronize with address book pretty much wants to synchronize with anything so he says I'll synchronize with that i'll synch with devices i'll synch with servers a server would probably only synchronize what other servers are synchronizing a phone would synchronize when a server is thinking or when another phone is syncing so you specify at registration time who you want to sync with when one of those guys then start syncing syncservices delivers an advisory notice to the clients that have registered an interest with that guy this is an advisory notice only you're free to ignore it if you're not ready to sink we definitely encourage you to sink if you can there are two ways that the notice can be delivered we can launch a tool that you've registered we specify on the command line to that tool the ID of the entity that's being synchronized the idea of the client is being synchronized excuse me and the list of entities that are being synchronized with that client alternatively you can register a call back directly in your application an object and a selector and we will invoke that selector saying hey now's a good time to think if you like if you don't want to sync simply return without doing anything and we'll pass you by this time you can always think later now why why would you want to choose one method over another something like a server or a device is probably going to register a tool to actually do the synchronization they don't necessarily have any multiple writers issues to worry about or anything like that when they want to synchronize we just launched the tool and all of the logic is embedded in that tool an application like I count on the other hand when it launches will register a dynamic call back while I Cal is running we can call that call back to tell ital to synchronize when I Cal quits the callback is deregistered implicitly and ical won't sink anymore until the next time it launches after you successfully created your sink session you go through and you negotiate the sink modes now there are four basic sink modes that we need to talk about here the first of these is fast sinking fast thinking is the preferred mode of synchronization when you're fast sinking you're basically just telling the engine what changes have happened since the last time you synchronized you tell the engine these are the records that were added since I lasting these are the records that were modified since I laughs sink these records have been deleted since I last synchronized that kind of implies that you can maintain all of that state information and not all applications not all devices are set up to do that sometimes even when you can maintain that information you may not trust it if you synchronize the device with another machine that information may be out of date in that case you will want to slow sync when you slow sync you basically give all of your records to sync services and we figure out what's changed you remember in the client state we store a complete copy of all of the records that we knew to be on your device or in your application the last time we synchronized when you give us all of the records who basically go through and we check off the records in the client store one by one and I think less than the client store afterwards is a record that used to be in your client that isn't anymore and we will generate a delete for those records and that's a very important point to keep in mind when you slow sync you tell us about everything we figure out what the changes are and delete the records that you didn't tell us about anymore sometimes bad things happen a device can be reset the user can accidentally delete your data store if you were to flow sync at that point what would happen is this we knew you had all of these records before now you tell us you've got nothing you must have deleted everything so we delete everything in the truth thought Mac synchronizes we delete everything on dot nak by the time you get home all your data is gone I'm sure this has happened to some of you in the past in this case what you want to do is do a refresh sink when you do a refresh sink we throw away everything in the client store we forget everything we ever knew about you we go through this process of rediscovery you give us all of your records we're going to pass those into the sinker server we're going to let him figure out he's going to take each of those records compare them to existing records in the truth to try to find a match no deletes will be generated but what will happen is anything in the truth that you didn't give us is going to come back to you at that point so if your data store has been reset if your device is being erased you need to be able to tell us that so that we can do a refresh think we also have this notion of pushing and pulling the truth there are times where a user just wants to erase everything on a device or an application or a computer and say replace it with the contents of this computer I've got all of these contacts in address book I know they're in a good state I want all of those on my phone that's the mode that we call pulling the truth what happens when you pull the truth is we expect you to delete all of the records in your clients data store and replace them with the records that syncservices gives you the converse to this is pushing the truth when you push the truth what you're saying is I've got a known good state in this specific client here that I want to replicate everywhere through dot Mac to all my other computers to all of my other devices into all of my other applications when one client is pushing the truth sync services will tell all other clients to pull the truth this is a very destructive operation so you only want the user to initiate this operation clients themselves should never try to push the truth now that being said none of these sync modes are actually reflected directly in the API instead what we did was we looked at these and said there's a lot of commonality between all of these different sync modes when you're slow sinking or refresh sinking or pushing the truth we want you to push all of your records out into sync services in some cases when you're pulling the truth we don't want you to push any records sometimes we don't want you to pull any records at all and so what we've done in the API is we focused in on those specific actions and we've oriented our API around those actions so don't be surprised if you go looking in the API and you don't see fascinating or slow sync or refresh sync mentioned anywhere it's the concepts that are important so let's come back to pushing changes now you've got a choice when you push your changes first thing you're going to do is you're going to ask things services should I push all of my records there can we fast think here if you can fast think then we only want you to tell us about the records that have been added modified or removed since the last time you synchronized you've got a choice here too you can do the hard work if you know what specific fields have changed you can tell us we just changed the first name on this guy we deleted this record and we changed the company name on this guy alternatively if you don't want to go to all of that extra effort you can just give us the whole record and we'll figure out what's happened by pulling the information out of the client state there we're going to package those things up and push push the field level differences over to the sink server so what happens if something goes wrong right now you're in the middle of pushing all of your changes and the device runs out of battery or your application crashes or god forbid sync services crashes and takes you down with it what happens at this point when you first start pushing changes we create an implicit transaction scope all of the changes that you push are going to fall into that transaction scope which is closed when you tell us I'm done I've got no more changes for you and we ship the whole thing off to the sync server if something goes wrong in the middle of that transaction scope we're going to unwind the whole thing we're going to roll them back we're going to forget all of the changes you made the next time you synchronize if you're smart enough to be able to figure out while I was halfway through pushing at that point so I need to re push all of those changes again then by all means go ahead and fast sync might be safer to slow sync at that point however you can tell the engine I'm just going to give you everything you figure out what's changed now some of you might be wondering why do they roll back all of the changes that we've already given them why don't they just take what we've given them process them at that point and we'll pick up where we left off the problem is that when you introduce relationships into the question there's a whole set of data integrity issues that come up that come into play you might have pushed a couple of records in that refer to some records that haven't been pushed yet because the device ran out of batteries so you couldn't get those records and so we erred on the side of safety just said we're going to replay the whole thing to get back to a known good state what we want to do is protect the data in the truth mingling is the heart of sync this is where we take all of the changes from all of the clients and we merge them into the truth we process the changes on a client by client basis so we take all of the changes from addressbook we merge them into the truth take all of the changes from Mac merged them in all the changes from the phone and merge them in it's here that we do our conflict detection if a foam has changed the first name and the first name is also changed on dot Mac since the last time it's synchronized we need to generate a conflict at that point again let's ask the question what happens if something goes bad here the answer is you don't have to worry about it that's our problem once those changes are being handed off to us we're responsible for them we will make sure they get into the truth or we will take steps to recover by asking for all records from all of the clients again I want to talk a little bit more about the conflict handling once the conflict has been detected what do we do with it well typically we're going to go off and ask the user we've got a conflict between this guy and this guy what do you want to do about it that's not always appropriate there are some fields and entities which the user isn't going to know what to do with I kels got a sequence number I don't know what that does can't expect users to figure that out and so we're going to add the ability or schema bundle to specify some code that gets loaded into the sync server to handle those conflicts he gets first crack at though when we detect the conflict we're going to ask this code can you deal with this if he says yes will merge the response into the truth if he says no we're going to store the conflict off on the side we don't want to pop a panel up in the user space right in the middle of sync remember applications and things can be synchronizing at any time having panels popping up saying what do you want to do about this what do you want to do about this it's going to get really stale really quickly instead what we do is we save the conflicted records off to the side we notify the user through a little UI element that it's got some conflicts some some we need his attention and when the user is ready they can pull those conflicts up and resolve them and they'll be merged in the next time they synchronize bullying changes is pretty much the easy part clients pull changes directly from the truth database they don't pull them from the sync server once the sync service finished mingling he's done he's off and someone else can come in and synchronize at that point what we do is we maintain a snapshot of the truth database for some self-consistency there it's held as long as clients for pulling truth out of it when you're getting changes out of the truth you have a choice we give you both the deltas we also give you the full record so you can go in and look did I change just the first name or the last name or you can take the whole record and push it onto the device or into your application you can filter out records that you don't want there's two ways this can be done we can give you all of the records and you can tell us I want this one I want this one I don't want this one sometimes it's much easier if you just write a little filter independently that's loaded into sync services that does that filtering for you so that you only get the records that you're concerned with for example in the phone device configuration you I I might want to specify I'm only going to synchronize contacts in this one specific group we've got some standard filters for that kind of thing and so you're you i can say let's use this filter for this client that gets loaded into sync services he gets rid of the records that we don't want and only gives to the client the records that pass through that filter when the engine gives you a new record we're going to make up an ID for that record the reason is that there may be relationships referring to that guy we need to know what to call him we're going to use a uuid for that but that may not always be convenient for you if you're going to push a record onto a phone or store it in your own database you probably want to use your own identifier for that if the record identifier can be changed any earlier references and a relationship that we've already given to you we can't change those of course but once you change a record identifier any references that we give you after that we'll use the new record identifier sync services uses a two-phase commit process at this point again to answer the question what happens if something goes wrong by two phase commit in this case I mean when the engine gives you a record you tell the engine yes I want this record or no I don't want this record up to you to decide but you got to tell us one way or another if you don't tell us that you accept it or reject it or record we're going to give it to you on the next thing and the next thing until you tell us what to do with that record now if you're talking to a low latency device over USB connection to a phone over a dial-up connection to a server it's not going to be terribly efficient if you have to push the record and then tell us you accepted it and push the record and tell us yes it made it there ok and so we allow you to do this batching process what you can do is just tell us you accepted a record got this one got this one thanks and then you tell us I committed the records that I told you I accepted or rejected what this allows you to do is to batch in memory the changes that we give you so you can pull a hundred changes out of sync services push them on mass up to your server or over to your phone and when you know they're there safely you tell us it's done we're golden what happens if something goes bad is we unwind that implicit transactions go when you first start pulling changes we create a new transaction as you accept you reject changes we write them into the transaction and when you commit those acknowledgments we commit and close that transaction and implicitly create a new one so if something bad happens we're going to unroll back to the last time you called committed those changes last time you told us you committed those changes and we're going to give them to you on the next sink so the five phases of sync you can think of them as a finite state machine the phases must be traversed in order but they can be cancelled or finished at any time the typical application sync model that we recommend for people is this when you first launch do a sink to pick up any changes that have been made since the last time your application was run do it in the background again remember to maintain application responsiveness we give you the methods to query whether you need to slow sync or we can tell you whether you think a sink is going to take a long time if so pop up a panel to the user and say hey this may take a long time do you want to do this now or if you're going to be even more sophisticated just do the sink in the background and let the user carry on with the abnormally as they make changes throughout the course with the application trickle think periodically to push those changes out when you save to disk do a thing to get the changes out when you quit what we want is to get those changes out to the sink engine again you don't necessarily at that point want to wait for the whole sink to complete the user is quitting you want to get out of there as quickly as possible what you can do is create a session push your changes and then finish it you're done you don't have to wait for the mingling you're definitely not going to pull any changes at that point because they're quitting the application they don't need it the device has got a much simpler model typically when a user initiates a sink there's an explicit action on the part of the user their device is plugged in they hit the hot sync button a sink lurk comes in because some other device is synchronizing just go through the whole think at that point so let's talk a little bit about the API what I'm going to do here is not actually get into the details as I said we've got some great reference documentation on the tiger DVD what I want to do is just give you a road map to some of the more important classes now the API is Coco based but it's procedurally oriented for a number of reasons what this means is it's easily wrapped in Java I've got a lot of experience doing that so we kept that in mind while designing this API you can also use it easily from see we support almost all of the core foundation types the core foundation tollfree bridge types most of the more important data types our toll-free bridge times and so again using this from carbon is no problem there are five classes that we're interested in I think manager I think client icing session icing change and the record snapshot I think manager is the singleton object is your basic administrative point of contact there is where you go to register your skinless is where you go to register your clients is where you go to look up the clients that have been registered not a lot to him there I think client represents a registered device or application this is where you can get information the identifier the display name the image of the guy what entities does he support how is he going to sink I want him to reset the next time he sinks you can specify how he's going to synchronize and use icing client to set up sync alerts to specify I want this tool to be launched when these kinds of clients start synchronizing and I think session encapsulate the whole thing process that we just talked about he's got all the methods to walk you through the state machine in there he's got the methods that you can use to query how should I think he's got the methods to allow you to pull the changes out to accept them and to commit them and the key point to know here is that there is only one sink session per client per machine allowed we've got a sink serve that's going to gate that and we will not let the same client sync multiple times so you don't have to worry about preserving that kind of semantics and I think change encapsulate all of the changes to a single record the change specifies whether it's a new record an existing record being modified or a record being deleted and he contains all of the field level changes to that specific record the changes that you get from sync server will contain those field deltas will also contain a full copy of the record if you're smart enough to be able to tell the sync server what the field level changes are to your records you can create one of these you only need to specify the field level deltas you don't need to give us the whole record the record snapshot gives you an immutable copy of the truth database using this you can go in and introspect everything that's in the even stuff that your specific client may not be synchronizing the snapshot is frozen at the time of creation what that means is if the truth database changes after the snapshot is created it will not be reflected in that snapshot you've got a self-consistent view of the troof at that point and you can choose what record identifier namespace you want these things represented in it could be the truth we could use the truth record identifier or you can tell us I want to use the identifier for this client again you can even get stuff out that that client may not be synchronizing the times where you might want to use the snapshot for example are to give you an example in the case of a phone that's synchronizing calendar events a phone doesn't actually synchronize the calendar lists themselves and yet when you create an event on the phone it has to be filed in some calendar so it's a bit of a paradox here what you can do is you can use the snapshot to get the list of calendars out of the truth you can let the user in the configuration you I choose a specific calendar and remember the ID of that guy and when your device when your client is pushing those new calendar events into sync services you just set up a relation saying this guy belongs in this calendar even though you're not sinking him one thing to note is that the truth database is organized to be efficient for sync not efficient necessarily for you so don't use this too often as a general-purpose database API you'll find the the results a bit disappointing in that respect so let's have a quick recap what do you do you register your schemas you register your clients you push data into sync services you pull your changes out of sync services and you provide DUI to configure your client we take care of all of the rest we synchronize your data we detect conflicts when we provide a standard UI so use it to resolve those conflicts we give you an air bag to preserve the data integrity and we provide a dot mac client to synchronize data between multiple machines the design goals that underlie everything that we've done with sink decoupled decoupled clients scheme is separate from the applications extensible schemas and thinks must be invisible now if you have questions you can contract Patrick Collins or Xavi a and we definitely encourage you to file bugs radar is our friend in that respect for more information we've got a lot of great documentation the reference documentation is on your tiger DVD it's also available on connectable com the concept documentation which I highly recommend you read some great docs there is only available on the web at this point we didn't manage to get it onto the DVD we've got some sample code for some sample applications it's in the usual place and developer example sync services