WWDC2004 Session 432

Transcript

Kind: captions
Language: en
I work with the sync services
development team two years ago we
introduced i think i think was an end
user application to let you synchronize
your contacts your calendars with the
phone a PDA and ipod with dot mac you
could synchronize your contacts and your
calendars to other computers and we soon
introduced safari bookmarks today it's
my pleasure to be able to tell you about
syncservices with which you can add
synced your applications that kind of
begs the question why would I want to
sync well for those of us with more than
one computer is a great way of never
having to worry about did i leave the
changes on this computer or that
computer or over here portable devices
are becoming more and more powerful and
so it's convenient to be able to have my
information available on my phone or my
palm or wherever but there's another
reason too and it's not one that's
immediately obvious when you think about
synchronization people have got
different applications that kind of do
the same thing but some apps are better
suited to one particular kind of task
than another sink can be used to share
data between different applications what
we're going to talk about today is I'm
going to tell you what sync services is
going to get it into the architecture a
bit of the data model and just give you
a road map to the api's I'm not going to
get too deeply into what the api's are
we've got a lot of great documentation
in the SDK that's on the tiger DVD that
you've got with you and your bags and
also available on the web there's the
second session that directly follows
this one that's going to be a much more
hands-on oriented approach into getting
actually into how you right now you
incorporate sync services into your
application so let's jump in and answer
the question what is sync services it's
a system service built directly into
tiger for data synchronization the basic
gist is you provide us with access to
your data and we do the rest we
care of all of the sinking and all of
the sink smarts for that it's a single
solution for everyone today if you want
to get your contacts from your phone
onto a palm or some other PDA you often
need two three sometimes more different
solutions we want to provide one
solution for everyone the problem with
the existing solutions today is that
they operate outside of your application
and this leads to a number of different
problems there's the multiple writers
issue meaning if you've got your
application and there's a separate sync
process both trying to access your data
store you have to solve the problem of
multi-process concurrent access to your
data and that's the hard problem to
solve if you're going for a file sync
solution you want to keep your word
documents in sync with your laptop so
you get some kind of file sync solution
there's a granularity issue there what
happens if you change the documents in
both places what happens if you change
just part of the documents on one and
part of the documents on the other it'd
be great if you could merge those two
proprietary formats if you're a device
developer and you want to sync your
device to some third party application
you have to go in reverse engineer their
format and figure out how they store
their things and that doesn't really
create a lasting solution there what
we're doing with sync services is giving
you the ability to build think directly
into your application you focus on
syncing your app and we'll take care of
the rest getting it to interoperate with
other applications other devices syncing
between computers the things not just
about applications it's for devices too
we've worked a lot with devices with
various kinds of memory limitations
devices that can only store a certain
number of Records devices that can only
store fields of a certain length we've
got some good solutions to help you do
that manage that filtering that's
formatting you provide the application
to manage and configure your device you
incorporate sink into that application
and it's the same way you think the same
way that you would
any other kind of application you
synchronizes your devices the same way
you synchronize your applications we're
going for one solution for everyone here
I want to talk a little bit about the
design goals because if you understand
how what we were trying to do when we
set out to accomplish things I think it
will really help you get into the mode
of what we're trying to do that we
started with a really simple preset your
data where you want it when you want it
and from that we drive three basic
design goals think should be decoupled
think should be extensible and think
should be invisible let's jump into what
these mean a little bit more things
should be decoupled now there's two
things that we mean by this applications
and devices should be able to think
independently of each other sometimes
when I want to sync with that mac my
phone's not turned on later when my
phone is available I don't have a
network connection yet I still want my
changes to flow back and forth between
the two of them so we want to be able to
synchronize these things independently
of each other now that being said we
also want to try to synchronize these
things at the same time if I'm thinking
two devices independently I have to do
three things one to get the changes from
say my phone into my computer and then
from my phone or my computer into
address book and then from address book
back into my computer and then I have to
sync my phone again to get the changes
back onto the phone so we want a way to
be able to try and sink both of these
things at the same time it reduces
system resources it makes things go
faster it's easier all around the second
kind of thing the second kind of
decoupling that we're talking about is
that schemas are independent of
applications and devices the bookmark
data model is not owned by Safari it's
not owned by Mozilla address book does
not own the contact schema there's a
clear separation between the two there
the data model needs to be extensible we
provide some standards schemas for
contacts for bookmarks for calendar
entries if you want to be able to
exchange data between different
applications and devices you need to
conform to that schema at the same time
we want you to be able to achieve
perfect synchronization we're not going
to be able to think of everything that
that you're going to be able to
synchronize in your application we can't
think of that ahead of time and so you
need a way to extend the standard data
types to add your fields you need to be
able to add new data types if you want
to synchronize photos for example and
the most important point is that things
should be invisible the user should not
have to explicitly think about
synchronization changes the idea is that
changes should just flow back and forth
naturally between the two to that end we
put your application in control you
think when you want to sync now it's
important that you try and maintain
application responsiveness during this
sometimes things can take a while a
device might be slow in responding there
might be a lot of changes going on you
don't want to give users the spinning
beachball of death there the model that
we've come up with is what we call
trickle sinking there's nothing that we
do to provide trickle thinking it's the
way that you write your applications to
to synchronize the idea is that we do
lots of frequent sinks the more we think
the fewer changes we do on each sink the
fewer changes we have to do on each sink
the faster the sinks go the faster the
sinks go the more frequently you can
sink and so on and so forth it creates
kind of a virtual cycle there so let's
come back to the question of what is
sync services and answer it again from a
bit of a more of a geek perspective it's
a public framework this framework fly it
provides administration you register to
synchronize you register your schemas it
provides API with which you can give
changes to the sink engine the sink
engine processes those changes and gives
changes back to you it's a demon the
sink engine runs inside the sync server
there it's a field level differencing
engine and I'll come back to that point
a little bit later to explain exactly
what we mean by that but we're also
coordinating sinks between multiple
applications devices servers and the
sync server takes care of coordinating
all of those and finally sync services
is a UI we provide a standard UI for
conflict resolution we provide an airbag
panel so that the user can protect their
data from rogue devices we provide an
API for you to reset a specific device
or an application what I'm going to be
focusing on in this talk is mostly
distinct server and the sink frameworks
there so let's get into some of the
details of sync as three things I'm
going to cover here we're going to talk
about schemas I'm going to talk about
clients and then we're actually going to
get into the meat of how synchronization
works so what can you synchronize I
don't know how many of you went to the
core data talk a couple of days ago but
for those of you who are there we're
going to do a quick recap over what we
can synchronize we use the entity
relationship model this is the same
model used by the core data framework
and it's sort of based on the preset
that if you can save it you can sync it
on top of that we've added some extras
just for sink now the entity
relationship model is an industry
standard way of decomposing data an
empty describes a single discreet thing
a contact a phone number a bookmark and
mp3 etc an entity has a name in our case
the name must be unique across the whole
space there's a global naming space for
entities there so we recommend using a
DNS style syntax column that Apple
contacts phone number for example to
avoid to minimize the risk of collisions
entities are composed of property
a property is an attribute which
describes a single characteristic of the
entity a first name a last name
something along those lines attributes
are strongly typed you see on the
monitor here the list of pipes that we
support it's a fairly rich type and we
can look at adding new types in the
future the basic types strings numbers
data date etc we've got some aggregate
types you can create arrays of these
primitive types you can create
dictionaries of these primitive types
and we have a new noon type in a new ms
basically a string with a fixed set of
values that are allowed and the engine
will enforce a certain level of data
typing on this relationships are very
interesting often data by itself is not
all that useful what's interesting are
the roles the relationships between the
data a relationship describes the
particular role between two entities for
example a contact has one or more phone
numbers a bookmark has a parent so a
relationship is directional you have
there are two kinds of Orden allottees
for a relationship you can have a to one
relationship or you can have a to many
relationship a to one relationship is a
one-to-one association between two
entities a bookmark has a single parent
and a to many relationship you've got
multiple relationships the contact can
have multiple phone numbers for example
there's the concept also of an inverse
relationship if you have a relationship
from a bookmark to its parent folder
there's also going to be an interesting
relationship from a folder down to the
bookmarks that it contains and we'll
come a little bit later as to into why
that's important to maintain so what
happens when an entity is deleted this
is again where relationships come into
play when you delete a particular entity
we find all of the relationships to
point to that entity and we nullify them
out you may also optionally say when
this entity is deleted I want you to
traverse through this relationship and
duly
all of the guys that he points to him as
what we call a cascading delete rule
some of the extra stuff that we've added
for sink is the notion of an identity
property when you add a new record into
the sync services engine there we want
to try and match that record up against
existing records already provided
against by other clients otherwise we'll
end up with duplicates in the case of a
contact we want to find a contact that
looks the same as this guy and we use
the identity properties for that kind of
matching a contact your identity
properties will most likely be the first
name and the last name any object any
record in the database that has the same
first name and last name it's probably
going to be the same kind of end same
kind of record and so the sink engine
will merge those two records together an
identical property can be a relationship
or an attribute for example a phone
number is bound to a specific contact
you don't want to match the phone number
from one person to the phone number of
another person even if they're the same
phone number they might be two roommates
and when one roommate moves away and you
change the phone number you don't want
the other person's phone number to
change at the same time if you put the
two contact relationship and to the
identity set there what it's saying is
that we will limit the set of records we
look at to all of the records matched on
that relationship there's the notion of
dependent properties for example let's
say that I have a calendar event he's
got a start date and he also has a bit
specifying whether or not he's an all
day event these are two separate fields
and so I don't want to try and merge
them together in the sink engine but
there is a semantic relationship between
the two if I change the start date on
one event and I change the fact that
it's an all day event on another on a
different client I want the sink engine
to generate a conflict for that by
putting those two in the dependent
property set there by marking them as
dependent properties the sink engine can
catch and generate conflicts for those
even though they're two different fields
I mentioned before the ability to extend
existing entities to add your own fields
to add your own attributes these are
what we call entity extensions you can
even create new relationship on an
existing entity to another entity or to
a brand new entity but it's very careful
you don't change the fundamental
properties of the entity that you're
extending if you add a new cascading
delete rule or something like that you
may end up causing bugs and some other
clients that were depending on the
original behavior finally we've
introduced the notion of a data class a
data class is just an informal
association of entities and informal
grouping of entities there tends to be a
lot of entities that are going to fall
out in your schema you've got contacts
phone numbers URLs aim addresses all of
these kinds of things and yet in the
mind of the user what they're thinking
about for all of those things is the
notion of a contact and so the data
class gives you a way of presenting a
user friendly name to your collection of
classes there the sink engine itself
doesn't actually use data classes for
anything intrinsic to the sink
operations your schema is described in a
file it's this file contains the
description of all of your entities your
extensions the attributes on the
entities the relationships it's a
standard plist file the format is well
documented in our documentation and it's
contained in a sink schema bundle that
bundle can be located anywhere you can
include it in a framework you can
include it in your app wrapper you can
put it in the standard system location
the key point to remember here is that
your schema is decoupled from your
application even if you're writing your
application your schema and that's all
that's going to be syncing that schema
in the eyes of the sink engine the two
are not related to each other your app
does not own the schema
we provide three standard schemas for
contacts calendars and bookmarks and
those are located in system library sync
services schemas and I encourage you to
go in open up the schema bundle find the
plist and have a look through it to see
the standard format unfortunately we
don't have any documentation on that yet
but we hope to be addressing that at
some point in the near future so let's
come back again to the what can you
synchronize question we talked about
entities relationships attributes this
defines the data model that you can
synchronize what you actually
synchronize our records a record is the
basic unit of exchange in terms of the
API it's expressed as an nsdictionary
the keys are the attribute and
relationship names the values are the
types that corresponds to the associated
attribute or relationship a record has
an identifier and that identifier must
be unique across your entire entity
space if you have a contact density and
you have a phone number entity the
records are the identifier must be
unique and this is where we differ a
little bit from your standard database
terminology one key point is that your
record dictionary must contain an entity
name you have to tell us what kind of
entity your record is the sink engine
depends on being able to know what kind
of record is so it can do the right kind
of matching the right things with the
field values so we've talked a lot about
applications devices we sort of
mentioned that the sink engine is a
little bit agnostic it doesn't much care
whether your client is an application or
a device and so we came up with this
generic name for them we call them sync
clients a sync client has a unique
identifier that is how your clients is
identified to sync services you can also
give it a user-friendly display name and
an image for display in some sync you I
there's a one-to-one correspondence
generally between a client and a data
source in the case of an application
the the association is pretty clear but
in the case of the device you've got to
think that I'm writing a client for a
specific kind of device but the user may
end up with multiple copies with
multiple kinds of devices I got many
different kinds of phones I've got a
couple of PDAs two icons each of those
corresponds to a unique client that you
register with the sink engine a client
description file provides a template
description for your client it generally
contains just the static details the
type of your client is at an application
or a device the list of entities that
you've synchronized and the specific
properties on those entities that you
synchronize that's important to note
just because an NCD defines a set of
fields doesn't mean that your client is
going to want to synchronize all of
those fields and so you can specify I'm
only interested in synchronizing save
the first name and the last name on the
contact entity some clients only push
changes some clients only pull changes
the ipod is basically a pull only device
you only pull information onto the iPod
the iPod is never going to change that
information by specifying that in the
schema in the client description file
you can help the sink engine optimized
some of its processes now a lot of the
times is going to be a one-to-one
association between the clients and the
client description and so you can also
include a display name and an image
directly in the client description file
and if you're writing an application
that's probably what you're going to be
using most of the time you can also
specify that information dynamically
using the sync services API so when a
phone is registered when the user
decides I want to synchronize the phone
your client can figure out what kind of
phone it is and register the appropriate
name and image for that phone let's get
now into the actual meat of how sync
works there's basically five phases of
sync I'm going to cover them briefly
here so we have a framework for the
conversation
we'll start diving into the nitty-gritty
the first thing you do is you create a
sink session you then negotiate how
you're going to sink you push your
changes into sync services we process
all of those changes and then you pull
what changes are due to you back out of
sync services now there's a couple of
things that you have to know first and
what I'm going to cover here is the
truth database the truth database is an
aggregate of all of the information from
all of the clients if you have a client
that is synchronizing contacts and he
pushes in just the first name and the
last name you've got another client who
is synchronizing the first name the last
name and the company name what we store
in the truth is the aggregate of all of
those fields the first name the last
name and the company name the truth is
what the client sink to not with each
other if you remember I mentioned
earlier that clients are decoupled from
each other and this is how we accomplish
it a client can sink into the truth
another client can come along and
sinking through the truth and then they
can pull their changes directly out of
the truth now we are storing a copy of
the data here and that's worth keeping
in mind if you're going to be
synchronizing photos if you're going to
be synchronizing large data files or
things like that you probably don't want
to push that information into the truth
because then you're going to end up with
multi gigabytes of data lying around on
the user's disk we're going to come up
with a solution for that at some point
in the future the client state database
what this contains is a snapshot of all
of the records on a device we need this
information for a couple of reasons what
we do when we know a record is on a
device when the device gives the record
to us or when we push a record to the
client we store a copy of that record in
the client state the reason we do this
is so that on the next sink if the
client gives that record back to us we
can pull what we knew was on the
client before out of the client state
and compare the two of them from that we
can figure out how this record changed
at all we can figure out specifically
what fields on that record have changed
what we push into the sync server are
just the field level differences if you
change just the first name we're not
going to push the whole record across
we're going to push just the first name
across into the mingler the other place
where this is used is when we're
formatting records I mentioned earlier
that some devices have limitations on
the length of the fields that they can
store a phone may truncate names at 20
characters for example so if we take a
really long name and we push it on to
the phone the phone is going to truncate
it if the phone then gives that record
back to us what we would do is we would
look at the shortened name we compared
it to the longer name in the client
state would say hey this has changed and
we'd end up propagating the truncated
name everywhere and that would make
people generally pretty unhappy so what
we do is we store in the client state
the formatted record we're going to
store the truncated name in the database
there in the client state so that when
the client gives that record back to us
we'll compare the two fields will say
those are the same it hasn't changed
unless of course the users actually
physically changed the name on the
device now you're probably thinking oh
great they're storing yet another copy
of all of my photos and my contacts and
things well it's not that bad actually
what we store in the client state is
really just a hash of the information
that we push the device just enough
information so that we can do that
comparison successfully one important
thing to understand is that the record
identifier are scoped to a particular
namespace each client has its own
namespace the truth database has a
namespace and there is no correlation
whatsoever between any of these
namespaces so if a client has a record
called foo another client
may have a record called foo and those
can be two completely different records
there is no association between the two
of them there so putting everything
together the way things work is this a
client is going to take a record give it
to sync services services inc services
is going to pull the record out of the
client state and compare them if the
record isn't in the client state then we
know that this must be a new record and
we push what we call an ad into the sync
server if the record exists in the
client state we compare the two and push
just the field level differences into
the sync server the sync server process
those processes those changes merges
them into the truth and clients then
pull all of the changes out of the truth
so coming back to the start of the
process creating a sink session the
first thing you have to understand is
you may not be allowed to synchronize
and there's many reasons for this it
could be that some other client is
already synchronizing now because the
sync server is writing into a common
database we can't allow multiple people
to all synchronize at the same time we
need to maintain a certain state of
integrity of the truth database there
and so we can only process sets of
clients at a time so if a client is
already in the middle of synchronizing
other clients must wait until that guy
has finished it's important to be able
to maintain application responsiveness
throughout this so we provide both
blocking api's for convenience we also
provide non-blocking api's so that you
can basically request the sync services
I'd like to start a sink session now
please generally you'll probably be able
to go straight away but if you can't
we'll call you back when you're ready to
go now that being said I also mentioned
earlier that we want to synchronize
clients simultaneously this is not a
contradiction what sync services
provides is the notion of a sink alert
when you register your client
you can specify the kinds of clients
that you want to synchronize with
address book pretty much wants to
synchronize with anything so he says
I'll synchronize with that i'll synch
with devices i'll synch with servers a
server would probably only synchronize
what other servers are synchronizing a
phone would synchronize when a server is
thinking or when another phone is
syncing so you specify at registration
time who you want to sync with when one
of those guys then start syncing
syncservices delivers an advisory notice
to the clients that have registered an
interest with that guy this is an
advisory notice only you're free to
ignore it if you're not ready to sink we
definitely encourage you to sink if you
can there are two ways that the notice
can be delivered we can launch a tool
that you've registered we specify on the
command line to that tool the ID of the
entity that's being synchronized the
idea of the client is being synchronized
excuse me and the list of entities that
are being synchronized with that client
alternatively you can register a call
back directly in your application an
object and a selector and we will invoke
that selector saying hey now's a good
time to think if you like if you don't
want to sync simply return without doing
anything and we'll pass you by this time
you can always think later now why why
would you want to choose one method over
another something like a server or a
device is probably going to register a
tool to actually do the synchronization
they don't necessarily have any multiple
writers issues to worry about or
anything like that when they want to
synchronize we just launched the tool
and all of the logic is embedded in that
tool an application like I count on the
other hand when it launches will
register a dynamic call back while I Cal
is running we can call that call back to
tell ital to synchronize when I Cal
quits the callback is deregistered
implicitly and ical won't sink anymore
until the next time it launches
after you successfully created your sink
session you go through and you negotiate
the sink modes now there are four basic
sink modes that we need to talk about
here the first of these is fast sinking
fast thinking is the preferred mode of
synchronization when you're fast sinking
you're basically just telling the engine
what changes have happened since the
last time you synchronized you tell the
engine these are the records that were
added since I lasting these are the
records that were modified since I
laughs sink these records have been
deleted since I last synchronized that
kind of implies that you can maintain
all of that state information and not
all applications not all devices are set
up to do that sometimes even when you
can maintain that information you may
not trust it if you synchronize the
device with another machine that
information may be out of date in that
case you will want to slow sync when you
slow sync you basically give all of your
records to sync services and we figure
out what's changed you remember in the
client state we store a complete copy of
all of the records that we knew to be on
your device or in your application the
last time we synchronized when you give
us all of the records who basically go
through and we check off the records in
the client store one by one and I think
less than the client store afterwards is
a record that used to be in your client
that isn't anymore and we will generate
a delete for those records and that's a
very important point to keep in mind
when you slow sync you tell us about
everything we figure out what the
changes are and delete the records that
you didn't tell us about anymore
sometimes bad things happen a device can
be reset the user can accidentally
delete your data store if you were to
flow sync at that point what would
happen is this we knew you had all of
these records before now you tell us
you've got nothing you must have deleted
everything so we delete everything in
the truth
thought Mac synchronizes we delete
everything on dot nak by the time you
get home all your data is gone I'm sure
this has happened to some of you in the
past in this case what you want to do is
do a refresh sink when you do a refresh
sink we throw away everything in the
client store we forget everything we
ever knew about you we go through this
process of rediscovery you give us all
of your records we're going to pass
those into the sinker server we're going
to let him figure out he's going to take
each of those records compare them to
existing records in the truth to try to
find a match no deletes will be
generated but what will happen is
anything in the truth that you didn't
give us is going to come back to you at
that point so if your data store has
been reset if your device is being
erased you need to be able to tell us
that so that we can do a refresh think
we also have this notion of pushing and
pulling the truth there are times where
a user just wants to erase everything on
a device or an application or a computer
and say replace it with the contents of
this computer I've got all of these
contacts in address book I know they're
in a good state I want all of those on
my phone that's the mode that we call
pulling the truth what happens when you
pull the truth is we expect you to
delete all of the records in your
clients data store and replace them with
the records that syncservices gives you
the converse to this is pushing the
truth when you push the truth what
you're saying is I've got a known good
state in this specific client here that
I want to replicate everywhere through
dot Mac to all my other computers to all
of my other devices into all of my other
applications when one client is pushing
the truth sync services will tell all
other clients to pull the truth this is
a very destructive operation so you only
want the user to initiate this operation
clients themselves should never try to
push the truth now that being said none
of these sync modes are actually
reflected directly in the API instead
what we did was we looked at these and
said there's a lot of
commonality between all of these
different sync modes when you're slow
sinking or refresh sinking or pushing
the truth we want you to push all of
your records out into sync services in
some cases when you're pulling the truth
we don't want you to push any records
sometimes we don't want you to pull any
records at all and so what we've done in
the API is we focused in on those
specific actions and we've oriented our
API around those actions so don't be
surprised if you go looking in the API
and you don't see fascinating or slow
sync or refresh sync mentioned anywhere
it's the concepts that are important so
let's come back to pushing changes now
you've got a choice when you push your
changes first thing you're going to do
is you're going to ask things services
should I push all of my records there
can we fast think here if you can fast
think then we only want you to tell us
about the records that have been added
modified or removed since the last time
you synchronized you've got a choice
here too you can do the hard work if you
know what specific fields have changed
you can tell us we just changed the
first name on this guy we deleted this
record and we changed the company name
on this guy alternatively if you don't
want to go to all of that extra effort
you can just give us the whole record
and we'll figure out what's happened by
pulling the information out of the
client state there we're going to
package those things up and push push
the field level differences over to the
sink server so what happens if something
goes wrong right now you're in the
middle of pushing all of your changes
and the device runs out of battery or
your application crashes or god forbid
sync services crashes and takes you down
with it what happens at this point when
you first start pushing changes we
create an implicit transaction scope all
of the changes that you push are going
to fall into that transaction scope
which is closed when you tell us I'm
done I've got no more changes for you
and we ship the whole thing off to the
sync server if something goes wrong
in the middle of that transaction scope
we're going to unwind the whole thing
we're going to roll them back we're
going to forget all of the changes you
made the next time you synchronize if
you're smart enough to be able to figure
out while I was halfway through pushing
at that point so I need to re push all
of those changes again then by all means
go ahead and fast sync might be safer to
slow sync at that point however you can
tell the engine I'm just going to give
you everything you figure out what's
changed now some of you might be
wondering why do they roll back all of
the changes that we've already given
them why don't they just take what we've
given them process them at that point
and we'll pick up where we left off the
problem is that when you introduce
relationships into the question there's
a whole set of data integrity issues
that come up that come into play you
might have pushed a couple of records in
that refer to some records that haven't
been pushed yet because the device ran
out of batteries so you couldn't get
those records and so we erred on the
side of safety just said we're going to
replay the whole thing to get back to a
known good state what we want to do is
protect the data in the truth mingling
is the heart of sync this is where we
take all of the changes from all of the
clients and we merge them into the truth
we process the changes on a client by
client basis so we take all of the
changes from addressbook we merge them
into the truth take all of the changes
from Mac merged them in all the changes
from the phone and merge them in it's
here that we do our conflict detection
if a foam has changed the first name and
the first name is also changed on dot
Mac since the last time it's
synchronized we need to generate a
conflict at that point again let's ask
the question what happens if something
goes bad here the answer is you don't
have to worry about it that's our
problem once those changes are being
handed off to us we're responsible for
them we will make sure they get into the
truth or we will take steps to recover
by
asking for all records from all of the
clients again I want to talk a little
bit more about the conflict handling
once the conflict has been detected what
do we do with it well typically we're
going to go off and ask the user we've
got a conflict between this guy and this
guy what do you want to do about it
that's not always appropriate there are
some fields and entities which the user
isn't going to know what to do with I
kels got a sequence number I don't know
what that does can't expect users to
figure that out and so we're going to
add the ability or schema bundle to
specify some code that gets loaded into
the sync server to handle those
conflicts he gets first crack at though
when we detect the conflict we're going
to ask this code can you deal with this
if he says yes will merge the response
into the truth if he says no we're going
to store the conflict off on the side we
don't want to pop a panel up in the user
space right in the middle of sync
remember applications and things can be
synchronizing at any time having panels
popping up saying what do you want to do
about this what do you want to do about
this it's going to get really stale
really quickly instead what we do is we
save the conflicted records off to the
side we notify the user through a little
UI element that it's got some conflicts
some some we need his attention and when
the user is ready they can pull those
conflicts up and resolve them and
they'll be merged in the next time they
synchronize bullying changes is pretty
much the easy part clients pull changes
directly from the truth database they
don't pull them from the sync server
once the sync service finished mingling
he's done he's off and someone else can
come in and synchronize at that point
what we do is we maintain a snapshot of
the truth database for some
self-consistency there it's held as long
as clients for pulling truth out of it
when you're getting changes out of the
truth you have a choice we give you both
the deltas we also give you the full
record so you can go in and look did I
change just the first name or the last
name or you can take the whole record
and push it onto the device
or into your application you can filter
out records that you don't want there's
two ways this can be done we can give
you all of the records and you can tell
us I want this one I want this one I
don't want this one sometimes it's much
easier if you just write a little filter
independently that's loaded into sync
services that does that filtering for
you so that you only get the records
that you're concerned with for example
in the phone device configuration you I
I might want to specify I'm only going
to synchronize contacts in this one
specific group we've got some standard
filters for that kind of thing and so
you're you i can say let's use this
filter for this client that gets loaded
into sync services he gets rid of the
records that we don't want and only
gives to the client the records that
pass through that filter when the engine
gives you a new record we're going to
make up an ID for that record the reason
is that there may be relationships
referring to that guy we need to know
what to call him we're going to use a
uuid for that but that may not always be
convenient for you if you're going to
push a record onto a phone or store it
in your own database you probably want
to use your own identifier for that if
the record identifier can be changed any
earlier references and a relationship
that we've already given to you we can't
change those of course but once you
change a record identifier any
references that we give you after that
we'll use the new record identifier sync
services uses a two-phase commit process
at this point again to answer the
question what happens if something goes
wrong by two phase commit in this case I
mean when the engine gives you a record
you tell the engine yes I want this
record or no I don't want this record up
to you to decide but you got to tell us
one way or another if you don't tell us
that you accept it or reject it or
record we're going to give it to you on
the next thing and the next thing until
you tell us what to do with that record
now if you're talking to a low latency
device over USB connection to a phone
over a dial-up connection to a server
it's not going to be terribly efficient
if you have to push the record and then
tell us you accepted it and push the
record and tell us yes it made it there
ok and so we allow you to do this
batching process what you can do is just
tell us you accepted a record got this
one got this one thanks and then you
tell us I committed the records that I
told you I accepted or rejected what
this allows you to do is to batch in
memory the changes that we give you so
you can pull a hundred changes out of
sync services push them on mass up to
your server or over to your phone and
when you know they're there safely you
tell us it's done we're golden what
happens if something goes bad is we
unwind that implicit transactions go
when you first start pulling changes we
create a new transaction as you accept
you reject changes we write them into
the transaction and when you commit
those acknowledgments we commit and
close that transaction and implicitly
create a new one so if something bad
happens we're going to unroll back to
the last time you called committed those
changes last time you told us you
committed those changes and we're going
to give them to you on the next sink so
the five phases of sync you can think of
them as a finite state machine the
phases must be traversed in order but
they can be cancelled or finished at any
time the typical application sync model
that we recommend for people is this
when you first launch do a sink to pick
up any changes that have been made since
the last time your application was run
do it in the background again remember
to maintain application responsiveness
we give you the methods to query whether
you need to slow sync or we can tell you
whether you think a sink is going to
take a long time if so pop up a panel to
the user and say hey this may take a
long time do you want to do this now
or if you're going to be even more
sophisticated just do the sink in the
background and let the user carry on
with the abnormally as they make changes
throughout the course with the
application trickle think periodically
to push those changes out when you save
to disk do a thing to get the changes
out when you quit what we want is to get
those changes out to the sink engine
again you don't necessarily at that
point want to wait for the whole sink to
complete the user is quitting you want
to get out of there as quickly as
possible what you can do is create a
session push your changes and then
finish it you're done you don't have to
wait for the mingling you're definitely
not going to pull any changes at that
point because they're quitting the
application they don't need it the
device has got a much simpler model
typically when a user initiates a sink
there's an explicit action on the part
of the user their device is plugged in
they hit the hot sync button a sink lurk
comes in because some other device is
synchronizing just go through the whole
think at that point so let's talk a
little bit about the API what I'm going
to do here is not actually get into the
details as I said we've got some great
reference documentation on the tiger DVD
what I want to do is just give you a
road map to some of the more important
classes now the API is Coco based but
it's procedurally oriented for a number
of reasons what this means is it's
easily wrapped in Java I've got a lot of
experience doing that so we kept that in
mind while designing this API you can
also use it easily from see we support
almost all of the core foundation types
the core foundation tollfree bridge
types most of the more important data
types our toll-free bridge times and so
again using this from carbon is no
problem there are five classes that
we're interested in I think manager I
think client icing session icing change
and the record snapshot I think manager
is the singleton object is your basic
administrative
point of contact there is where you go
to register your skinless is where you
go to register your clients is where you
go to look up the clients that have been
registered not a lot to him there I
think client represents a registered
device or application this is where you
can get information the identifier the
display name the image of the guy what
entities does he support how is he going
to sink I want him to reset the next
time he sinks you can specify how he's
going to synchronize and use icing
client to set up sync alerts to specify
I want this tool to be launched when
these kinds of clients start
synchronizing and I think session
encapsulate the whole thing process that
we just talked about he's got all the
methods to walk you through the state
machine in there he's got the methods
that you can use to query how should I
think he's got the methods to allow you
to pull the changes out to accept them
and to commit them and the key point to
know here is that there is only one sink
session per client per machine allowed
we've got a sink serve that's going to
gate that and we will not let the same
client sync multiple times so you don't
have to worry about preserving that kind
of semantics and I think change
encapsulate all of the changes to a
single record the change specifies
whether it's a new record an existing
record being modified or a record being
deleted and he contains all of the field
level changes to that specific record
the changes that you get from sync
server will contain those field deltas
will also contain a full copy of the
record if you're smart enough to be able
to tell the sync server what the field
level changes are to your records you
can create one of these you only need to
specify the field level deltas you don't
need to give us the whole record the
record snapshot gives you an immutable
copy of the truth database using this
you can go in and introspect everything
that's in the
even stuff that your specific client may
not be synchronizing the snapshot is
frozen at the time of creation what that
means is if the truth database changes
after the snapshot is created it will
not be reflected in that snapshot you've
got a self-consistent view of the troof
at that point and you can choose what
record identifier namespace you want
these things represented in it could be
the truth we could use the truth record
identifier or you can tell us I want to
use the identifier for this client again
you can even get stuff out that that
client may not be synchronizing the
times where you might want to use the
snapshot for example are to give you an
example in the case of a phone that's
synchronizing calendar events a phone
doesn't actually synchronize the
calendar lists themselves and yet when
you create an event on the phone it has
to be filed in some calendar so it's a
bit of a paradox here what you can do is
you can use the snapshot to get the list
of calendars out of the truth you can
let the user in the configuration you I
choose a specific calendar and remember
the ID of that guy and when your device
when your client is pushing those new
calendar events into sync services you
just set up a relation saying this guy
belongs in this calendar even though
you're not sinking him one thing to note
is that the truth database is organized
to be efficient for sync not efficient
necessarily for you so don't use this
too often as a general-purpose database
API you'll find the the results a bit
disappointing in that respect so let's
have a quick recap what do you do you
register your schemas you register your
clients you push data into sync services
you pull your changes out of sync
services and you provide DUI to
configure your client we take care of
all of the rest we synchronize your data
we detect conflicts when we provide a
standard UI so
use it to resolve those conflicts we
give you an air bag to preserve the data
integrity and we provide a dot mac
client to synchronize data between
multiple machines the design goals that
underlie everything that we've done with
sink decoupled decoupled clients scheme
is separate from the applications
extensible schemas and thinks must be
invisible now if you have questions you
can contract Patrick Collins or Xavi a
and we definitely encourage you to file
bugs radar is our friend in that respect
for more information we've got a lot of
great documentation the reference
documentation is on your tiger DVD it's
also available on connectable com the
concept documentation which I highly
recommend you read some great docs there
is only available on the web at this
point we didn't manage to get it onto
the DVD we've got some sample code for
some sample applications it's in the
usual place and developer example sync
services