WWDC2004 Session 427

Transcript

Kind: captions
Language: en
what Apple we're going to be talking
today about data synchronization hello
sorry okay this is the second talk on
data synchronization the first talk
covered the architecture kind of the
core concepts and the API talked a lot
about basically what sync services is
all the different parts and what it
means this talks a bit more of a how-to
how do you take an application and get
it to synchronize data now there's going
to be a little bit of overlap from the
first talks don't know if anybody was at
the previous talk if you work feel free
to doze off a little bit as i go over
things but i do want to make sure that
we frame all the context and all the
concepts so that it all makes sense so
let's talk about syncservices what it is
initially we had three applications that
you could synchronize for three data
types contacts calendars and bookmarks
could also synchronize these two devices
phones ipod and you could synchronize
them up to dot mac now what we've added
is the ability for you to synchronize
your own custom applications so you can
take your app and you can synchronize
your data you can either synchronize
with the existing types of data we
already provide so you can kind of join
in on the party or you can sync your own
data and you can get it up to dot mac
and across to other machines so what are
we going to show you today we're going
to go over a review of the sinking
concept this would be a bit of a
refresher if you saw the previous talk
or read about some of this before but I
want to make sure that I cover the basic
concepts so you know basically what I'm
talking about I'm going to talk about
what you need to do in an application I
want to make sure I cover the what what
is it that you're doing and we'll give
you a demo that will highlight and
illustrate what we just tells you so
first I'll tell you what to do then
we'll show you what that looks like and
then what I'm going to do is cover how
to do it will show some small code
snippets and we'll talk about exactly
how you do all these things when you
walk away from here this is what I want
you to take with you want to make sure
that I'm
bring enough of the overall architecture
that you understand what syncservices is
that you understand what it means for
your application to be synchronizing
that consists of three main things how
to work with schemas if I can get that
to highlight data schemas are used to
identify the different types of data
that you'll be synchronizing so we're
going to cover how you can make your own
schema and how to use schemas that are
already in the system how do you manage
the sink session this is the core of the
talk what do you do when you're actually
synchronizing your data will go into
that in pretty good detail we're going
to talk a lot today today tonight seems
dark in here we're going to talk a lot
about how you trick will sink the
trickle thinking is a way to basically
push small bits of data very frequently
so that from the users perspective their
data is just constantly trickling out
when you make changes in your
application you want to trickle the
changes out as soon as you can and when
changes are made in other applications
that you're syncing with you want to be
able to pull those changes in
transparently to the user I'm going to
cover a little bit of terminology
clearly we need to define our terms so
we'll introduce some of that as we go
through the talk and I'm also going to
talk a bit about best practices by that
I mean things that you can do in an
application in order to provide the user
with the best experience things that you
can do so thats thinking goes smoothly
fairly transparently and so that when
problems occur the user has choices on
what they can do so let's talk a bit
about the types of sync clients and the
types of thinking that you can do
there's four main types of clients we
have applications frameworks devices and
servers so let's look at that first
diagram again but from the perspective
of the types of clients here we're
showing applications ical and Safari or
both applications they sync data from
within the application this is what
we'll be talking about today for your
custom applications from within the
application you'll be interacting with
sync services you might also have a
framework that's used by multiple
applications address book does this the
address book app I
chat male and other applications can all
access a common data store so there
really isn't one process running on the
system or one application that owns this
data and in that case the framework must
provide some mechanism to sync usually
by having a demon that can get invoked
when you want to sync similarly when
you're sinking to a device you'll have
some proxy to that device something that
runs on the computer that represents
that device for syncing finally with the
server you'll also have a process that
can talk to that server now although I'm
going to focus today on talking about
applications much of what I say is going
to be applicable to all these types of
clients and one thing that they all
share in common is that they you know I
really can't find this thing Kobe made
it look so easy before all the data is
pushed into something we call the truth
each user on each machine will have a
truth database that means if you as a
specific mac OS user have two different
machines that you're sinking you'll have
a truth database on each machine and if
you have more than one user on a machine
each one of those users has their own
truth database and the truth is an
aggregate it's an aggregate of all the
data that's synced by every client so
although your client might be thinking
some data you might not think every
field or attribute that's in certain set
of record types you might sync contacts
but you only think a few of the fields
there might be a richer client that
thinks many more fields so the aggregate
is the combination of everything even if
your client isn't synchronizing it we
have three main sync modes fast sink
slow sync and refresh fast think is the
most desirable that's done when what you
want to push is just the changes from
the last think if you're able to keep
track of deltas between sync operations
then when you synchronize you can just
push up changes added this record
deleted this record made a mod to this
record so you push up little changes and
then you pull down from the engine only
what
changed their if you don't keep track of
the deltas or if it's the very first
time you're sinking you'll need to do a
slow sync when you do a slow sync you
push up all of your records what the
engine does is it looks at all the
records you're pushing up and compares
them to the last set of records that
your client had pushed if you don't push
a record up that you had previously sent
that's treated as a belief it also
compares all the fields so it can tell
what records have changed this is
clearly a fairly expensive operation
especially if you have thousands of
records so it's always more desirable to
do a fast think when you can but there
are a few cases where you'll have to do
a slow sync and I'll talk about that and
a little more detail as we go through
the slides finally you can do a refresh
think this is typically done when you've
lost all the data on your machine or
you've lost all semblance of state
regarding your sink sessions if this
happens you can tell the sync server
forget anything you ever knew about me
I'm just going to refresh we're going to
start over as if we've never synced
before typically you'll be doing that to
pull everything down from the engine so
you can reset your state but there are
cases where you might have added a few
records and you want to also push those
up to the engine but still do a refresh
for instance you may have lost the
battery or power in a phone and lost all
the data but you're traveling and you've
added a couple of new contacts so you've
just got to contacts on this phone you
tell the engine to do a refresh think
you just push up your two contacts
they're added and all of the contacts
that it initially been in the sink
engine are pushed back down to your
phone so with these different modes of
thinking we give you the performance
when you need it but we also provide
security that you'll always be able to
recover all your data and that you'll
always be able to sync every record most
important thing when you're sinking is
correctness second most important to
speed and most users are going to think
the most important thing is speed
especially when things are slow and take
a long time but trust me the minute you
lose some of their data they're going to
realize what was most important and
that's being correct I'm sorry I was
supposed to show you that before
I'll be doing that throughout the talk
so get used to it there's a one other
mode of thinking which we call trickle
sing and it's not really a mode in the
same sense as the other three it's more
like a way of life for your application
if you trickle think you routinely fast
sink and there's a number of different
places opportune moments when your
application can do a fast think you try
to do it in the background make it
transparent to the user and one
important point an application probably
shouldn't ever synchronize unless the
users told it to so when your
application starts for the first time
and the user starts to use it you should
provide them some configuration so that
they can say yes I want to sync my data
now at that point when they do they've
given you carte blanche to sync it
whenever you want so you should go off
and think small changes frequently we
talked in the previous presentation
about how the more frequently you can
think the less data you have to sync
each time so it's more like a continuous
flow that doesn't put much load on the
system does it seem to take a long time
for the user and it keeps things
synchronized more expediently if you
make changes in your app and push them
out right away they're available to
other applications and available to get
pushed up to Mac right then and there
similarly when changes are made in other
applications you'll be notified and you
want to pull them in as soon as you can
so you always have the most up-to-date
representation for the user so let's
talk about what you need to do in an
application to sync it what are the five
most important things the five main
things I've grouped what I think are the
core fundamental pieces that you're
going to have to put into an application
to sync the first one is setting up a
data schema a data schema represents the
type of records that you'll be sinking
in a canonical form at the engine can
understand this is important for a few
reasons most applications have their own
way of representing data you could use
objects all the way down you might
archive them or store them in a graph
you might just have a simple text file
comma separated values you could be
using a database and all of these things
are going to be different for your
application from other applications but
the engine needs to know one specific
way of representing this data so you'll
define in the data schema the set of
entity types that you're going to be
synchronizing entity is kind of like
class for objects and instances of
classes an entity is the type of object
and it'll have a set of attributes so
when you think contacts you'll define a
contact entity of phone number entity
and you'll define all of this in a
schema that can be used by the engine
another important point about this when
you think you're not just thinking your
application to the sink engine you're
sinking with every other client that
synchronizes those clients don't want to
know how you represent your data and you
don't want to have to know how all these
other clients represent their data you
don't want to have to know how data is
represented on a phone you don't want to
know how its represented on a server and
you don't need to you just need to know
one way to represent it and that's the
way that's defined in the schema the
second thing you have to do is configure
your application you could have the
greatest sync client in the world if you
don't configure it nobody's going to
know about it it's never going to run
configuration is actually fairly simple
it's a combination of a static file that
defines some properties and a very small
bit of API that you use by configuring
your client you're registering it with
the engine making sure that the data
schemas that you want to use are
registered and specifying any alerts
that you might want to get when other
clients synchronize it's pretty
straightforward the fun stuff is the
actual thinking control flow we have a
sinking state machine API that you use
we have a fairly object-oriented API in
general but the actual act of data
synchronization is a state machine it's
a procedural API with methods you invoke
on a sink session object we went with a
state machine with the procedural API
for flexibility by providing an API like
this there's a lot of different
junctures and points where you can opt
in and out of the sink you have a lot of
flexibility on how you want to do the
sink if you get involved in a sink
operation that starts to take too long
because other clients are also saying
they're taking a long time you can opt
out of the sink you can cancel the sink
if something goes wrong at any time you
can complete it in the middle and take
up where you left off on the next sink
you have to do the sink steps in the
order we're showing you here but other
than that you can leave after you push
your changes you can stop sinking before
you pull anything down or you can go all
the way through it's up to your
application I talked before about having
a data schema to represent your data in
a canonical format that's kind of the
class this is the object when you take
any of your data and want to transform
it over to the sink engine you'll be
creating essentially instances which we
call records of the schema types that
you've defined when you are pushing
records up you take the records or the
objects in your application you
transform them into records based on the
data schema and these are essentially NS
dictionaries you push these dictionaries
into the engine and when you get changes
back you'll be taking dictionaries or
sets of field changes that are
represented in something we call a nice
ink change and you'll be transforming
that back into your own record types for
your app finally for fast thinking you
need to do some state management if you
want to fast think you have to keep
track of the differences between the
first sync or the previous think you did
and the current state of your data this
is for both adds and deletes modifies
being somewhat of a special case of ADD
if you've modified a record or added a
record you can put it on a list when you
delete a records you can put the ID in a
list when you go to sync you just
consult these lists and that's all
you'll have to hand to the engine I
added this i modified this I deleted
this if you don't do that you'll have to
give the engine every record so there's
a few different ways that you can
maintain state you could keep timestamps
on all your records you can keep a dirty
bit on the records you could have a list
of what records you've changed what you
do is going to be something that you'll
choose that's most appropriate for your
applications data model I find typically
keeping a list of what's changed is very
straightforward you'll also want to save
that list in a fighh
if you quit out of your application
without thinking that way when you
restore your application the next time
you'll still remember the Delta so
you'll know what you can sink there's a
few other things that you can do to when
you're sinking you're an application
whenever an application does anything
you shouldn't just go off and do
something without the user knowing
what's going on so you need to provide
some sort of feedback most think
operations will hopefully be very brief
some of them might be so fast that the
user couldn't even detect them but
others may take a few seconds or longer
so if you're engaged in any kind of sync
operation make sure that you put some
animation up so the user knows what's
going on you can use the spinning bar
animation you could put up a status pain
you could just put some status text
somewhere so the user can look and see
what's happening users tend to get very
annoyed if an app becomes unresponsive
if you're sinking in the main loop of
your application the user might be
clicking in a text field trying to do
something even if it's just a second or
two seconds it'll get kind of
confounding to someone that's trying to
use your application if it doesn't
respond but if they see something
spinning that's kind of a queue ok
something's going on here and we'll get
used to seeing that and it'll be less
troublesome for them I mentioned before
about trickle sinking sink often and
then try to sink fast if you can do that
in an application it's going to be a
much smoother experience for the user so
I just told you what you have to do but
there's a lot of things you don't have
to do you're probably thinking boy I
have to do all these things what did you
guys do well we actually did quite a lot
and I think we did a lot of the heavy
lifting so that we're going to make it
fairly straightforward for an
application to synchronize and these
things that you don't have to do are
fairly complex first one is conflict
management you don't have to worry about
conflicts between different sources in
fact you can treat thinking almost as if
you're in isolation your application
just sinks into the engine if there's
conflicts with records for other clients
will take care of that will notice the
conflicts will keep track of them will
present you I to the user to resolve
them and then we'll handle merging in
the conflicts correct
also you don't need to present what will
change to a user if you're going to make
a hundred changes if you're going to
remove a thousand records will pop up
something we call an airbag it's an
opportunity for the user to opt out of
that sink will tell them these changes
are about to be made by this sync client
so you won't have to worry about doing
anything to notify the user about
changes that you're making due to a sink
you don't have to detect duplicate
records sometimes when you think two
sources for the first time in a very
common case as a phone and address book
or a phone with calendar data and a
calendar you'll have the same record on
both devices or both clients will detect
if those are duplicates and will handle
merging them together to present them as
one unified record so you won't end up
duplicating every contact you have just
because you synchronize the phone that
happened to have all the same contacts
on it already you don't really have to
pay attention to other clients or worry
about dot Mac we've got a decoupled
architecture regarding all the clients
your client just thinks to the engine
you worry about that if you get that
right will fan all of your changes out
to the other clients will get the
changes from them into you will take
care of everything you don't have to do
anything special to sync to dot Mac
thought Mac will automatically be able
to sync your data types up and down to
other machines as long as you've defined
a schema for them ok now I'm going to
bring a Nancy Craig hell up to do a demo
she's going to illustrate many of the
things I just talked about you wired up
iclicker please sorry I like it and
there was on the slides first so we want
you to feel confident when you leave the
session that you too can write sing
cabool applications so we did select a
little more sophisticated example not a
trivial one so you can get the most out
of this session and also when you're
watching a demo on the rest of gordy
slides I think it's time to begin to
think about how you might modify your
existing apps to sync and how you might
create a new application that's syncopal
okay so what you're gonna learn from the
demo and the rest of the talk really is
how to sync your custom objects you can
think like Toby and Gordie said you can
sync all the contacts and calendars and
bookmarks but if we think it's a lot
more exciting if you create your own
object models and you think those
objects you're going to learn how to
sync relationships in your object model
and you're going to learn how to sync
your applications simultaneously if you
combine that with syncing your
applications often then you'll learn how
to trickle think and the good news is
that these demos that you're seeing
today are available now on your tiger
seed DVD so you can go to developer
example sync services and if you're gung
ho you can open up your Xcode project
now and you can follow along because
Gordy's can actually show a lot of the
details later that relates to the schema
files in the client description etc so
because it's a sophisticated example I'm
just going to take a moment to just tell
you what the architecture is and the
object oriented at so of course you have
the sink engine and the truth database
at the center we have one app we call
event it's just going to import ical
files and create custom event objects
the second application is media assets
and it's just going to go to any old
iPhoto library year folder and parse
those files and create custom media
objects and each of these applications
has their own local database store so
what does the object model look like
there's an event object corresponds to a
wedding or a birthday party and a media
object and it corresponds to a
photograph that was taken at an event so
naturally there's a 2-1 relationship
from media to events and a to many
relationship from event to media so this
is a special relationship because if you
set the 21 relationship from mediate
event you would expect that media
objects to be added as it one of the
destination objects of the too many from
that event to the media so we call this
an inverse relationship
and the good news is that sync services
is supporting and verse relationships in
the sink engine and will maintain the
integrity of inverse relationship even
if you don't okay and then this is an
example of an event we went to Mendocino
to the beach and here's the photographs
in retrospective life is probably not
needed but i just wanted to show off my
photography but can we go to demo one
alright so this is the events
application it's a simple master-detail
interface I've already loaded the ical
file by the way so for those of you in
the back row I'm going to zoom in okay
you see that event has the title
attribute start date and end date I'm
going to point out a couple of other
things down here you see the record ID
and below that is the client ID so each
application has its own client ID okay
another area I want you to look at over
here now your when you do your
applications you're not going to have a
big old ugly sync button here in a
trickle check box but we added that
there because we're going to first show
this demo slowly step by step to show
you the process of what's happening
between the two apps and then will speed
up later by turning trickle singing on
and we also implemented a calendar view
because you typically want to do your
events on a calendar so at this point
the events application has the local
event objects I'm going to push the sync
button there's going to be a progress
indicator that runs here and it's going
to push the local event objects out to
the truth database well that's the end
of the demo no
so this must be a tiger boat because I
can't hide it hide it like that let's
bring up the media assets up
go away okay news media access app stain
saying its master detail interface I'll
import some an iphoto library of 2004
photos and so just down here you see
kind of it helps if i zoom in but
basically this title is one of the
attributes the date of the media objects
the image is just a URL and it's being
shown here below now there's an event
pulldown menu you can't tell that I'm
actually pushing the mouse on there
nothing is appearing because this
application doesn't know anything about
event objects right now again it has a
record ID under here for each record and
has a client ID okay so if I now push
the import button it's going to push the
media objects to the truth database is
going to pull over the event objects I
do that all the time sorry think there
we go so the events just got pulled over
and populated into the menu so I have to
know that's Chinatown and this picture
was taken at Chinatown's but I'm not
going to if you sit there while I said
all of these relationships so what we
did was you created a smart events
button and it's just going to run down
these media objects and assign them to
the most logical event matching the date
sub so if you push that button now all
my media objects have events so again
just to review we've just created 21
relationships between the needy and the
events object and too many relationships
from all the event objects to the media
again it's just local and I haven't
pushed it yet so if i push sync going to
push it out and then we'll bring up the
events application and the calendar view
and when I push think here on the events
app it's going to pull both the media
objects and the 21 and the too many
relationships and hopefully we'll see
them on the calendar there so let's turn
on trickle thinking on the events at and
then we go back to media will import a
cup some more photos or februari march
okay they're down at the bottom put
smart events create some more
relationships let's move this up to
februari okay now when I push the sync
button the events EPS is set to trickle
think so it's going to get an alert that
media assets is sinking and it will
begin sinking simultaneously so you have
to look quick there will be a progress
indicator over here will be another
progress indicator over here huh no soda
okay now we'll go back to go back to
events let's this is a multi-day event
Tahoe skiing trip let's say we want to
change the name of that but I want to
show this off so I'm going to find Tahoe
skiing over here too and let's change
that to winter trip now when I hit the
tab it's going to modify the local
objects it's set to think about every
five seconds so it'll be a moment delay
it will think when it updates the local
changes will update the tahoe skiing
down here and then when it pushes the
changes out it'll update media assets if
i turn trickle sinking on haha alright i
havn hit the tab button yet now i'm in
hit the tab there goes and it didn't
work
there goes right it works okay so let's
have a little more fun this is a
multi-day event so let's move some of
the pictures let's say that this picture
was actually taken on the sixteenth and
it should appear over here on the date
of the sixteenth hit tab I'm waiting for
the sink it goes that one less for those
of you miss it will do one more time
when with this photo to the 18th hit tab
should sync up here go 18 ok to
applications sharing the same content
and singing together that's it so we
just go back to the slides for a minute
okay you're all probably wondering a
little bit how its implemented so I'm
just going to cover that briefly
especially if you're looking at the code
so it does use the model-view-controller
paradigm and the models are the syncopal
objects in the design there we go there
we go okay and we use cocoa bindings of
course to update all of those changes
that are done locally in the app but in
addition when you're pulling all the
changes and applying them to the local
objects that's how the displays are
being updated we also used key value
observing which is the underpinnings to
cocoa bindings and we use that to record
all of the local changes as Toby and
gordy were saying you need to that's one
of your jobs to record all the changes
you make locally for pushing later and
we also found transformers that is the
NS transformer class useful for
converting your models to records before
you push and then when you pull the
changes in from the sink engine you need
to apply them to your models and
sometimes when you get additions you
need to create models so use
transformers there we also use
transformers for resolving the
relationship so gory is going to show
that more detail but the relationships
that come from the sink engine are not
what you expect you have to convert that
to actual references to your objects
and i think that's it can go back to
Gordy for more details on how to a
little bit about how we did many of the
things you saw in the demo and Nancy
actually wrote the demo so she gets all
the credit but if I get it wrong you got
to forgive me but I do promise to do
better with the clicker now huh okay I
talked about the five main things you
need to do to sync let's just recap
quickly you'll set up a data schema
you'll have configuration for your
application then we'll have the main
sync loop before you mentioned that's
the meat of syncing pretty vegetarians
out there that's the tofu of sinking
you'll have data transformation and
Nancy just touched on that a little bit
and then of course keeping track of your
data so that you can fast think let's go
over that now let's look at the schema
so what goes into schema essentially
you're defining entities so in the
example we just showed you we had two
entity types media assets object which
had a picture of title and a date
associated with it and we also had an
event object which had a title and a
date we mapped the media objects to the
event objects we have attributes such as
the title and the date we also have the
relationships such as the relationship
from the media asset to an event and the
relationship back from the event to the
media asset now being a little bit
redundant but I want to make sure that
we didn't skip over anything here this
is kind of a pictorial representation
I'm just going to run through it from
the top just to tell you all the
different parts of a schema you start
off with a data class a data class is
actually somewhat of an informal
construct it's used to present what
you're thinking to the user so if your
application has a number of different
entities that you want to sync you can
group them together in one data class an
example of that is contacts and
calendars rather than specifying every
entity type to the user providing them
with way too much information you can
sort of summarize it by naming it in the
data class as I mentioned before you're
sinking entities so a data class
consists of a number of entities
and then deconstructing further we can
see attributes these are primitive types
that you use basically what you would
put in an nsdictionary they represent
the different attributes of each entity
in a contact you would have name first
name last name include we just saw an
example in what we showed you you also
have the relationships and one other
thing that we didn't mention before
identity properties the first time you
synchronize a new object or a new record
from one source what we'll do in the
engine is compare it to all the records
we have from existing sources if we see
that it's the same record we won't
duplicate it that way and I mentioned
this before when you think your phone
for the first time with address book
you're not going to duplicate every
entry that you've entered dutifully into
both the way we do that is by having
schema specify the identity properties
these can be attributes or relationships
you might want to scope the identity of
something through a relationship for
instance for a phone number you might
scope its identity through the enclosing
contact by specifying the relationship
from that phone number back to a contact
as well as the type and the value which
would both be attributes so you tell us
dynamically this isn't something that
you're stuck with it's not a static
description but it's something that you
put in your schema that we can use that
can be different for each data class or
each entity type you tell us what the
identity of an object is how to notice
that and we'll take care of mapping
duplicates so let's look a little bit
more to sync schema it's a plist
straight up is very straightforward
you'll have a name for your scheme that
way any introspection tools any UI can
be used to look at it will be able to
determine the exact name of the schema
also the engine has to be able to
identify schemas uniquely you don't want
two schemas with the same name so we
recommend that you use a dnf style name
here we have calmed apple snake examples
as our name you'll have a set of data
classes usually you'll just have one
data class for instance with the contact
schema we just have the contact data
class but depending on the complexity of
your application and the choices you
make
to how you want to organize your schema
you could put more than one data class
in one schema it's up to you you have a
list of entities and that's really the
main thing that you're going to be
putting inside of the data schema so
let's look at an entity each entity also
has a name its dns qualified as well so
that it doesn't conflict with other
entities and any names are in a global
namespace they're not just mapped within
the schema that they exist inside of the
entities are treated as global that way
you can refer to an entity and another
schema if you want to extend something
or if you wanted to refer to a data
class and another schema that you're
adding an entity type too so you have to
make sure that you use a unique name you
specify the data class that your
entities in you give it a display name
the display name would be used again by
any user interface something so that the
user doesn't get stuck looking at really
long weird disambiguated names in this
case media makes a lot more sense to the
user then you have your attributes
relationships and identity properties
and let's look at those attributes are
very simple I've just included to hear I
put a lip sees at the bottom because
this isn't everything it's kind of hard
to fit things and I hope you can see
this actually I've noticed in some of
the presentations is hard from the back
to be able to see when we put code up or
any kind of text like this but in this
case i'm specifying two of the
attributes the date and the title this
is for an event object you specify the
name and the type the name is the field
that you're going to use in a record
dictionary that represents one of these
entity types and then the type is just
simply what it is here's a list of the
attribute types that you can use I just
put it here quickly for completeness
standard stuff that you can put into a
property list also you can use an array
or a dictionary as a primitive type but
you need to be careful we're doing field
level differencing if you have a record
and it's got five different fields will
difference those fields independently so
if one record from one source changed
field one another record from another
source changed field two that's not a
conflict will merge it together but if
one of your fields is in a red
or a dictionary that entire collection
is going to be considered the atomic
unit for that field so if you make one
small change in that in one source and
another change in another that's going
to cause a conflict there are some cases
though words very convenient to be able
to use a collection but wherever you can
it's best to split up your attributes
into separate or you know use separate
attributes for each one of your somatic
fields there's a few additional types
calendar date just because it's so
useful you can't put calendar dates into
property lists but you can put them
inside of a record we have NS data in
case you want to take something like an
image or something that's your own you
know object-type that's not represented
by one of these you can just sort of
stuff stuff it into an NS data you can
also specify an enumeration of string
this is useful to have a bounded set of
strings and the engine will actually do
some consistency checking for you so if
you want to have weekdays Monday Tuesday
Wednesday and so on you could specify
those rather than just saying string and
then possibly miss typing something you
can also have a URL which is very useful
to reference things elsewhere so let's
look a little bit at a relationship
relationship starts off with a name just
like an attribute does it has a display
name now I didn't show this for the
attribute because it wouldn't fit on the
slide but both attributes and
relationships have a display name that
way as tools or develop that can do
introspection into these things you can
display something a little bit more
meaningful than the normal name that
you'll pick notice that the names of the
attributes and relationships don't need
to be DNS qualified they're scoping is
local to the entity that they reside in
for a relationship you specify whether
it's one to one or one too many we have
a one-to-one relationship in our example
from a media asset object back to an
event each media object corresponds to
one event however the events can have
many objects they can have many media
objects so in one direction we're
specifying a one-to-one relationship and
the other so one too many we're showing
the media right here this is the
relationship on probably should have
mentioned this this is the relationship
in a media record back to an event
you specify the target type this is the
fully qualified target type of event so
here we're just simply specifying we've
got a one-to-one relationship to an
event I did a lot of talking that was
actually something that's fairly simple
you can also specify an inverse
relationship these are very useful when
you want the engine to do some
consistency checking for you if you've
set up a media asset to point back to an
event you want that event to contain
that media asset similarly if you move a
media assets relationship from one event
to the other you want to make sure that
it's unwired from that first event and
wired into the second one you can
specify an inverse relationship now this
is a little tricky to look at outside of
context if you look at the examples that
we've provided and you look at the
entire schema you'll be able to see how
this is wired up a little more clearly
clearly if I could say that clearly
there's a metaphor and there somewhere
we have the entity name for the inverse
relationship and the name of the
relationship that's back so we're saying
a media object has a relationship to an
event and then the inverse relationship
is from the events media relationship
field this is just how you specify
identity properties it's very simple
it's just a list of attributes and
relationships that are being used for
the identity for that record in this
case we're using the date and the title
of an event to identify it uniquely okay
so let's talk about what you're thinking
I just described the classes now let's
talk about the instances of those or the
records when you're sinking an object
you have to push up two things a record
dictionary and a unique identifier for
it now the identifier must be unique
across all of the entity types that
you're synchronizing so if you have
contacts and you have phone numbers you
can't use the same identifier for a
contacts that you use for a phone number
just because they're different entity
types you always have to make sure that
all of your identifiers are completely
unique however you don't have to worry
about other clients your client has its
own namespace for love its identifier
you need to put the entity name in a
record that's essentially likely is a
pointer back to a
ass inside of an object by specifying
that the engine now knows what kind of
record is dealing with if you didn't put
that in the record we'd look at this
nsdictionary you'd be filled with all
kinds of great fields but we wouldn't
know what it was so you always have to
make sure that you put the entity name
in everything that goes in that record
is just a set of key value properties
for an attribute is one of the types i
showed you before so it's just set
straight up dictionary for a
relationship if it's a one-to-one
relationship you'll have an array with
one element in it if it's a one-to-many
relationship you'll have an array with
zero or more elements the reason that a
one-to-one relationship still uses an
array is for consistency so you don't
have to have code that's doing is kind
of all over the place to see if this is
an array or just a singleton object
relationships are specified by using the
unique record identifier of the target
now one thing about record identifier
often when you're using a relational
database you can construct a unique
identifier with some combination of your
primary key and your record type in the
database so you might want to use what
you have for a primary key as part of
the record identifier if you do that you
may not want to put those fields into
the dictionary for the record or into
your schema because it's redundant
you'll be using them for the record
identifier there's no reason for you to
also put them inside of the record
itself so let's look at what an
application sees a user will look at an
application and they'll be presented
with some kind of visual representation
of your objects in the application
you've got your own objects internally
like I said before these could be struck
they can be objects it can be
constructed out of strings could be
whatever you want when you're sinking
you need to transform those into records
so these are probably a little hard to
see from the back but these are just
straight up and as dictionaries and
that's what you're going to be sinking
back and forth to the engine so let's
look at one in more detail this is a
media record in white I have the actual
name of the entity and then I'm just
highlighting the relationship to
separated from the attributes very
straightforward it's just a dictionary
we've
an array for the event which is a
relationship back for the enclosing
event hey Lou okay this is what a event
looks like and the only difference here
is that it has a list of media objects
since it's a one-to-many otherwise very
similar so we have a very regular way of
specifying all the records when you're
sinking you don't have to worry about
pushing up different objects in
different ways everything eventually
grounds down to just being a dictionary
so now let's talk about configuration we
know how to describe the schema for the
data in our client now what we're going
to do is we're going to set up a client
description property list this is also a
plist file it statically describes the
characteristics of your client so what
does it have we've got a list of the
entities and the properties in those
entities now you might have a data
schema that you're sharing with other
applications and there could be a whole
slew of entities in there and a lot of
properties because you're trying to
cover all the bases this particular sync
client that you're writing may not use
all those entities it might not use all
of those fields that's fine in your
clients description property list you'll
specify the subset that you use that way
the engine knows which entities and
which fields to be giving to your client
and it also knows what to expect from
your client so it won't erroneous we
delete things just because your client
doesn't pass up certain attributes you
can specify whether entities are push or
pull only most of the time you'll be
both pushing and pulling entities you'll
be contributing to the pool of data
you'll be pulling in changes but
sometimes for instance in the case of an
iPod you'll only be pulling things down
the engine can make certain
optimizations in its data store when you
give it that information you also
specify what type of clients you want to
sync with so if you're an application
typically when other applications think
the same data types that you do you want
to get notified so you can sync when dot
max thinks you'll want to start up when
devices think you'll want to start up
and we'll show that for our examples
here we specify that we wanted to sink
when Max ink and also when each of the
other applications sync
here's a look at a property list very
straightforward we have the display name
for the client and you can also specify
I can get this to go here an image pad
when you have a user interface that
presents a list of clients which we
provide for you it's nice not only to
have the name of the client but to have
some kind of an icon that represents it
often it will be the same as your
application icon but in some cases you
might choose something different
something that sort of illustrates that
this is data that you're synchronizing
so you can specify an icon relative to
the past of this properly relative to
the path of this property list and that
way we'll present something nicer to the
user than just simply text we also have
a list of the entities as I described
here we're specifying the event entity
and the fields that we're going to
synchronize I'm kind of going through
these fast cuz this is pretty
straightforward stuff okay let's talk
about thinking now this is the
interesting part when you're going to
synchronize the first thing you need to
do is register your data schema now it's
not that expensive to reregister the
schema every time your application
starts you don't want to do it every
time you think if you can avoid it but
what you can do is start your
application and just register the schema
without worrying if it was already
registered typically your data schema
file isn't changing so all this amounts
to is a quick stat by the sync server it
checks to see if there's any differences
and if there's not it actually doesn't
do anything so when your application
starts up make sure that your schemas
are registered that you're using then
you need to register client now I'm
sorry I screwed up the slide I've done
this so many times let's talk a little
bit more about registering the schema
I'm just going to show you the code now
can you I don't know if you can see this
and if you can i'll talk through it a
little bit more but it's very
straightforward what you'll do is you'll
keep your schema in a bundle in your
application you might as well keep the
schema localized if it corresponds to
your application now we mentioned that
sometimes data schemas are decoupled
from applications certainly from the
engines point of view it doesn't make
any assumption that a schema cartilage
to any one given application if you're
just providing a schema in your app
for your own use you can keep it inside
of your resources if you're not you
might want to put it inside of a
framework somewhere so other
applications can access it once you have
it it's a simple path and you make a
call into the sync manager now I'm
introducing the sync manager an API here
there's just a few objects that you'll
need to use in order to affect the sink
in the sync manager as the name implies
does mostly management type of functions
you use it to register your schema you
use it to register your client as you'll
see very straightforward call you just
pass in the path and you're done now the
second thing I started to talk about is
registering your client now you only
need to do this if you haven't
registered it before so when we look at
the code if I can get to it unconvinced
somebody's got a voodoo doll for this
click or somewhere sure there we go okay
so when you're registering a client you
check to see if the client is already
registered so what you do is by
specifying the clients identifier you
ask the sync manager for your client
object if you get it back great you're
done you could just return if not then
you'll need to register it to register
it is very similar to registering a
schema you just tell the sync manager
you want to register a client now the
one difference is when you register a
client you provide an identifier provide
both the client description file and the
identifier so that you can refer to it
again in the future for instance to
start a sync operation and the other
difference is that you'll get a client
object back so you can then use that to
proceed through the sync operation this
is really hard okay okay the second
thing you'll do when you register a
client is specify an alert handler this
is pretty straightforward we're just
doing this so that we can sync when
other applications or servers sink in
the example that we had we specified
programmatically to the engine that we
wanted to synchronize when applications
or when servers synced we also specified
an alert handler that gets called inside
of our application so you've got a
running application while it's executing
if some other sinks or so
server or application goes off to sink
you want to get notified so you can join
in and sync at the same time that'll
happen in the main run loop of your
application and it will happen just with
a simple callback that we're specifying
here I should note that you could also
specify what types of clients you want
to sync with in your clients description
property list it didn't really fit on
the screen when I made an example of
that before and I also wanted to
highlight that you can do it
programmatically ok once you've got all
that done you're ready to sync data so
let's see what we have to do for that
the very first time you think you need
to do a slow sync you do a slow sync
because you don't really have any basis
to compare to so you're going to push up
every record you have if you've done
that you'll be able to fast think the
next time and we've pointed out that you
want to try to trickle sink as often as
possible and when you trick will sink
you only want to push up Delta's to make
it fast when your application is
launched you'll want to sync now when we
shows the demo before we had a checkbox
for trickle think and a button for
sinking in a real application you
wouldn't have a sync button nor would
you have that check box you would just
have trickle sync behavior all the time
when your application starts up it would
make sure that it had the most current
set of changes so it would sink
immediately and pull changes down
similarly before you exit you want to
synchronize if you synchronize before
you terminate your application that
ensures that changes the user have made
are not only flushed to a data file and
save but they're also synchronized out
to the rest of the world I've really got
to figure this thing out it's like if I
pointed somebody over there it seems to
go ok so we mentioned before that stinks
session is a finite state machine so you
have a certain set of steps you go
through i want to point out that pulling
changes down is optional if you want to
you can just push changes up so when
would you want to do that possibly when
your application is exiting you'll just
push your changes up and quit when users
quit an application they don't want the
application to sit there
forever while they're waiting for it to
sink they wanted to just get its
business done and exit quickly so you
want to make sure you don't spend a lot
of time when an application is
terminating going in a full sync
operation so you can opt out of this at
any point and in this case you would
just push your changes up and then exit
you do have to execute these steps in
order though so let's look at them in
more detail when you start a sink
session you can specify a blocking call
or a non-blocking call a blocking call
would typically take a timeout you don't
want to call into a blocking method and
then just wait forever and have your
user locked out if you call the blocking
call from the main run loop you
typically want to specify the time out
around two seconds after 2 seconds
you're going to get the little spinning
beachball so you can specify an extra
second and hope it doesn't happen but
typically you want to sync operation to
start quickly or you're going to bail
out of it with a non-blocking call you
give a call back into the engine make a
call that returns immediately and then
at some point in the future the sync
operation will start now there's a
couple of issues you have to be careful
about one is responsiveness that just
mentioned you don't want to go off
forever waiting for a sink to start
secondly if a user goes and makes
modifications to data you want to make
sure that you use the data at the point
the session actually starts so if your
application decides to sink and makes
that call to start a sink session don't
collect any data to use in the sink wait
until the sink session actually starts
and then you can use it if you have a
blocking call don't loop if you call
this blocking method and it returns
without being able to sink it actually
returns yes or no whether or not you've
got a session don't immediately call it
again first of all you would just be
banging on the engine typically the
reason it returned no is because another
client is syncing the same entity types
of you so if you have a phone that's
synchronizing and then address book
decides it wants to trick will sink but
the phone is already synchronizing it
could take a while for that device to
finish so if you ask the question to the
engine to start a session and it returns
no you probably want to wait a
sufficient amount of time before you try
again
or an even better approach is to just
use the non-blocking call all the time
so here's some code to begin a session
and notice at the very top the first
thing that we do is we save our file now
the code snippets i'm showing you or
actually they're taken from the demo so
here we save our data before we start a
sink you don't want to synchronize data
you haven't saved in a file because the
next time your app starts if it was
unable to save you're going to be out of
sync no pun intended with the engine so
the first thing to do when you're going
to synchronize save all your data and
then at the bottom here you can see that
we're starting a session now I actually
committed an egregious crime here I
specified five seconds I did that just
for testing and I ended up leaving it in
the slide by mistake i'm using a
blocking call and specifying a time long
enough that the beach ball is going to
come up as it takes more than two
seconds that's going to annoy a user so
you typically want to keep that limited
to two seconds or less when you use the
block and call the next thing you need
to do once you've actually established
the sink session is negotiate do you
want to do a slow sync or a fast think
so i'm going to show a little code here
from our app even though we mentioned
that you need to do negotiation up front
and we sort of show it at the beginning
of the sync operation you can spread a
little bit of it out if it's more
natural for your app and you'll see how
we actually did spread that out now in
the previous talk we discussed the
sinking modes and then talked about how
we don't actually have a call into the
engine i want to sass sink or i want to
slow sync there's no call back to your
client asking it what it wants to do we
actually have a set of methods that you
can use because it's much more flexible
so in this case we're checking to see if
we want to do a refresh think so we
would do a refresh think in our example
if we lost our data file so if we start
our example code up and our data files
gone will refresh think that way will
restore everything from the engine we
also do a slow sync in some cases we do
a slow sync the very first time we ever
synchronize also we catch errors with
exceptions during the sync operation
anything goes wrong during sync
operation on the next sync we force
ourselves to do a slow
so in this case we're telling the
session whether or not we've actually
reset all the entity names so that we
can refresh that's the first line or in
the second if clause we're actually
checking and we're telling the engine
that we want to push all of our records
which essentially amounts to a slow sync
but besides what your client wants to do
you also have to ask the engine what it
wants you to do a user might have gone
and said I want to reset every client
from dot Mac at that point when you
think the engine won't want you to push
any records you're going to be reset
also something might have gone wrong
during the sync operation from the
engines perspective so the next time you
think it will want you to do a slow sync
and we'll see where we ask those
questions and then the API as we proceed
now we're going to go push changes very
simple flow chart here just to show you
what we're doing if we want to push all
of our records then we're going to get
every record that we have convert it and
push it up to the engine otherwise we'll
just push up the deltas I want to point
out one thing when you're pushing
records you don't have to push an entire
nsdictionary record you can push
something we call a nice ink change in
this case we've got one change to a
record we've changed the title of an
event to sean's birthday so I've got a
very small I sink change object but if I
wanted to push up the entire record I'd
have to push up all the relationships
and all the other fields and you can see
that's a lot more information and that
can add up if you're doing that for
every record and you have a large amount
especially when you're doing an initial
sync so if you push up a nice ink change
you'll save the engine a lot of time
walking through an entire record trying
to figure out what exactly has changed
in it now here's our code this is a
little bit dense I pointed out that
negotiation is actually split up I
showed you a previous slide where we
told the engine our intentions but you
also have to ask the engine what it
wants you to do so we're asking it if we
should actually push changes for this
entity type we're walking through all of
our entity types so in this case media
access objects and event objects we're
going to ask the engine should we push
them there's a few reasons why it might
tell you you shouldn't it might be
resetting you from
state somewhere else or you might have
in configuration told the engine I'm not
going to actually sink this entity type
a user should be offered the choice you
might only want to sync events and not
media assets so if you turn one of them
off you don't have to remember it you
don't have to keep track anywhere the
engine knows that you've disabled it so
when you ask it if you should push
changes it'll say no if it wants you to
push changes then you need to ask it if
it wants you to push all changes so I
just mentioned before sometimes the
engine needs you to push everything and
do a slow sync this is where you ask it
now the rest of this code is pretty
straightforward we're just walking
through all of the records either every
record if we're doing a slow sync and
pushing them all or just the changed
records and we're going through and at
the bottom you can see the call to
session to push the changes from that
record so we're just simply pushing it
up specifying a unique identifier at the
very bottom I didn't I wasn't able to
fit this on this slide so i'm going to
show you another slide that's the
continuation here if you're doing a fast
think you need to explicitly delete your
records so what we do here is check to
see if we've kept track of state and we
have a list of any deleted records if we
do and the engine isn't forcing us to do
a slow sync then we push the records
that deletes up now why wouldn't we push
the delete supper for doing a slow sync
reasoning is the engine knows what your
previous state was so if you push up
your entire set of records anything you
don't push up it treats as a delete now
we come to the fun part mingling
mingling is actually pretty simple
you're going to tell the engine that
you're ready to pull changes and then
the session is going to enter into the
mingling state if other clients are
sinking at the same time the engine is
not going to return until they've
finished pushing all their changes and
it's been able to mingle from all
sources so this can take a while the
answer is going to be doing field level
differencing here so it's going to be
walking you through all the changes that
come in from every source that's
thinking at this time and it's going to
compare them on a field by fields basis
you can call this blocking or non
blocking if you call it blocking again
you have the same set of issues you want
to make sure that
you're being responsive so you don't
want to go in and start mingling and
waiting too long for the user and a
blocking call if you call it non
blocking though you have to be careful
if the user makes modifications to data
while you're sinking the engine is going
to be pushing changes down to you that
are predicated on the state you had
before you started mingling if the user
has made changes to any of the records
that the engine also has changes for you
typically want to let the user win so
you need to have some mechanism in your
application to keep track of changes
that are being made during a sync
operation and make sure they override
any changes that come down from the
engine so here's our call we're going to
make a blocking call in first we're
going to build a filtered list of all
the entities this is similar to what we
did when we pushed entities we asked the
engine should I pull these entity types
if they're disabled or if the engine
wants to be reset from your client it
won't tell you to pull anything so you
may end up with an empty list then you
simply ask the engine to prepare to pull
changes that's the call that goes into
the mingling state now I committed an
even worse than here can anybody spot it
and if you can read this you'll see I
specified the date distant future so
this application is going to wait
forever now we actually didn't do that
in the example I think I took this slide
from an older version of it this is an
example of something to be very careful
about here your application is just
going to go and wait for the engine
forever so if something takes an
incredibly long time in another client
your client is going to be penalized
finally we get to pulling when you pull
records down you need to ask the engine
if you're supposed to replace every
record if the engine is trying to
replace everything because the user said
I want to pull everything down from Mac
and clear everything and replace it from
there on my machine it's going to tell
you to replace all of your records if
that's true don't go and whack your data
store because you don't know if the sink
is going to be successful anything could
go wrong the user could pull the plug on
the machine something could happen that
could cause your application to crash
instead just internally set all your
data aside and take everything you get
from the engine and only when you're
able to save that completely should you
then throw out
other data this is pretty
straightforward you get an enumerator
when you're pulling changes the
enumerator is pretty much like an array
enumerator accept it contains i sync
change objects I talked a little bit
about these before this is an object
that just contains a list of the changes
for a record it's often more efficient
particularly when a record has a lot of
attributes and there's only a few
changes however sometimes with your
client the logic might be simpler if you
could just get the entire record from
the engine and just map it right on to
what you have in your client you can do
that by pulling the complete record out
of a nice ink change so you have your
choice as to whether or not you want to
use the fields that are indicated in the
change or whether you want to pull out
the entire record we walk through the
change in numerator we pull out each
icing change with next object and then
we apply each one based on the change
type so let's look at that when we're
applying the changes we're going to use
the two-phase commit for each change
that we get from the engine will either
accept reject or ignore it if we accept
it the engine will assume that we've
taken it and will give it to us again
assuming we successfully complete the
sink session if we reject it the engine
will assume for some reason we don't
want it perhaps you have a device or an
application that doesn't fit every field
on it so some of the fields that are
handed to you you just can't use so you
reject that change so you don't keep
getting hit with it if you don't do
anything the next time you think the
engine is going to push that record at
you again now after you've gotten every
record and successfully applied them or
reject it and if that's what you want to
do you tell the engine to commit at that
point the engine says great whatever
this client just told me I'm going to
believe is the truth and i'm going to
keep track of this so the next time you
sync everything you've accepted doesn't
get pushed to you again everything you
rejected doesn't get pushed to you again
but if you don't make it to commit if
your application blows up before that or
something goes wrong the engine will
then try to push all those changes back
to you on the next sink and that's good
because if something goes wrong you
don't want to lose all that data
let's look at the code for it we think
we look at the change type we switch on
it if we have an ad or a modify you
simply try to apply it what we're using
is a transformer here if we're able to
successfully transform it then we tell
the engine that we've accepted the
record otherwise we tell the engine that
we're refusing the record so in our
example if there was something malformed
in a record something wrong with it we
would reject it otherwise if the record
looked good we would accept it you do
the same thing for deletes you take the
delete down check for consistency if
you've got that ID for that record it's
referring to you just delete it once
you're done going through all of that
make sure that you save all the data to
your data store before you commit to the
engine you don't want to tell the engine
sure I've taken all your changes because
you've got them in memory and then crash
and not actually write them out to a
file so we're just showing that here and
then at the very end we tell the session
that we're finished and we're all done
so it took a little bit about state
management before I'll just reiterate
you need to keep track of the ads and
the deletes and the modifies that you do
that way each time you think you can
just push up deltas which is much faster
you need to save this info if your
application exits without synchronizing
if you don't save it the next time you
start up if you try to fast think you'll
have forgotten about the deltas from
before if you lose the Delta information
make sure that you slow sync remember
the most important thing is correctness
after that comes speed so I've just got
a few best practices to describe to you
before we close sink quickly and often
we keep talking about trickle thinking I
keep banging on this and saying it over
and over again it's really the best
thing to do for the user the more often
you think the more frequently and the
quicker you can sink the more
transparent it becomes the less load on
the system and the better the experience
for the user your data is getting moved
around and it's where people want it at
the time that it's changed you want to
be responsive it's very important that
you don't lose responsiveness in an
application users hate it when they're
banging in the text
or trying to do something or get a menu
to come down and they don't know why
application is hung you want to provide
user feedback by that I'm talking about
something like a progress indicator or a
progress bar possibly a status line with
some text telling the user what's going
on depending on how much data your
application sinks what type of
application it is if it's a pro app or
if it's just a simple application that
users don't really pay much attention to
you might want to give them more or less
information but make sure that you show
them something you need to offer choices
to the user if you go to quit an
application and start thinking and you
can't get a session within a reasonable
amount of time you might want to pop a
panel up to the user saying do you want
me to synchronize your data before I
exit that way the user has more of a
choice similarly when you start up if
you're not able to get a sink session
right away you probably want to start
the app immediately and wait a minute to
get the sink session you might want to
notify the user that you don't have all
the data but you certainly don't want to
make a user wait 10-15 30 seconds for
another client to finish syncing before
your application starts now I put this
here almost as a joke when I first wrote
the slides and it sounds kind of
ridiculous don't corrupt the users data
I might as well tell you write an
application that links and doesn't crash
when it runs but this is actually a
critical point keep in mind that the
data that you're synchronizing is not
only being synchronized with your
application but it's being pushed out to
other applications so if you corrupt
some data it's going to fan out to other
applications that might fan out to
devices it will fan out across Mac and
it'll have catastrophic effects so it's
very important to test that last one
percent test all the boundary conditions
make sure that you handle errors it's
really critical when you write a sync
client that you get all of that right
it's very important to have correctness
a couple people you can contact we have
developer relations we also have a
technology evangelist the names are here
please also we're looking forward to
getting suggestions from people file
radars when you encounter problems and
stay in touch with us and don't hesitate
to contact these two people
if you have additional questions it's a
couple more things you can look at for
information but get to it okay we've got
a lot of reference documentation we've
got some concept documentation that's
only available online the reference is
actually also available on the tiger DVD
that we gave you also on the tiger DVD
the examples that we showed you all the
source code is there in a project it's
in developer example syncservices have a
look at it you'll see everything I
talked about in as much detail as you
want