Transcript
>> Good afternoon.
Welcome to Session 118, Mastering Core Data.
Please turn off your WiFi devices they make me nervous.
Much better.
I'm Miguel Sanchez, I'll be doing the first half of the
talk, and then pass it on to Adam Swift to wrap it up.
So Core Data has been around on the desktop since
10.4 Tiger, on iPhone OS since 3.0 last year,
and obviously on the iPad earlier this spring.
The purpose of this session is to help you
become more proficient with our technology.
We are, this is not an introductory session.
We assume a basic level of knowledge with the framework.
You could have attained that by reading the
documentation, or worked through an example.
You certainly don't have to be a super expert.
But we will not be covering the
most basic concepts of Core Data.
This is our to do list for the session.
And let's jump right in.
So let's start out by talking about
some tips and tricks for modeling.
Quick recap.
The modeling, the model is how you describe
your data to us, and we help you manage it.
So the better job you do at describing
that data, the more we're able to help you.
The vocabulary you'll be using to describe your
model are the entities, which more or less correspond
to the tables in your
store, attributes, and relationships.
And you want to make sure that you design your model, not
around an ideal representation of what your data could be,
but more around a practical implementation of how you're
going to be accessing that data with your application.
Once you give us a model the data will be
presented to you as instances of NSManagedObject.
Now you don't need to generate any code.
You can just use the standard class that we provide.
But we actually do want to encourage you
to start using ManagedObjects subclasses.
Why? First, by moving away from the KVC access patterns,
you gain a lot more support from the compiler to tell you
when you're accessing something incorrectly.
It's better to use a more direct set name
accessor method, rather than to set value for key,
and you might mistype name or something like that.
This improves your code readability.
And it's also faster at execution time, because
KVC doesn't have to do the dereferencing
of the method, of the state that you want to access.
Now when I say, please generate subclasses,
I don't mean write a lot of code.
All you have to do is the modeling tool
allows you to generate the stub for the class.
And it's really just a declaration of the properties, so
that the compiler can check them when you're accessing them.
Here we have three properties up on the screen, a first
name, a simple string property, to one relationship manager,
and a to-many relationship direct reports.
Remember that to-many relationships are
represented as sets within Core Data.
You don't even have to write the code for
the implementation for these properties.
If you use the @dynamic compiler directive, we inject
the code at run time, the actual implementations
of these methods for you, so you have; it's all
there for you, you don't have to write the accessors.
Another thing we've seen people have questions around in the
Core Data forums is, once you do the initial implementation,
the initial generation of your classes, what happens
when you have additional properties
that you want to add to your entity?
Do you need to regenerate your whole
classes and stomp over what you had before?
The answer is no.
Please be aware that the modeling tool
has this menu items, design data model,
and then the 4 items where if you select
a particular property in your model,
we generate either the implementation
or the header declaration
for that property, and leave it on the pasteboard.
Then you can go into your specific
file that you've pregenerated
and paste the appropriate piece of code that you have.
So you don't have to stomp, you don't
have to stomp over your preexisting file.
A couple of more tips for working with Managed Objects.
Remember that you are inheriting
from NSManagedObject and NSObject.
So be careful not to use property names that might
conflict with properties that you're inheriting
from these classes, such as Description and Deleted.
Those are the two most common ones
that we see people have conflicts with.
And also remember that, because these are KVC compliant
classes, it's not enough to avoid the property deleted,
but also isDeleted, getDeleted, and setDeleted.
So all the KVC resolutions for a particular name.
So the initial generation of your object is pretty
straight forward, you declare the properties.
But now let's start writing some more interesting code.
A property, a type of property that's
available to you with Core Data are transients.
Transient properties are declared in the model.
You get the benefit of change tracking at memory,
during memory while Core Data is managing the objects.
But their state is not persisted to the Store File.
So you're independent from the source schema.
You're able to add transient properties
without incurring a potential migration step,
because you wouldn't be changing the store schema.
Here's two examples, I mean I walked you through 2 slides
showing what people mostly do with transient properties.
The first is the simplest.
You're simply computing a full name
value from a first name and a last name.
So the first name and the last
name are stored in the database,
but your application requires the
display of this, it's a full name.
So you would write an accessor like the following:
you would see if you've already precomputed that data
by getting the primitive value of full name.
And if you don't have that value yet, you simply concatenate
the first name and last name, and you return that.
So this is a transient property.
You now have access to the full name, even
though full name does not exist in your database.
A more interesting use of transient properties, and
we've got some of this questions in the lab yesterday is,
to reference external resources, which you are
building yourself within your core data instance.
So let's say that you have a particular type that
Core Data doesn't handle, some document object.
You can declare a transient property document object that
depends on a persisted property, persisted document path.
So you are using Core Data to store the string
value of the path to access this resource.
And then you do whatever you need to do
to fetch the actual object from that path,
and you return that in your Core Data object.
So you're using a persisted property, so as a locator for
something that you're going to go out into the file system
and go load yourself in however mechanism you choose to use.
Another type of property that we have
available are transformable attributes.
You can see the list of types that
we support for the attributes
in Core Data from integers though strings, booleans.
If there's a particular type there that you don't
see that you want to use in your attributes.
You can always pick Transformable at the very bottom there.
A Transformable attribute is one
that is handled by Core Data.
We are doing the storing of your data.
But we're converting it into an instance of NSData,
where it's basically binary data
that we're putting in the store file.
But that data to you behaves as
whatever type you declare it to be.
So here, for example, we're declaring a property that's a
NSColor, and we don't have NSColor support in Core Data.
So you do the declaration of your property.
Color is declared as a Transformable property.
And we do the right thing behind the scenes of
archiving and unarchiving the color data for you.
But from your code you're dealing with colors.
When using Transformables we are using a custom,
I'm sorry, a default class of NSValueTransformer.
When would you want to use your own subclass of a valid
transformer and put it into your transformable class?
Core Data doesn't do any encryption of data.
So if you want to archive something out in binary
format, and you want to add an encryption step,
Subclassing ValueTransformer might
be a good way to put that code.
You might not want the default behavior
that we use with keyed archiving.
You might just want a straight write that bytes
data directly to the string that you're writing out to.
And if you do that, make sure that you take into account
endianess between your, the byte ordering of your data,
because your data might be stored in one
platform and fetched in a different platform.
So make sure that if you do, are
writing your own transformer
that you are taking into account the byte ordering issues.
The last thing I want to talk about
Transformable Attributes is
if you do Subclass NSValueTransformer this is the
convention that Core Data uses when calling those 2 methods
that you, the primary methods in ValueTransformer.
When we call Transform Value, we
are going from your type to NSData.
And we're using the reverse transformation
to go from data to your type.
So that's; people get this backwards some times.
There's an example on how we do
this, the photo locations example,
it's associated with this session
if you go to the attendee website.
The last thing I want to touch on in this section
is adapting the model to your access patterns.
So let's say that you have a problem of,
you want to search on titles of books.
So your initial inclination might be to define an entity
book with a simple title property that's a string.
And you want to search on titles.
So searching means contains, right, in database terms.
And you'll also want to strip out the
diacritical marks and case sensitivity,
so you do Contains with the brackets,
DC that's the diacritical stripping.
It turns out that Contains can
be pretty heavy, heavy weight.
This is fully ICU compliant, localization aware,
basically regXmatching at the database level.
So it's a pretty hefty operation.
Not only that, you're doing it each time
you're searching for each row in your database.
So if you have a large dataset you might
not get the performance that you want.
Here's the, a secret for you guys.
Apple, we ourselves on the phone specifically, on
mobile devices, we don't really do full text searching,
full searching in some of the searchers that you're doing.
We're doing prefix matching.
So this is actually very good for most
of the functionalities that you want.
So one trick that you can do is;
first, well actually two tricks here.
You can take care of the normalization of the
data, right when you're setting the title.
So you add a secondary attribute, normalize
title, where you do all of the stripping
of the diacritical marks and standardizing on a case.
And then you put that, you put your normalized title into
your book instance, and then you also index that property.
You also index the normalized title property.
And now you can write a much more efficient predicate
by searching on any normalized title that's greater than
or equal to the prefix that your matching
on, or less than the subsequent prefix.
For example, if your user is typing S T A R, star,
you can use the following predicate to much quicker go
and find taking advantage of the index,
and you get back a much quicker result
than if you're doing a regular contains predicate.
So we also show you how to do this with the direct
property example, also associated with the website.
One final step here, you'll notice that in the previous
slide we're now doing prefix matching on the title,
but we're only matching on the first, on
the beginning of the title, basically.
So what if you want to do the match
on any word on the title?
One thing you can do is split all the words in
your title into a separate title words entity.
And it's one-to-many relationship.
Do the prenormalization.
So you're storing the title words already normalized.
You're still using the same predicate as before.
But now the predicate is searching on title words.
So you're matching on any word in the title.
And whenever you find a match on the title word,
you can just use the relationship back
to point to the book that it belongs to.
Okay? So, this has been an example of how, what I meant by,
see determine what the actual usage
pattern of your application is.
Where is the bottleneck?
Where do you need the fastest turn around
time, and adapt a model to do that.
Let's move on to talk about the Managed Object Life Cycle.
Remember Managed Objects are the instances
of your data that we're providing for you.
They don't exist in a vacuum.
They always exist with regards to
the context that is managing them.
So anytime we talk about the life cycle of an object,
it's always something that's happening
with regards to that context.
So we're inserting a context where, I'm sorry, we're
inserting an object, we're inserting it into the context.
If we're fetching an object, we're
fetching it from the store into the context.
As we're updating our data, setting state on these objects,
the context is maintaining track of what's going on,
so that if we ask it to undo a change,
the context is doing that undo.
When we decide to clean up, there can be at a very high
level two kinds of clean up; either a direct deletion
that you are asking us to perform, because you're
messaging the context, please delete this object.
Or there's more memory level deletion where
we realize that there's no more references
to your instances, so we turn them back into false.
A thinner shell of those objects, that
doesn't use the full memory footprint.
So what are your options for hooking
into this lifecycle of an object?
You have three high level ones.
First is, if you're dealing with per instance
actions your best bet is probably to override methods
from NSManageObject, and I'll be talking about those.
If you want to react to graph level changes, you probably
want to register with the Manage Object Context and listen
for the notifications that it's posting
and react to those notifications.
And thirdly, you remember that as you're asking the context
to perform certain things, some
of those methods return errors.
So make sure that you're inspecting those return values
and reacting to whatever the context is telling you.
So let's go one by one in these.
If you're going to override methods in NSManagedObject,
let's say that you want to add
additional initialization code.
The awake methods, the awake set of methods that you
see on the screen here is our good place to do that.
Let's walk through each one of them independently.
awakeFromInsert.
awakeFromInsert will be called once
during the lifetime of your object
when you do the initial insertion
of that object into a context.
This is a good place to set baseline state for your object.
There will be times, remember that when you're creating
the Managed Object Model we have text fields for each one
of the properties that allow you to
set default values for your properties.
There will be times when this is not enough for you.
For example, if you are creating employees,
when you're doing the model you can't tell
what the next employee ID is going to be.
So the awakeFromInsert is a good place to put that
code where you're initializing the employee ID.
This is the awakeFromInsert from Managed Object;
your own subclass that represents an employee, right?
So this is something you can't put in a
model, so this is where you would put.
awakeFromFetch is very similar to awakeFromInsert, except
that it's called each time that your data is fetched
from the database and an object is created in a context.
What do you want to put here if
you ever override this method?
This is another good place to initialize
transient properties.
Some slides back I showed you the full name method.
That's sort of like an on demand
initialization of your full name property.
But you could also choose to set that
value up once you know that you have all
of your database state and you're awaking from a fetch.
Finally, during the lifetime of your objects
there will be situations that require us
to revert state to your object from a snapshot.
But specifically if you're asking us
to refresh the object or to do an undo,
so there will be times when the object will be
kind of reawakened with state that it had before.
You will be notified of this with
the awakeFromSnapshotEvents method.
This is a good place to put code that
might reset some of your cache data,
so that you can compute it on demand again later on.
So now we move onto the objects where,
I'm sorry, to the mechanisms that you have
to react to states changes for the whole graph.
NSManagedObjectContextObjectsDidChangeNotification
is how the context tells you that,
of what's going on with your edits.
It's post it's notification that informs
you, and a user info dictionary, you'll,
give you the list of inserted objects,
updated objects, deleted objects.
Please note that we're not telling you these are the
changes that have already been saved to your store.
We're only telling you this is what's going
to happen the next time you do a save.
Right? We've processed the changes in memory and this is
what we know about, but you still have to do the save.
When do we; when will you be getting this notification?
When is it posted?
It's posted if you explicitly tell the
context to process all the pending changes.
It will also be posted right before a save.
But it's also posted, very frequently, at the end
of the event loop and anytime we're doing a fetch.
So you're getting this notification
throughout the lifetime of your application.
Other notifications that are posted
by the ManagedObjectContext,
which are interesting for you to hook into.
When we actually do a save you will get the
NSManagedObjectContextWillSaveNotification
and DidSaveNotification.
Some of you have asked on the forums
about time stamping your objects.
So you want to put a timestamp on the
object right when it was last changed.
This is a good place to do that.
If you register to receive the
WillSaveNotification, you know that your graph
of objects that's being communicated
to you is about to be saved.
You can set the timestamp for each one
of those objects, and then we save it.
This is also a good place to set up relationships.
If you remember in the previous slides in the awake methods,
those were meant mostly for single instance manipulation.
So by this point more of the objects are set up,
so you can set up relationships between them.
Once the save does happen, you
will get the DidSaveNotification.
This is a good place to put your code that needs to notify
others in your application that the save has happened.
Let's say that you're managing
multiple context, multiple peer context.
Once the save happens and you want another
secondary context to know about that save,
you can start that messaging from
replying to this notification.
How would you do that?
Let's say that you have more than
one context in your application.
You do a save in one context and you want to communicate
all of those edits that just happened onto another context.
We have a method,
mergeChangesFromContextDidSaveNotification,
which the notification you're sending, by the way,
is the one you got in the DidSaveNotification.
You simply get that, take that notification, and then
you send this message to the other contexts that you want
to have the exact same changes that were
just saved, and we do the merging for you.
Now while we're in the topic of saving, don't forget
that the save method returns a boolean,
and it also has an error parameter.
So what kinds of things can go wrong when a save happens?
The first is validation errors.
Remember that as you were defining your model, you're able
to declare certain boundary conditions about the maximum
and minimum value of your properties,
or optionality of certain values or not.
So if there's any validation issues that we detect
while we're doing the save, we will fail the save
and communicate this back to you in the error parameter.
If there's more than one validation issue, they
will be chained in the NSDetailedErrorsKey.
So be sure to inspect the User Info
Dictionary inside of each error
and see if there's more than one validation problem.
The second type of issue that can cause the
save to fail are optimistic locking failures.
This is a mechanism that we use
to detect multi-writer conflicts.
When we're doing a fetch from the store we
keep around a snapshot of the last value
that we saw for a specific property in the store.
So if anybody else changes that value underneath us in the
store, now that's any, that somebody else could be you.
It could be another thread in your application.
It could be another peer context that you're managing.
It could be another application that's
still dealing with the same store.
Whatever it is, there's a change
in the store, you tell us to save,
and we detect that somebody's changed the value underneath
us, so we will raise an error and we will notify you.
The default policy that Core Data has
is to raise an error when this happens.
If this is not what you want to do, you can change
the merge policy on the Managed Object Context.
And you have here the set of policies that you can set.
If you want us to try to merge the changes from
the store and from changes that you have in memory,
you can use one of the, the second or the third policy.
And with each one of those you're
telling us who do we give priority
to when we detect a conflict on
a property-by-property level.
Do we take state from the store and stomp your memory state,
or do we state from the object and stomp your store state?
Or do you just want us to take the whole object
from memory and override whatever was in the store?
Or do you want us to take whatever was in the
store and override whatever was in memory?
So, you get to pick whatever policy that you want.
But please be aware that our default policy
is just to say this save doesn't work,
because somebody changed the data underneath.
Cleaning up.
Like I said at the intro to this section, there
is two types of clean up, one is deletion.
You're telling the context delete object, or delete.
Remember that the delete doesn't happen
until you do the subsequent save.
So you can tell a context, please delete this
object, but it's just marked for deletion,
it's not actually deleted until you do the save.
When you do the save you will be notified that the save
happened with the WillSave and DidSave notifications.
By the time you're getting those notifications, we've
already done the delete propagation in your graph.
So if you, you can't access the relationships
in your objects within those notifications.
So if you want to keep around; if you want
to kind of take notes about what's going
to be deleted for you to do secondary processing.
For example, if you're managing your own resources,
like that document path example I
showed with the transient properties.
A good way to plug that code is
in the prepareForDeletion method.
This is something you would override
in your NSManagedObject Subclass.
So this is when we're telling you, hey,
we're going to eventually delete this object.
This is where you would say, oh, this object is going away
and I'm managing an external reference
for this object myself.
So let me just keep track of this
path that will eventually be deleted.
So that when I'm notified that the deletion actually
happened, I will go ahead and remove the data.
The second kind of clean up that can
happen is kind of memory level cleanup.
You're not telling us to delete
objects; we're simply cleaning up,
because we don't see that you have referenced this to them.
So please don't override the dealloc method.
We don't quite guarantee at what
point that's going to be called.
The equivalent for you Core Data developers
should be they will turn into fault.
That's where you, that's the equivalent
of dealloc for you guys.
This is where you want to clear
out your caches or any dependencies
that you've registered for for a key value serving.
Now turning something back into a fault happens when
either Core Data detects that we don't have any references
and nobody has references to those objects, or
you're explicitly telling us to refresh an object
by calling the following two methods
on the ManagedObjectContext.
You're telling us to refresh the
object, or you're telling us
to reset the context and turn everything back into a fault.
Please don't call the refreshObject method and
tell us to ignore the merging of the changes
when you have a dirty object, because the
consistency of your graph will get out of sync.
The final thing I want to talk to you
about is multithreading with Core Data.
Most of the time most likely you will be considering
introducing multithreading into your application
to improve the UI responsiveness of your application.
So you want to push; you want to make Core Data
applications, operations asynchronous by pushing them
into a background thread, so that your UI is
free to continue to interact with the user.
Be aware that there's always pitfalls when
working with multithreaded applications.
It doesn't come for free.
So just because you spawn off numerous threads,
all of these have a little bit of a cost
as you're doing the context switching back and forth.
Make sure that all of, even though you have multiple
threads executing, they're not all contending
with the same resource, which defeats the purpose.
And you're also introducing a little bit of complexity
into your application, specifically with the debugging.
So this is not a free solution.
But you do decide to go down this path, so the
golden rule that we want you to always remember is
to give each managed object, each
thread its own Managed Object Context.
And I quote thread here, because you
could be using Grand Central Dispatch.
Basically each concurrent unit of execution
to get its own Managed Object Context.
Managed Objects are not thread safe.
You can't pass them around threads
and expect them to work properly.
What is thread safe is the objectIDs
that each one of those have.
So let me illustrate this.
You have a UI thread interacting with
the user and the background fetching.
The background thread is doing
the fetching of three objects.
You're ready.
You've warmed up the application.
You don't just pass those instances
over into the main thread.
You actually take the object IDs
from this object that you fetched.
You pass that across the thread boundary.
And then you use the method on the context, such as
objectWithID where we construct a local copy of that object.
Now fear not, you're not doing a fetch from scratch here.
Because the background thread was already warming up the
role caches that Core Data uses to create the objects.
So creating an object in the first
context is actually very, very quick.
You are taking advantage of the work that
you're doing with the background fetching.
If your background threads are inserting new
objects, please remember to first save them
to the store before you pass the object ID across
the thread boundary and ask us to fetch it.
When you create an object, we pass a temporary ID.
You can't pass a temporary ID to another context
and expect it to find it unless it's been saved.
So once you do the save the ID becomes permanent
and we can get it from the other context.
If you're doing this with Grand Central
Dispatch, the pattern is the same.
Let's say that you have a serial queue.
You know that blocks inside of a serial
queue will execute serially by definition.
So all of your blocks can potentially share
the same context, instance of context 1 here.
But, and this is a very important but, just because we have
serial queues doesn't mean that you don't have concurrency.
Right? You might have more than serial
queue executing at the same time.
So blocks within different serial queues
could potentially execute concurrently.
So make sure that if you have more than one serial queue,
the blocks in that second queue are
using a different instance of a context
from the blocks in the first serial queue.
And that's how you maintain the golden rule.
Of course if you're using a concurrent
queue, you know that by definition block 4
and block 6 could potentially execute concurrently,
so you take care of giving them different
instances of a context to manipulate.
The last thing I want to talk about
in this section is what happens
when you're doing edits to your data in multiple threads?
Well, what happens is that you
better know what you're doing.
I mean, a lot of this has to do with doing, defining
a work, a good workflow in your application.
Okay? This isn't so much Core Data's problem, it's
what does it mean for somebody to be editing an object
that was deleted in the background, or vise versa, right?
So a lot of that work is you guy's work.
Well once you figure out what it means in your
application, the two mechanisms that Core Data gives you,
and we've seen these methods before is, first if you want to
refresh the state of an object, turn it back into a fault,
you do the refreshObject with mergeChanges method.
Or if you did a lot of processing in a background thread,
in a background context, and now you want to push all
of those changes over into another
context in another thread,
remember that I mentioned the
mergeChangesFromContextDidSaveNotification.
So you're passing all of the objects.
Here, this is the one place where
objects are crossing the thread boundary.
But because this is being handled by Core Data,
we take care of doing the right thing
behind the scene so that nothing goes wrong.
So this is what happens with multi-party edits and deletes.
And now I'll pass it onto Adam to conclude the session.
[ Applause ]
>> Adam: Thank you Miguel.
And now I'd like to dig a little
deeper into fetching and performance.
So you know how critical performance is to providing
a great user experience for your application.
You want your user interface to stay responsive,
even as you scale to dealing with a lot of data.
And the two key strategies for achieving these performance
goals are limiting memory usage by only fetching the data
that you actually are going to show in your user interface,
and amortizing your data base I/O by fetching in batches.
So keep in mind fetching is performing disk I/O.
So you will want to avoid the extremes of fetching
too much data all at once, and on the other hand,
frequently fetching a little bit of
data and repeatedly calling out to I/O.
You want to find that right middle balance where you're
fetching your data in objects and reasonable batches.
And we can do that by leveraging
the strength of the database to do
as much work as possible at the database layer.
So we can use predicates and sort descriptors
to work across your entire data set
at the database level and keep memory and I/O under control.
So I want to walk you through a few examples
of how you can use predicates to do the work
at the database level, and keep
your memory and I/O needs low.
Let's start with an example of how you can avoid
fetching objects from a to-many relationship,
when all you really want to know is how
many objects are in the relationship.
You can use the account expression to avoid fetching those
objects, when all you want to know is how many there are.
For example, if you have a list of music playlists and
you want to find all the playlists without any songs,
you can use the @count expression, and a predicate
like this to look up those playlists without any songs.
And you won't be fetching any of the song data back,
you're just fetching the playlist that match that query.
If you want to work with the attribute value from
objects related through a to-many relationship,
you can use a SUBQUERY expression to access the attributes
owned by the objects in the to-many relationships.
And this gives you a powerful way to
test those attributes without, again,
fetching of any of the objects from the relationship.
So in this example we want to fetch all the
artists with songs longer than 10 minutes.
And we do that with a SUBQUERY expression that takes the
songs, the name of the relationship as its first argument,
and then tests if the song length
is greater than 10 minutes.
If the results of that SUBQUERY, if the songs returned
by that SUBQUERY, affect the count of the songs
that are returned by that SUBQUERY are greater than zero,
then we're going to return that artist in the fetch.
You can also work with and fetch attribute values directly.
So if you're only interested in fetching back
unique attributes from one of your entities,
you can fetch back those unique
attributes as read only dictionaries,
only fetching back the attribute
value without anything else.
You're evaluating that work in the database, and
only fetching back the results you're interested in.
So let's look at an example where we want
to fetch all of the unique album names.
So we tell the request we want it to return distinct
results, we want the results returned as dictionaries,
and all we want is the unique names from our album entity.
And when you execute this fetch, you'll get back an
array of dictionaries with all of those names in it.
You can go even further working with attribute
values directly in the database by calculating
and evaluating aggregate data on those
attributes and returning dictionaries
without fetching all of those objects into memory.
This is a powerful technique for performing
a lot of work at the database level.
So let's look at an example where we want to calculate
the total length of all of the songs in our music library.
So the first thing we need to do is create an expression
that represents the function we want to evaluate.
And in this case we want to take the
sum of the length of all of our songs.
Then we need to wrap that expression in an expression
description that tells our fetch how to encode
that information back in the dictionaries
that we're going to be returning.
And in this case we want the results
back as a double with the name totalTime.
The last thing we need to do is we need to configure our
fetch request to perform the fetch on the song entity,
only search for the property that we've constructed here,
which is the function to calculate the sum of the length,
and return those results as dictionaries.
Again, we're doing an incredible
amount of work at the database level,
and only fetching back the single
result we're interested in.
Sometimes you just want to know how many
objects are going to be returned by a fetch.
Either to display that number on screen, or to make space
for the objects that you're going to be fetching back later.
And you can take any fetch request and ask the context
for the count for that fetch request to get that value.
So in this case I'm showing a table that lists playlist
names, and we can use countForFetchRequest to look
up the number of songs for each playlist.
But then we can go a little bit further with
working in the database by using a sort descriptor
and setting a fetch limit to fetch the first three songs.
So now alongside the number of songs that
we've got, we can show our users a preview
of the first three songs in each playlist.
It's kind of improves the user experience, but
you're only fetching back a little bit more data,
even though you've got a lot of data under the hood.
So now let's take a closer look at what you're
doing when your fetching managed objects.
There are a lot of options available
to you on a fetch request
for how you're fetching objects and
what you're actually getting back.
In the case that you're fetching objects
that you want to use in your working set,
and you want to access the attribute data right away,
you want to fetch back fully faulted managed objects
with all of their attribute values pre-populated.
But you're not going to have to
fetch back all of the relationships,
even though they're fully faulted managed objects,
the relationships are still represented as faults.
So you're not paying the memory cost
for traversing too many relationships.
To get back fully faulted managed objects,
you need to tell your fetch request
that you want to ReturnObjectsAsFaults: NO.
So why would you want to fetch back faults?
Well faults are a very useful tool.
They're a very lightweight placeholder for managed objects.
And their attributes are fetched on demand.
And when you turn a fault into a managed object,
and it fetches its attribute values on demand,
it doesn't change its pointer address, so you're
still working with the same object in memory,
so you can keep it in the array that you
had before; where it used to be lightweight,
now it contains all of the information
from the managed object.
And there's also a middle ground called partial faults,
where you can fetch faults, but specify that you want
to prefetch or you want to specify
that faults should include some subset
of the properties from your managed objects.
So if we wanted to show a listing of song titles, but
we didn't want to fetch another heavier weight attribute
from the song entity, we could tell the fetch request that
we want properties to fetch to include only the title.
And the smallest representation for a
Managed Object is the Managed Object ID.
These things are really small.
Each Managed Object ID is only 16 bytes.
So it's actually possible to work with
a large set of Managed Objects ID's
that represent the Managed Objects
without taking up a lot of space.
As Miguel mentioned before, the Managed Object IDs
are also thread safe, so it's a great way to pass
that information between different threads.
They're also perfectly suited for using in predicates.
So any time you've got a predicate where you would
be supplying a Managed Object in the predicate,
you can supply a Managed Object ID,
and Core Data handles it just fine.
To get back Managed Object ID's you need to tell your fetch
request that you want the Managed Object ID result type.
And then you want to tell the request
not to include the property values.
And you might be thinking, wait a minute, I'm fetching
back Object ID's, they don't have property values.
But by default, the fetch request assumes that
if you're fetching back Managed Object ID's,
then you're probably going to want to use those
Managed Object values some time in the future.
So even though the Managed Object ID doesn't store
the property values, the property values are fetched
into the row cache, so if you later look up the
Managed Object for, or create a Managed Object
for that Managed Object ID, it doesn't need to
do a fetch from the database to get those values.
But if you really want to work with a large number
of Managed Object ID's and minimize the amount
of memory you're using, you want to make sure to
tell the request not to include the property values.
So back to talking about fully faulted managed objects.
If you're not just working with the
attribute values in your working set,
if you're not just displaying the attribute values in your
working set, but you also want to display related values
on screen right now, then you want to prefetch
that relationship, so that as you're displaying,
all of the managed objects in your working set, you're
not having to execute an individual fetch to fault
in those relationships, which, getting
back to amortizing a database I/O,
is incurring a round trip to do a fetch every single time.
So you want to take advantage of prefetching to get
those related values ready for your working set of data.
And I'll show you an example of how to do this.
If you want to show a list of playlist songs, and
alongside the song show the album name for the song.
You can tell the request that you
want to set the relationship keypads
for prefetching to include the album relationship.
So I've talked about a number of techniques you can use
to keep your memory usage low and amortize your I/O.
But what about the times where you
can't control the access pattern?
What about when you're trying to work with some sort of API
that it takes the entire array of
objects that you want to fetch?
How can you batch; how can you fetch your objects in
batches, when you're handing over the entire array?
You might think that you're either handing over an
array of faults, in which case they'll be fetched one
at a time and hit that frequent fetching pattern.
Or you're fetching everything all
at once and handing it over.
In which case you're doing that big
upfront fetch that you wanted to avoid.
Well the fetch request can do this for you automatically.
You can set the batch size, and when you execute your fetch
request, it will return an array subclass that's configured
to automatically fetch your objects
in batches as they're accessed.
And the way you do that is you tell your request,
set the fetch batch size to the size you want.
So I hope you've gotten some good ideas about
things you can do to improve your performance,
your fetching performance with Core Data.
But before you dig in and start making changes
to your code, I want you to use the tools
that are available to you to focus your efforts.
The Core Data instruments and instruments can pinpoint
exactly where in your code you're hitting those fetching
and faulting hotspots, so you can use your efforts
in the spots where you need to put the time.
And I also encourage you to absolutely take a look at the
header files and class documentation for NSFetchRequest
and its expression the Predicate Programming Guide.
And make use of the developer forums as well.
There's also, just searching on the net, you can come
up with all kinds of great information and resources.
So I'm going to wrap up this session today
by looking at the topic of migration.
First of all, why do you need to bother with migration?
Well, think back to the beginning of this session where
Miguel was describing that the data model is your contract
with Core Data that describes how your data
will be saved, and structured, and accessed.
So any time you go to make a change to your data model,
that's going to change how that data is saved and accessed.
So if you want access to your old
data, you need to adapt that old data
to a new structure, and you do that with migration.
In Leopard we introduced versioning and migration using
a custom mapping model that you could hand construct
with flexible logic to translate objects from your old
data model to objects in your new data model, in memory,
by fetching data from your old store with the old
data model, transforming them with the mapping model,
and then saving them to the new
store with the new data model.
In Snow Leopard and in iPhone OS 3
we introduced lightweight migration.
And lightweight migration works by looking at your old data
model and your new data model, analyzing the differences,
and inferring a mapping model automatically to
translate data from your old format to the new one.
As an enormous huge benefit to
this, lightweight migration is able
to perform this migration entirely in
the database using nothing but SQL.
So what kind of changes are supported
with lightweight migration?
Well you can add, or remove, or rename just about anything:
attributes, relationships, entities, all supported.
You can also change the numerical type of
attributes, so you can change an int to a float.
You can promote a relationship from a to-many, or from,
excuse me, you can promote a relationship from a to-one
to a to-many, and preserve the
related objects in the new data model.
You can't go the other direction, however, because
from a to-many to a to-one, there's no way to infer
which objects should be saved and which ones to let go of.
You can even make changes to the
entity inheritance hierarchy.
So you can add a child entity, or a new parent, or you can
even take two peer entities, create a common new parent,
and move properties up from each of the child
entities into the new parent, and into migration.
All the data from those entities will be preserved.
So what do you have to do to take
advantage of lightweight migration?
First, you need to make sure you keep the old data models.
We need this for two reasons.
We need the old data model, so that we can compare
the old model to the new one to infer the changes.
Second, we can't read the old data
without the old data model.
So before you go to make any changes, go to
the Design menu in Xcode, choose Data Model,
add Model Version, and start making changes on the new one.
The second thing you need to do is set the
options when you load your persistent store.
Set the migration options when
you load your persistent store.
That's the migrate persistent stores automatically
option and the infer mapping model automatically option.
Now if you've skipped over step
one you'll see an error like this,
Cocoa error 134130, "Can't find model for source store."
And that means Core Data couldn't find your
source model, so we can't do the migration.
So I said you can rename just about anything.
And I meant it, but you have to give us a hint.
You have to give us a hint in your data model, so
that we can tell when you're renaming something,
as opposed to when you've deleted
one attribute and added a new one.
So I'll show you an example of how this works.
You need to set the renaming identifier, and
we'll do that here to change a song's name,
and our old model to its title in our new model.
So you can see I've got the Xcode data modeling design
tool here, and I'm looking at the song title attribute.
And this is version 2 of our data model.
And I've highlighted where the naming
identifier appears in the inspector.
So all we need to do to preserve the data that used to be
stored as the song name in our new model as the song title,
is put name in as the renaming identifier.
A couple of tips to keep in mind when you're dealing with
lightweight migration, changing a transient attribute
to a persistent attribute is the same to
lightweight migration is adding a new one.
A transient doesn't exist in the persistent store.
So all of the same rules apply as
when you're creating a new attribute,
that it needs to be optional or have a default value.
Or for a new relationship it must be optional.
There's a lot more information available about migration
in the Core Data Model Versioning
and Data Migration Programming Guide.
Covers both lightweight migration and
the custom mapping style of migration.
Before I let you go there's one
more thing I wanted to highlight.
This is a technique that's incredibly useful
for adding back some of that custom flexibility
that you might miss, but doing it in lightweight migration.
And the way it works is with a post-processing
step that you use after migration.
So after you; the way it works is you open your store with
the migration options, check the metadata for a custom key
that you've chosen, like DonePostProcessing.
If the key isn't set, then you do your post-processing to
populate derived attributes, or insert or delete objects
that you want present or removed from
your second, your new data model.
And then set the store metadata, so that
you don't wind up post-processing again.
Then save the changes in metadata and you're good to go.
Now I'll show you a code sample to see,
so you can see exactly how this works.
First we open the store with the migration options enabled.
Then we check the store metadata for
our custom key, DonePostProcessing.
And we check to see if the value for
DonePostProcessing is less than 2.
If it's less than 2, then it's time for us
to update our normalized titles for books.
So we go ahead and do that work to
populate the derived attributes.
And then we make a copy of the metadata to update
our custom key, but preserve the other keys
in the metadata, and set it back on the store.
And finally we save.
A really useful technique for adding back
some of that flexibility that you get
with custom migrations in a lightweight migration form.
So I hope this session has given you some
ideas on how, the many different ways
that you can use Core Data to mature your application.
And I want to stress that you want to invest the time
to come up with a good initial model for your data.
And then Core Data will help you out,
as you need to adopt your application
with evolving access patterns and
incremental changes over time.
And if you do find yourself wanting
for some feature or encountering a bug,
please use the bugreport.apple.com
website to report those to us.
We read them.
And the more information you can provide to help
us understand and reproduce the problem you see,
or feature you'd like, the better
chance it has to be dealt with quickly.
For more information please contact Michael Jurewitz,
our Developer Tools Evangelist, jurewitz@apple.com.
And take a look at the Core Data documentation.
There's great programming guides, examples, and
tutorials, and they're always being updated.
And also take a look at the Apple Developer Forums.
And if you want even more focus on performance on
iPhone applications, come to tomorrow's session at 4:30
where Melissa will talk about Optimizing Core
Data Performance on iPhone OS in Presidio.