WWDC2016 Session 720

Transcript

[ Music ]
[ Applause ]
>> Good afternoon.
So I'm going to talk to you
this afternoon about how
to structure your programs
with concurrent programming
and what we've done this year
that's new in GCD in Swift 3.
My name is Matt.
I'm going to be joined
later by Pierre.
We're both on the Darwin
Runtime team here at Apple.
So when you create a new
project, you're going
to have something that looks
a little bit like this.
You have your application.
That application
gets its main thread.
That main thread is responsible
for running all of the code
that powers your user interface.
As you start to add code to
your application, you're going
to find the performance of
your application changes
quite drastically.
For instance, if you start to
introduce large items of work,
say data transforms or image
processing, on your main thread,
you're going to find
that your user interface
suffers drastically.
On macOS, this can be the
spinning wheel appearing.
On iOS, it could be
something more subtle.
Your user interface
will slow down,
or maybe even stop entirely.
So I'm going to take you
through some basics on how
to struct your application to
avoid this kind of problem.
And later on in the talk, Pierre
is going to come and take you
through some more
advanced topics.
So how do we deal with
this kind of problem?
So how do we deal with
this kind of problem?
We have to start by
introducing the idea
of concurrency to
your application.
Concurrency allows multiple
parts of your application
to run at the same time.
On our system, you achieve
concurrency by creating threads.
A CPU core can execute one of
your threads at any given time,
but the payoff for
introducing concurrency,
the penalty for introducing
concurrency
to your application is
it's much more difficult
to maintain your thread safety.
The other threads that you've
introduced can observe the
effects of you breaking your
code invariants while you're
performing operations
on other threads.
This becomes a bit of a problem.
So how can we help?
Well, GCD is the concurrency
library on our platform.
It helps you write code,
multi-threaded code that works
on everything from
an Apple Watch
through all of our iOS devices.
Apple TVs and all
the way up to a Mac.
So in order to help you
with your concurrency,
So in order to help you
with your concurrency,
we introduce some abstractions
on top of threads themselves.
That is, dispatch
queues and run loops.
A dispatch queue is a construct
that allows you to submit items
of work to that queue.
In Swift, this is closures.
And dispatch will
bring up a thread
and service that work for you.
And when dispatch is finished
running all of the work
on that thread, it can tear
that worker thread
down for you itself.
As I said before, you can
also create your own threads,
and on those threads,
you might run run loops.
And then finally, on the first
slide we saw you get your main
thread, and the main
thread is special.
It gets both a main run
loop and a main queue.
So dispatch queues have two main
ways that you can submit work
for them, the first of which
is asynchronous execution.
This is where you can queue
up multiple items of work
to your dispatch queue,
and then dispatch, again,
to your dispatch queue,
and then dispatch, again,
will bring up a thread
to execute that work.
Dispatch will one by
one take items off
that queue and execute them.
And then when it's finished
with all items on the queue,
the system will reclaim
the thread
that it bought up for you.
The second mode of execution
asynchronous execution.
This is where, for instance,
if we have the same setup
as we had before,
the dispatch queue
with some asynchronous work.
But you have your own
thread, and that thread wants
to run code on that queue
and wait for it to happen.
You can submit that work
to the dispatch queue,
and then that's where
it will block.
It will wait there until
the item that you've asked
to execute has completed.
We might add some more
asynchronous work to that queue,
and then dispatch will
bring up a thread in order
to service the items
on that queue.
Again, the asynchronous
items will be executed there,
and then when it comes time
to run the synchronous item
that you've asked to run.
The dispatch queue will pass
control over to the thread
The dispatch queue will pass
control over to the thread
that was waiting, execute
that item, and then control
of the dispatch queue
will return back
to a worker thread
controlled by dispatch.
It will continue to drain the
rest of the items on that queue,
and then also reclaim the
thread that it was using.
So now I've shown you how
you actually submit work
to dispatch.
How do we use that to help
us solve the problem we
had earlier?
What we want to do is get the
work off your main thread that's
causing you to block
your user interface,
and we do that by taking
the transform that we had
on that main thread and
running it on a different queue.
So you can take the
transform, and you can back it
with a dispatch queue.
And now when you want
to transform data,
you can move the value of that
data to your transform code
on the other queue,
transform it,
and then send it back
to your main thread.
This allows you to perform
that work while the main thread
is idle and servicing events.
that work while the main thread
is idle and servicing events.
So what does that look
like in real code?
Well, it's really simple.
So first of all, you can
create the dispatch queue
to submit your work to by just
creating a DispatchQueue object.
It takes a label, and that
label is visible in debugger
as you're writing
your application.
Dispatch queues execute
the work that you give them
in first in, first out order.
That is, the order that
they were submitted
to the queue is the order that
dispatch will run them in.
And then you can use the async
method on your dispatch queue
to submit work to that queue.
So now that we've actually
submitted our resize operation
here to a different
queue, well how do we get
that back to the main thread?
That's very simple too.
The dispatch main queue
services all of the items
that you execute on it on
the main thread itself.
This means you can just
call DispatchQueue main
and then call async
on that main queue,
and then call async
on that main queue,
and that code will execute,
and you can update your
user interface from there.
As you can see, it's very simple
to chain work from one queue
to another queue and back
to your main queue again.
So now we've seen how to
get control of your code
and put it on different threads.
It comes at somewhat of a cost.
You have to control a
concurrency in your application.
The thread pool that dispatch
uses will limit the concurrency
you achieve in order to use all
of the calls in your device.
However, when you
block those threads,
if you wait for other parts of
your application or you wait
in sys calls those
worker threads
that are blocked can cause
more worker threads to spawn.
Dispatch is trying to give you
the concurrency you deserve
by giving you a new thread to
continue executing code on.
This means it's very important
to choose the right number
of dispatch queues to
use to execute code.
Otherwise, you can
block one thread.
Otherwise, you can
block one thread.
Another thread will come up,
and you will block
another one, and so on.
And this pattern is something
that we call thread explosion.
We covered thread
explosion and its problems
in last year's talk,
Building Responsive
and Efficient Apps with GCD.
So I recommend you go back
and watch that from last year.
So now we've seen how
you do simple things,
like getting work off of the
main thread onto another queue.
But how do we actually apply
this to your application?
Well, if we go back to
the system we had before,
what you want to do is identify
areas of your application
with independent data flow.
So as we've seen, it
might be image transform,
or you might have a database.
You want to take those
areas, and split them
into distinct subsystems,
and then you want
to back those subsystems
with a dispatch queue each.
This will give each subsystem
a queue to execute work
on independently without
suffering from the problems
of having too many queues
and too many threads.
And we saw a couple of
slides ago how easy it is
to chain work together.
That is where you can
async one block to another,
and then to another queue,
and then back to,
say, the main queue.
But there's a second
pattern I want to show you
that we feel is equally useful.
That is grouping work and
waiting for that work to finish.
If you have a single
thing that wants
to spawn multiple different
works items, and you only want
to make progress if
those work items,
when those work items is
finished, well, you can do that.
And in order to do that,
dispatch can help you.
So if we do back to the
diagram we had before,
if the user interface spawns off
three separate work items here,
you can create a dispatch group.
Dispatch groups are here
to help you track work,
and they're very simple
to create in Swift.
You just create a
DispatchGroup object.
And now when you submit work to
dispatch, you can add the group
And now when you submit work to
dispatch, you can add the group
as an optional parameter
to your async call.
You can add more work to
that group, and you can do it
to different queues,
but associate it
with the same group.
And each time you
submit work to the group,
the group will increment
the counter of items
that it's expecting to complete.
Then finally, once you've
submitted all your work,
you can ask the group
to notify you when all
that work is finished,
and you can tell it do
so on a queue that
you've chosen.
So now one by one, these
items will start to execute,
and as they execute the count
in the group will
decrement every time a work
item completes.
And when, finally, the last
work item has finished,
the group will go ahead and
submit your notification block
to the queue that
you've requested.
So here, we submitted
the group, the block back
to the main queue, and it
will run on the main thread.
Now there's a third pattern
that I feel we should show.
Now there's a third pattern
that I feel we should show.
These two have been asynchronous
execution, and the third one is
to deal with synchronous
execution.
You can use synchronous
execution
to help you serialize
state between subsystems.
Serial queues, dispatch
queues are serial by natural.
And you can use this for its
mutual exclusion properties.
That is, when you submit work
synchronously to that queue,
you know the work that
the subsystem is running
on that queue isn't
running at the same time.
You can use this to build
very simple accesses
to the thread safe
that access properties
from your subsystem
from other places.
So for instance, here
you can call queue sync,
and you can return a
value out of queue sync,
and we will capture that value
on the queue, and then return it
to you as that work
item completes.
However, you have to be
careful when you start
to introduce this
pattern, because you start
to introduce a lock ordering
graph between your subsystems.
to introduce a lock ordering
graph between your subsystems.
Now, what does that mean?
Well, if you have the subsystems
that we had before, and you sync
from one place to another
and then to another,
and then finally, you end up
syncing back to the first one.
Well, now we have a deadlock.
Pierre's going to come and talk
about deadlocks later
on in this talk.
So now we've seen a bit how
to structure dispatch use
in your application as a whole.
How can we also apply it to
your usage inside the subsystem?
Well, you can use dispatch
to classify the work
that you submit, and to do so,
we need to introduce
quality of service classes.
These are classes that provide
an explicit classification
of the work that you are
submitting to dispatch.
So it allows you as the
developer to indicate the intent
of the code that you are
submitting to dispatch.
And dispatch can use that to
affect how it executes the code
And dispatch can use that to
affect how it executes the code
that you've given us.
That is, the code
could be executed
at a different CPU priority,
different IO scheduling
priority, and so on.
And we covered QoS in detail in
the same talk from last year,
Building Responsive and
Efficient Apps with GCD.
So how do we actually
use the QoS classes?
Well, it's as simple
as it was before.
You can pass the
QoS class to async
as another optional parameter.
So here, we're submitting
background work to our queue.
And if you come along later
and submit queue out work
at a higher QoS, dispatch will
help you resolve the priority
inversion that's created.
That is, it will raise the
items in front of your work
on the dispatch queue
to the higher QoS just
so that they execute quicker
and you actually get
your item executed
through as quickly
as you expected.
However, it's important
to note at this point,
this doesn't help your
work jump the line.
All that does here is elevate
all the work in front of you,
so it executes as quickly as
the work that you've submitted.
so it executes as quickly as
the work that you've submitted.
And then you can also
create dispatch queues
that have a specific QoS class.
This is very helpful,
for instance,
if you have background work
that you always want to execute
at background, you
can create a queue
that executes all
that as background.
And when you submit
work to that queue,
that's the QoS that we'll get.
So on a more granular
level, when you async
to a dispatch queue, it
captures the execution context
at the point where you async.
Now, execution context
means things like QoS.
It also means the login
context that you currently have.
But if you want more
control over this,
you can use DispatchWorkItem
to create items
where you have more control
over how they execute.
For instance, here we're
creating a work item
with assignCurrentContext,
and that takes the QoS
of the execution context at the
time you create the work item,
of the execution context at the
time you create the work item,
rather than the time
that you submit it
to your dispatch queue.
This means you can create
that item, save it for later,
and when you do finally
execute it, we will submit it
to dispatch with the properties
of when you created it.
And now while we're
talking about work item,
there's another part that's very
useful for DispatchWorkItem,
and that is waiting
for them to complete.
You can use the wait method on
DispatchWorkItem to indicate
to dispatch that you
need that work item
to complete before
you can make progress.
Dispatch will respond by
elevating the priority
of work ahead of
it up to that QoS,
like it did with priority
inversions before.
And it can do this because
the DispatchWorkItem knows
where it was submitted, which
queue you want to run it on.
And therefore, dispatch knows
which queue it has to elevate
in order to get your
work item completed.
And it's very important to
note that, because waiting
on Semaphores and
Groups don't store this
ownership information.
This means if you wait on
a semaphore, it isn't going
to cause the things in front
of your semaphore signal
to execute any quicker.
And now I'd like to invite
Pierre on to the stage.
He's going to take you through
more about synchronization.
[ Applause ]
>> Thank you, Matt.
So with Matt, we've
seen how to use dispatch
from the perspective
of your app.
I will walk you through more
details about what it means
from the perspective
of your objects now.
But first, a note about Swift.
Synchronization is not part
of the language as of Swift 3.
You only have one grand key
from the language today,
which is that your
global viables are
initialized atomically.
But what you don't have is
that your class properties are
not atomic, and lazy properties
of your classes aren't
atomic either.
of your classes aren't
atomic either.
What that means is that if
you're calling these properties
in our initialize
context at the same time,
your lazy initializer
may actually run twice.
So you have to synchronize.
The language doesn't give us
really a lot of tools today,
but that doesn't mean that
races aren't a problem.
There is no such thing
as a benign race.
What that means for you is
that if you forget a
synchronization point,
you will end up with crashes
or corrupting the data
of the user of your apps.
I invite you to go
watch the talk
that happened earlier this week
about T San, which
is a sanitizer.
Which is a tool that
helps you find out where
in your app you're missing
proper synchronization.
So what do we use
for synchronization?
So what do we use
for synchronization?
Traditionally you
would use a lock.
And in Swift, since you have
the entire Darwin module
at your disposition, you will
actually see the struct based
traditional C locks.
However, Swift assumes
that anything
that is taught can be
moved, and that doesn't work
with a mutex or with a lock.
So we really discourage you
from using these kind
of locks from Swift.
If you want to a traditional
lock, what you can use, however,
is Foundation.Lock, because
unlike the traditional struct
based C locks, it's a class,
and it's not prone to any
of the problems I
mentioned earlier.
However, suddenly that means
that you're locating
your next object,
which may be undesirable
for you.
And if you want something
that's smaller and that looks
like the locks that you have
in C, then you have to call
like the locks that you have
in C, then you have to call
into Objective-C and introduce
a base class in Objective-C
that has your lock as an ivar.
And then you will expose
a lock and unlock methods
and a try lock if you need it
as well that you will be able
to call from Swift when you
will subclass this class.
You will notice on that slide
we're using unfair lock.
It's a new API that
we introduced.
It's not prone to
pirate invasions.
It doesn't spin, unlike the
spin lock that we duplicated.
It's most important
to come back to life.
That being said, this is a GCD
talk, so what we encourage you
to do is to use dispatch queues
for synchronization purposes.
The first reason why is that
these are significantly easier
to misuse than a
traditional lock.
Your code will run
in a scoped way,
which means that you
cannot forget to unlock.
which means that you
cannot forget to unlock.
The other thing is that
queues actually are way better
integrated with the run time
in Xcode in debugging tools.
So how do we use
queues to synchronize?
I will walk you through
a problem
of implementing an
atomic property.
So here we have this object
that has an internal state
that we want to access
in a safe way.
We will use a queue
to synchronize.
How do we write our
getter and our setter?
The getter is just
about returning this
internal state with sync.
It gives us mutual exclusion.
Matt talked about this earlier.
And the setter is as simple.
You will just set your new
state and the other protection
of sync and your queue.
This pattern is pretty simple,
and you can actually extend it
to products significantly
more complex in variants.
I told you that queues
are better integrated
with your debugging tools.
They also have more features.
And new in this release, we
let you express preconditions.
It lets you express that you
have invariants in your code
that really need to hang
on that given queue,
and you had that this way.
A dispatch precondition
that you're on that queue.
Sometimes the opposite
is actually useful.
You want to make sure that a
given piece of code never runs
on that queue, because you
know you may synchronize
with that queue, and you express
this this way, a precondition
that you're not in the queue.
So that's about synchronization,
synchronizing your state.
And as Matt said
earlier, it's way better
if you can just organize
your application in a way
that your passing
values are hung
that don't need synchronization
in the first place.
However, in real life code, you
need some objects to be accessed
However, in real life code, you
need some objects to be accessed
from simple, obvious subsystems.
What that means is that
all these subsystems have a
reference in these
objects, and getting rid
of them actually
can be a challenge.
I will now walk you through
a four-step state mission
that will help you get
this height and not end
up with weird crashes that
are hard to reproduce.
Your state machine starts
with first thing setup.
Setup is about creating
the object
and giving it the property you
need it to have for its purpose.
Second, you will want
to activate this object.
What that means is that you
actually make this object be
known to other subsystems.
You start using it in
a more concurrent world
in performance duties.
And then the hard part starts.
You want to get rid
of that object.
And so the third
step is invalidation.
And so the third
step is invalidation.
Invalidation is about making
sure that all the parts,
all your subsystems know that
this object is going away
so that, fourth, it
gets deallocated.
So let's look back.
Setup, activation,
invalidation, deallocation.
This is quite abstract,
so we will walk
through a more concrete example
that I hope you will relate to.
Let's go back to the application
that Matt introduced earlier
and focus on two
of the subsystems.
First, we have our
user interface,
which will handle stuff such
as the title bar of your app.
And I will assume that you
are able to observe some kind
of state changes in your
subsystems so that here,
for example, for our
data transform subsystem,
when it starts performing
some work,
when it starts performing
some work,
we present a visual
indication to the user.
And then when the data transform
subsystem stops doing any work,
that visual indication
goes away.
So how do we implement
that BusyController?
So we remember the
first step is setup.
Setup is about you picking
the properties that you want
for your code, and the
animation, and all that.
That's really up to you.
Then we'll want to
start using that object,
and that's activation.
We activate it.
What does that mean
for our controller?
What that means is that we want
to start receiving these
state notifications,
state changes notifications,
so we will register
with that subsystem, and
ask for the notifications
to be received on
the main queue.
We're doing UI we want to
handle the logic there.
Now that it's activated,
well, that's your code.
That you want to do.
That's your animation.
That's your very nice
UI for your application.
But then, there are
some parts of your app
that maybe don't need
that visual indication,
or that maybe don't use the
data transform subsystem,
and you want to reclaim the
resources of that controller,
and you want to get rid of it.
It's very tempting
to say, "Okay,
the main thread is
the only subsystem
that really owns
this BusyController,
so I will get rid of it
like this," and just deinit,
and register it from
the subsystem,
and hope for the best.
That doesn't work,
and I will walk you
through two examples of why.
So let's step back a bit.
Our BusyController is
reference from your UI,
from the main queue and
your user interface.
However, when we registered it
with the data transform
subsystems, it was very likely
that reference was taken
from the data structure
onto this object, which means
that when we're getting rid
onto this object, which means
that when we're getting rid
of the reference that
the main thread has,
there's still one left, which
means that deinit doesn't run,
which means that it
get unregistered,
it doesn't get collected,
and you end up with
abandoned memory.
However, you're very
skilled developers,
and you know how to fix that.
Weak references, and I
will say yes, you're right.
However, that's not
the end of that story,
because this is not really
what an app looks like.
The graph object is
significantly more complex.
It's not uncommon to have
objects that hold references
and a whole bunch
of other objects,
such as this octopus
object right here.
We will continue getting rid
of that reference
from the main thread.
And unlike before, this
is not abandoned memory,
because this octopus object
knows it has this reference.
But then, if we get rid of that
octopus object from the context
But then, if we get rid of that
octopus object from the context
of the data transform
subsystem, what will happen?
It will get rid of that
reference that it has
on the BusyController,
which, remember,
will run beyond discretion,
because that's what deinit does.
And then you have a problem,
because it's very likely
that to do that you
need to synchronize
with the dispatch queue that
owns that data structure.
And you guessed it, we
end up with a deadlock.
Actually, that bug is so common
that we've made it an
assertion, new in this release.
If you run that code on the
last release, it will assert,
and on OS X or in the simulator,
the crash report you get
actually will have an
application specific information
that actually points you toward
the actual problem you have,
that actually points you toward
the actual problem you have,
so that you can fix it easily.
Okay, so now we know
we really don't want
to unregister from deinit.
How do we fix that?
We fix that by having our
third step, invalidation,
be an explicit function call.
And under this invalidation,
we do this in registration.
Also, since we have
preconditions, let's use them,
because this object,
this BusyController,
really should be managed from
the main thread, and you want
to make sure that users of
your API do that properly.
So you will want to
have a precondition
that this only happens
on the main thread,
or the main queue, even.
But that's not quite it.
We have a last problem.
Remember, this all
happens on the main thread,
and you have this subsystem,
the data transform subsystem,
who is sending you
still state transitions,
who is sending you
still state transitions,
and you may have some
that are still happening
at the time you're invalidating.
How do we resolve that?
Well, you want to track
invalidation as a real state.
And what does that mean?
Well, just what it is.
You want to track invalidation,
for example, here, as a Boolean
in your object, and you
remember when you did.
At the same time, let's
throw in more preconditions,
and make sure, enforce,
that before your
object is deallocated,
it has been properly
invalidated.
It will help you find bugs.
Why is it interesting?
Because now, in your code
that handles the notification
for the state transitions,
we can observe
that the object was invalidated
and actually drop the
notification on the floor
and update the UI in a
way that would be open.
Okay, that was quite
the complex example.
However, I hope that you
will now go look back
However, I hope that you
will now go look back
at your applications and your
code, and try to find places
where this pattern will help
you reduce the complexity
of your code, and
maybe remove bugs.
It should also not
surprise you that,
given that we're giving you
this advice, that GCD objects,
whose purpose in life is
to be used concurrently,
follow exactly the same pattern.
So let's look at the GCD
object with that in mind.
So we remember the
first step is setup.
Setup for dispatch objects
is all the things you can do
when you build the object and
all the attributes you can pass.
Matt already showed labels
and queue attributes earlier.
And here, we also have
a dispatch resource
where we monitor all the
[inaudible] attributes.
Sources also have handlers,
and the event handler
particularly is the code
that will run when the resource
that your monitoring fires,
that will run when the resource
that your monitoring fires,
and that is events pending.
For the resource that
we have here, well,
that's when there is
data taken available.
Once you've set up your
object and it's ready to go,
you want to use it
and activate it.
New in this release, we've made
that step an API
[inaudible] called activate.
It used to be that
for dispatch sources,
the initial resume had
exactly that meaning.
We've actually now
made suspension
and activation be two
separate concepts.
Also in resume activate can
be called several times,
and it only acts once.
The contract however is that
once you've called activate,
you won't mutate the properties
of your objects anymore.
We've also found that creating
queues the way you use sources,
creating them initially
inactive is useful,
and we've added a new attribute
that lets you exactly do that,
and is actually named
initiallyInactive.
Once the queue is created, you
can pass it around, finish,
configure it the way you like,
and finally, activate it.
Many of the dispatch objects
don't really need explicit
invalidation, such
as groups or queues,
because they become
inactive by the sheer fact
of you stopping using them.
However, the story is quite
different from sources,
and sources have an
explicit invalidation.
It's called Cancel.
Cancellation in sources does
the thing that you'd expect,
which is that you
stop getting events
for the thing you're monitoring.
But it's not only that it does.
The second thing it
does is that if you set
up a cancellation handler on
your source, such as here,
it will run on the target of
the queue at cancellation time.
It is actually where you want
to get rid of the resource
that you're monitoring,
such as closing [inaudible]
frame memory.
Last, but not least,
cancellation for sources is
when your handlers
are destroyed.
Handlers are closures.
They capture via subjects
maybe even the source itself.
They can be part of
the reading cycle.
Calling cancellation is how you
can break that reading cycle.
It's why it's very important
to always cancel your sources.
So you remember a bit earlier,
we added a lot of preconditions
in our code, because
we want to make sure
that concurrently used
objects are always used
in a way that you can expect.
Dispatch is no different,
and expects that at the time
your object gets deallocated,
and expects that at the time
your object gets deallocated,
it's in a different state,
and dispatch expects
two things from you.
First, that your
objects are active.
And second, that they
are not suspended.
The reason why is that being
suspended or inactive means
that you as a developer
don't think that it's safe
to run the code associated with
it, but we need to run code
to get rid of the object.
Okay. So we've seen today how
you can think about your app
in terms of data flows and how
you should use that to divide it
into fairly independent
subsystems that use value type
for communication purposes.
If you need to synchronize
state,
we also showed you how you can
use Dispatch Queues to do that.
And finally, when you
have objects that are used
And finally, when you
have objects that are used
in a very heavily concurrent
world, how to use activation
and invalidation to
get this pattern right.
Here is a link that will show
you more resources associated
with this talk.
And a few related sessions
that, if you're interested
with dispatch, you
should probably check out,
because they are
really interesting.
And that's it.
It was Concurrency with GCD.
[ Applause ]