WWDC2010 Session 208

Transcript

>> Quinn "The Eskimo!": Greetings.
My name is Quinn "The Eskimo!"
I work for Apple Developer Technical Support,
answering questions from people like you, developers,
about networking file systems, threads,
and other sort of Core OS little things.
Welcome to Network Apps for iPhone OS Part Two.
If you were at Part One, thank you for coming back.
I admire your persistence.
[ Laughter ]
If you didn't make part one, that's probably okay.
Most of this talk sort of stands alone.
I'll make some references back to part one and you can
catch it on the videos, but mostly you should be fine.
I'm going to start today -- well, not today
because I'm already an hour in -- with a quote.
This is from my friend at DTS who was there when I joined.
And he's since moved on to debug kernel panics for a
living, which sounds like fun to me but I'm going to --
and it's relevant to this talk because the
previous talk was all about architecture.
It was about problems and architecturing
your application to solve those problems.
This talk is all about practical matters.
And so the practical matters we're going to cover and the
number one, the big one -- asynchronous programming --
and then shorter sections on debugging and common mistakes.
I have certain religious objections to some
terminology, "anti-patents" is one of them,
"Cloud" is another so we'll avoid that term.
So to start.
[ Applause ]
As most of you know -- well, can tell
from my accent -- I'm Australian.
And -- thank you.
And I was looking for an icon for the practical
and I wanted sort of an image of filling developers
up with knowledge or something like that.
So I thought, "What's more practical than beer?"
And the answer is: Nothing.
[ Laughter ]
So I took this and I turned it into
my cheesy graphic for the talk.
And you know I'm a fan -- if you came to
Part 1, something you would have missed --
you will know that I'm a fan of the cheesy graphics.
And I also sort of wanted a filler bar so you
guys could tell your progress along the talk.
And for me -- for you guys it's a
filler bar and for me it's a goal.
When we get to the end, after two hours of talking
about networking, I'll be well-ready for a beer.
So to start -- asynchronous programming: glass is empty.
Bummer. Asynchronous programming is a big
topic so we're going to break it into three.
There's the basics.
There's a discussion of run loops and then
a short discussion of state management,
which is how to connect the complicated
model -- state of your model objects --
to the hopefully simple state of your front end objects.
We're going to kick off with the basics
and what is more basic than a definition?
Now, I had a hard time coming up
with a definition of asynchronous
and synchronous programming in
words so I've defined it in code.
People kept saying my slides were
short on code so I keep adding it.
I was helped.
So here you see on the left is a classic synchronous network
program: the start method runs; it runs the first request;
gets the results from the first request,
processes them; runs the second request;
gets the results, processes them; and so on.
And it does all of that before
returning from the start method.
This is called synchronous.
On the right is an asynchronous program: the start requests
runs, it starts the first request; it starts method run;
it starts the first request and then immediately
returns; and when the first request is done,
another method is called -- request1Done it's called;
and that processes the results from the first request,
starts the second request; and the
whole process continues on from there.
Now, people often ask us "why?"
If you've been to any network sessions this week,
you'll hear constantly "program
asynchronously," and there's good reasons for that.
First of all -- well, the obvious resistance to that is that
the synchronous programming on the left clearly is easier --
there are fewer lines of code, there are fewer methods,
everything was shorter and simpler or it appeared to be.
But yet we network programmers keep
nagging you about asynchronous.
And why is that?
And it's because really fundamentally, the network is
asynchronous; you can't control the order that things happen
on the network, you just have to adapt to it.
And if you do your network programming asynchronously,
what happens is that you have a mismatch between your code,
which is running synchronously and the [inaudible 3,0]
underlying network code, which is running asynchronously.
And that causes all sorts of problems.
And in addition to that, what we find
is that synchronous programs are fine
for when you're writing your third-year paper at university
and you want to write a simple test network program.
But when it turns into writing a real program that works on
the real network where it has to deal with errors and it has
to deal with latencies and it has to deal
with cancellation, then the balance tips
and synchronous networking is no longer easier.
Doing cancellation in synchronous networking is a nightmare.
Doing cancellation in asynchronous networking is trivial.
And so it turns out that asynchronous
networking becomes easier
as you move along, as you deal with all the Edge cases.
And that's why we recommend it.
There's another critical equation
in iPhone OS, which is this thing.
If you do synchronous networking programming on the
main thread, your application will eventually be killed.
You will get crash reports from the user
saying that your application has died.
No matter how you do your synchronous
networking, this is inevitable.
And so you have to avoid this.
And the reason why this happens is
because of this guy: this is the watchdog.
The watchdog kills applications that have gone bad.
And one definition of the application going bad
is it not responding to user interface events.
If it doesn't respond to user interface events, the watchdog
gives it a certain period of time and then it shoots it.
Now -- well, I guess it would chew off a limb
or something really, I'm mixing my metaphors.
[ Laughter ]
This is not just a network thing.
If you take your main thread and you start calculating
Mandelbrot sets in your main thread and you do it
for a minute, the watchdog will kill you as well.
But it's especially bad for networking because
of these -- oh, actually before we get there.
Oops, slide misordering here.
When you do get killed by the watchdog,
this is what it looks like.
You get this "ate bad food."
If you look at the numbers there it says, "8badf00d."
[ Laughter ]
[ Applause ]
Don't clap me, I didn't come up with that idea.
[ Laughter ]
Anyway, you can tell the crash reporting people on
Friday because there's a great cash reporting session.
So if you want to know more about crash reports, it's
a fabulous session -- I highly recommend you go to it.
But if you just want a quick summary,
you can get this from the tech note 2151.
And it's an indication of how often we reference
this tech note by the fact that I know the number,
because it has a good summary of
all these watchdog crash reports.
And the watchdog crash report that says
"ate bad food" means that you were killed
because you weren't responding to these rude events.
And the reason why this is an issue is because of
this: The watchdog timeout is roughly about 20 seconds,
that's not an API, that's a current implementation detail.
And all the networking timeouts of
synchronous networking are all longer
than that, in some cases much longer than that.
So if you do a synchronous request and
the network drops out from underneath you,
then you're stuck, waiting for the network to respond.
You can't get out of that; you're just stuck there.
And then at some point, 20 seconds later, the
watchdog comes along and kills your application.
And the answer to this is not lower the timeouts.
An application that's unresponsive for
5 seconds is still a broken application.
You need the application to be responsive all the time
and that means the main thread can't be doing things
that take long periods of time, and
especially can't be doing things
that take unbounded periods of time like networking.
So we return to this: Synchronous program
networking on the main thread is death.
Now there's another "gotcha" here and that
relates to this: Hidden synchronous networking.
There's a whole bunch of utility methods in the
operating system that do networking behind your back,
things like the NSArray initWithContents0fURL.
If you pass at a filesystem URL, it will read a plist off
the filesystem and that will be fast; it won't be a problem.
But if you pass an http URL, it will go to the network.
And if the network's not working properly, it can't
return because it hasn't got the whole results yet.
It can't error because it's not got
to the point where it's timed out.
It just sits there and waits and the watchdog kills you.
And in addition to that there's the DNS.
Lots of people make these traditional BSD
DNS calls -- gethostbyname and gethostbyaddr.
Again, fully synchronous -- don't
call them ever on the main thread.
And another one is this NSURL connection method
-- sendSynchronousRequest:returningResponseerror.
Now that's not really hidden in the sense
that it's got "synchronous" in the name.
But it is one common case where we see people tripping up.
And finally, there's this notion of synthetic synchronous.
And synthetic synchronous is where you call an API
asynchronously and then you wait for the results.
So here's some pseudocode -- this guy.
I get this from folks.
They say, "But I called the API asynchronously."
And it's like, "No, if you're waiting
for the results, it's synchronous."
Now synthetic synchronous is a
good idea in some circumstances.
I'm going to have some examples of where you might
use it later and where the operating system uses it.
But it's not a miracle cure for this equation.
Synchronous networking on the main thread will kill you.
So how do we break this?
How do we break that equation?
And the first thing you might think is, "Well, if
synchronous networking is not in the main thread is bad,
then I'll just put it on a secondary
thread and that will be good."
And that's rarely the answer.
Sometimes it's the answer in some circumstances
but for typical iPhone OS application,
it's not the answer because threads are evil.
Now that was a contentious statement a few years ago
but now it's fairly well-understood
that threads are reasonably evil.
And I'm going to talk about that
specifically in a slide or two.
But for the moment, you just have to take my word for it.
Now another option is Grand Central Dispatch.
Grand Central Dispatch is all about
asynchronous programming.
It's potentially the best thing coming to
networking ever in terms of programming model.
Unfortunately, today it's not a great choice and this
ties back to something I had in my previous talk,
which is this: here's the iPhone networking stack.
We want people to be working at the Foundation layer.
Now Foundation has been revved to
take advantage of GCD in some places
but the networking part of Foundation hasn't been.
And that means you have a choice: you can
either not use Foundation and use GCD,
or you can use Foundation and not use GCD.
And our recommendation is you stick with
Foundation; it gives you a lot of benefits above
and beyond what you can get from just using
GCD directly, which means that for the moment,
for a typical network programmer, GCD is the future.
One day it will be great, but for
the moment we have to stick
with the third option here, which is run loop programming.
Now I'm going to talk about run loop in a lot
of depth in the next section but first I wanted
to return back to this idea of threads being evil.
Why are threads evil?
The first point for why threads
are evil is this notion of locking.
If you have multiple threads in the same program,
then they may share data and if they share data,
you have to lock the data before you access it.
And then you introduce this notion of accessing the data
outside of the lock, which causes random corruption.
Or if you have more than one lock in
your program, you end up in deadlocks.
And in general, that sort of thing is just
a mess and it's better to try and avoid it.
In addition to that, this is this idea of cancellation.
If you're doing synchronous networking on the main thread
and the network goes away, then you can be blocked.
Or even if the network's just waiting for data, you
can be blocked inside the kernel waiting for the data.
The problem here is if the user hits the cancel button,
how do you get out of the kernel to say you're done?
And it turns out that's very hard to fix.
You can fix it.
You can use inter-thread signals and other crazy
things but the reality is it's very hard to get right.
And it's one of these cases where
making a request asynchronously --
it starts to be a big win because you have an asynchronous
request, then you're just waiting to be called back.
And if you want to cancel, you just invalidate your own
loop sources and release everything and you're done.
It's a big win for asynchronous.
Timeouts are a similar issue.
If you use synchronous programming, you're at the mercy of
the timeouts provided by the underlying API you're using.
But if you use asynchronous programming,
you can time out just by using MS timer
and when it fires, cancel the request and you're done.
Also, bidirectionality -- TCP is inherently bidirectional.
And if you use a protocol, it is bidirectional.
It's a big win doing it that way.
You get advantages on the wire but it doesn't
work well with synchronous networking on threads.
If you're reading, waiting for data, you're stuck.
You can't write data as well.
Now, again, there are ways around that.
You can simultaneously use two
different threads to read and write.
But that sort of undermines the whole
benefit of using synchronous programming.
And finally, there's this issue of resource use.
If you have 10 threads, they each have 8K of
stack and they might be blocked in the kernel
and they're consuming 24K of kernel stack.
And on the iPhone, all of that memory is
wired down -- it's consumed permanently.
So you're looking at large amounts of memory that can't
be reused for anything else and they're just stuck
and they're literally doing nothing;
it's no benefit to user at all.
In contrast, if you use asynchronous programming,
that really -- that's not an issue anymore.
You don't have threads sitting there,
consuming memory doing nothing.
So in my experience, threads are evil.
But of course, there's a "but."
There's always a "but," and that is if
you're doing CPU-intensive operations.
Threads are evil for networking but for operations
that need the CPU, they're a really good thing --
they're the best way to get the CPU to run concurrently
with the user interface or the CPU task you're trying to do
to run concurrently with the user interface.
They're also fine for doing I/O operations that are
both fast and reliable, like accessing the disk drive --
at least on iPhone OS the disk
drive is both fast and reliable.
And so it's fine to use threads for those things.
And that raises the question of
how do you mix and match them?
How do you do threads for one things
and run loops for the other?
And my recommended way of doing that is with NSOperation.
NSOperation is this abstraction for
dealing with asynchronous operations.
It's like start the operation and when
it's done, you hear about it being done.
And it turns out that you can use NSOperation -- what
we call standard NSOperations for CPU-bound tasks
where the NSOperation queue starts a thread for you and
you run on a thread and that's good for CPU-bound tasks.
And then for networking tasks, you can
use what's called a concurrent NSOperation
where the operation queue doesn't start it
on a thread but instead organizes to run it
and expects it to continue running by itself.
So NSOperation is a really good way to model a mix of CPU
and network-bound intensive operations simultaneously.
It's a very unique technique.
The only issue with NSOperation is that doing the
concurrent NSOperations for networking is a bit tricky.
There's a bit of fiddly things you've
got to do to get it work properly.
So I've been working on a sample that shows how to do
this and it's called the LinkedImageFetcher sample.
It's not quite ready for prime time -- I didn't really have
time to get it properly reviewed before the conference.
But I've put it on the attendee site
so you can go and grab it from there.
And it shows how to do this mix and match of
CPU and network operations in the same program.
And I do intend -- I fully intend -- to get
that made public soon after the conference.
Finally, before we leave the subject of
threads, this is the idea of hidden threads.
NSOperation will start threads behind your back.
Similarly, Grand Central Dispatch will do the same thing.
And also this Cocao method,
performSelectorInBackground:withObject:,
will also start threads behind you are back.
And you if you mix and match threads and
run loops, you can fall into one big pitfall
and I'll give a good example of that later in the talk.
But just for the moment, the lesson here
is: Watch out for these hidden threads.
And that wraps up for the basics.
We're going to dive straight into run loops.
And that's definitions to start off with.
There's one run loop per thread, always.
A run loop is an event dispatch mechanism --
it monitors a set of event sources and each
event source has a callback associated with it.
And when the event source fires, it calls the callback.
Now, the run loop has to be explicitly run
by the thread that it's associated with.
So the thread runs the run loop and
while it's running inside the run loop,
it monitors these event sources and
calls the callbacks as they fire.
And if no event sources fire, it
blocks, waiting for one of them to fire.
In general, you must explicitly run your run loops.
But as one special case, the user
interface frameworks like UIKit
on iPhone OS will automatically
run the main thread's run loop.
Now, to look at this graphically, I
have a series of less-cheesy diagrams.
Here's a bunch of threads -- the main
thread and a couple of secondary threads
and each of them has an associated run loop.
The run loop sort of owns that thread.
Now, if we focus on one of these
threads, we can zoom into it.
And here we see the run loop and the run loop
is associated with all of the run loop sources.
Now these run loop sources aren't abstract
notions, they're related to what you've done.
So for example, here on the left, if you start a timer, we
create a timer event source that's attached to the run loop.
And on the right there, we started an
NSURLConnection and it's created a connection source.
So it's attached to the run loop.
So these run loop sources don't come from out of just thin
air, they come because of operations that you've done.
Here's an example of actually scheduling
something on the run loop.
Here's -- we started with a net service, which is
a reference to a service that we found on Bonjour.
We created an input stream for that service.
Now that input stream by itself is not scheduled on
the run loop so we explicitly scheduled on the run loop
with this scheduleInRunLoopforMode method.
And you pass in the current run loop
and you pass in on the run loop mode.
Now, run loop modes are a source of some confusion and
I'm going to cover those in detail in a few slides.
But for the moment we're just going to ignore
it and choose the default run loop mode.
Then the other thing you do is you set the delegate
and the delegate is effectively the callback.
The real callback is internal to NSInputStream.
But for the moment from your perspective
the callback is the delegate.
And then once the source is set up on
the run loop, you kick off the open.
And then the open proceeds asynchronously in the background.
At some point in the future when the open is
complete, you'll start getting events to your delegate.
And we'll call this method the
HandleEventMethod on your delegate.
And the question is, "Well, what thread is that running on?"
It's running on the thread associated with the run loop that
you passed in when you scheduled the stream on the run loop.
Those two are tightly-bound together and this also
means that in order for this delegate callback
to be called, you have to be running this run loop.
Now, if you're in the main thread, that's
really easy -- the UIKit does it for you.
But if you're in other threads, you have
to go out of your way to make sure it runs.
Now this is explicit scheduling where you explicitly tell
the frameworks what run loop you want to schedule on.
In addition to this, you get implicit scheduling.
And here's an example of this, where the frameworks
sort of decide for themselves what to schedule on.
And this is NSURLConnection, it's a utility
method called connectionWithRequest:reqdelegate.
And that automatically schedules on the
current run loop in the default mode.
And so that's the context that this callback will run in.
Every time we have an implicit scheduling, we almost
always have an equivalent method that's explicit.
So here's the explicit version.
You allocate the connection with the request
and the callback, which is the delegate
and you pass NO to this startImmediately prompter.
So it doesn't start, it doesn't
schedule in the run loop automatically.
Then in the next step, you schedule
it on the run loop that you want
to schedule it on and then you call the start method.
And from then on, at some point in the future, you'll get
this delegate callbacks associated with this operation.
So that's pretty much how you schedule
things on the run loops.
What about these run loop modes?
Whenever you add an event source to a run loop mode -- to
a run loop -- you actually add it in a particular mode.
And whenever a run loop runs, it always runs in that mode.
And when it runs in that mode, it only runs -- it only
monitors the event sources associated with that mode.
All the other event sources are ignored.
So this is just the basic facts.
Here it is graphically -- well, not quite yet.
This is where we left off our run loop model.
I'm going to insert the modes in there.
So now we have two run loop modes in this blue layer:
the default run loop mode and a tracking run loop mode.
Some of them have all the event
sources associated with them,
the default mode, and the tracking mode only has a subset.
So when you run the run loop in the default
mode, we monitor all of these event sources.
In contrast, when you run the run loop in this tracking
mode, we only monitor a subset of the event sources.
In this case, we ignore the timer.
So if the timer fires, the callback
for that timer won't be called.
And that's useful in a variety of circumstances.
But the real question is: why do we
have this whole run loop mode mess?
And it's associated with a recursion.
Sometimes you're in a run loop callback
and you want to run the run loop again.
You might want to call an API using this synthetic
synchronous model; sometimes it's important to do so.
And an example of doing that is the user
interface tracking that's done by UIKit.
And I'll talk about in a little more depth in a few slides.
But I just want to give an example
of this synthetic synchronous model.
You're in a run loop callback and you want
to run an async API, synthetic synchronous.
So the way you do that is you set up the async
call and you schedule it in a custom run loop mode.
Run loop modes are just strings;
you can pull them out of thin air.
And generally we recommend that you use reverse DNS notation
just to keep away from other people's run loop modes.
So you schedule your event source
in this custom run loop mode
and then you run the run loop in that custom run loop mode.
And what that means is only your event source will run.
All other event sources are held off until
the run loop start returns to the other modes.
So here's an example of this.
Here's where we left our run loop
off with two run loop modes.
What we do is we create a custom mode and
we run the run loop in that custom mode.
And we add our file descriptor in the schedule -- we've
created a CF file descriptor just as an example --
and we add that -- its event source --
to the run loop in that custom mode.
And when we run the run loop in that custom mode,
only that event source is looked at;
all other event sources are ignored.
It's really useful.
In this case, if you're doing it on the main thread,
this might be a bad idea because it's effectively
If you block forever, you'll be killed by the watchdog.
If you're doing it on a secondary
thread, it's perfectly reasonable.
You can even do it on the main thread if you can limit the
amount of time that you'll spend if there's some upper bound
to the amount of time you'll spend
running the run loop in the custom mode.
Now example of where this is used in
practice is user interface tracking.
If you have a scroll view on screen and the user
taps down on the scroll view and drags up and down,
the UIScrollView class wants to run the run
loop in order to track the user's finger.
And it doesn't want to return to the main run
loop in order to do that, to the top level,
so it uses a form of synthetic synchronous.
It gets all of the run loop sources -- event sources
-- that are associated with tracking touches,
such as the input event sources and the compositing
sources required to composite out to the screen
so you can see things and it adds those to a custom
run loop mode, which is UITrackingRunLoopMode.
And then it runs the run loop in that mode.
And so all of the event sources required
to track a run and other event sources,
such as maybe one's OpenURL event sources or
ones related to push notifications don't run.
This is a hugely important technique.
Now, it's a "gotcha" for you guys because if you take
an object and you schedule it in the default mode,
then the run loop isn't running in that mode at this point.
So you might have created an NSURLConnection and
it's receiving data, the user puts their finger
down on the scroll view, and it stops receiving
data because its event source isn't being monitored.
And you might work around this
by scheduling the event source,
not only in the default mode but
also in the UITrackingRunLoopMode.
But there's actually a better solution to that, and
that is to schedule in this common modes object concept.
The common modes are a meta-mode.
You can't run the run loop in the common modes but
you can schedule event sources in the common mode.
And when you do so, the run loop automatically
schedules those event sources in all the likely places
that you'll need to be run, which
are these modes called common modes.
Now on iPhone OS, the common modes consist of the
default run loop mode and the UI tracking run loop mode.
But that could be extended.
So for example in Mac OS X, there's a run
loop mode for tracking across the menu bar
when the user puts the mouse down in the menu bar.
And so the key thing about using the
common modes is that you run in all
of these modes where you're likely to need to run.
And it's a good abstraction layer
for getting your code running,
even though the user's interacting with the user interface.
The "gotcha" with using the common modes is that
if the run loop is running in the default mode,
then you can do all sorts of things
-- you can do pretty much anything.
If you get a network error and
you put up an alert, that's fine.
But if the user is tracking their finger across the scroll
view and you're running on the UITrackingRunLoopMode,
then if you get a network error -- because you're
running now because you scheduled in the common modes --
if you get a network error and you put up an
error alert, that's going to do bad things.
The user's going to be hopelessly confused.
It may -- it probably won't crash the
frameworks but it's not going to look good.
So if you use this common mode concept, make sure
you understand the context you're running in.
And for example, you can use other mechanisms like a
short timer that's scheduled only in the default mode
to defer these sorts of user interface operations.
As you're using run loops, keep in mind the following:
There's never any need to create or destroy run loops.
Run loops are created on demand per thread and
they're destroyed when the thread's destroyed,
so you don't need to mess with the run loop itself.
In contrast, run loop sources -- it's
vitally important that you invalidate them.
If you think about those previous diagrams, they are massive
pointers with one object pointing to the next object,
which is pointing back to the other object, and so on.
And so it produces a massive amount of
retain loops between all these objects.
And if you fail to invalidate your run loop sources,
then what happens is those retain loops
are never broken and you just leak memory.
Whenever you schedule an event source in a run loop,
you typically sort of have an owning
object, which sort of owns that scheduling.
And before it releases its last reference,
it's vitally important that you invalidate the run
loop source before you release your last reference
to it, otherwise you'll just leak.
And in some cases -- I had a developer today
who was leaking sockets because he was failing
to invalidate his socket run loop sources.
Try to avoid scheduling cross-thread scheduling
where you're running on thread A and you're trying
to schedule an event source on thread B's run loop.
In general, it's meant to work.
It works at least 99% of the time.
So sometimes it just blows up.
But worse than that, you know, we can fix those
bugs; we know about them, we are fixing them.
But the real issue here is that inside your
own code you can get into these race conditions
where the run loop sources are or aren't
scheduled and it just gets very confusing.
So always try and schedule on the current thread's run loop.
And if you need to, use performSelectoronThread
to get over to the thread that you want to be on
and then schedule on the current run loop from there.
Don't run the run loop recursively in
the default mode on the main thread.
The UI frameworks have run loop sources that are only
meant to run in the default mode at the top level
of the framework where you're nearest to main.
If you run the main thread's run loop in the default
mode, those sources will fire in the wrong context
and bad things will happen on both iPhone OS and Mac OS X.
Run loops are a serialization mechanism.
This is vitally useful in most cases.
If you think about a run loop, it monitors
event sources and then calls the callback.
And when the callback returns, it returns
to monitoring the next event source.
So these callbacks are inherently serialized, which
makes your network programming very much easier --
it radically reduces the amount of
race conditions you have to deal with.
But the issue is, of course, that this
serialization can give you latency.
If your main thread is off doing user interface
compositing somewhere or calculating Mandelbrot sets
or whatever it's doing, then while it's doing that,
your network event sources aren't firing
because it's in a run loop callback.
And so you really want to either keep the main
thread doing very nonsynchronous operations, i.e.,
always returning to the run loop quickly or in some cases
it's a good idea to create a single, secondary thread
and put all of your network event sources on that thread.
And so they will never be held off
due to latency on the main thread.
And finally, there's this problem with hidden threads.
I'm going to go into that in a little more detail.
Here you see me doing performSelectorInBackground
to call the doStuff method.
Now when the doStuff method runs, it's running
on a secondary thread -- that's the whole point.
It does its stuff and then when it's
finished, it wants to schedule a timer
to continue doing more stuff in about a second from now.
Now the thing here is that that doMoreStuff method
that it's trying to call can never possibly execute.
And the reason is the schedule timer with time
interval method always targets the current run loop,
which is the run loop associated with the
current thread, which is a secondary thread
because we ran doStuff using performSelector
in the background.
Now when that secondary thread is created by performSelector
in background, it's created, it calls doStuff,
and when doStuff returns, it's destroyed.
And so any event source that you schedule on it will never
run because the secondary thread never runs the run loop.
It's a real "gotcha" that confuses a lot
of people, so what watch out for this one.
And this is why hidden threads are a danger if
you're mixing and matching threads and run loops.
And that wraps it up for run loops.
It's been a long haul.
Glass is almost full to the fifth stage, which will be good.
The last thing I wanted to deal with is state management.
And this is the idea of how you connect the states of the
front end of your application -- the user visible states --
to the states of the back end of your application,
the states associated with the networking.
Doing operations on the networking typically requires
lots of states as you get this thing in posit
and then deal with the results and so on.
But the user interface hopefully has
very simple states because you don't need
to display a lot of state information to the user.
Now, a really easy example of this
is this placeholder mechanism.
The only piece of state shared
between the user interface code
and the model objects is whether
the placeholder has been got or not.
That's one piece of state: Are we busy or what's the image?
In fact, in many cases, if you follow the advice from my
previous talk, you don't really need to share state here
at all; all you need to do is get
the object -- which is the image --
and listen for notifications for changes to that object.
In contrast, solicited operations, ones that the
user is expecting -- specifically requested --
and they want progress on, is a little more state there.
Obviously you need to know whether the back end is busy --
here's an example in Safari where if the back end is busy,
we get a Stop button rather than a Refresh button.
And similarly, if the back end is busy, you get
a progress bar, so that's two pieces of state.
And if the back end fails, if the
model objects can't fetch the data,
then you get another piece of state, which is the error.
And in addition to that, there's one piece of
control flow that goes down, which is the cancel.
But the reality is the state sharing is really small here.
Safari is doing a lot of weird
things to get data off the network.
It's going through probably hundreds of network states to
get the primary URL posit, get all these images, and so on.
But the front end is seeing one state -- you know,
effectively one state -- which is: Are we busy or not?
And so it's very relatively easy to map your
back end states to your front end states in a way
that the user can understand and that's reasonably
easy to wrangle in terms of your user interface.
Typically it involves: Are we busy?
What's the progress?
And was there an error?
So the take-home message here is: Asynchronous
programming means you will have to do state management
in your user interface; it's really that simple.
But it's not as hard as you might think.
It doesn't require a huge reworking of your user interface.
In most cases, it requires a very simple rework.
And the real trick is to hide all of the irrelevant states
down in the model so that the model goes through a bunch
of complicated states and the front end only says
"busy" or "not busy," "error" or "not error."
And then, of course, for the front end
to know about changes to the state,
you have to have some sort of model notification.
And that is something that I talked about in Part 1 of this
talk and something that's too complicated to recap here.
So if you missed Part 1, you might want
to go back and look at it on the video.
So that wraps up state management and it fills the first
part of the glass -- we're happy about that I think.
Well done.
[ Applause ]
>> And I'm running really fast,
which is probably a good thing.
Okay. Next point: debugging.
Network debugging is traditionally hard.
This is the guy with the network debugging program.
Now why is it hard?
It's hard because the network is asynchronous and
it's hard because network behavior is dependent
on environmental factors that you can't control, such
as how many MiFi iPad stations there are out there.
And so if you get these problems coming in
from the field, it's very hard to debug them.
And similarly, you may encounter this problem that
only happens once in every 10,000 executions and
yet it still crashes on a number of users and annoys them.
My first tip here.
[ Laughter ]
Now, you're laughing because you think
it's a joke but it's not really a joke.
There are lots of things you can do to
minimize the bug count in your program.
The number one most important thing is design.
If I'm writing a user interface application, I'll often sit
down and write a line of code, put up a table, view, run it,
change the code, run it again, see whether it
works, get a bug, fix the bug, and so on --
sort of this incremental implementation approach.
That works fine for user interface code where
everything tends to run in a deterministic fashion;
it's a real disaster for networking code.
The networking code you have to plan in
advance, you have to understand the states
that the network can be in, and how those states change.
And you have to understand how
those states affect your model
and how those states affect the
front end of your application.
Plan that out in advance so that you're not
building it from scratch and keep changing it
because if you change it, that will introduce bugs.
So try and design it in advance and stick with it.
And if you have to change it, think
very carefully about how you change it.
My other tips are a little bit more prosaic.
You know, I get projects from developers
and they're like, "This crashes."
And you go, "Okay."
And you build it and it's got compiler
warnings and it's like well, that's step one.
And then you run a static analyzer
and it's got static analyzer warnings.
It's like life is too short to debug network
problems and these other silly problems.
You get the silly problems out of the
way and that makes compiler warnings,
that makes the static analyzer, that
means adding asserts to your code.
Asserts are really useful for tracking down bizarre
networking problems because if things happen
in the wrong order and the program's not executing
the way you expect, your assert fires and instead
of corrupting some vitally important
state, you end up straight in the debugger.
So that's a huge win for network programmers.
And similarly, memory management warnings.
Memory management errors are very
indeterministic as well, just like network errors.
So don't try and debug your memory management
problems and your network problems at the same time;
use zombies to flush out the memory
management problems first.
In terms of real network debugging, the
first point I want to talk about is logging.
Now logging has a bad reputation in debugging circles.
People disparagingly refer to it as "printf debugging."
And it's considered sort of a 1970s technology.
I love logging.
Logging is the first thing I add to
my network programs and the reason is
because in my opinion, logging is like a TARDIS.
Okay, if you're not -- okay, we have some Dr.
Who fans out here.
[ Applause ]
This TARDIS is about five minutes' walk away
from my flat in Glascow, it's pretty cool.
So if you're not a Dr.
Who fan, a TARDIS is a vehicle that can travel anywhere
in time and space and that's what logging does for you.
These network bugs are indeterministic
and sometimes they happen in real time.
If you stop in the debugger, then
the service stops sending you data
and it eventually times out and gives up on the connection.
And so you can't just stop on the debugger.
But if you've got good logging in there, you can play the
thing out in real time or wait for the error to happen.
And then when it does, go back through the logger
and replay time at a speed that you can understand.
And that's a critical piece of network debugging technology.
And as I say, it gets a bad rep but it's good stuff.
In addition, logging lets you travel in space.
You've got some user in Uzbekistan whose network always
reproduces this problem but you can't reproduce it ever.
You've got an app reviewer that always refuses
this problem and you can't reproduce it ever.
What do you do?
You can't go to Uzbekistan or indeed, the
app review offices to debug the problem.
So what do you do?
And the answer is you have that user
turn on the logging and send you the log.
So it's don't skimp on the logging
when you're writing a network problem,
it's a critical part of making a network
problem that can be debugged in the real world.
So when you do it -- many people
disable the logging when they're done.
You know, in the release build they
remove the logging so it can't be enabled
because they think it will slow
down their program or something.
My experience is you want to leave it in there -- leave
it in there and leave it disabled and provide a way
for the user to turn it on so that
they can get you in the field reports.
Try to make the logging persistent.
If the application crashes, having all the logging
information in memory is not going to help you; it's gone.
And similarly, if the user has a problem and then quits
your application, launches mail, say, "I've got a problem."
And you say, "Well, what did the log say?"
That's not helping really, is it?
Because it's gone.
Also make it easy to retrieve.
I really like the in-app email feature of iPhone OS 3
because you can automatically create
an email and attach the log to it.
It makes it very easy for the user to get the log to you.
Packet traces are a way to log what's going on in the wire;
it's another form of logging but it's a very specialist form
of logging -- it allows you to see all
the packets traveling over the network.
Packet traces are another critical
network debugging technology.
They allow you to do divide and conquer.
You have a problem with a server and a
client and you can't never really tell,
is it the client sending the wrong request
or the server responding incorrectly?
Well, what do you do?
You run a packet trace, see what went
over the wire, and then you know for sure.
Similarly, you can use packet traces for comparison.
If client A works and client B doesn't, you packet
trace both of them and see what's the difference
between the requests that were sent and then fix that.
It's also great for verification.
Every now and again you run a packet trace over
your program and see what it's sending on the wire.
I like to think of it as leaks for the network.
Every now and again, you're sending
things you weren't expecting.
You might have typed "http" instead of "https" by accident
and now you're sending all the user's
confidential information in plain text.
And the only people who are going to find out -- if
you're lucky -- are app review or the black hats.
Similarly, you might have a game update timer.
And so you're sending your game state every
tenth of a second and you forget to invalidate it
and now you're sending your game state twice
to every 20th of a second and that's bad.
So look at the packets on the wire with the packet
analyzer and see what's going on, just every now and again.
Also, if you do this, make sure you have some
feature in your debug build to turn off your
on the wire privacy features, typically TLS,
because the packet trace works a lot better
if you can actually see inside the packets.
Now there's a Q&A -- packet traces aren't
available in iPhone OS but if you go to Mac Handy,
you can take a packet trace with
the tools referenced in this Q&A.
And finally, there's the simulator.
The simulator is a great tool, especially since iPhone
OS 3 for doing network debugging -- surprisingly so.
The simulator and the iPhone behave quite similarly if
you're using typical application debugging technologies.
So it's fine to do a lot of network
debugging in the simulator
and if you do that, you get access to Mac OS X tools.
And my favorite tool of all time is DTrace, it's
like super-logging that you don't have to write.
Of course, if you're running on the simulator,
you're running on the Mac OS X kernel
so nothing really behaves exactly the same as an iPhone.
So you have to do your real testing on an iPhone.
And that's my talk on debugging and network debugging.
My next step is the common mistakes,
notably putting foam at the top
of an empty beer glass apparently is a common
mistake, too -- I have to fix that one.
Number one common mistake: main thread synchronous.
We see it all the time.
Some significant proportion of app review rejections
have occurred because people turn on airplane mode
on the iPhone, they launch the app, it crashes.
And we see millions of these -- probably not millions --
but probably hundreds of thousands of these come
in every day through the crash reporting mechanism.
Just don't do it.
Number two: threads.
As I like to say, "Networking is hard
enough without involving threads as well."
I see applications like this all the time.
They think, "Let's do something asynchronously."
Of course, asynchronously means "threads" and they have
these threads running through all of their model objects,
all through their view controllers, all through their views.
And at some point, somewhere in that code, they call
UIKit on one of those threads and it doesn't blow up most
of the time but every now and again it just goes boom.
And they go, "How do I fix that?"
And the answer is: get in your TARDIS and go back 6 months
and design your applications properly
and then it won't happen.
[ Laughter ]
So be very careful with the threads.
It's not the threads that are evil -- well, they are kind
of evil -- it's not that threads don't have their place,
it's just their place is deep in
your model and very well-contained.
The interface life cycle -- this can get
very confusing so I've put up a chart.
There are a variety of interfaces on
iPhone OS: Bluetooth, cellular, Wi-Fi.
They come and go on different life cycles.
Bluetooth is easy -- when you resolve a Bonjour
service, the Bluetooth interface comes up.
We talked about that earlier today.
And so if you're confused by it, take a
look at the video for the Bonjour session.
Bluetooth interfaces go down on idle.
They don't go down when you disconnect; they go
down when the interface hasn't had
many packets over it for a few minutes.
So a few minutes after you stop talking on
the interface, that's when it goes down.
The cellular interface often will be
pinned up by various services, for example,
push notification will typically pin
the cellular interface up all the time.
If the cellular interface isn't up,
then its life cycle isn't being held up,
then its life cycle is controlled by the API as you call.
It comes up when you connect to a network service
and that assumes you're using CF Socket stream
or higher, which means CFNetwork or Foundation.
If you use BSD sockets, the cellular interface
won't automatically come up based on connect.
One day we'll fix that but today is not that day.
The cellular interface goes down based on idleness.
So again, if it's going to go down, it will go
down about two minutes after you stop using it --
I think it's two minutes, some short
number of minutes after you stop using it.
The Wi-Fi is substantially more complex, so another slide.
The Wi-Fi comes up based on a number of criteria.
The first one is if it sees a network that it's
seen before -- it's one of your known networks --
then the Wi-Fi will come up automatically.
In addition to that, there are these two
controls: there's a user-level control,
which is asked to join networks and settings.
And if that's on and there's a Wi-Fi app at the foreground,
then the Wi-Fi will put up the Wi-Fi chooser dialogue
and let the user choose a Wi-Fi network and login to it.
And a Wi-Fi app is one that has this
UIRequiresPersistentWiFi, in its plist.
Wi-Fi going down is also somewhat complex.
Wi-Fi goes down 30 minutes after the
last Wi-Fi app has left the foreground.
Now the definition of "foreground"
is quite complicated there.
If you ScreenLock your device, then that
pushes your Wi-Fi app to the background --
not the background of the multitasking
sense, but it makes it inactive.
And that means that the ScreenLock
is the front-most application.
If you think like a Mac, it actually makes more sense here.
So the ScreenLock becomes the front-most
application, which means --
and the ScreenLock doesn't have
UIRequiresPersistentWiFi set.
So if you ScreenLock, then your
UIRequiresPersistentWiFi setting is no longer relevant
and the Wi-Fi will go down in about 30 minutes after that.
But it's worse than that because if you're not running
on batteries, then when you ScreenLock the device,
unless something else is keeping the
CPU awake, the CPU will shut down.
So it will go to sleep like a Mac would go to sleep.
And when that happens, then the Wi-Fi goes down as well.
The whole story is also complicated
by this notion of captive networks.
Now both -- I'm not quite sure if I talked
about captive networks, I know Brett did,
I think Stritt did as well.
Captive networks is a complicated topic and you'll have
to catch the information on those in the previous talks
and also last year's iPhone networking talk
as well talks a lot about captive networks.
Reachability.
Now, you never thought that I would complain about
this but it's an API that people use too much.
Don't use Reachability pre-flight -- we've had
that in all three sessions this week and it's true.
Reachability is about user interface, it's
about telling the user what's gone wrong;
it's not about telling you whether you can connect or not.
So if you want to connect, connect -- it's just that simple.
If it fails, you can then use Reachability
to get some idea of why it failed
and maybe guide the user as to how to fix it.
Or you could start a Reachability operation in parallel with
your connection operation so that you minimize the amount
of time that you displayed imprecise state to the user.
But don't use Reachability before
you connect, just try and connect.
Similarly, don't use Reachability to determine
the interface type for the sake of speed.
We're going to talk about that in the next slides.
What you want to use Reachability for is to guide your user
interface, provide useful feedback to the user to say, "No,
this is never going to work; you
need to fix your networking."
It's also a good idea to trigger retries.
If you've got a whole bunch of queued operations and they've
all failed and you've retried them once and they don't seem
to be working, then you can queue them
all up behind a Reachability change.
And when the Reachability changes, then
retry some of them and see whether they work.
It's no point retrying them until Reachability
changes because chances are the reason
for the failure was a problem with the local network stay.
Another example of Reachability where it's
useful is interface-specific connectivity.
I see developers who have licensed content that
can only be displayed if the application is running
on cellular, you know, movie content and so on.
And that's a good reason to using
Reachability to determine the interface type.
A bad reason is to estimate the speed.
In terms of using Reachability, use it asynchronously.
It's like any other API: if you use it
synchronously, you will be killed --
on the main thread at least --
you'll be killed by the watchdog.
There's a Reachability sample.
Make sure you get version 2.
If you got a previous version of the sample, go
and make sure you update to the version two version
because the previous version -- how best to put that?
"Severely suboptimal," shall we say.
Interface type.
And this is interface type for this question.
What type of interface am I on?
Which really means -- and also what type of cellular am
I on, which really means: What speed is this network?
This is a fundamentally broken question.
The network speed is independent of the
link layer speed -- it's really that simple.
If you're talking across a MiFi, which is going to
be my example for this talk, your back call is a 3G.
So if you're talking to anything on the wider Internet,
knowing that you're connected by MiFi isn't going to help.
If you're talking on 3G, on cellular on 3G in
this room, you're probably going slower than Edge
because every other device in this room is talking on 3G.
Whereas very few people still have first generation iPhones
in this room, so very few people will be using Edge.
So if you need to know the speed of the
network, measure the speed of the network --
get a small file, download it, see how fast it goes.
And also adapt to changes because
the speed can change over time.
So just be careful with this one.
Don't ask me what sort of interface we're on
purely so you can get an estimate of the speed.
Timeouts -- another thing that we see with common
questions and it's generally one of those flawed questions.
The network timeouts are set to the default values for a
reason, specifically on cellular it can often take tens
of seconds for the cellular network to come up, which means
that if you lower the network timeout to a few seconds,
you may have connected to the network,
caused the cellular to start coming up,
and then timed out before it's finished coming up, which
means you're wasting power and not getting connected.
So make sure -- well, as a rule you want to
leave the timeouts set to their default values.
Now, if you're worried about what
impact that has on your user interface,
then solve that problem at the user interface level.
If it's a solicited operation -- something that the user
specifically requested -- then put up a progress dialogue,
put up a cancellation button, keep retrying until it works.
And if the user can then walk in range
of their base station, it will all work.
The last thing you really want -- I
see this all the time on my Mac --
where I go open the lid and then
try and connect with my VPN.
And my VPN software says "connection
failed," because my Wi-Fi hasn't yet --
it took a second or two for my Wi-Fi to bind to the network.
That's really annoying.
If the VPN software had just kept retrying,
then when the Wi-Fi bound, it would have worked.
Or if the Wi-Fi wasn't working, I would have walked
in range of my base station and then it would work.
And then it would automatically connect.
So don't just timeout for user interface
operations; let the user cancel and just keep trying.
For unsolicited operations where there's no real
user interface, then yes, you will need timeouts
but just use the default timeouts -- that's
fine, the user isn't waiting for them.
That wraps up my common mistakes, which wraps
up my talk as a whole: two hours of networking.
Sorry, guys.
I need a beer, I don't know about you.
But first, a summary.
Networking is hard.
We can't fix this at the API level.
It's a fundamental issue of networking and the best
way to make it easy is to design your project properly.
Good architecture is how to make it easy.
In addition to that, what I talked about
in this talk is asynchronous programming,
how using run loops is really your source of solving
the problem of doing the networking off the main thread,
and how you can use NSOperation to structure your
high-level application into asynchronous operations
so that the network details don't
leak out into your application.
Plan for debugging -- adding logging
once you've got a bug is painful.
You really want to add the logging in advance, then deploy
the users, then get the logs, not the other way around.
And try to avoid the common mistakes.
I have lots more common mistakes, I could have kept
going all day but those were the real tough ones.
So for the moment that's it.
I'm Quinn "The Eskimo!"
That's my email address.
Paul Danbold is doing our network evangelism at the moment.
There's tons of documentation on our website.
Apple developer forums, the Core
OS section is where I hangout.
So feel free to come and ask us a question.
You're more likely to get an answer on dev forums
than you are if you send me a personal email
because if I answer you on dev
forums, everyone sees the answer.
Sample code -- use the iPhone sample code but
don't be afraid to use the Mac OS X sample code.
Mac OS X and iPhone -- the networking is architecturally
very similar and most samples work on both.
In addition to WWDC attendees, as a special one-time offer,
you can get the Linked Image Fetcher sample which shows how
to use NSOperation to do asynchronous
operations and use the networking.
And last but not least -- well, actually
not quite last -- related sessions.
Obviously my first session has already passed, so
you'll have to catch that on video if you missed it.
All the other sessions have passed
except for Understanding Crash Reports.
It's a really good session, I highly recommend it.