Transcript
>>David Chan: Good afternoon.
My name is David Chan.
I'm on the iOS Performance Team and I'll be joined by
colleague later Peter Handel whose on the iOS Power Team.
So today, we're going to talk about Advanced
Performance Optimization on iPhone OS.
And in particular this is Part 1 of two.
We're going to be talking about animations,
scrolling, responsiveness, and battery life today.
"The iPad is a far slower machine that a modern
MacBook in terms of raw hardware performance,
but feels faster in many ways, because
you never have to wait for it."
This is a great quote.
Basically, what this is saying
is this is why you're here today.
People love using the iPad, the iPhone and
iPod Touch because it's a magical experience.
And a large part of that experience
is you never have to wait for it.
So, great performance is all about
creating an outstanding user experience.
So today, we're going to be covering the most
advanced topics, for our most advanced developers.
We assumed that you've already
written an iPhone application,
you've played with all the different aspects
of it, and you've seen the challenges.
Today, we're going to be covering animation and
scrolling, responsiveness, and battery life.
Tomorrow, we're going to be covering
memory, databases, and I/O.
Be sure to show up for that one as well.
So, across both of these talks, we're
going to be trying to give you a framework
to solve your application performance challenges.
Now what do we mean by that?
First, we want you to learn as
much about the system as possible.
You're going to use that knowledge as
a mental model so that when you come
up with performance issues you can
think creatively about solving them.
And finally, we want you to really measure progress.
We built great tools into the iPhone SDK and we want you
to use them to see exactly what's happening on your system.
Don't guess and be able to see that the changes
that you're making are making real progress.
So let's jump in.
Animation and scrolling is first up, so let's begin.
So what we're going to be covering today is we're
going to go behind the scenes of the animation.
Now you're all familiar on how to create an animation.
We're going to see what happens
after you commit that animation.
We're going to go into how to keep your
animations responsive, how to keep them smooth
and how to keep all these nice scroll views
in your system very, very nice and smooth.
So, here's a timeline diagram of a typical animation.
Now it has three stages and we'll
walk through them one by one.
First, you create your animation.
Now this is pretty simple.
You've probably seen this before.
Use UIView, create animations.
You changed some part of your view hierarchy.
You maybe changed some properties.
And then you commit it.
So that's step two.
Now this is where the system calls your
layoutSubviews, and drawRect calls.
Now this is when the system is ready to take
your animation, ship it over the render server,
and have that animation show up for the user.
And that's the third process.
Every single frame is rendered by the render
server for the length of your animation.
So let's start with step one.
So, this should be pretty simple code
for you guys, pretty familiar stuff.
We're creating a view hierarchy
starting with that inside view.
We're putting a little scale on to that.
So that starts up very small.
We're going to begin the animation and add that
new view hierarchy to our existing view hierarchy
and then bring the transform up so it will become full size.
So let's see what that looks like, great.
So, this is what we just saw.
We saw the existing view hierarchy there
and then we add a new part to that.
So that's the card with image and that great label.
So as you can see, they're not quite filled in yet.
The next thing that happens in stage two is
that the animation is prepared for commit
by calling layoutSubviews and drawRect
on each of those new views.
So they just got filled in and now they're ready for commit.
So, we have the new view hierarchy on the left here and what
render server thinks is going to be displayed on screen.
And major part of commit is that this gets sent over.
So now this towards syncs up and the transaction
of the animation is committed to the render server
like you would commit a transaction to a database.
So now we're on to stage three.
Now for every single frame of your animation--
the length of the animation, the render
server goes through these four steps.
It takes the current time and looks over
the tree that it has for your application
and sees what animations it needs
to update for the current time.
So in our example, we have that scale starting
very, very small and coming up to full size.
Now, every 160th of a second, that
scale gets interpolated to a new value
so it gets just a little bit bigger,
a little bigger on every frame.
So the second part is we calculate the
screen region that needs to be updated.
This is important because our render server is
really, really great at figuring out, "Well,
we don't need to update the whole screen, we
just need to update that things have changed.
So it can walk over the tree and figure
this out in a very, very nice way.
The third thing it does is it takes the whole view hierarchy
and constructs a scene using Quads and using the images
that you draw-- that you drew into in your
image assets as textures and creates a series
of graphic commands for the GPU to render this scene.
Once it's handed off the GPU, it tells it to present the
rendered update to the display and this is a lot of stuff.
And again, it happens on every single frame.
Every 160th of a second, that's 16 milliseconds.
Not a long time right?
So that's behind the scenes of animations.
Let's talk about what can go wrong in stage two.
So, like I said before we create,
commit and render an animation.
And when you create the animation,
you can often set a delay--
or, I'm sorry, a duration, and when you
create that duration, you're basically saying,
"I want the animation to start and
end within this amount of time."
So as you can see, the start of that duration
starts as soon as you create the animation.
Now if you spend too much time preparing and committing that
animation, you know, maybe you spent too many time drawing,
you spent too much time laying out subviews, creating
views, that can end up with a pretty serious delay.
So what can you do?
Well, the first thing that we use-- that we'd say
you should do is to draw less when you're preparing.
So it only invalidates views that need to be
updated, only call setNeedsDisplay on visible views,
if you call on hidden views, they'll still get drawn,
and only implement drawRect when absolutely needed.
Now if you're going to have an empty drawRect,
this still matter because the system has
to allocate a backing store for this.
Second, you should invalidate smaller
regions of large views.
If you have large views and let's say you
implement something like a painting program,
we have a great mechanism built into systems that you can
implement a smart drawRect and use setNeedsDisplayInRect
so that you can, for example like in the painting
program, only invalidate the regions around touches.
And finally, if that isn't your kind of program,
you know, you should think about taking large views
and decomposing them into the parts that
stay the same and the parts that change.
So the other part of preparing an
animation is if you use image assets.
Your image assets are going to
get decompressed in stage two.
So we often see big delays because people are
using, you know, very, very large images or formats
that just aren't appropriate for the device.
So try to decompress and rescale big images sparingly.
This will, you know, keep your animations quite responsive.
And try to use the, you know, formats
that are optimized for iPhone.
Or iPhone-optimized PNGs, those that are
added, your Xcode projects are great,
and JPEG and TIFFs have different tradeoffs.
JPEGs are small in disk but have, you know,
perhaps lost equality, whereas TIFFs are very large
and so they make a little bit more
time to read off of storage.
Finally, if you create any custom CGImages, we highly
encourage you to use to UIGraphics convenience functions.
This will take care of the all the nitty-gritty details.
And the problem-- the reason I'm mentioning is that if
you get those details a little bit wrong it's possible
that the system will have to copy those
images for you into the right format.
So try to avoid those.
And you can use the Color Copied Images debug
option in the Core Animation Instrument to see them.
So this is a big topic.
People want to know how to make their animation smoother.
So we're going to go through and talk about
exactly what happens behind the scenes.
We're going to talk about some specific examples of how
you can improve your animations and we're going to look
at a new feature in iPad and iOS 4 that I called dynamic
flattening that can help you create smoother animations.
So let's begin.
So, your server tries to render each frame
your animation at 60 times per second.
This again is not very, very much
time to do any form of rendering.
So, fewer pixels to render, means smoother animations.
That means fewer input pixels.
So, if you have very, very large images for
small view, that's not going to be great.
If you have a lot of blended views, you're
going to end up with a lot more output pixels.
And finally, few rendering passes
also means better animation.
So like I said in the beginning, we want to
get you guys to measure what's going on here.
So we're going to be using the Core Animation
Instrument and this is really simple to use.
I hope every one of you has tried using this.
You just plug in your device, launch instruments,
and select your application and hit record
and it'll show you the number of frames
that were rendered in the last second.
It gives you on a second by second basis that count.
Now, when you're measuring, you want to make sure
that you measure the base line as in what it looks
like right now and any changes that you make.
So you make sure that you're actually making improvements.
So I said before that this is a count.
Now, one thing to keep on mind is our animation
from before was only about 300 milliseconds or so.
And when I mentioned this the first
time, I only saw that I got 18 frames
and I was really disappointed I thought that was quite slow.
But it turns out that's actually the max that
we're hitting-- that we're shooting for right?
So, at 60 frames per second is our target rate and although
I only rendered 18 frames, that was the correct number.
So you have to do that division if you're going
to be measuring stuff at subsecond levels.
But one thing that you can do and it's really
neat trick that I actually really like a lot.
You can link in the animation over a few
seconds for much, much better measurements.
And this helps reduce measurement jitter.
This allows you to get the timing just right.
And here we actually see that when I do
that, I was really happy and surprised
that I got 60 frames a second over
the course of that animation.
So, fewer pixels to render mean smoother animation.
Now I've mentioned that the render server actually
looks over the whole view hierarchy and tries to figure
out exactly which screen regions need to be rendered.
Now you can actually see, one, it figured out that I
needed a render using the Core Animation Instrument again.
And we're going to be using the
flash updated regions check box.
Now, what this will do is it will
cause parts of your application
that are being updated by the renderer to flash yellow.
Let's take a look of what that looks like.
So, we're bringing in the card, it flashed
yellow, that's exactly what we want to see.
So that gives you kind of a baseline to figure what
parts of the application I actually need to fiddle
with for this animation that I care about.
So, let's jump in to one specific way that you
can improve the smoothness of your animation.
We said this before.
You want to reduce the amount of you blending
in your animation and in your view hierarchy.
So again, let's take a look at how you can see that.
So Core Animation Instrument, we're going
to be checking color blended layers.
And let's take a look at what that looks like.
So, we have the same view from before except
that now the opaque regions are shaded green
and the blended regions are shaded red.
And as you can see towards the top, the regions
that are even deeper blended are darkened red.
So, why does this matter?
Well, the graphic system can form a certain number of
pixel operations per frame to maintain a smooth frame rate.
And blending requires more operations per on-screen pixel.
So if you're just putting down an opaque view onto screen,
you can just write each of those pixels out, right?
If you have to blend, the graphic system has to read the
value there and then write the ending pixel because it needs
to figure out what is actually blending to, right?
So the second reason is, the graphic system
supports efficient hidden surface removal.
That is if something is fully occluded
then we can get rid of it.
And it doesn't even have to touch that surface.
Never even gets rendered.
So-- But it can only avoid views that
are completely occluded by opaque views.
So by having all the views that should be opaque,
be opaque and you can help out the graphic system
and it can render your animation very smoothly.
So, I mentioned that the graphic system can perform
certain number of operations per frame and we're going
to take a look at what that means for
opaque views and for blended views.
So here on the left, we have that view of
my animation with color blended layers on
and you can see the opaque ones
are green and the blended ones red.
And on the right, we have a rectangle that represents the
approximate number of pixel operations per frame at 60 bits.
So let's see what happens when we bring over the
opaque pixel, the opaque views when they get drawn.
I'm sorry, when they get rendered.
So as you can see they're overlapping over on the
actual view and they're overlapping here as well.
Now why is that?
That's because our graphic system supports
something called differed rendering.
Basically, what allow us to do is figure out
what's overlapping what and we don't need
to spend anymore time drawing things-- I'm sorry, rendering
things that are just going to be overlapped anyway.
However, we can still have these blended views left.
So let's see what happens when we bring those over.
So as you can see, the number of pixel operations
that this blended views take up don't take advantage
of this optimization and so the use up
even more pixel operations per frame.
So what can you do about it?
Well, first it's important to understand how
views get marked as needing to be blended.
Contents determine the blendings.
So there are three ways that happens.
First, views that are drawn or by default part
is opaque and you have to actually set that--
set the flag to no and implement drawRect.
Once you do that, it's been blended.
Second, the use image assets like PNGs,
they often can contained an alpha channel.
If you look in your image assets with
preview and hit the Get Info button,
you can actually see whether or not your PNGs have alpha.
And if you didn't intend for that image asset to be
blended in your system, then you really should go
in to your Image Editor, resave
that, and get rid of that alpha.
The third way that views can become blended is by
creating opaque-- sorry, creating custom CGImages.
And if you use the UIGraphics convenience functions,
we make it really easy to have you pass yes
to opaque and then the image ends up being opaque.
So these are the ways that contents
become blended or opaque.
And the next question is, well, what
else can you do about it, right?
Once you've fixed all the accidental blending in your
views, let's say you still have a lot of blending.
Well, the next thing to remember is that it's the number
of pixels that are rendered and the number of pixels
that are blended that impact the performance of this.
And so, if you decompose a large blended
view into the parts that actually need
to be blended and the parts that are still opaque.
Even though that ends up being more views,
that still ends up with better performance.
OK. So that's view blending.
Now we're going to move on to talking
about Offscreen rendering.
So what is Offscreen rendering?
Well, to achieve certain effects on our system,
the compositor or the render server needs
to use a temporary offscreen region
in order to achieve the final effect.
And one way of thinking about this is how a painter will
use their color pallette and take two colors together
and mix them before painting on their final canvas.
So why is this slow?
Well, besides the fact that you are rendering more pixels,
you're rendering to this offscreen context and then taking
that and then rendering to the final display.
You're also switching between this
main and offscreen context
and they'll stall the graphics pipeline
and, you know, really hurt performance.
So let's take a look at how you can
detect this into your own animations.
So again, we're going to use the Core
Animation Instrument and we're going
to be checking the Color Offscreen-Rendered Yellow flag.
Now this shades yellow portions of your animation that had
to be rendered offscreen and then back to the main screen.
So this isn't quite as easy as blended views.
I can't just tell you to go find the blended views
that aren't supposed to be blended and get rid of them.
Avoiding this kind of offscreen rendering
requires some creative solutions.
So let's take a look at couple of examples
and some workarounds that we come up for you.
So let's say I have an animation where I take an image with
a background color and I fade opacity from solid to blank.
So it looks something like this
which have the image, we have--
set the background color and we begin an
animation, set the alpha to 0 and it fades away.
Now to composite correctly, the image
needs to be composited over the color
at full opacity offscreen and then blended into the view.
So let's take a look at that.
So, you can see that the blue in the iTunes icon
and the greens in the face in the photos look right.
They're faded over black.
They're just dimmed a little bit.
Now, what I mean correctly, one naive way of trying to avoid
this kind of offscreen rendering is just to say what happens
if I just blend that orange color into the background
at the lower opacity and then blend in the image.
So you basically break this out into two separate layers.
Let's take a look at what that looks like.
So as you can see, the image doesn't look quite right.
You know, the-- the blue isn't quite right and, you
know, the face in the background gets this orange tint.
So let's take a look at some other
ways that you might approach this.
So one workaround is to composite the background
color and image in drawRect first just like that.
And then when you faded out, so here we've actually
drawn it and then when it's actually faded out,
when it's being rendered, it doesn't have to
go offscreen and just fades out very nicely.
So this falls in the category of thinking about what
the graphic system needs to do to render that offscreen
and do it ahead of time using Core Graphics in drawRect.
So, one more workaround.
So if we're fading over a static
background like here, this is great.
We have a black background.
You know, nothing is changing behind it.
There are no patterns.
You can try fading in a view that contains
the background over the view instead.
So it might look something like this.
We just create a new view with the bounds of our view.
We set the background color to black
and then we fade in from 0 to 1 instead.
So let's see what that looks like.
It's great, it looks exactly the same.
So the lesson to this is if you see a situation where
you have offscreen rendering, this often alternate ways
of getting the same effect, visually
by using a different technique.
Let's take a look at another example.
So new on iPad and iOS 4, CALayers now support
this great property called cornerRadius.
And this allows you to get these nice
rounded corners on your views really easy,
you pop that layer-- you pop that property on.
Now, animating a view with a rounded
corner mask, as you will find out,
requires that the renderer would go
offscreen, render your image and then apply
that mask that contains the rounded corner.
Now, this actually applies to all
masking that's non-pixel aligned.
So if you have an arbitrary mask layer that you
assign to a CALayer, you'll see this as well.
Or if you're moving a view that has clips to
bounds set and it's moving on non-pixel boundaries.
In any case, let's take a look
at the rounded corner example.
So, what can we do about it?
Well, again, we have two Workarounds.
The first is to try to achieve the same
effect in Core Graphics ahead of time.
New on iPad and iOS 4 is this great UIBezierPath API and
allows us to just to draw a path with that rounded rect
with those great rounded corners and we set that as the
clipping region for whatever we draw into our background.
So when we draw, the rounded corners come with it.
Now of course, this only works if the
corners don't have to clip anything
that actually goes outside of the rounded areas.
But in our original situation, it works great.
So the second workaround is to decompose
rounded corners into separate views.
So basically, all that means is that you're creating
four small views that are positioned around your view
that have a little black sliver drawn into it.
And here are some codes for the top left corner.
That's pretty simple.
Again, this is one of those cases where
you can achieve the same visual effect just
by using a different technique and
you can avoid offscreen rendering.
OK, those are some ways that you
can avoid offscreen rendering.
Remember fewer pixels to render means smoother animations.
And that means fewer rendering passes
also means smoother animations.
So let's talk about that new feature
I talked about, dynamic flattening.
Now this is new in iOS 4 and on the iPad.
And the reason we added this is that animating
changes to a complex view hierarchy can be choppy.
So here we have the little subhierarchy that we added to
our existing hierarchy in our app for the Core Animation.
Now why do I say this is slow?
Well, it renders the hierarchy on every single frame.
So as it's scaling in it works over that tree,
renders those subviews together and then scales it.
So this animation would be smoother
with a flattened hierarchy, right?
You do the work once.
You draw it in drawRect and then just have
that scale it overtime, surely a lot faster.
But as I'm sure you guys know it's kind of pain to change
your whole view hierarchy and then you lose the dynamism
of being able to move views around it independently.
So now you can flatten without changing
your view hierarchy using shouldRasterize.
Let's see how you can use it.
So like I said it's a CALayer property.
Generally, the way you want to use it is you want to turn
it on before animations and turn it off after animations.
Let's take a look at some code.
So this is using the new animateWithDuration using blocks.
So I'm just going to quickly walk through this example.
So we created that scale before.
Before the animation starts, we set shouldRasterize.
And then during the animation, we just set the
Transformed to Identities so that brings it all the way up
and then when the animation is done the
system will call my completion block here
and all that it does is it sets
that shouldRasterize back to no.
So let's see what that looks like.
Let's see how it works.
So here we have the subview hierarchy again.
And we're going to hint the compositor that it should
render this view hierarchy offscreen and then cache.
Now I just warned you a whole lot
about avoiding offscreen rendering.
But I want to assure you this time,
this offscreen rendering for good.
And this is how it's going to work.
When we render for the first time, it's actually going to
render into this offscreen region and then it's cached.
We're actually going to keep this around for frame to frame.
And then on each step of the animation, it's
going to get rendered over just like that.
So this is actually really, really nice
but it can hurt more than it can help.
Don't turn on everywhere because
there's a limited cache size.
And if you start setting lots of views with shouldRasterize,
you're going to overflow the cache and that ends
up in a really, really bad situation,
ends up being much worse than before
because essentially you're rendering every single view
that you set with shouldRasterize offscreen and the back
on the screen and we just talked about how doing
that in every frame can really, really
hurt your animation performance.
And like any good cache, it throws away old results.
So if you changed anything in your
view hierarchy during your animation.
The render server actually has to throw away
that cached copy and then render a brand new one
in order to actually show the proper results.
So make sure you don't change anything during your
view hierarchy while you have shouldRasterize on,
otherwise you ended up rendering
offscreen without great performance.
So that's smooth animations.
Remember, rendering fewer pixels means smoother animation
and that applies to blending, that's fewer output pixels
that applies to rendering passes as well,
so reduce the amount of offscreen rendering.
So let's talk about scrolling.
Now, I talked a lot about animations first.
But a lot of you probably care a little bit
more about scrolling now why did I do that?
Well, it turns out that each frame
scrolling is a small animation.
When you flick that scroll view every 160th
of a second is issued in brand new animation
and that's calculating a new scroll
view and that's the implicit animation.
It's going to prepare and commit that animation.
So if you have a new cell coming on screen,
your layout subviews gets called, your
self or rowAtIndexPath gets called.
And the compositor has to render a brand new frame.
So, the animation advice I gave earlier totally applies.
Prepare yourself very quickly and then render very quickly.
So here's another timeline diagram like I showed you before.
As you can see these animations are squished together really
tight because they have to happen within 16 milliseconds
in order to get that nice scrolling effect.
So first thing that happens is that we create an
animation implicitly by calculating a new scroll position.
We prepare and commit the animation.
So this is what happens when your cell gets laid
out and this is where all the drawing happens.
And finally the frame is rendered.
Now, when I say the frame is rendered I really do
mean the whole table view gets rendered, right?
Because every time you scroll, each of those
cells is moving in a different position.
So, prepare cells quickly.
Now, there are two major parts to preparation, right?
There's layout and there's drawing.
And on the layout side, that's when the table view gets
to tell you, "Well, you've adjusted the scroll position.
Now give me your new cell if there's
a new cell appearing on screen.
So you want to use the dequeueReusableCellWithIdentifier.
We use those table cells.
It's not necessary an advanced
tip but we do have to mention it.
You will save a ton of time creating objects
and backing stores for each of those cells.
And you know be sure to use unique
identifiers for similar cells.
If you have lots of different kinds of cells in your table
view, don't spend the time to transform one to another.
Use a little bit extra memory and
give them different identifiers.
You can save time and you can get
those cells up really quickly.
So the second part of preparation is drawing
and we've told a lot of you in the past
to flatten a view hierarchy of your cells.
And that's actually a really good idea.
I like that.
Because what it does is it reduces the amount of time
that it takes to render those cells in the table view.
And so you end up with a nice scrolling effect, right?
It's very, very smooth.
However, I have seen some applications
where it scrolls nice and smoothly right
up until you get a new cell and then it jumps.
Now, what can happen here is too much cell
drawing, you can spend a lot of time drawing all
of these views into the same cell and that isn't great.
That's not a great experience.
Your table view isn't scrolling very smoothly.
So there's a nuance point and of course, you're going to
have to measure an experiment to see what works for you.
But if you have that kind of scroll view in
your application where it scrolls very smoothly
and then it jumps when you get a new cell.
First measure it.
But if you find that you spend a lot of
time drawing cells I have a tip for you.
So elements that need to be rasterized
anyway, text labels, things with pads.
Be sure to just flatten those together that make sense.
Just flatten all those labels together.
Those don't need to be composited by the render server.
Do it once in Core Graphics, you'll be happy.
But for elements that are just images, let's
say you would put them into an image view,
you might consider letting the rasterizer handle--
sorry, the renderer handle a few of those.
And basically, what you're doing is you're balancing the
time on the CPU spent when you're creating a new cell
and the amount of time spent on
the GPU on every scroll change.
So, like I said, there are two halves, you want
to prepare cells quickly and render quickly.
Now, all of the lessons that we talked
about from smooth animations apply here.
A few pixels to render means, smooth scrolling as well.
So simplify the structure of your view hierarchies.
If you have any unnecessary or
invisible views, just get rid of them.
You want to reduce the amount of
view blending as much as possible.
So, use color layers, color blended layers
to see what's going on in your table views,
and reduce any offscreen rendering that you might have.
That would actually really, really bad in the situation.
And again, new to iPad and iOS 4, you can try to
use the dynamic flattening property in order to--
you shouldRasterize property rather to flatten
your cell hierarchies if you haven't already.
And this might be a nice way of making scrolling
performance just a little bit better for the cost.
One caveat though.
Your cell animations will not look
great if you keep this on all the time.
So if you end up doing a rotation, you want to
turn this off just before the rotation starts.
And for any edit animations, you want to turn
this off before the edit animation starts.
So, we shipped a lot of devices
that run iPhone OS and iOS 4.
Last year we shipped the iPhone 3GS
and the iPod Touch-- new iPod Touch.
And these have twice the CPU power of the previous
generation, twice the RAM and the GPUs are way faster
but what that means is, it is a big
gap between what you're developing
on if you're using an iPhone 3GS and the iPhone 3G.
And we have millions of customer who
have iPhone 3Gs and iPod Touches.
And we want your apps to look great on them.
So if you can, keep around one of these devices and
make sure to test on the devices you intend to target.
iOS 4 runs on the 3G, the iPod Touch and iPhone 4.
And we want everything to look great across those.
Now, I want to talk about the iPad.
We shipped this a couple of months
ago and we think it's great.
It has even faster CPU with the A4.
And even though it has 5 times as many pixels
and about the same graphics capability,
we doubled the bandwidth of the BUS.
And so we think that this has great
graphics performance as well.
And finally, the brand new iPhone 4.
Again, it has the A4 chip so the CPU is a lot faster
because, hey, now you have 4 times as many pixels to draw.
And even though you have 4 times as many pixels, you're
really going to want to get your hands on one of these
to make sure that all your animations look smooth.
Because things are going to double and you
want to make sure that everything looks great.
So that's animation scrolling.
Let's talk a little bit about keeping
your applications snappy and responsive.
So the key part of responsiveness is
simply, do not make your users wait.
We're going to talk about how you
can measure some of these things.
We're going to talk a little bit about launch, interaction
delays and just a couple of notes about CPU optimization.
So Time Profiler Instrument, this is new
in iPhone SDK 4, not Xcode 4, iPhone SDK 4.
And it's actually wonderful statistical
sampling profiling tool.
And what that means is every millisecond you
can see what's happening in your program.
It can take a look at, you know, the stacks of exactly
where things are using the CPU and if things are blocking.
So, the way you should use this during your development
process is if you come up with a performance issue,
use this to measure what's happening during that scenario.
Measure first, and then as you drill down,
you can actually find some of the problems.
You'll actually see the different landmarks
of your code and be able to see, wow,
I didn't realize that that was going to take so long.
So by default, this shows time spent on the CPU.
If you want, you can use this great check box here.
This was just hitting the information
button there called "All Thread States."
It actually shows the time spent in blocking as well.
Now, for stuff that's running on the
main thread, that's hugely important.
You can use this tool to narrow down to exactly
what's happening in your program on the main thread.
And if you see things blocking, that's
not kind of the normal, you know,
blocking on new events, then you
want to chase down after those.
So once you found your problem, you want to measure
exactly how long that particular section of code is taking.
So you want to take a baseline.
Again, what it looks like right now.
And as you make changes, just try to make that
faster you want to actually see real changes.
We recommend just simply timing the start
and end using CFAbsoluteTimeGetCurrent.
Now for those of you that want to know, this
is wall clock time and it's user time, too.
So simply use it like this.
It's pretty easy to use.
So I want to talk a little bit about launch.
If you are at the performance optimization on
iPhone, talk a little bit about it there, too.
Now, I encourage everybody to measure, right.
But this is a little bit tricky to measure the total
number of time-- the total amount of time during launch.
You can start by measuring the amount
of time between the start of name
and the end of application did finish launching.
That gives you a good sense of what's happening.
But the other thing to do maybe is
to time launch using Time Profiler.
Now you can use this as an absolute, measurement
because obviously there's sampling going on.
But it's actually really, really useful tool for
relative measurement as you're making changes.
And it helps you figure out what your
application is actually doing at launch.
So what can you do?
Well, deal only what's necessary on launch.
Can you defer the work that you
see that your application is doing?
Could you do it on demand?
We have a philosophy in-- our application
development of being lazy.
If you can be lazy and do it on demand, that's great,
because the user might not need network ever at all.
The second point is, reduce the number of linked frameworks.
I know when I'm developing I sometimes try
out the brand new frameworks and add them
into my project to try out a brand new feature.
That's great, you should experiment.
But before you ship and build the final product,
you should make sure to remove those
frameworks from your Xcode project.
Because when you have those in there, the system will
actually try the load those at launch and you want to reduce
that as much as possible because that can
cause I/O and cause initializers to run.
You want to reduce that as much as possible.
And if you're using third part libraries,
you want to look out for static initializers.
Now these are usually C++ methods
that are called to initialize a class
and you can detect these using these environment variables.
The other thing that you want to
look out for is weak exports.
These are somewhat rare but we've
actually seen them in the field.
And you can quickly check for these using otool like this.
This is how you set up environment variables to
run in Xcode if you haven't seen that before.
And we have some sample output of what gets printed with
the print statistics and print initializers options.
So interaction delays, the key thing
here, simply do not block the main thread.
And by that I mean, not as blocking operations.
But if you have any operation that's taking
longer than about few milliseconds or so,
you really want to spin it off into the background.
We have lots of scroll views on our system
and people love idly playing with them.
And if you block for a few frames.
It can be really disconcerting.
So, long running task should be
spun off into the background.
And you should try to factor these
into executable units of works
that you can really show progress
to user if it's really long running.
Remember to make UI updates back onto the
main thread once you actually do this.
And in iOS 4 it's really easy to do
with NSOperationQueue and blocks.
So here we have some sample code.
And what is this doing is just
creating an image and we're going
to do some custom drawing to that on the background thread.
And when that's ready, we're going to post
it to the main thread with an image view.
So the first thing we do is we just
create an operation with this block.
And here, we're just creating image context with options.
By the way this is now thread safe in iOS 4.
You can totally use this on the background.
It's great.
And then grab the current-- grab
the image form that current context.
Now, so our drawing is all done here and maybe that's
like 100 milliseconds to 200 milliseconds or so.
Now we want to post that to the main thread.
Now of course, you can't really modify stuff on the
main thread from background threads because a lot
of the UIKit code isn't thread safe here.
So what can we do about that?
How do we get the image to the main thread?
Well with blocks, it's really easy.
All you do is create a new operation with this
block on the main queue, run on the main thread,
and we just create a new UIImageView and use that image.
Pretty simple, huh?
So next if I have about responsiveness,
always make URL request asynchronous.
So I've seen this code, it's really easy to
use sendSynchronousRequest, boom, it's done.
Unfortunately, you don't know what the users are
going to have in terms of network connectivity, right.
They could be somewhere where the network is a little
bit flaky and they start to make the connection that kind
of goes through and, you know, it doesn't
take very long for people to get frustrated.
So it's a little bit more code but it's worth it.
Use connection with request, implement
the delegate of NSURLConnection and you--
this all happen in the background and then your main
thread is nice and free for users to interact with.
You'll get callbacks when you need to receive data
when things failed and when it's finished loading.
One more note, spikes in memory usage can cause delays.
And why is this?
Well it turns out that to accommodate
higher memory usage, code is evicted.
And what I mean by that is the code
that's on the system is usually at--
is usually filling up the rest of the memory that's free.
And if you spike memory usage like this, the system actually
has to kick out something in order to give you more memory.
And what it usually kicks out is code.
And so as you see, it will spike up there
with very little code left in the system.
And when you bring memory back down,
it doesn't just magically fill in.
As if you read back from the storage
proceed and that can take a long time.
This is probably one of the most
common and unexplained delays
that people will find whether-- when
their application is unresponsive.
You'll do some operations, you'll sample it you say,
"Well, I'm not spending a whole lot of CPU time here.
Where is the time going?"
Often times, it's reading back the code
that your application needs to proceed,
in this frame or code, your code, system libraries.
So one final note about responsiveness, we have
some great tools and system including Time Profiler
that allows you to actually find hot spots in your code.
And as you could see it will actually give you
statistics about each individual line of code
and even to the point of each individual instruction.
It's pretty handy.
So one tip we have about this besides, you know, of
course making sure that your algorithms are as optimized
as possible is to use a feature that we have built
into our CPUs and its called "vector processing."
And what vector processing is, it's the way that we
can use the chips to process many elements at once.
So let's say about four elements
at a time in this situation.
So let's say we have some sample code here which is
pretty simple, we're just walking along this array
and we're summing up the total values into this foot.
In iOS 4, we have a new framework called
"accelerate" and this is great stuff.
This is all really, really highly optimized
code that you can just use out of the box
that will give us this vector processing
to do lots of operations at once.
So in this case, we're using the--
summing the vector elements.
And it's one simple line of code and
operates on this vector four at a time.
So again, don't make users wait,
measure the problem situations,
look for situations where you can improve
the interaction time in your application.
So with that I'm going to hand things
off to my colleague Peter Handel
and he'll be talking about power and battery life.
[Applause] Thank you.
[ Applause ]
>> Peter Handel: Hi, everyone.
My name is Peter Handel.
I'm an iOS Power Engineer and I've been
doing that for almost four years now.
I'd like to share with you some tips and tricks
and how you can improve the battery life
of your application in three key areas.
When using the radio to send and receive data, when using
Core Location to figure out where your device is located.
And when using the CPU and GPU to get
your work done and draw on the screen.
First the network, transmitting data over 3G is
one of the most power intensive things you can do.
This is exacerbated by the fact
that 3G networks keep the 3G radios
in a high-power state for few seconds
after data transmission.
Therefore, if you were to send and receive even
just a little bit of data, every few seconds,
you'd keep those power-hungry 3G radios
in a high-power state the entire time.
That's one of the quickest ways
I know how to drain your battery.
So how can we enjoy the high speed and wide availability
of 3G while still maintaining excellent battery life?
Here's a few tips.
First off, use the Activity Monitor tool
which is part of Instruments to figure
out how much networking your application is doing.
Next, coalesce your data into large chunks
rather than transmitting a thin stream of data.
If you notice that your application is transmitting a
thin stream of data, this may be because you're pulling
across the network to check to see whether
an event has occurred on the server.
Try to avoid this at all cost.
Let me repeat that.
Try to avoid pulling over the network at all costs.
We came across application, a little
chat application which checked
with the server every few seconds
to see whether a new chat come in.
And as you can image, just chewed through the battery.
Instead, try to use the Apple Push
Notification service if you can.
Also, minimize the amount of data transmitted.
Use a compact data format or maybe even
compress your data before you transmit it.
And finally, be real careful when
you reuse legacy or third party code
because oftentimes this code will
assume that it's just on Ethernet.
So for the 3G radio chip, let that chip idle.
From a power perspective, Wi-Fi
uses roughly half the power of 3G.
Now this obviously depends on network
characteristics but it's kind of rule of thumb.
Also, note that the Wi-Fi network
will allow the Wi-Fi radios
to enter low power state immediately after transmission.
Because of these 2 things, your application may want to know
when it's on Wi-Fi versus when it's on the cell network.
To check this, use the kSCNetworkReachability FlagIsWWAN.
Where does 2G fit into this mix?
Well, from a power perspective, it fits
in roughly halfway between Wi-Fi and 3G.
Also, like Wi-Fi, the 2G network will allow the
2G radio to enter the low power state immediately
after data transmission, and that's the radios.
Next, Core Location.
Judging by the number of apps in the App Store that use Core
Location, you guys love it and your customers love it too.
If you haven't used it, Core Location is an API that
with just a few lines of code which I have up here,
will allow your device to figure out where
it's located to varying degrees of accuracy.
However, be sure to only use the
least amount of accuracy you can
because the higher level of accuracy uses more power.
For example, if you have a coffee shop
finder application which can tell which--
and Core Location can tell you that you're here at the
Moscone Center, that's probably good enough to know
that there's coffee shop right across the street.
So in this situation, you use the nearest--
I'm sorry, you use the 100 meters accuracy.
Next, the distanceFilter.
This dictates how often you receive
location changed updates.
Be sure to set it appropriately because the
default is to receive every single notification.
And as you can imagine, this would
lead to a lot of unnecessary events
and higher CPU usage and worst battery life.
Be sure to call stopUpdatingLocation as soon
as you reach your desired level of accuracy.
Also, note that Core Location will
manage the GPS power for you.
What this means is that for example in our coffee shop
finder application, if your user is looking at the map
and they decided to go in the preferences part of your
application, call stopUpdatingLocation immediately.
And then one or few seconds later, they go back to
the map, go ahead and call startUpdatingLocation
and Core Location will pick up right where it left off.
So for the GPS chip, let that chip idle.
Note, the same is true for Core
Motion, which is the new iOS 4 API.
After you call the start update functions, be
sure to call the matching stop update functions.
Also, if your application goes in the background, be sure
to turn off the sensors when that happens if you like.
Note that in-- if your application would like to be
notified of the significant location change or if you want
to use region monitoring, instead of just
having Core Location running all the time,
use the new iOS 4 API which lets you do this.
And I have this-- that up here.
And that's Core Location, finally, the CPU and GPU.
You might be wondering why we're talking about
performance and power in the same presentation.
Well, it turns out that if you optimize for performance,
you get better battery life thrown in for free.
This is because fast code uses less
CPU time which uses less power.
So for the CPU, let that chip idle.
So as you know, the iOS 4 is an
event based operating system.
Now as I mentioned earlier in the networking portion
of this talk, there are certain conditions when--
there are certain situations where you
might want to uphold to check to see
when an event has occurred or something has changed.
Try to avoid this and instead subscribe to an event.
But we don't have events for everything so
in some situations you may have to pull.
Try to reduce the frequency with which you pull.
For example if you pull every 30th of a second, try dropping
that down to every tenth of a second or even every second
to see if there's any user-visible impact.
For example, we see a lot of sample code on
the Internet which recommends that you figure
out whether the device is being shaken like continuously
pulling and using your accelerometer to do that.
Don't do this.
Instead, use the Shake API to figure
out when your device is being shaken.
Next, be bursty.
Try to consolidate your CPU usage into short bursts.
This will allow the CPU to enter
that idle state I've talking about.
Note that this may require you to restructure your
code or possibly you can use a different algorithm.
How do you know when your code is being nice and bursty?
Well, use the Time Profiler tool as part-- which is part
of the instruments to check your CPU activity level.
For example, we found that during audio playback.
We were able to get much better battery life by
decompressing large chunks of audio at once rather
than decompressing small pieces continually.
Next, procrastination, because
who doesn't like to procrastinate?
Because if you put it off long enough,
you just might not have to do it all.
For example, we came across a game, which check
the-- which stayed at state every few seconds.
You can imagine it's not really good for the CPU.
It's not very good for battery life either.
Instead, maybe you could save the state when
the user reaches a milestone or a checkpoint
or maybe even the user quits the game, the time safety.
When using the GPU, pick a fixed frame rate.
We recommend about 30 frames per second and enforce
this using the CADisplayLink rather than using NSTimer.
This will help you minimize the appearance of
dropped frames and also help you avoid the situation
where your app is continuously drawing as quick as possible.
On the other end of the spectrum, if a
frame has not changed, don't redraw it.
For example, if you have a chess application
and your user is looking at the pieces,
contemplating the next brilliant move, don't be updating
the screen every 30 frames a second if nothing has changed.
Finally, be sure to check out the Energy
Diagnostics Tool which is part of instruments.
And I was in the session this morning, session
309 which delved into this extensively.
So be sure to check out the video once it's available.
So to summarize, for the radios we learned
that data transmission is very expensive.
So we coalesce and compress data as much as possible.
With Core Location, we use the least
amount of accuracy we can get away with
and we call stop obtaining location as soon as we can.
And on the CPU and GPU, we optimize our
performance and get better battery life for free,
with bursty we procrastinate as much as possible.
And on the GPU, we used the fixed frame rate, 30 frames
per second and we don't unnecessarily redraw the screen.
So to summarize, let those chips idle.
Thank you.
Dave.
[ Applause ]
>> David Chan: Thanks Peter.
So in summary, use your knowledge about the
systems, come up with creative solutions.
Always measure the baseline and the changes you make to
make sure that the changes you're making are improvements.
About animations, fewer pixels to
render, means smooth animations.
Make sure to prepare yourselves and
render very quickly for smooth scrolling.
Don't block the main thread and let those chips idle.
Thank you for coming to our talk.
Here are some related sessions for more information.
Be sure to come tomorrow to part 2 of this talk.
We'll be covering memory, databases, how
to use the data APIs on our system and I/O.
And right after this, there's an optimizing
core data performance on iPhone OS session
that I highly recommend you go through if you use core data
or you're planning using core data
in any of your applications.
Here are some other session that have already happened
that are great reference material for
some topics that we covered today.
Thank you very much for coming.