WWDC2010 Session 425

Transcript

>> John Harper: Welcome to Session 425,
Core Animation in Practice, Part 2.
And my name is John Harper.
I'm part of the Core Engineering team and mostly working
on Core Animation and everything relating to that.
And this talk is kind of a continuation of the first talk.
The idea is that the first talk was kind of setting
the scene, going over all the broad details.
And this one, we want to just dive into
certain areas, not taking a broad slice,
but just looking at things we thought you might find
interesting, to try and give you a better understanding
of what's happening and some new things you can be using.
So, things we're going to go over, we're basically
going to split the talk into three sections.
The first section will be kind of a mixed
grab bag of different APIs, some new,
some things we think you might find useful.
And then we're going to spend quite a bit of time on
performance relating to Core Animation and basically try
to give you an idea of how the GPU, the current
Core Animation, all these things fit together
and affect the performance you see in your applications.
Finally, we're just going to talk very quickly about
the new kind of High-DPI screen on the new iPhone
and what that means to your graphics
rendering from a Core Animation perspective.
So, let's get right into it.
So, the first API I want to talk about, the
first kind of section, I guess, is drop shadows.
And, obviously, shadows are a very important part of your
applications, in that they can often give a lot of depth
to your visual display, and they
can really make something stand out.
They just make things look a lot more natural, often.
So, in the past, we've had a full set of shadow APIs on the
Mac platform, and you can set things like the shadow radius,
and the shadow card, the path, all this kind of stuff, but
we've never actually supported them on the iPhone platforms,
because, really, the performance just wasn't there.
And so when we were doing the iPad, it really became
apparent that we really do need some kind of shadow support,
so we brought all those APIs back onto the
iPhone OS, but we added some new features,
just to make the performance a lot more acceptable on
these kind of lower end devices, compared to the Mac.
So, the new API we added is this thing called the
shadowPath, and so the idea here is that, typically,
if you say "Turn on shadows on your layer," what
that means is that we had to take the Alpha channel
of the composite content, blur it to get
that nice kind of blown out shadow look,
and then apply it underneath the layer, and that
blurring step in particular is a lot of work for the GPU,
and doing it every frame really
doesn't work very well on certain GPUs.
So, the idea of a shadowPath is that this is a
way you can tell us where your layer is opaque.
And, obviously, once we know the opaque region, we
can use that to kind of cache the shadow as a bitmap.
So, we can render it, blur it, and just keep it around.
And then as long as you don't change the
path, the shadow will be there forever,
and we can reuse it from frame to frame very cheaply.
So, I just want to show a very quick demo of this,
to give you the idea of why you should be using it.
So, we have this demo app, and you can
see that we have a number of layers,
just colored rectangles bouncing around the screen.
And so what I'm going to do, first of all,
is just turn on shadows in the old naive way,
where we basically just set the shadow,
capacity of the layer to be non-0.
And when we do that, you can see that we have nice shadows,
but we have pretty horrible performance,
especially when I bring this up a bit.
It's really nasty.
So, very simple.
All we had to do here is set the
shadowPath to be a rectangle
in this case, because that's the shape of the layout.
And as soon as we turn that on, everything
becomes way faster and much smoother.
[applause] So, probably, that shows you
why you should be caring about this.
And now let's just go through that
example in a little more detail.
So, as I said, we have a number of layers.
Each of them was created in this way.
We created a layer.
We set its rectangle.
We set a background color with some random hue.
And then the next, when we enabled shadows, we really
just set the shadowPath, we set the shadow radius,
and set the shadow offset, move the shadow vertically
under the layer a little bit to make it look more natural.
And then when we hit the final button to make
everything fast, all that was happening is this,
and so you can see what we're doing here is we're
going into the UIBezierPath to create a path
for the rectangle, which is the shape of the layer.
And then UIBezierPath is a nice feature where
you can ask it for a CGPath, and, obviously,
the CGPath will be the underlying representation
of that path, and then we can use that CGPath there
to set the shadowPath of the layer, because, obviously
Core Animation is a lower level API than UIKit,
so we really only deal in core graphics objects.
And so, really, that's all you had to do.
There's just one extra line or two extra lines.
You just put them up like I did
here to get those faster shadows.
And, obviously, this isn't restricted
only to rectangular things.
Because it's a path, you can create
round recs, or complex shapes.
Really, anything you can imagine can be put in that
path and used to create the outline for your shadow.
The next topic, moving on from shadows, are shape layers.
And, typically, when you're using Core Animation
layers or UI views, and you have some non--
just a slab of color content, then what you're always using
is a bitmap, and you'll probably be drawing into the bitmap
or providing a CG image ref, and that's fine,
in many cases, but there are problems with that,
if you're trying to, for example, scale the layer.
Then, obviously, since you have a bitmap, the resolution
of the bitmap are fixed, and when you scale it up or down,
you get blurriness, or aliasing, or whatever.
And, also, you can't really animate the contents of that
image, that bitmap, because, you know, again, it's fixed.
So, in certain cases, shape layers
can be a way to avoid those problems.
A shape layer is really just a layer which
draws a path with a filler or a stroke.
And so, because the drawing of the path is deferred
until the composite time, when we know the resolution
that the layer is being drawn at, the we really get a
nice scalable result, where the path will stay sharp,
no matter what scale path you apply to it.
Similarly, we have some support for animating
paths and separating between two path states.
So, if you, for example, have to have a line, and you want
to have it a certain width, and then animate from point A
to point B, then this is a really easy way to
do that without using any images or any bitmaps.
So, we should talk a little bit about the performance
here, because it's not as obvious as just using an image.
Firstly because we're really just storing the contents
as a path, then it really does use very little memory.
I mean, if you have an image, the memory usage is fixed
in terms of width by height; whereas, if you have a path,
then you could have a million by million path
shape, but it could only have four line segments,
and there's probably only like 20 floats or something.
On the other hand, because, like I said, we
deferred the rendering of the path a lot longer,
so we can get these advantages, then we'd all be rendering
probably every frame, and so that can take more CPU
if you're not careful, if you have a very complex path
or a thousand of these names or something like that.
Finally, another plus, I guess, is the images you may have
heard us talk in the past about how you must avoid blending
and all that kind of thing, the nice thing about
shapes is that since we know the shape ahead of time,
because we know what the path is, when we draw it to
the screen, we can ignore all those transparent areas,
so you really only pay the cost for the regions with color.
And so, again, if you have, for example, a
diagonal line, then if you had that in an image,
you would have to pay the cost of drawing
all those transparent pixels to the screen,
because the GPU doesn't know what's transparent
and what isn't; whereas, with the shape layer,
if you had that line stored as a path, we know exactly
where the color is, and we can just slam down those pixels
that are colored and ignore everything else.
And so, in summary, we really think
this can be really useful,
but you have to be judicious about
where you actually use it.
And, typically, you only need to really
use it when you want to take advantage
of the features of this scalable/animatable content.
And, typically, it's best in a few semi-large elements.
Again, if you have thousands of these
things, you may run into issues.
So, again, I want to run into a quick demo of this.
So, really, what we have here, again, some number of layers
created, and we can up the number to whatever we want.
And then each of these layers is a shape layer, and you
can see that it's animating between two arrow shapes.
And so this is trying to show the first
of those or the second of those points,
which is that shapes can be animating; whereas, you can see
that if you had an image, you really wouldn't have any way
to do this without redrawing those arrows every frame;
whereas, now we can just have two paths representing arrows,
and let Core Animation sublay between them as it can.
And just to show you the second nice feature about
the shape layer, which is that it's scalable.
If I apply a scale transform to the common container
of this layer-- oops, trying to get this to work.
Here we go.
Hopefully, you can see that, even though we
scaled in, we didn't lose any resolution.
We're staying perfectly sharp on the arrow points, and
it's still animating at a fairly decent frame rate.
And, obviously, we're only paying the cost
to render the bits you can actually see,
so it really doesn't make a whole lot of difference
that we now have these things massively huge.
So, again, I want to just go through the code quickly.
So, first of all, we're going to
create the two paths for each layer.
In this case, we have some function which just creates an
arrow of random shape, using a number of random numbers.
Just for convenience, we're going to use, get the
bounding boxes of each path, and you need them together,
so we have one rectangle which represents
the overall bounding rec of those paths.
And then we're going to create a shape layer,
whose fill color is set to some random color,
whose position is somewhere on the screen, and then
whose bounds is the bounding path rate that we computed.
And then to set up the actual path, you can see
we didn't bother setting the layer path property
at all, because we're going to animate that.
So, we create a basic animation, which gives us that
from-to behavior and targeting the path property,
and then we set the from and to values of the
animation to be these two paths we created.
And, obviously, animations often work on numbers, but, in
general, they can be anything that can be interpolated,
and we know that we can interpolate
between two CG path objects.
And so once we have that, we just set some
common timing properties, give it a duration,
give it some kind of these, set it
to pulse back and forth forever.
And then finally we just add that to the sublayer.
We're not going to specify a key, because
we don't ever want to reference this again,
so we just let it sit on the layer forever.
And so you can see we did that 20 times,
or whatever, and got those pulsing arrows.
One of the things we found, when developing the iPad, in
particular, with its larger screen and more complex content,
is that when you have a lot of things animating
around the screen at once, that puts a lot of strain
on the compositing system, because if something is moving,
even if you have actually, where you're only moving one
of those layers, but it has like 100 sublayers, like
some kind of view hierarchy of table view or whatever,
then you don't just pay the cost for moving the one thing.
Obviously, you have to rerender all of those
things, and it's a complex rendering tree.
Then that can be an expensive task,
which can take enough time
that you've dropped 60 frames per second,
from the nice smooth behavior we want.
So, we've added a new feature for the iPhone OS 3.2,
for the iPad in later releases, where you can now ask us
to basically take a subtree of a layer and all its
sublayers, and you can ask us to basically cache
that in a bitmap on the render tree side.
So, to do that, you really just
set this shouldRasterize property.
And what this is telling us is
that you're asking us to convert
that layer tree into an image every time we render it.
And so that's kind of a flattening act.
It's taking this tree and converting it to just one bitmap.
And then the beneficiality of that-- if that's a word--
is that we can then reuse that bitmap whenever we can.
So, we create this bitmap, and, ideally, the thing you asked
us to rasterize with a cache won't be changing from frame
to frame to frame, so in that case, we rendered it a
previous time into a bitmap, and then we can just use
that bitmap again and again to
stamp it into subsequent frames.
Obviously, if we can do that, if we can get some
reuse, then we can avoid a lot of that extra rendering,
where we can get much better animation performances.
So, I wanted to show kind of a diagrammatic
example of what I'm talking about,
because it may not be immediately obvious.
So, if you think about a very simple
layer tree we have here,
we have a background color, a layer with a background color.
It has two sublayers, an image, and some text, and
then all of that is parented into another layer,
which is setting some kind of 50 percent scaling matrix.
So, if we don't have any of this caching stuff, and we add
that layer tree to our view, then what's going to happen
when it renders is that each of those three layers
that actually provide content are
going to render one after the other.
So, they render into the frame buffer one,
first the color, then the image, then the text.
Then you can see that-- well, you
can't see, but I'm going to tell you--
they rate it at 50 percent resolution
here, because there is the scaling.
So, they didn't render at 100 percent and scale.
They just rendered directly into the screen.
Now, if I set this middle layer to
say rasterize, then, implicitly,
what you're asking us to do is create a second
buffer here, so we now have the frame buffer,
and we have this caching buffer, which is going
to start to hold everything in that subtree.
So, now you can imagine what's going
to happen when I render this again.
Instead of rendering to the screen, we're going to
render those three things again into the caching buffer,
but this time, hopefully, you can see they're a lot larger,
because they're actually rendering at the native resolution
of the layer, instead of through that transfer
matrix, because we're just caching that subtree.
And so, obviously, once we have
that cached, we then can use--
the rendering system will take that and just
copy it to the screen through the actual matrix,
and we end up the same place we were before.
Of course, the nice thing is that we've done
that once, but if we need to render that again,
then we don't have to go back to the layer tree.
We can just take the cache buffer and
just copy it straight to the frame buffer.
We don't have to render the layer tree.
That's, obviously, in this case, only three items, but if
there was 3,000, we could get a huge performance win there,
because we just skipped all of that work.
And, again, another example of why this helps you is,
imagine that you change the scaling matrix from 50 percent
to 25 percent, and probably we
would create an animation to animate
that scale change, and so now we have this thing cached.
Again, we can just go, and every frame of that animation can
be rendering out of the cache, and just take a single kind
of imaging operation to take the
cache version to the screen.
So, hopefully, you can see that this really
can make a big difference in certain cases.
So, again, I'm going to show a demo of that now.
What we have here is yet another shape layer.
This one is much more complex.
So, this is actually from an SVG, and
it has about 300 path segments, I think.
So, you can see we can render one of these things
at approaching frame rate, but I'll add a few more,
and the performance is getting pretty shoddy.
Add a few more, and we're chunking along.
And, obviously, I can prove that there's
still shape layers, because when I zoom in,
you can see all the detail there, and there's no pixilation.
So, go back to the zoom back state, and what I'm going to
do is I'm just going to set that shouldRasterize property
on each of these shape layers, each of the
butterfly layers, and go-- oops, twice.
So, you can see that when I do that, it just--
the butterflies get cached in the bitmap
instead of rendering to the screen every time.
We get this nice, beautifully smooth animation.
Of course, the problem with this is that, although it's
nice and smooth, we've lost one of the good features
of the shape layer, which is, now, when we
zoom in, I don't know if you can see that,
but now everything is pixilated, because we asked the cache.
And so now what we're doing is, instead of
rasterizing the shape, we're scaling the bitmap.
There are ways to work around this.
There's another property called rasterization scale,
where you can ask the cache version to be cache
to the 7th scale factor, but for
now we're just going to ignore that.
So, I just want to talk a little bit more
about why you shouldn't use this now.
So, it looks great, and it can
be really useful, in some cases,
but you have to be really careful with
this API we've been talking about.
Firstly, a lot of these devices don't have a
huge amount of memory, so, thusly, any caching,
any things you are caching are
taking memory from something else.
Bitmaps can be large, especially on larger screen devices.
Also, obviously, the caches are fixed size,
so, once you ask too many things to be cached,
then some of those won't fit, and
you won't get the benefits.
If you ask us to cache, but then are unable
to get the reuse, then that's actually worse
than if you hadn't asked us to cache at all.
And the reason for that is that the rasterize
properties, it's kind of an API contract.
You're asking us to always convert it to a bitmap, because
that has some side effects, like these pixilation effects.
And so we really need to always use the same
results, no matter what the other circumstances are.
So, for example, if you ask us to cache 1,000 layers, and
then there are only ten of them that can be used from frame
to frame, then the other 990 will be
rendered into a buffer and then rendered
to the screen every frame, and that can be pretty expensive.
Also, as we saw, rasterization locks the scale down.
And one last point, which is a little
more esoteric, but people have run into it,
which is the rasterization or the
caching happens at a very precise point
in this kind of pipeline of rendering operations.
Really, what we're doing is we're taking the
layer, and all its sublayers, and its contents,
and copying that to an image, and then taking that
kind of thing and stamping it into its parent.
And the act of compositing it into
its parent is another step
in the rendering process, and that's where masking happens.
So, if you have a mask layer applied to a
layer, it's going to put a nice shape around it.
Then that is going to be working on the cache
version, and so the masking operation itself,
which is also fairly expensive, will not get
any benefit from the caching, at that point.
So, obviously, if you want to deal
with that, you can just turn on caching
on the sublayer, and, hopefully, that'll solve it.
Okay, so one UA guide for iPhone iOS 4, I should
say, is something to do with keyframe animation.
So, let's talk about that.
So, as you may have seen in the previous Core
Animation talk, keyframe animations are another type
of animation object, which instead of just moving
between two points, they move value between endpoints.
So, for example, in this case, we have four points,
and we're moving some point up, down, whatever.
But you can see here the lines are very straight.
There's no curves, so it might be okay
for what you want, but, typically,
you want a more natural kind of animation movement.
And you can do that with the previous set of APIs, but
it's fairly tricky in that you need to either create a set
of timing functions to apply to
each segment, or use a Bezier path.
And in both those cases, you have to be very careful
to preserve continuity through the transition points.
You have to make sure the tangents of
each side line up and all that stuff.
So, we've added a new feature, which is basically
what we call a new calculation mode for the animation,
and a calculation mode is just how we do the interpolation.
So, whereas, before we were using a linear
calculation mode, we've added a new one called Cubic.
And most of you probably know, a Cubic interpolation is not
just looking at two points to get the interpolated point.
It looks at the surrounding points, as well.
Because of that, it preserves the
continuity through the points.
So, when I set that, instead of having this
flat, angular curve, we get this fairly similar,
but now we have the transitions are a lot smoother.
And so this is actually using something called a Catmull-Rom
spline to fit those points, but there is a fair amount
of customizability here, and there's three other properties
on the animation core, the tension, continuity, and bias,
which just let you kind of yank the tangents a little
more, but without ever giving you the possibility
that you're going to lose that continuity,
at least unless you really want to.
So, it's a very quick thing, and,
hopefully, it's very easy to use.
It should just-- if you need to use it,
hopefully, it'll make things a lot easier.
So, another animation topic, which I guess Michael
touched on earlier, but I wanted to talk about, too,
when you apply a rotation animation,
there are really two ways to do that.
You can either use the transform property-- and,
obviously, in that case, you're interpolating matrices,
which means that to represent angles, the angle is a
modulo one ton, because that's just what matrices do.
Or you can use this other subproperty called
rotation.z and then interpolate that as a 1d value.
That avoids this kind of modulation issue, modulo
issue, because you're going to animate your angles, say,
from 0 to 720 degrees, but you have a whole other set of
issues to deal with, which are this Euler angle problem.
And so what you're really asking us to do here
is take the matrix, the transform property,
and extract the three Euler angles for
that matrix, and then interpolate those.
And the problem with that is that it works fairly
well, if you're only animating one of them.
But once you start touching multiple of these Euler
angles, like you want to do a y animation and a z,
then you get into a whole world of pain, I guess,
because these things really don't concatenate nicely.
You can get gimbal lock issues,
where they align to the same plane.
And it can be a nasty issue, so what we've had
to, I guess, a couple of releases ago, now,
is a new value function property, and the value
function is really just a way to apply a function
to the interpolate to get the value we set on the property.
And so, obviously, you know, the
interpolant is what we're interpolating,
and so we want to think about that
for this rotation problem.
Then what we're going to do is like
the Euler angle rotate animation.
We're going to interpolate a 1d
value between 0 and 2 pi or 0 360.
But we're going to set that to the transform property of
the layer, which, as you know, is a matrix, not a 1d value.
So, we have to apply this makeRotationMatrix
function to turn that 1d value into the matrix.
But, hopefully, you can see that by doing this we've
avoided all these problems with the previous two methods
of doing this, which are we can represent any angles.
We get complete control over how the
interpolation happens through those angles,
and we don't have to worry about any of those Euler angles.
So, if you have two animations, both setting the
transform property and with the additive mode set,
then they will animate correctly, and
you should get exactly what you wanted.
So, again, just a really quick
example of what this looks like.
So, we create an animation for the transform.
We set the two from and to values to be 0 2
pi, and then we just set the value function
to be a instance of the CA/valueFunction class.
And right now there's no way to create your own
functions, but we give you a bunch of useful ones.
So, in this case, we want to have the function
which takes a single value and creates a matrix,
which is a rotation of that z access,
which is normal to the rotation.
So, when we get that, that's going to do what we saw on the
previous slide, and then finally we'll just set the duration
at the animation of the layer, and off we go.
So, one final animation point, which is, typically, you
often want to find when the animations have completed.
You want to modify your layer tree at that
point, add new content, or remove them.
You want to chain animations, in some cases.
And then so previous to iOS 4 and Mac OS Snow Leopard, the
only way to do that was to create the animations explicitly,
and then set the delegate property, and that works fine,
but often explicitly creating animations is more work
than you have to do otherwise, because we
have all this implicit animation feature.
So, we now have this other way of getting completion
callbacks, which are using the objective C block syntax.
And so setting this transaction completion property
will tell us the runtime, that this is a block of code.
And any animations I create from this point, I want you
to remember the block of code they're associated with.
And then when all of those animations have completed, that's
when you fire off the block to run on the main thread,
and it gets to do whatever completion work it needs to do.
So, a lot less typing than creating explicit animation
subclasses, delegates, and all that kind of thing.
So, again, in this example, we created a block, and then we
just set these two properties, capacity and position, and,
obviously, what we're trying to do here
is we're ramping down the capacity to 0
and moving this layer somewhere far over to the right.
So, you can probably guess we're
trying to move this thing offscreen.
So, then, when the block runs that we set up
earlier, it's just going to remove the layer
and then, presumably, do some other cleanup work.
And that's a nice way to animate some types of things,
where you don't need to do this work afterwards.
But implicit animations are totally
enough for what you need to do.
Okay. So, that's really all the API mixture.
So, just in summary, I think the most important
point to this section, if you're going to do shadows
on the embedded iPhone platform devices,
things, then you really must use the shadowPath.
Just not setting that is really
not acceptable to performance.
I would say putting 100 times that over 100
would never be good enough for what you want.
So, you really do need to set the shadowPath.
Secondly, CAShapeLayer, although not useful for everything,
in some cases, can really save you, because it works,
gets around all these limitations of bitmaps, and
the performance is really good enough to have a few
of these things running around, as we saw,
and getting you this nice rendering quality.
And then, finally, think about if you're
coding up some kind of animated UI,
and the performance really isn't good enough, then using the
shouldRasterize property to try and get some kind of caching
out of it is often a really good way to improve performance.
But I must stress, like I said, it's
really a last resort kind of feature
in that you don't want to do it unless you really have to.
Okay. So, the next thing I want
to talk about is performance.
Specifically, I want to build up a
picture of how to think about performance
of graphics rendering on the iPhone and the Mac.
I'll mostly focus on the iPhone, although all of
this stuff is really applicable to both platforms.
So, the first question you really
want to ask is we're going to build
up from the bottom, up from the hardware through to the API.
And so the question is what do GPUs do?
A GP obviously being a graphics
card or a graphics processor.
And so you may have seen this kind of diagram before.
This is like one way we program the GPU, and
this is not really what I'm talking about.
We really don't care about this.
This is for open GL programmers.
And we're really program at a much
different level than this.
We don't deal with lots of vertices.
So, we're really just thinking about triangles.
So, let's get rid of that, and let's think about GPUs.
In our eyes, the GPU is really just the
device to compare triangles to pixels.
Obviously, the pixels live in a frame
buffer, a piece of memory somewhere.
And so we have multiple types of triangles.
Firstly, we can have a triangle with a color.
You can see that.
We can have triangles with an image, and
we can have triangles that aren't opaque,
so they need to be composited with what's beneath them.
The interesting point there, from a performance standpoint,
at least, is that the first two were both opaque,
so they really don't care what's beneath them, and they
can just write that color directly into the frame of them;
whereas, the second one, the non-opaque one really
needs to do some math to compute the final pixel.
So, it's going to look at what's underneath it, do some
kind of plus, multiply thing, and then write that back in.
So, already we can see that blended
triangles have more performance,
more GPU cycles required for them than opaque triangles.
So, that's very useful if you're
thinking about your UI, right?
And then, finally, one other thing to think about is
that we're not just talking about one memory buffer.
We can draw triangles into a piece of memory and then use
that as the source image to apply to another triangle.
And so you can see here, I took the content I previously
rendered and mapped it across some other set of vertices.
That's really all we can talk about for the GPU.
So, the question then becomes how
do we take your view hierarchy,
and how do we map that onto that set of primitives?
The answer is really very simple, which is we
just take your layers and map them into triangles.
So, this is an image I reused, so kind of
like triangles, but the idea is that each
of these rectangles is really just two triangles.
Obviously, you can split from one vertex,
split between opposite rectangle points
and get two triangles out of each of these rectangles.
So, specifically, your layer has a background color.
Your view has a background color.
Then we would draw two color triangles in that color
into the layer of the thing I was drawing it to.
Similarly, if you have an image applied to the contents
of the layer, then we draw two triangles on an image, and,
obviously, you can see, depending on the opacity of those
contents, we may have to turn blending on one of them.
And then, again, more complex compositing effects will use
that other feature we talked about, which is render similar,
render something similar, and then use that to
do some extra map, copy that back to the screen.
And simply, we can do caching.
We saw just before that when we cache something,
we render it into one buffer and then copy it back.
And also things like masking filters, if you're on a Mac.
All these things have to do extra work.
They can't just render directly to the screen.
It turns out that's a big deal for the GPU,
because it interrupts its flow of stream.
You can think of the GPU like an oil tanker.
It's moving along, and if you need to
stop it and point it somewhere else,
it's a big operation that takes a lot of time.
So, one more thing.
Obviously, we, at the Core Animation
level, we don't even bother sending content
to the screen that's colored by opaque regions, typically.
Again, if you're thinking about performance, you
need to know that, because you really just need
to look at the visible areas of your app.
For example, your application on the iPhone is sitting
on top of Springboard icons, probably, the hump screen.
And if we didn't do this, then that would
contribute to the performance of your application.
But since we do, really, you can stop
thinking about what you have to care about,
once you get down to that first opaque layer.
Now we get to the interesting part, I guess, which
is, given that we know, roughly, what we're doing,
at least to some very broad strokes, degree,
then what are the costs involved here?
What are the expensive things we have to care about?
So, we can break this down into three points.
Basically, how many destination pixels are we going
to touch in the screen or the temporary frame buffer?
How many pixels do we have to read to generate that content?
So, obviously, we have triangles with images.
We need to read so many triangles to generate
the destination pixel, so I read so many pixels.
And then, finally, how many times do we switch buffers?
And so we basically give these a name.
We have write bandwidth, read bandwidth, which,
obviously, measures the memory bandwidth.
We really just have to think of how many pixels, really.
And then the big one is how many
times do we have to switch buffers?
So, just a few quick examples of when you run into these.
Firstly, we have too much non-opaque
content, then you probably will--
your application will be limited by the amount of writing
the GPU has to do, the number of destination pixels it has
to touch, because, obviously, translucent things have to be
drawn; whereas, the opaque things, they wouldn't have to be.
Secondly, too many large images, then you're
probably going to be limited by the amount
of data the GPU is having to read every frame.
And then, again, if you have too many masking operations,
then the GPU will be switching between rendered targets,
and the performance will get lost that way.
Typically, what will happen is, at any one point, your
app will be bottlenecked behind one of these three points,
and so you'll do some work, fix that, and when
your final is fast, but not quite fast enough,
and then you have to switch and
look at one of these other points.
So, at this point, I want to switch and try
and put some examples behind all this talk.
And so what I have here is a sample application.
This is available on the WWDC website.
Hopefully, you'll be able to find it.
It's called the Core Animation Image Browser.
So, I'm just going to run it once,
so you can see what it does.
I'm going to build it, compile it, and switch onto this, and
then we came up, and we had this kind of image browser app.
And I could flick, and it scrolls slowly.
The structure of this is pretty simple.
We have a scroll view.
We have view controller, and we have a subclass of the
scroll view, which kind of lays out these item layers,
and each one of these is a view, but
it has a custom layer backing it.
So, really, we're going to spend
a lot of time looking at how
that item layer is implemented and
what we can do to make it faster.
So, this is the app.
I'm just going to give you a very quick run through.
So, we have some code.
We have a app delegate, a view controller.
The view controller is really just taking
a bunch of URLs from the app bundle
and then passing them on to the scroll view.
The scroll view has a little bit of code to do this layout,
and so the layout method is really just, like I said,
creating an item view for every image
that was given, image URL that was given,
and then initializing the image
view, the item view, with the URL.
And then it's going to add that to itself as a subview.
There's some other stuff down here we'll talk about later.
So, the item view is really simple.
All that it does is it has an end method, but, more
importantly, it implements layer class to redirect the UIKit
to be using another, our own layer
subclass as the backing of this view.
We don't want to use CALayer.
We want to use our own one with all its custom codes.
We return our class from this method,
and that's what happens.
And then when we initialize ourselves,
we just basically pass on the image URL
into the layer that was created for us by UIKit.
So, like I said, most of the code, in fact pretty
much all of it is in this image or item layer.
And you can see it has a bunch of methods.
And what do I want to talk about?
Right. So, I guess the final point here is the-- obviously,
I don't have time to write code here, so I kind of cheated
and added a header file with a bunch of
different options we can turn on or off.
So, the first thing I want to do is I want to recompile
with this, using this thread option, enable it, because,
as you saw, maybe this thing took
a really long time to start up,
and we don't want to be waiting
for every time we test something.
So, by setting this use image thread, all that's going
to do is we're going to create a background thread,
and we're going to arrange for our images to be loaded
on the background thread and then set into the layer
as they arrive, rather than just
doing it all at once ahead of time.
And that's really not a Core Animation performance
thing, but it makes this a lot more usable.
So, let's run it again.
Okay, so now you can see that the images are
loading, as we go, and that's a lot better.
But performance now is what we want to look
at, and performance here is really bad.
So, the first thing we want to change here is
we want to look at how the shadows are drawn.
You saw each of those items had a shadow, and I'm afraid
I did the thing I told you, you really shouldn't do,
which is I basically just set up the shadow properties
in my init method, passed the radius offset,
and then let it auto generate the shadows.
That actually works pretty well in
the Simulator, but in the Simulator,
we have a very fast CPU to do all that rendering for us.
So, I'm going to go back to my options, and I'm going
to say, okay, let's use the shadowPath this time.
And so, hopefully, when I rerun this, we'll see--
okay, so, see, we still have the same shadows, but,
well, it's a little faster, not massively so.
Okay, well, anyway, so we know we
still have work to do here, right?
So, hopefully, you can see it's better.
Now, the other thing I want to look at is the images.
This is going to be really hard to make out, but the
other bad thing we did is we, when we loaded the images,
we just took the CG image that UIKit
loaded for us, and we assigned it directly
to the contents layer, because we can do that, and it works.
And so here, what we have, nine images onscreen.
These images are actually 1024 X 768, which is screen sized.
So, you can imagine, when I composite this, I'm actually
asking the GPU to read nine times the screen size
and the amount of image data which is quite a lot of memory.
I guess this screen is roughly a megapixel.
See, now that's nine million pixels.
A really easy way to fix that, which is--
well, for me, I'm just going to turn on this,
but [laughter] I'm going to tell
you what I actually did now.
[laughter] So, right.
So, instead of using the contents properties-- you can
see before what I was doing is getting the image up here,
finish loading, and this is my didChangeValueForKey,
so when I set the image property, I want to pull that to the layer,
so, in this case, before I was setting the contents
of the layer to be image that had been loaded,
at this point, putting it on a backup thread.
Then calling setNeedsLayout, just so I can update
the bounds and the shadow shape, basically.
But, so, what I'm going to do is I'm not going
to set the layer contents to be the image,
because that's how we get this nasty behavior,
where we have all this image data being
downsampled on the fly of your frame.
I'm going to tell it send this display, and at the same
time, I'm going to implement the drawer in context method.
And so my draw in context is going to do a bit of work.
First of all, I'm going to take advantage of
the fact that now that I'm actually drawing,
I can get rid of the composited shadow entirely
by just asking core graphics to shut it for me.
But, mainly, I'm going to fetch the image, and I'm going to
draw the image directly into the layer I've back in store.
But, obviously, I'm going to draw it at
the size I want it, not the original size,
so firstly I'm going to have Core Graphics do
that down sampling, which is going to get a much,
much better result than having the GPU do it, because
this is kind of bread and butter for Core Graphics.
And, secondly, obviously, when we come to draw, we
have prescaled content, which is not the right size,
so we really, at that point, instead of
compositing 9 times the screen size of image data,
which is going to have roughly screen size.
So I think I flip that, so let's recompile again.
Okay. So, firstly, if you're looking at this on
the actual device, you'd see it looks better,
but immediately you see the performance
is way, way better now.
And just by having the right amount of image data for the
right size screen, we can give the GPU the amount of work
that it really likes to be doing, instead of way, way more.
Okay. There's one more thing I think I should show you here.
I need to restart this.
So, I'm going to run-- I guess I should switch back.
I'm going to run this, this time, using the same version of
the application, but I'm going to run it using instruments,
specifically, the Core Animation instruments tool.
So, this is messed up.
Maybe I have to kill this.
There we go.
Right, so, you can see we have-- okay,
I have to-- I didn't rehearse this.
So, anyway, what I wanted to turn on is this color
blended layer option, and I want to switch back.
You can see that-- you can see I can't scroll for one thing.
But, anyway, the point here is that
we're asking Core Animation to tell us
where the opaque pixels, and where
are the non-opaque pixels?
So, we can see this example.
Obviously, the background is green, so that's good.
That's opaque.
But all these images are being
asked to composite every frame.
And if you look at them, they're opaque, right?
So, we really don't need to do that.
We can just have them mark themselves as
opaque and get rid of the Alpha channel.
In this case, they had a shadow,
as well, but we can cheat there.
We can just draw white into the background of the
layer, because we know the background is white.
So, I'm going to put that to the background.
And so we have this other option, which is going to set
the-- I think it sets the-- I can't remember what it does,
but it sets the opaque property of the layer.
And you can see where it's saying here,
if our layer is opaque, when we draw it,
we're just going to fill the background with white.
And so I run this again, and, hopefully,
this time, it's going to be a lot greener.
You probably won't see a difference in performance,
because we weren't really stressing that aspect of the GPU
in this app, but this would be useful in
your other cases-- except I don't hit save.
Okay. Okay, here we go.
Okay, so now it's all green, and that basically
means that it's still scrolling really smoothly,
and probably if I put a few more images in
here, it'll get smoother than it was before.
And that's kind of what you want to look for, just
little tricks where you can minimize the amount
of compositing that's going on
get the extra bit of performance.
So, one last thing I wanted to show here is-- let's turn
off that color thing, first of all, while I remember.
So, one final thing, which is a little similar,
but I wanted to show you another feature,
another way of doing a common feature, which is after you
scroll, and you wanted to have a masked feathered edge
in it, so I set this to 1, and then recompile.
Then we'll see the app has changed a little
bit, which is near the top and bottom
of the scroll layers, we have a feathered edge.
Right. So, you can see it fading, right?
I see this every now and then.
And so you can also see where we
lost a bunch of performance now.
It's back.
It's not as bad as it was, but it's still, obviously,
chunky-ish, and so we want it to be as fast as it was,
but let's, first of all, just look at what we did here.
So, really, I turned on this stop edge layer, and this
is going to be added as the mask of the scroll view,
because we want to take the user masking operation to
kind of just gradually clip out the edges of the scroll.
And so what this does is, really, it's just
a layer, and it has a sublayers method,
so that whenever its size changes,
it gets to reconfigure itself.
And in this case, we're basically going to
create two layers-- I'm sorry, three layers.
We're going to create two gradient layers, one
for each edge, so It's going to wrap from 0 to 1.
And then we're just going to create
a solid line in the middle,
and it'll just read a nice gradient,
which is going to ramp from 0 to 1.
Sorry, 1 to-- 0 to 1, to 1, to 0.
And you can see, when we set that as the mask
of the layer, we get the effect we wanted,
because our mask kind of dissolves the
content and then applies it to the background.
But as I was saying, the performance here wasn't
good enough, and that's because if I switch--
let's see, if I switch on the color offscreen option,
and then switch back to the app, you
can see the whole screen is yellow.
And what that means is that we're basically
taking an extra offscreen rendering pass,
which is that thing I showed you earlier, where
we draw a bunch of triangles into a buffer,
and then use that as the source
for another drawing operation.
And that's, you know, this whole kind of dependency chain
gets created, and it makes performance pretty nasty.
So, another thing to look at, is you
want to get rid of this kind of thing.
You want to basically eliminate all of this yellow.
And just like before, with the shadows, I
could draw the shadows on a white background.
Again, we know the scroller background here is white,
or at least it's static, and so I don't really need
to be doing masking here, even though you
may think of this as a masking operation.
I can turn this around and really just
composite a white gradent on the top and bottom
of the scroller, and I get the same effect, right?
So, I have another magic option,
which will-- oops-- do that for me.
So, I switch this to 2, then what I'm going
to do now is-- if I find the right view--
you can see I have this piece of code here,
which is setting up this subedge layer.
So, if edges equals 1, I'm going to use the
mask, which is probably what we're doing.
But, in this case, I'm just going to add this
as a sublayer of the top of the other thing,
and then I have to make sure now that the gradient is
inverted, because before we wanted to mask off the edges.
Now we want to cover them up, so we want the
opacity to be, basically, in the other places.
So, whereas, before I was having
access to the gradient one way.
I'm just going to flip it over and drop the layer
in the middle, because we no longer need it.
So, let's run that.
Yeah, I'm not going to make that mistake twice, maybe.
So, I still have the color of the offscreen option
enabled, but you can see it's no longer firing,
because we don't have any offscreen rendering.
We just have two white gradients, one on the top, one
on the bottom, to cover up the pixels we want to hide.
And, obviously, you can see the
performance is back where we want it to be.
And just to prove, to maybe make it a little more obvious,
I'm going to turn back on the color blended layers option,
and you can see exactly where these
gradients are now sitting.
And, obviously, they have to be blended,
because they have an opacity ramp in them.
Okay. [applause] Okay, so just
going to want to summarize this,
go over these three things, talk
about what we were just saying.
So, firstly, let's get rid of those blended layers.
You need to minimize the number of alpha-blended
pixels to minimize the amount of write bandwidth.
And there's two basic ways to-- first
of all, there's one way to see that,
which is you turn on this color blended layer option.
That's in instruments, and instruments
only works for the devices.
And so if you're running on a Mac, or you're running on the
Simulator, you can't use instruments to turn these options
on yet, so-- but what you can do
is you can set environ variables.
And so, if you're running on the Mac, you can just
set this environ variable in your x code project
and then run your application, and
you get exactly the same behavior.
Or if you're running, say, the iPhone Simulator, then
you can set this environ when you run the Simulator.
It's not particularly hard to do.
You just have to make sure to run the Simulator
from the command line before you start x code,
whatever environ variable set you want, and then when
your app comes up, it'll have all those things preset.
So to get rid of the alpha channels, you need to make
sure that any image refs, which have opaque data,
which include the ones we were looking at there, you
have to make sure they don't have an alpha channel,
because the alpha channel is the way we are told to look at blending.
We don't look at any properties of the layer.
We just look at, does the contents
of the layer have an alpha channel?
And so, if you're drawing into the layer,
obviously, you don't get to touch the CG image rep,
because there problem isn't one, but you can set
this layer opaque property, which is going to tell us
that when we create the bitmap here to draw
into, we don't create an alpha channel,
and so it's kind of the same thing,
two ways of doing the same thing.
And, finally, another point which is a little more
interesting is that, say you have an image, and it may have,
say, a translucent border, but an opaque center.
And what you'll probably do, to start with, is
have one image and just put it on the screen.
But, obviously, since it has that non-opaque edge, you
have to have the whole thing have an alpha channel.
And if the image is really large, compared to the border,
then that can be pretty expensive,
in terms of compositing costs.
So what you can do, in those cases, is you can
basically just cut up your artwork into multiple images.
You could do strip at the top, strip
at the bottom, strip down each edge,
and then you could have the center bit be
an opaque image, like a JPEG or something.
And that will save you a lot of
performance, if your image is large.
So, the read bandwidth is really, really
simple, which is just use images that,
as much as possible, match the screen resolution.
When I say images, I really mean bitmaps of any type.
So, layers that draw exactly the same, when you draw
into them, they create a bitmap of the size of the layer,
so you want to make sure you're drawing
them at the right size to match the screen.
Yes, don't use megapixel images to create thumbnails,
because it doesn't look good, and it doesn't work well.
And so, again, there is an option in
instruments for this, "Color Misaligned Images."
This one is a little tricky to get to understand
correctly, because we've changed it recently.
So, I'll try and explain what it does.
If you are on iOS 4, then this will draw two colors.
If you have an image which has just shifted a little bit,
maybe its edges aren't quite pixel lined, it'll draw pink.
If you have an image which is scaled, which is really
what we're talking about here, then it'll draw yellow.
On previous devices, I think there were
ways to draw pink for both those cases.
So we really added that as one way to help you track down
low res content and a high DPI app, and things like that,
so you can use it for find any general scale res.
And then, again, rendering passes, this is really
often the most important thing to get right.
And, typically, unless you're doing
very small offscreen things,
you need to have only one rendering
pass per frame to get good performance.
And so often, you really need to trick your way into that.
You can't just set all this compositing
stuff up in the most obvious way.
You really have to think about what you're doing and what
you really need, and just try and drive the number of passes
down by turning on that thing--
that's not the bullet I was expecting.
So, complex compositing, things like masking group opacity.
In some cases, you have that enabled.
And for those on the Mac, we'll all
require this offscreen rendering.
And then I was about to say, use the
Color Offscreen Instruments option
to basically show you wherever
you have this offscreen rendering.
Obviously, this option controls the yellow tint
over your layers that are drawn over offscreen,
and so you can see it gets-- if you have multiple
offscreen passes, it'll draw yellow, on top of yellow,
on top of yellow, and so it gets darker and darker.
So, that gives you a nice way just
to gauge exactly how bad it is.
And then one final thing, the feature we talked about
earlier, this cached contents of layers and a bitmap,
that is actually involving offscreen rendering itself.
If you get it working correctly-- by which I mean you
actually get some cache reuse from frame to frame,
because the contents of that cache subtree isn't
changing, there's not too much demand on the cache memory,
then that can really hide those extra
rendering passes, because you can push them
into that subtree that's been rendered once, and
then reuse just the image from frame to frame.
Yes, but there is the caveat always, which you
really need to make sure it's working; otherwise,
you could be making things worse for yourself.
Okay, so one more slide on performance.
So, to sum it all up, there's really a very
simple algorithm here to look for performance.
Obviously, this involves all those color whatever options.
But what you're really caring about, why
my frame rate isn't at 60 frames a second.
Get rid of extra rendering passes, get rid of really large
images, and just get rid of extra non-opaque content.
And you just have to keep cycling around and around and,
obviously, eliminating extra core graphic storing, as well,
but at the end of the day, you just have to do the
hard work and get the performance where you want it.
Okay, so that's enough about performance.
The final section of the talk is going
to be just a little bit about high DPI.
And, obviously, we all saw the new iPhone
with the massively high DPI screen.
And so I don't know if you've been to any of the UIKit
lectures, talks about how that's going to be exposed
as programming API, but I'm not going to talk about that.
I just want to give you an idea of how you can
use this stuff on the Core Animation level.
So, what's really happening here is when we have the
high DPI phone is that you give us the layer tree,
or you give us a view tree, which
has a layer tree backing it.
And it gets composited to the screen, so we have
this is a picture of a new iPhone, unfortunately,
but if we had an old iPhone, then what we're going to have
is we're going to have a screen sized layout that's going
to have a bitmap, which is 320 X 480 pixels large.
And then if we were going to display that application on a
high DPI device, then your layer tree is exactly the same.
But what happens is that when the UI window is created,
it's going to add a scaling transform
onto the root of your layer tree.
So, we now have this kind of 200 percent, 2X scaling
transform, which just blows everything up 2X.
And I don't know if you can see this.
Maybe you can, actually.
But when we take a bitmap, and 320 X 480 bitmap,
and blow it up twice, then we get pixilation.
And so, obviously, if you had that, you would get
very little, if any, benefit from the high DPI screen.
So, we added some features in Core
Animation to work around this.
Namely, we have a new property on the layer called
content scale, and what the content scale is,
it's basically a way of telling us either that-- well,
it tells us the scale factor of the content of the layer,
and the content is, obviously, the image.
So, in this case, we're drawing text, so when I set
the content scale to be 2, which is the relationship
between my layer geometry and the screen geometry, I
get scale by 2X, so I'm going to say content scale 2,
and what that's going to do is it's
going to implicitly change the size
of that bitmap context from 320
X 480 to twice that, 640 X 960.
And then the nice thing, though, is that we'll just
hide this from you by just setting the matrices
on the Core Graphics context so that you still think
you're drawing, because it's saying 320 X 480 buffer,
but just Core Graphics will take care of the extra on
what's required to get the high resolution content.
And so zooming in, this is just
to hammer this home one more time.
We magnify the old content, and then
we set the content scale properly.
The buffer changes size, and you get finer
grain, more pixels per inch or whatever.
And it's going to, obviously, look great, because that's
going to match the natural resolution of the screen.
You get the highest possible DPI.
So, one final point is that even though
UIKit, they've chosen the way to expose us
to apps is preserve compatibility, make
sure that your window is still 320 X 480.
In some cases, you may want to think about, for example,
if you have graphics content, and you really want to get
down to the native resolution of the display, you want to
have a 640 X 960 layer, just so you can position things
in inches, exactly correctly for some reason, then
there's no reason you can't just undo that matrix.
UI window has this text matrix, but anywhere in
your layer tree, you can apply an inverse to that,
which are all CBS scale, half, 50 percent matrix, and that
will set things up correctly, and then your layer will be
in the native, again, in the native coordinates,
native scaling space, and it will match right.
Okay, so I'm really almost running out of time here.
So, basically, you have a text scale factor.
The geometry is the same.
Content scale should be used for content.
And like I said earlier, rasterization
scale for rasterizing,
and you can undo the scale matrix when you need to.
Okay, so one more slide.
So, if you take anything out of this,
hopefully, it'll be maybe these three things.
One, whenever you need to, use shadowPath, or rather,
whenever you're using shadows, use shadowPath.
Two, whenever you don't have quite the right performance,
but it seems like this could help, use shouldRasterize.
And three, really think about what your layers are meaning
to the graphics card, and try to think of them in terms
of triangles and opacity, and what have you, and
just try to make some kind of mental calculations.
And then now we're really done with five seconds to spare.
[laughter] Okay, thank you very much.
[applause]