WWDC2010 Session 426

Transcript

>> David Hayward: Good afternoon, and
thank you for coming to today's discussion
on "Image Processing and Effects with Core Image".
It's a lot of fun stuff to talk about today.
I'll start off by giving you a brief introduction to Core
Image for those of you who may be new to the technology
and discuss quickly how you can add it to your application.
After that I'll pass the stage over to Daniel, who will
be talking about how to use Core Image most efficiently
in your application to get best performance
and then, finally, Alex will come up on stage
and talk about writing some filters including
one at the end with some really twisted math,
but don't worry it's not on the final exam.
So, first a quick introduction to Core Image.
So, Core Image is used throughout the Mac OS X
Operating System for everything from fun special effects
like our dashboard ripple effect to effects in
iChat and in Photo Booth and in screensavers,
and it's always used for serious
image processing in our applications
like iPhoto Aperture including your applications.
So, how does Core Image work?
Well, it's built on its core on what we call
filter kernels, which are small portions of code,
which are written in architecture-independent language.
That language is C-like with vector extensions and
it's based on an OpenGL shading language subset.
Core Image can then execute those kernels on either as
CPU or on GPU depending on how your application sees fit
and Core Image is built on other key
architectures of Mac OS X such as OpenGL
and OpenCL in order to get its best performance.
So here's the basic concept of Core Image.
The idea is you have filters, which
perform per pixel operations on images.
In this very simple example here, we have an original
image and we're going to apply a Sepia Tone filter
to produce a new image that had effect applied to it.
However, we can start adding additional
filters and create chains of filters.
For example, here we have the Sepia Tone filter and
then we've added to that a hue adjustment filter,
which gives us a little blue tone effect.
So, this allows, as you can see,
far more interesting effects.
One key feature of Core Image, however, is that it can
concatenate multiple filters and the idea behind this is
to reduce the need for intermediate buffers
wherever possible to get best performance.
And it's not just simple chains that are supported.
You can actually have complex graphs of filters
in order to achieve much more interesting effects
such as sharpening effects and gloom effect
like we kind of see here in this example.
So, in addition to the framework for executing filters,
Core Image includes a large set of built-in filters,
which allow you to get started with Core Image right away.
These built-in filters we include include several
filters for doing geometry adjustments
such as affine transforms and cos transforms.
We have distortion effects like this glass distortion;
a wide variety of blurs, Gaussian blurs, motion blurs;
we have sharpening filters; we have color
adjustments like this one, which is an invert filter;
we have color effects like this is a
Sepia Tone filter we mentioned earlier;
we also have a bunch of stylized fun filters
like this one, which is called crystallized,
which turns an image into a crystal; we also
have half-tone effects and also tiling effects,
which will take either a rectangle or a triangle and create
an infinite image out of a portion of your input image;
we have generators, which are for generating starburst
effect and checkerboards, whatever you can imagine;
we have transition effects, which are useful for using Core
Image on video, which allow you to segue from one piece
of image to another image; we have composite operations
like standard Porter-Duff composite operations;
and we also have a set of reduction operations such as
filters, which will take a large image and reduce it
down to the average color of an image or the
histogram of an image and this could be very useful
as foundation for other image processing algorithms.
So, as I alluded to earlier, Core
Image supports large rendering trees
and one of our keys features is we will optimize
that tree for you to get the best performance.
Our Core Image runtime has just-in-time
optimizing compiler and one of its key features is
that it defers its optimization until you
actually draw the image and this allows it
to only evaluate what portion of
your image is needed to draw.
So, if you're zoomed in on a very large image, Core Image
will only apply your filter to the portion that is visible.
Similarly it also supports tiling of large images so if
you have a very large image, it can break it up into pieces
for you without you having to do
all the additional work of tiling.
Core Image also performs optimization
algorithms; the other typical compilers don't.
For example, it will concatenate multiple color matrix
operations if they appear in series or if you have a chain
that involves a premultiply and an unpremultiply
of alpha, it will optimize those away.
It also if you will reorder scale operations so if have
a complex filter that's being applied to a large image
and then it's downsized to the screen, Core Image is
smart enough to move the down sample operation earlier
in the processing tree so that the
complex filter is evaluated on less data.
Another optimization is it only does color management
when it needs to, which is typically on the input image
and then finally rendering to the
display, to the display for a file.
One thing to keep in mind is these optimizations
are not just about improving performance.
By optimizing out to sequential operations like
I've outlined here, we also get better quality
because there's fewer operations which
can introduce quantization artifacts.
So, that's a brief introduction
to the architecture of Core Image.
Let me talk for a few slides about how you
can add it very easily to your application.
So, first off there's a few Core Image objects that are
Cocoa based objects that you want to be familiar with.
First and foremost is the CIFilter object.
This is a mutable object which represents the effect
you want to apply, and it has a set of input parameters,
which can either be numerical parameters or images,
and the results of a filter is an output image,
which then you can do further filtering on.
Another key object type is the CIImage Object, which is an
immutable object that represents the recipe for an image.
This image can either represent a file just
read from disk or it can represent the output
of a filter that I just mentioned earlier.
The other key object to keep in mind is the
CIContext Object and this is the destination
to which Core Image will render your results and the
CIContext Objects can be based on either OpenGL context
or on Core Graphics context depending on
what you see fit best for your application.
So, how do you add Core Image to your application?
Well, you basically can do it in four easy steps.
First is we want to create a CIImage Object.
In this brief example, we're going
to create a CIImage from a URL.
Second, we want to create a filter object.
We create a filter in Core Image by specifying its name.
In this example, the CISepiaTone filter.
We also at this time can specify the
parameters for this filter, the input image
and the amount to the effect we want to apply.
Thirdly, we want to create a CIContext into which to draw.
Here we create a context based on a CGContext and lastly
step four we want to draw the filter into that context.
So, we ask for the output image of the filter and then we
draw that into the context and that's all there is to it.
Here's the same steps written in code-like format.
Here we have a slightly more complicated example and
four lines of code; we're creating an image from an URL;
we're applying the Sepia Tone filter as we did earlier;
then on top of that we're applying a hue-adjustment filter,
which will rotate the hue in the
image by 1.57 degrees by radiance.
Lastly, we get the output image of that
filter, and we draw it into our context.
Now, one of the great things about Core Image
is that in a very few number of lines of code,
you can leverage all of the key technologies that
are below Core Image such as OpenGL and OpenCL.
For example, these four operations here when
Core Image converts that to OpenGL work,
will represent all of this code in
OpenGL for you and this includes tiling,
this includes converting the kernels
to shaders and so forth.
Similarly when Core Image is leverage OpenCL to
do its work, it will also convert that same set
of operations into a healthy amount of OpenCL work.
So, that's our brief introduction to Core Image and
how to add it to your application to talk about how
to get the best performance out of Core Image.
I'm going to pass the stage over to Daniel, who
will be talking about getting the best performance.
[ Applause ]
>> Daniel Eggert: Thank you, David.
So David showed you how easy it is to use Core Image.
So, I want to show you a small demo that does exactly that.
It's a very simple demo that uses Core Image.
Then next I want to take you through five topics
related to Core Image and show you some things
and how to do things efficiently with Core Image,
a few things to be aware of and then, finally,
take you through some debugging tips at the end.
So the demo is a very simple demo.
It opens image and then inside the
NSViews, subclass is draw method,
it applies one of the built-in Core Image
filters to it and draws it to the screen.
So let's take a look at what that looks like.
So, this is the pointilizer demo app.
Let me drag an image onto the app, and it simply
opens that image and here's our custom NSView
and this filter pointilizes and I can change the
radius, which is the input parameter there's a filter
and you can see the dots getting
larger, and I can drag the slider
and they get smaller and that's all there is to that demo.
So this simple demo application is
available on the attendee website.
I suggest to you to download this app check
it out to get started with Core Image.
Also, the next few things I'm going to talk about most
of those are illustrated in this simple application.
So, the five things I want to talk about.
I'll start off with something that most of you
have probably heard about which is NSImage.
This is the Cocoa image object that
most of you know very well probably.
Some things that people are not aware of is that an
NSImage can both be a source and a destination and,
hence, is inherently mutable pixel container.
The content inside an NSImage can
change depending on various situations.
Also an NSImage can contain both bit
map data and things like PDF data.
So the other image type that is available to you
on the system is something called
CGImageRef, which is the quartz image object.
This in contrast to NSImage is an immutable pixel container.
It contains exactly one bitmap-based
image and this is the type you want to use
for best fidelity when you're using image processing.
This is the type you want to read your
data into and if you're saving onto disk,
you want to use the CGImageRef-based APIs.
Again, the sample code will take
you through some of these steps.
Have a look at that.
Next up is I want to talk a bit about CPU and GPU.
You've probably all heard about that a lot of work is
being, like the GPU is the new kid on the block you want
to use the GPU, but you need to be aware that both
the CPU and the GPU each have a unique benefit.
For example, the CPU is still what will give you the best
fidelity whereas the GPU will give you the best performance.
So, it's really a tradeoff there.
Another more subtle detail is that
the CPU is more background friendly.
On the CPU, you have thread scheduling so
if you want something on the background,
you'll probably want to run it in the background on the CPU.
The GPU has the obvious advantage that it
offloads the CPU so if you have a lot of work
on the CPU you might want to use the GPU.
So, it really depends on your application.
You need to think about what is
the right thing to use for you.
Two examples if you're applying an image effect on
image, if you're apply an image effect on image,
if you're apply an effect on image and want to say
it to disk, you're probably going for best fidelity
and in that case you want to use the CPU.
If you're interactively updating the
display, you probably want to use the GPU.
So now that you know which one of the
two you want to use, how do you do it?
David showed you how to create a CIContext.
Well, when you create it, you can
create it with an options dictionary.
Inside that options dictionary, you set
kCIContextUseSoftwareRenderer to yes,
and that will make that context be a CPU context.
If you don't, you pass that option
dictionary in when you create the context;
if you don't specify it, that context will be a GPU context.
The other thing to note as David already mentioned is
that CPU Context will use OpenCL on Snow Leopard and 4.
The next thing is probably the most important thing
to take away from what I will say in this session.
It's about your CIContext.
The CIContext inside Core Image hold on
to a lot of state and a lot of caches.
That is not visible to you, but Core Image does that to
ensure that your application gets the best performance
and you really need to keep your CIContext around
and reuse your CIContext otherwise all
those caches will be thrown away every time.
So, as I've written here, reusing your CIContext
is usually the single change in your application
that leads to the largest performance win.
So check your application if you're using Core Image and
make sure you're doing this and well, how would you do that?
If you're using like in the example app
you're drawing inside an NSview drawRect,
you can simply use NSGraphicsContext
and get the CIContext from there
and that will automatically do the right thing for you.
It will reuse the same CIContext.
If you are creating your own CIContext, you simply
retain it in the beginning or, sorry about that,
you retain it in the beginning, you reuse it and then
in the end before your application
quits, you release the CIContext.
So, remember this.
If you're not doing this, it might
give you a large performance win.
The next thing is color management and the good
story here is that you get color management for free.
Core Image automatically does color management
for you by respecting the input images color space
and respecting the context of output color space.
The filters are applied in the linear working space.
So at the beginning Core Image converts from the
images color space to the linear working space
and on the output side Core Image converts from that
linear space into the context output color space.
Sometimes people want to turn off color management.
You can do that.
Again, in the options dictionary, you can set on
the input side the KCIImage color space to null.
That will turn off color management on the
input side, likewise on the output side,
and you get setKCIContextOutputColorSpace to null.
That will turn off color management on the output side.
When you're entering to the display,
the display has a color profile
and we color manage to match the displays color space.
This color space will change and there
are two reasons why it could change.
One thing is if the user drags the window
of your application turn on the display,
if the user has a multiple display setup,
the other situation is more subtle.
If the user while your application is running goes
into system preferences and changes the color profile,
you want your application to update the display
so that your window still looks correctly.
You would do it like this.
You call set displays when screen profile changes to yes
on your window and then inside your drawRect method, again,
you use NSGraphics context, get your CIContext from there.
That will make sure that your window is redrawn
correctly when it moves from one display to another.
This does not all, however, work when the user
changes the display profile in the system preferences.
Go and check out the sample code.
It shows you exactly how to handle that situation.
Some of your applications might use off screen caches where
you've rendered something to that you are then reusing.
Those off-screen caches need to be
invalidated when the display profile changes.
Again, it's kind of the same scenario,
but slightly different.
You can get notified about the
display profile changing either
in the windows delegate you can
implement the window change screen profile
and implement that method, clear your cache there.
There's a notification you can register for.
It's called NSWindowDidChangeScreenProfileNotification.
You can use that to clear your caches.
The fifth thing is threading and, again, the
good story here is that if you're using the CPU,
Core Image will automatically use
all available cores on the system.
You will get multi-threading for free
with Core Image rendering to the CPU.
There is another side of threading, obviously, if
you are using multiple threads in your application
and there's some things there that you need to be aware of,
calling into Core Image the CIContext
are not thread safe.
Everything else inside Core Image is, but if
you're using background threads for Core Image,
you need to create separate CIContext instances for
each thread that you're calling into Core Image from.
You can use locking, obviously, and just use one shared
context, but we recommend using separate CIContext instances
as that is usually the fastest approach, but it
depends on exactly how your application works--
these two possibilities.
The other thing is if you go down
the road of calling into Core Image
from a background thread, you need to set the stack size.
Core Image actively uses the stack.
We recommend setting the stack size to 64 megabytes.
If you are using Cocoa, this is how you set
the stack size on newly created threads.
That brings us to debugging.
Now you are using Core Image and you want
to know what is going on inside Core Image,
we have something that is called render tree printing.
It allows you to see constantly what Core Image is up to.
We will render to the console for every time
that you tell Core Image to draw an image
into a context to render an image into a context.
There's an environment variable called CI_PRINT_TREE.
When you set that to one, each of these draw calls
will cause the filter tree to be dumped to the console
and this gives you a peek into what Core Image is up to.
One thing to note though is as David already
mentioned Core Image does tiling whenever necessary.
If the output size of the image is very large, one
single draw operation might actually cause multiple draws
and in that case, you will see multiple outputs.
So, in Xcode what you would do you would find
your executable in your Xcode project under Groups
and Files there's a section called Executables.
You find your application there, you Right-click
on it, select "Get Info", out comes a window,
you need to go into the "Arguments Tab",
and in there you hit the little plus
at the bottom and you can add an environment variable.
Again, it's called CI_PRINT_TREE.
You can set that to one.
And there's a nice little checkbox on the
side so you can just leave it in there
and just toggle the checkbox if
you wanted to turn it on or off.
If it's turned on, it will impact
performance because you are logging a lot
so this is an easy way to turn it on and off.
So, now you've turned it on what will
actually happen, what will you see?
You will see something like this and you will see
many of these if you're using Core Image a lot,
and obviously we think you should be using Core Image a lot.
The thing to see here is lines
like these and each line starts
with these two asterisks is a render
operation, is a draw operation.
Let's look closer at the first one of these.
It looks something like this and
these, this is the render operation
and you need to read these from the bottom going up.
The first line we see, which is
the last line, is your input image.
In this case, it's a 292x438 image that is
being put into Core Image into the filter tree.
We go up there's a kernel being applied to it and there's
a fine transform being applied to the result of that
and there's some more kernels being applied
to the result of the affine transform
and finally this is being rendered to a context.
In this case, the context name is fe-context-cl-cpu,
which tells you this is a CPU context using OpenCL.
So let's briefly look at the second one
of the two that I had up on the display.
Again, we read from the bottom.
We start out with a FILL operation.
Here's our input image.
In this case, it's a 1200x900 image.
We go further up there's some kernels applied to it.
There's a source over, some more kernels
and finally we end up at the context.
In this case, the context name is different.
It's an fe-context-gl, which tells you in this
case the context is a GPU context using OpenGL.
So what can you use this for?
You're seeing all these things
and what's the madness behind it.
Well, one easy debugging help that you can gain from this
is you can see how many times a thing is being rendered.
You can find your input images in this output.
One easy thing is just to look at the sizes of the images.
If you know that you're inputting an image
that's 1200x900, you can identify those images
and let's say you're taking an image, applying
some filters to it and saving it to disk,
you would only expect one render
call being done on that image.
If you're seeing 21 render calls,
you should probably go back
and look at your application logic and
see if everything looks sane there.
The thing I mentioned earlier is if you have large appellate
sizes, Core Image might actually call draw multiple times
because of tiling, but still we have seen
shipping apps that had poor performance simply
because they were drawing too many times.
Not only when saving to disk but also updating your display.
This can give you some hints to what is actually happening.
Another thing that you can use this for is to see
are you using the CPU when you expect to use a CPU?
Are you using the GPU when you expect to use the GPU?
You can, again, identify your input images and then
see if the context name matches your expectations.
If you are rendering to the display for interactivity,
you probably want to use a GPU, does that really happen?
The CI_PRINT_TREE stuff is something, again,
you can try with a demo application
that is available on the attendee side.
You can turn it on there and see what
that small sample application does.
That is all I have for you and now I'd
like to ask Alex on stage who is going
to tell you something about writing your own filters.
[ Applause ]
>> Alexandre Naaman: Thank you, Daniel.
My name is Alexandre Naaman.
I'm a software engineer on Core Image and
today I'm going to talk to you a little bit
about writing your own customer CI Filter.
So, so far we've seen today a little bit the cast of
characters and how you go ahead and put those together
into creating your own app in an efficient manner and
what I'm going to show you today is first off two samples;
one very simple sample, which is going to be desaturation
filter, and then a more complex one based an idea
that M.C. Escher had in 1956, and he
actually didn't complete so we're going
to talk about those in a fair amount of detail.
So, first question you're going to ask yourself is,
why would I write a filter instead
of just using what's already there?
And two main reasons why you would want to write your
own filters is because first off the filter doesn't exist
in the existing set that we ship in OS or you
can't create the effect that you're looking
for by daisy chaining these two things together.
So, any number of effects together.
The way we're going to do this
well, there are two main ways.
First off you can try doing it inside of Quartz
Composer and writing a kernel and then porting it over
and writing some objective C code in your
kernel, but for the purpose of our demo today,
we're going to do everything inside of Quartz Composer and
it's really easy once you implemented your algorithm inside
of Quartz Composer to bring it inside of an app.
So, our first sample is going to be relatively simple.
We're just going to desaturate an image and we're going
to have the slider that controls how desaturated it gets.
So, this is what the composition looks like
inside of Quartz Composer so it's fairly simple.
We're going to take an input image and then we're going
to pass it through our kernel that we're going to write
and we're going to go through that step-by-step
and desaturate it to a certain amount until we end
up with a new image and that is the entire composition
and the amount value is going to be a slider that we have
that goes from zero to one and
controls how desaturated the image gets.
So, let's take a look at what the kernel is
going to look like and how Core Image works.
So, here we're looking at a sub part of the initial
input image because Core Image works on tiles
and that's not important right now, but it will be in
a moment when we talk about some of the things you need
to keep in mind when you start writing your own filters .
So, first off Core Image is going to ask us to render,
to provide a new color value at
every single pixel location, XY.
So, if we're asked to render a point here, the
first thing we're going to do is read its value.
So, we're going to get the current coordinate and
then we're going to determine the value of that color
in our input image at that location and
unpre-multiply it because colors that come
into the kernels have alpha pre-multiplied.
The next thing we're going to do is compute
brightness for it, CAP Y, and these values,
the RGV values come from the SRGV color
profile and you can find those by looking
in the color sync utility for that profile.
So, we're going to compute a likeness, CAP Y, and
we're going to create a new color that we're going
to call desaturated color and we're going to assign
the red, green and blue components to be equal
to CAP Y and preserve the original alpha value.
So, now we have two colors unpre-multiply
and, sorry, a RIG and desaturated color,
and what we're going to return is some mixture of those
two and if this looks familiar, this function mix,
it's because the kernel language that you used to write
your own custom filters inside of CI looks a lot like GLSL.
It's a subset of that.
Now, we can vary the value of amount, and we'll have
a more or less desaturated image and so if we do that,
we can see how that affects our output and this is,
again, just looking at a subsection of the image
and this is actually our kernel
running in Real Time inside of Keynote.
So, we took the kernel that we had inside of Quartz
Composer and we just dragged it in there and varied
that value amount and we end up with this.
So, it's pretty simple to prototype
the effects you're trying to generate.
That being said this is a fairly simple sample, but there
are two other things you really need to keep in mind
when you start writing your own kernels and those are
the Domain of Definition, or DOD, and Region of Interest.
So, in the case of the sample that we were
just looking at, if we take our input image
and let's suppose that it was a size 1200x1000.
After we've applied the effect, the image
is going to be of the exact same size.
This is what we call the Domain of Definition,
which is to say given your input image once the
filter is run, what is the size of the input image?
In this case, it doesn't change we don't have
to tell Core Image to do anything special,
but you can imagine that if you did a zoom or a blur that
the image might get either smaller or larger and you need
to tell Core Image what to do in that situation.
Now, as I mentioned earlier when we perform the
rendering, we actually provide you with tiles.
So, we do a tiled approach that is going to be
optimized for the device that you're targeting.
So, if we were trying to render this small section
here, what we need to know is that is the data
that we need from our original input image?
And this is what we call the Region of Interest or
ROI, and so in this case, we have a one-to-one mapping;
for every pixel in, we have another pixel out, and
we're not reading any additional data so we don't have
to specify an ROI, but as soon as you write
anything a little bit more complicated,
you may have to take those things into account.
So, let's look at another sample that's
going to be slightly more tricky,
and this is just going to be involving transposing an image.
So, for every, we're just going to
swap around the x and y coordinates.
Very simple.
And the kernel looks like this so we just sample the
image but instead of sampling the image at dEstCoord.xy,
we're going to sample it at dEstCoord.yx.
So, how does this affect DOD?
Well, if we start off with an image that's
600x400 and we don't tell Core Image otherwise,
it's going to assume that the image that we're trying to
render is of the exact same size, which is to say 600x400.
That's not what we want.
As you can see, I've colored here in blue the
section where Core Image doesn't know what to do.
It doesn't know where to grab the pixels
from so what we want is to tell Core Image
that the DOD for this image is actually 400x600.
So, we want to swap those values around.
The way we're going to do that is by getting the
extent of the input image, creating a new filter shape
and that basically has the origin for the x and y swap and
the width and height swapped and then when we call apply,
and this all happens in your output image for your
filter, we're going to pass in one additional parameter
that is the KCIApplyOptionDefinition and give it the
new shape that we just created and when we do this,
Core Image will know that what we're trying
to render our result from this filter is going
to be a size 400x600 and has a new origin.
So, that takes care of the DOD.
How does this affect the ROI?
Well, this is the result that we're
looking for, and as I mentioned earlier,
when Core Image does its rendering it tiles it.
So, if we were, for example, to render a tile in the
upper corner of the image located at this location
and of that size, if we don't tell Core Image otherwise
the result we're going to get is going to look like this
and this is a really common mistake when
people starting writing their own filters even
if they've been writing them for years actually.
So, we're going to, let's look at what
happens if you didn't specify ROI.
So, let's look at our input image and let's look at
where that rectangle that we tried to render came from.
It's in the middle of nowhere.
There's no data for us to transpose here.
We're reading basically garbage so we're going to
end up with something unknown or basically based
on our rom mode but not the results you want.
So, once again, what we need to do
is tell Core Image that the data
that we're looking for comes from a different place.
So, we're going to swap the origin so the x and y for the
origin and we're going to swap the size and we're going
to do that by creating a region of method inside of our
filter and just swapping those values as I mentioned.
One thing to keep in mind when you write your own filters
is always think about how does it affect the ROI and DOD?
And if you're sampling at locations that aren't equal to the
current dEstCoord, you probably need to write both of those
and if the rendering isn't correct chances are
that you need to tweak that a little bit more.
So, now that we're done with the simple example,
let's talk about a slightly more complicated example.
So, the code for this is available for download right
now on developer.apple.com/mac/library/samplecode/Droste,
and it's based on an idea that M.C. Escher had
in 1956 and a lithograph that he tried to produce
that he actually never finished,
but can now be done in realtime.
So, we're going to take an image
that looks like this and we're going
to create what he called a cyclical annular expansion.
So, basically a recursive image that spins around, and
it caused him to have some almighty headaches and me too.
[Laughter] So, let's look into this one a little bit more.
So, let's forget about the recursion
for a split second here.
Let's look at how we're going to
deform one level of this image,
and if we animate that that's going to look like this.
The basic idea here being that if we perform this repeatedly
so we start off with our first level of deformation
and we just keep applying it over and over and over again to
out input image until the point where the data is too small
to make any visible effect on our final output
image, we're going to get our desired result.
Now, one thing you might have noticed here is that
our image has actually been sheared and shrunk.
So, in order to not end up with this kind of unfinished
painting, we're going to add in one more layer
on the background, and if we do that,
we'll end up with the desired result.
The grid that we're going to use is going to look a
little bit like this, and I'm going to talk about the math
and I promise I only have two slides
on math a little bit later.
[Laughter] So, let's talk about
how we're first going to do this.
We're going to use Source Over.
So, if we start with our input image
A and we apply the effect level zero,
we end up with image B, our first deformation.
We do this again, level one, so the
scaled-down version, we get a new image, C.
And if we do image C over image B, we start getting
what looks like the result that we're looking for.
I mean if we do this repeatedly N times, we're
going to eventually get our final output image.
So, the question that we have to figure out is, and
this is how Core Image works it's going to ask you
for the color value at a given pixel, and you have to
figure out from your source image where does that come from.
So, let's pretend we were trying to render the
dot on the corner, a prime of that little table,
and we need to figure out where that
comes from in the original source image.
And you can see that the rectangle that we're going to be
cutting out for this image where the recursion is going
to happen is in yellow and even this
represents the first level of deformation even
after the very first level the image is going to get
scaled down a little bit and there are going to be areas
that are outside of the bounds of our original image
that are pictured in green, red, sine and blue,
and we're going to have to figure
out what to do with those as well.
So, these are my two math slides.
I'm going to go over them quickly here.
So, what we're going to do is going to
look a lot like a logarithmic spiral.
So here we have an equation for a logarithmic spiral r is
equal to a times e to the b theta and a if you think back
to your math classes, is going to control the number of
strands in the spiral and b controls the periodicity.
So let's take a look at how we're going to deform the image.
With the inner circle here corresponding to
the region that we're going to be cutting out
and the outer circle corresponding
to the bounds of the image.
So, we've got two measurements we've got r1 our
inner radius, and r2, the radius that we're trying
to get to once we perform the deformation.
Now, at theta is equal to zero e raised to the power
of zero is equal to one, therefore, a is equal to r1.
We need to figure out the values for a and b so we
can pass these into our kernel a little bit later.
Then when theta is equal to 2 pi so once we've done one
entire revolution, our two is going to be equal to r1,
our value a that we just figured out, times e raised to
the power of b times 2 pi and if we isolated the value
of b we end up with b is equal to log of r2
divided by r1 and all of that divided by 2 pi.
So, this works, but it's not a conformal map,
which is to say it doesn't preserve angles.
So, if we had a synthetic image with
no people in it, it would look fine,
but if we had any other stuff it wouldn't look correct.
The whole image looks skewed.
So, although we've preserved angles going
radially out we haven't preserved angles
in between the points and this is not going to look correct.
So, we need to do a little bit more tweaking to
our kernel in order to get the desired result.
So let's look again at our image, and the first thing we're
going to do is convert it to a polar log coordinate system.
So, we've unwrapped it and then we're going to rotate it
and scale it and then finally replicate it along the X-axis
and when we do that, we will get a conformal map
and we'll get something that looks like this,
which is going to give us a desired result.
Those are all my map slides.
[Laughter]
So, let's look at the kernel.
Believe it or not this is the entire
kernel for performing a Droste Effect.
So, we're going to pass in a few parameters.
R.x is equal to a or r1; r.y is equal to
log of r2 divided by r1 and divided by 2 pi,
which is b in the equation you looked at earlier, and then a
scaling factor, which initially is going to be equal to 1.0
and then for each subsequent operation is going to
equal to image width divided by inner rectangle width
and then we're just going to keep taking that
to the second power or third power, et cetera,
et cetera, to create each additional iteration.
So, our kernel is going to take a few parameters
as inputs the first one being the input image,
the second one being the location of a center of the
image so the center of the rectangle that we chose to cut
out from our image and the third one being our values a
and b and then the scaling factor that we talked about.
So the first thing we're going to do
is we're going to move our coordinate
that we're currently trying to evaluate to the center.
So everything is relative to the center
of the rectangle that we tried to cut out.
And then we're going to convert that, we will figure out
the angle of that point so we're going to convert that point
into polar coordinates so we've got the
distance squared and then do .5 that.
So, now we've got our polar coordinates and we're going to
perform the rotation and the scaling and then we're going
to convert that point once again back into
Cartesian coordinates and perform the exponentiation,
which is going to give us the logarithmic spiral that
we were looking for and the last step we're going
to do is scale that point if necessary
and then move it back to the center
from where it came from and that is the entire effect.
So this works, but as we saw earlier it
requires a lot of passes over the data
and that means a lot of intermediate buffers.
So, you can imagine if you had multiple levels of
recursion, this is going to end up being quite slow.
So the question we want to ask ourselves
is, can we do this in a single pass?
The key thing to note here is that when b in our equation
is equal to zero, we end up with r is equal to a.
That's it.
So this is going to give us a hall of mirrors effect.
So, if we were to try and cut out
this rectangle here in yellow,
what we really want to do is replicate what's not inside
the yellow inside the yellow and do that repeatedly,
which is where the scaling factor comes from that we
talked about earlier and that's going to look like this.
The question is, can we do this in a single pass?
And I have good news, we can.
So, let's take a look at this in a little bit more detail.
We're going to cut out this section in
yellow and we want to replicate the image in.
So, we're going to move it to the center once again and then
when we're asked to render a point, let's say on the corner
of the arm chair here, what we need to do
is figure out where does that point come
from within the outer section of the image?
Where is the valid data?
And so what we're going to do is we're going to draw a line
from the center out and we're going to scale this point
by that same scaling factor until we reach valid pixel
data and we'll look at the code for this in a second.
And the same way we saw that even on the first iteration
sometimes we'll be asked to render data that falls outside
of the image bounds so sometimes we're going to have
to take points that fall outside and bring them in.
So, we're going to divide the value
of the current coordinate
by the same scaling factor, and
again, we'll do that repeatedly.
In terms of testing for this, we have two points here,
which correspond to the rectangle that we've chosen
and those are simply at the inner rectangle width divided
by two for width and height and minus inner rectangle
within height divided by two, and then we have two
additional points which correspond to the outer bounds
of our image, which are just at center.x, center.y and
minus center.x, minus center.y. So let's look at the code.
So, the code is mostly unchanged.
We're just going to add a few for loops here and we can
unrule these because it's a fixed number of iterations.
So, we're just going to be passing in one
additional parameter, which is the dimensions of half
of the inner rectangle and then the first for loop is going
to take an existing point and scale it continuously based
on whether or not it falls outside of the image
bounds or not and if it's not, it won't change it.
It's just going to do divide by one so it won't affect
the position and then if on the other hand we had a point
that was inside the inner rectangle, we're going to
keep scaling it out for a certain number of iterations
until we get a point that is in the
valid area and the rest is unchanged.
So, we get hall of mirrors plus
the Droste Effect in a single pass.
Now, that doesn't view with alpha blending,
and if we look at our picture of our iMac,
you can see that the sections colored
here in green look horrible.
So, the question is, can we do this in a single pass?
And you see we applied the Droste
Effect it looks kind of bad.
So, we can use the same trick that we used for scaling
points if we realize that alpha is not equal to one.
So, if we're asked to render a point where alpha
isn't equal to one, we're just going to move it in
and keep accumulating values until we have alpha equals one.
So in terms of code, it's going to look like this.
So we're just going to use Porter Duff alpha blending.
We're going to have another for loop.
Truth be told we probably didn't need four iterations
because each new iteration involves a new sampling operation
so that's on the expensive side, but that's the basic idea.
We're just going to compute a new coordinate, look at the
value for that and keep adding it to our current color
if necessary and then if we do
that, we'll get alpha blending.
So we can do everything in a single pass.
So, let's talk a little bit about how
this filter is in terms of performance.
So, first off I talked a little bit about ROI earlier and
this is a perfect example of when it's really important
to specify an ROI because in this case we can't determine
ahead of time when we're trying to render a small subsection
of the image how much of the original image we need.
We might need the entire image to render just a small tile.
So, this is not a tileable operation, which
means that you're going to be limited in terms
of the maximum output size for the image that
you can create to the limits for that device
so the maximum texture size or the sizes in OpenCL.
Also had we gone down the multi-pass approach, we could
have chosen to just simply take our original input image,
scale and rotate it and do the Source Over.
That's another thing.
Also because we're not using the built in affine
transform and rotate methods inside of Core Image,
we're going to end up with some aliasing artifact
so we might want to do some multi-sampling in order
to make our image look a little bit smoother.
And one thing that's really important to note
is given the recursive nature of this algorithm,
the very first thing you want to do is match the aspect
ratio of the image that you're trying to perform this effect
to be identical to the aspect ratio of
the rectangle that you're cutting out.
So, let's take a look at how this works.
So, we've implemented this inside of Quartz Composer
and first thing I'm going to do is click two points
to indicate what portion of the image I want to cut out
and then I'm going to enable the effect and we're done.
So, we can do this on and then we can change the
values of a and b to see how that affects our result,
and we can take a look at some of the
other graphs and how they get affected
by these things including our little spiral here, which
was kind of fun, and we can also do it on live video
and we can go really crazy [applause] and
you can see how I felt before I got on stage.
[Laughter] Okay, and this is available for you right now.
You can download it on the website
via the URL that I showed you earlier.
It's kind of spooky.
I know. Okay, if you're curious and want to learn a
little bit more about the math behind all of this stuff,
there's some people in the Netherlands that
came up, that figured out how this all worked
and without them this would never have been possible, and
they published a paper in the AMS, which you can get at here
as well and Jos Leys, whose name I probably butchered,
also has an interesting web page about the math behind all
of this and one good thing to keep
in mind or a good reference for you
when you start writing your own kernels is the GLSL
quick reference, which is available for download
from the Khronos website, which I've listed here.
In addition when you start writing your own kernels,
there are a few good books you might want look at,
Digital Image Warping by George Wolberg, which has a
lot of information about image resampling; GPU Gems 3,
which has a good object tracking demo
written in Core Image that you can look at;
and then finally Digital Image Processing,
which is a good overview of image processing
in general and has as good chapter on color.
You can contact Allan Schaffer at aschaffer@apple.com
or go to our developer forums at devforums.apple.com.
And on that note I'd like to thank you all for coming,
and I hope you enjoy the rest of the conference.