WWDC2013 Session 509

Transcript

X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
>> Thank you all for coming
to on the last sessions
of this week.
I hope we've all had
a great week here.
Today, in our session,
we're going to be talking
about Core Image Effects and
Techniques on iOS and Mac OS.
So, Core Image.
In a nutshell, Core Image is a
foundational image processing
framework on both iOS and Mac
OS and it's used in a variety
of great applications from Photo
Booth to iPhoto on both desktop
and embedded products.
It's also used in the
new photos app for some
of their image effects,
the new filter effects
that are available in iOS 7.
And it's also in a wide variety
of very successful App
Store apps as well.
So, it's a great
foundational technology.
You spend a lot of time making
sure to get the best performance
out of it and we want to
tell you all about it today.
So, the key concepts
we're going to be talking
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
about today are going to be how
to get started using
the Core Image API?
How to leverage the built-in
filters that we provide?
How to provide input
images into these filters?
How to render the
output of those effects?
And lastly, we have a great
demo of how to bridge Core Image
in OpenCL technologies together.
So, the key concepts
of Core Image.
It's actually a very
simple concept
and it's very simple to code.
The idea is you have
filters that allow you
to perform per pixel
operations on an image.
So, in a very simple example,
we have an input image
and original image,
pictures of boats,
and we want to apply a
Sepia Tone Filter to that
and the results after
that is a new image.
But we actually have
lots of filters
and you can chain them together
into either chains or graphs.
And this-- by combining
these multiple filters,
you can create very
complex effects.
In this slightly more complex
example, we're taking an image,
running it through Sepia
Tone then running it
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to a Hue Adjustment Filter to
make it into a blue tone image
and then we're adding
some contrast
by using the color
controls filter.
Now, while you can
conceptually think of their--
being an intermediate image
between every filter internally
to improve performance,
Core Image will concatenate
these filters
into a combined program in order
to get the best possible
performance and this is achieved
by eliminating intermediate
buffers which is a big benefit.
And then we also do
additional runtime optimizations
on the filter graph.
For example, both the
hue adjustment an example
and the contrast or
both matrix operations
and if you have sequential
matrix operations
in your filter graph then
Core Image will combine those
into a combined matrix
which will actually further
improve both performance
and quality.
So, let me give you
a real quick example
of this working in action.
So, if I bring this up
here, we have an application
which we first used a
little bit last year at WWDC
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and now we actually
have fully vague version
of the application.
So, the idea here is you
bring up the filters pop-up
and that's allows you to add
either input sources or filters
to your rendering graph.
In this case, I just want
to bring in the video
from the video feed and
hopefully you can see that OK.
Once we have that
effect, we can then add
on additional adjustments
like for example we can go
and find another effect in
here and with color controls,
we can increase saturation
or contrast
and we can do these
effects live.
We can also delete them.
If you want to do a
slightly more complex effect,
we can do a pattern here.
We can go to dot screen.
And dot screen, hopefully
you can see this turns your--
turns the video into a-- like
a newsprint type dot pattern,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
we can adjust the
size of the dot
and the angle of
the dot pattern.
Now, let's say this doesn't
quite suit our desires
right now.
This is a black and
white pattern.
We'd like to kind of
combine this halftone pattern
with the original
color of the image.
We can actually represent graphs
in this list for you here.
We can-- what we can do is
we can add another instance
of the input video.
So, now we've got two operations
on this stack of filters
and then we can then
combine those
with another combining filter.
Yeah, here we go, [inaudible].
So, now, hopefully you can
see it on the projector
but we've got both the
halftone pattern and the color
from the original image
shining through, all right.
So, let me pop this off, delete.
That was the first demo of
the Funhouse Application.
Let me go back to my slides
and the great news
is the source code
for this app is now available.
So, this has been a
much requested feature.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
We showed this last year a
little bit and it should--
[ Applause ]
really great shape for you guys
to look at this application
and see how we did
all this fun stuff.
So, once you've look at the
code, you can see very quickly
that there's really three
basic classes that you need
to understand to use Core Image.
The first class is the CIFilter
class and this is mutable object
that represents an effect
that you want to apply
and a filter has
either an image input
or numeric input parameters
and also it has an output
image parameter as well.
And at the time you ask for
the output image parameter,
it will return in an object
that represents the
current state based
on the current state
of input parameters.
The second key object
type that you need
to understand is
the CIImage object
and this is an immutable object
that represents the
recipe for an image.
And there're basically
two types of images.
There's an image that's come
directly from a file or an input
of some sort and-- or you can
also have a CIImage that comes
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
from the output of a CIFilter.
The third key data type
that you needed to be aware
of is a CIContext and a
CIContext is the object
through which Core Image
will render its result.
It can be either based on a
CPU renderer or GPU renderer
and that's really important to
distinguish between those two
and I'll talk about
that a little bit later
in the presentation.
So, as I mentioned in the intro,
Core Image is available both
on iOS and Mac OS and for
the most part, they're very,
very similar between
the two platforms
but there are a few
platforms specifics
that you might want
to be aware off.
First of all, in terms
of built-in filters,
on iOS Core Image has about over
a hundred built-in filters now
and we've added some
more in iOS 7 as well.
And on Mac OS X, we have
over a 150 built-in filters
and we also have the
ability for your application
to provide its own filters.
The core API is very similar
between the two platforms.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The key classes I mentioned
earlier are CIFilters, CIImage
and CIContext and
they're available on both
and they're largely
identical APIs.
On OS X, there are few
other additional classes
such as CIKernel and CIFilter
shape which are useful
if you're creating your
own custom filters.
On both platforms, we have
render-time optimizations
and while the optimizations
are slightly different due
to the differing natures of the
platforms, the idea is the same
and that Core Image
will take care
of doing the best render-time
optimizations that are possible
to render your requested graph.
There're a few similarities
and differences regarding
color management
which is also something
to be aware of.
On iOS, Core Image
supports either sRGB content
or a non-color managed workflow
if you decide that's what's
best for your application.
On OS X, the-- you can either
have a non-color managed
workflow or you can support any
ICC based colored profile using
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
the CG color space graph object.
In both cases, both
in iOS and Mac OS,
the internal working
space that Core Image uses
for its filters are unclamped
linear data and this is useful
to produce high quality
results and predictable results
across a variety of
different color spaces.
Lastly, there're some
important differences in terms
of the rendering
architecture that's used.
On iOS, we have a
CPU rendering path
and we also have our GPU based
rendering path that's based
on OpenGL ES 2.0.
And on OS X, we have also a CPU
and a GPU based rendering path.
Our CPU rendering path is built
up on top of OpenCL on it's--
using its CPU rendering.
And also, new on Mavericks,
Core Image will also use OpenCL
in the GPU and I'd like
to give you a little demo
of that today 'cause we got
some great benefits out of that.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
For a wide variety of operations
in Core Image, we get very,
very, very high performance
due to the fact
that we leverage the GPU.
For example, we can be
adjusting the slide or real time
on this 3K image or I think
it's 3.5K by something image
and we're getting very,
very fluid results
on the sliderand that's--
'cause these are
relatively simple operations.
One way we like to think
about this, however,
is how does this
performance change as we start
to do more complex operations
and how do we make sure
that the interactive behavior
of Core Images is
fluid as possible.
So, we've been spending a lot
of time on this in Mavericks
and we came up with
this demo application
to help demonstrate performance.
One thing that really
makes it easier
to see the performance
instead of trying
to subjectively judge a slider
is we have this little test mode
where it will 50 renders
in rapid succession
as quickly as possible.
And what it will do is it will
take these filter operations
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that we've done and it
will prepend to beginning
of that filter operation
in exposure adjustment
and it will adjust that
exposure and render it 50 times
with 50 different exposures.
And that will force Core Image
to have to render everything
after it in the filter
graph again.
So, if we go through here,
it will do a quick sweep
of the image and you can see
we're getting 0.83 seconds
and that's an interesting
number that turns
out that that's how long it
takes to render 50 frames
if you're limited by 60 frames
per second display time.
So, that's good, that means
we're hitting 60 frames per
second or maybe we're
actually even faster
but we're limited
by the frame, right.
So, the question is, however,
what starts to happen
is we start
to do more complex
operations and obviously
if we start throwing in
very complex operations
like highlights and
shadows adjustments
and more importantly
very large blurs.
This blur is actually even
more than the 50 pixels,
50-50 value that you're
seeing in that slider.
It's actually hundreds of pixels
wide but hundred of pixels tall
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and that requires a lot
of fetching from an image.
So, obviously in this
case, when we do a sweep,
we're getting not quite
real time performance.
And while it's, you know,
impressive, you know,
we could do better
and this is one
of the reasons we
spend a lot of time
in Mavericks changing the
internals of Core Image
so that it would use OpenCL
instead and as you see
as we turn on OpenCL on the
GPU path, we're now back
down to 60 frames per second
on this complex rendering
operation.
So, we're really pleased
with these results.
The great thing also
about this performance is
that it particularly benefits
operations that were complex.
We were doing large
complex render graphs.
So that was again the
demonstration of OpenCL
on the GPU on OS X Mavericks.
So, as I've talked about
today, we've had a lot
of built-in filters and I'd
like to give you a
little bit more detail
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
on those built-on filters
and some we've added
and give you some more
information on how
to use filters in
your application.
So, we have a ton
of useful filters
and it's probably
barely even readable
to see them all here so, I just
want to highlight some today.
So, first of all the filters
fall under different categories.
We have whole bunch of
filters for doing color effects
and color adjustments.
In my slides earlier, I called
out three as an example,
color controls, hue
adjustment, and sepia tone.
The other ones work similarly.
They take input image
and have parameters
and produce an output image.
We've also added some new ones
in both of iOS 7 in Mavericks
that we think will be
useful for different--
a variety of different uses.
We have, for example,
color polynomial
and color cross polynomial
that allows you
to do polynomial operations
that combine the red,
green and blue channels
in interesting ways.
You can actually do some really
interesting color effects
with this.
We also have a class
of filters which fall
into either geometry adjustments
or distortion effects.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And for example, one of these
is a fun effect called twirl
distortion and we can actually
demo that real quickly here.
And you can see this
adjusting a twirl on an image
and this actually running
in that presentation right
now using Core Image.
It's kind of recorded movie.
We also have several
blur and sharpen effects
and I mentioned blur and sharpen
because blurs in particular one
of the most foundational types
of image processing you
can perform, Gaussian blur
for example is used as the
basis of a whole variety
of different effects
such as sharpening
and edge detection and the like.
We've also added some new blur
or convolution effects to iOS 7
in Mavericks and we've
picked some that were--
be particularly general so that
they can be used in a variety
of applications, very, very
common to use either 3X3
or 5X5 convolutions and
we've implemented those
and optimized the
heck out of them
so they'll get really
good performance.
We've also added a horizontal
and vertical convolution
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
which is useful if your
convolution is a separable
operation and, again, we've
optimized the heck out of these.
We also have a class of
filters called generators
and these are filters that
don't take an input image
but will produce an output image
and these are things for effects
like starburst and
random textures
and checkerboard patterns
but we've added a new one
in both iOS 7 in Mavericks
called QR code generator
and this is a filter that takes
a string as an input parameter
and also a quality setting
and then also will produce
as its output a chart
image, bar chart image.
So that can be useful on a lot
of interesting applications
as well.
We also, have a class
called face detector
and this not exactly a
filter per se but it's--
you can think of it as
a filter in the sense
that it takes input images
and produces output image--
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
out data and we've had this
for a couple of releases now.
The great thing is starting
now in OS 7 in Mavericks,
we've made some enhancement
to that.
In the past, you
could take a face
and it will return
the bounding informa--
bounding rect for the face
and it will also
return the coordinates
for the eyes and mouth.
But starting in Mavericks in iOS
7, there's a flag you can pass
in that will also
return information
like whether a smile is present
or whether the eye is blinking,
so that's another
nice new enhancement.
So there's a brief overview
of our 100 plus filters.
One question we're commonly
asked is how do we choose what
filters we use or can
we add this filter or--
and I wanted to just
talk a moment
about our process on that.
So the key thing we
want to consider,
these two key criterion,
one is that a filter
must be broadly usable.
We want to make sure that we
add filters like convolutions
which are useful on a
wide variety of usages
so that we can implement
them in a robust way
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and have them be useful to a
wide variety of client needs.
And also we want to make sure
we choose a type of operations
that can be well
implemented and performant
on our target platform.
So, as I mentioned in
my brief introduction
at the very beginning
of the presentation,
you can chain together
multiple filters and I wanted
to give you an idea in code
of how easy this is to do.
You start out with
an input image,
you create a filter object
saying, "I'd the filter
with a name," and you
specify a filter with the name
like CISepiaTone and at the same
time, you specify the parameters
such as the input image
and the intensity amount
and once you have the filter,
you can ask it for
its output image.
And that's basically
one line of code
that will apply a
filter to an image.
If we want to apply
a second filter,
it's just same idea,
slightly different.
What we're going to be
doing here is we're going
to be picking a different
filter.
We'll pick hue adjustment
in this case.
And the key difference is the
input image, in this case,
is the output image in
the previous filter.
So it's very, very
simple, two lines of code
and we've applied multiple
filters to an image.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The other things that's
important and great to keep
in mind is that these--
at the time you're building
up the render graph here,
the filter graph, there's no
actual work being performed.
This is all very fast and
could be done very quickly.
The actual work of a
rendering image is deferred
until we actually get a request
to render it and at that time
that we can make out
render-time optimizations
to make the best
possible performance
on the destination context.
So another thing is that you can
create your own custom filters.
I'm going to actually do some
of these on both iOS and OS X.
We have over a hundred
built-in filters and on iOS 7,
while you cannot create
your own custom kernels,
you can create your own custom
filters by building up filters
out of other built-in filters.
And this is a very
effective way to create new
and interesting effects.
And again, we've chosen some
of the new filters we've added
in iOS 7 to be particularly
useful for this goal.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So how does this work?
So the idea is you want to
create a CIFilter subclass
and in that filter, you want
to wrap a set of other filters.
So there're set of things
that you need to do.
One is you need to
declare properties
for your filter subclass
that declares what its
input parameters is.
For example, you might
have an input image
or other numeric parameters.
You want to override
set defaults
so that the default values are--
for your filter are
setup appropriately
if the calling code doesn't
specify anything else.
And lastly and most
importantly, you're going
to override output image.
And it's in this method that you
will return your filter graph.
And internally, Core Image
actually uses this technique
on some of its own
built-in filters.
As an example, there's a
built-in filter called CIColor
invert which inverts all
the colors in an image.
And if you think
about it, really,
that's just a special case
of a color matrix operation.
So, if you look at our
source code for color invert,
all it does in its output image
method is create an instance
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
of the filter for CIColor matrix
and passing the appropriate
parameters for the red, green,
and blue, and bias vectors.
You can also do this
kind of thing to build
up really other interesting
image effects.
For example, let's
say you wanted
to do a Sobel Edge Detector
in your application.
Well, a Sobel Edge Detector
is really just a special case
of a 3X3 convolution.
In fact, it's very simple
convolution, all it is,
is depending on whether you're
doing a horizontal Sobel
or a vertical Sobel, you're
going to have some pattern
of ones and twos and zeros
in your 3X3 convolution.
One thing to keep in mind
is, especially on iOS,
because we don't want to add a
bias term to this convolution.
And the idea here is that we
want to produce an output image
that is grey where the image
is flat and black and white
where there are edges.
And that's particularly
important, because on iOS,
our intermediate buffers are
8-bit buffers for these type
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
of operations when they-- and
they can only represent values
between black and white.
One thing to keep
in mind, however,
is that by adding a bias,
you are actually
producing an infinite image,
because outside the image where
the image is flat and clear,
you're going to have grey as the
output of this Sobel Detector.
So let me just give a little
demo of that in action
and I've actually-- in
this particular demo,
so it's a little bit more
interesting to look at on stage,
I'm recoloring the
image so it looks--
so the flat areas look black
and the edges look colorful.
So again, I've got an input
video source and I can go
and add a filter to it and I'm
going to add a custom filter
that we've implemented
called Sobel Edge Detector.
So as we can see here,
hopefully that shows
up on the display, OK.
You can see my glasses and as I
tilt my head, you can see the--
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
or you can see the
stripes in my shirt.
And in this case, the
images are being recolored
so that the flat
areas look black
and the edges look colorful.
And one thing you'll see is
that these circles at the--
above my head are actually
colorful and that's
because the Sobel Edge
Detector is working
on each color plane separately.
And there's-- if there's a
color fringe in the image,
it will show up as a colorful
edge in the Sobel Edge Detector.
If we wanted to get rid
of those colorful frames,
all we have to do is
append another filter.
We can add in color controls
and then we can desaturate that.
And now we've got [inaudible]
of monochrome edge
detector, all right.
All right, so back to slides.
And again, as I mentioned
before, the source code
for that filter and the--
is all available in the Core
Image Fun House application.
All right, so there's
another great use
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
for creating your own
CIFilter subclasses and that's
if you want to use Core
Image in combination
with the new Sprite Kit API.
So it's a great new
API, the Sprite Kit API,
and one of the things that
supports is the ability
to associate a CIFilter
with several objects
in your Sprite Kit application.
For example, you can set a
filter on an effect node,
you can set a filter
on a texture,
or you can set a
filter on a transition.
And it's a great API but one
of the caveats is it you only
can associate one filter.
So if you actually want to have
a more complicated render graph
associated with either a
transition or an object
in your Spite Kit world,
then you can create
a CIFilter subclass.
And what you need to do in
that subclass is you need
to make sure that your filter
has an input image parameter
and you need-- if you're
running a transition effect,
you want to make sure it
has an input time parameter.
And you can have other inputs
but you want to specify them
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
at setup time before you
pass it in the Sprite Kit.
So let me go on to the next
section in my presentation today
to talk about input images.
So, the input in the
filters is input images
and we have a wide
variety of different ways
of getting images
into your filters.
One of the most commonly
requested is to use images
from a file and that's very,
very easy to do, it's one line
of code, create a
CIImage from a URL.
Another common form--
source is bringing data
in from your photo
library and you can do
that by just asking the
ALAssetsLibrary class
for a default representation.
And then, once you have that,
you can ask for the
full screen image
and that will return a CGImage
and once you have a CGImage,
you can create a CIImage
from that CGImage.
Another example is bringing in
data from a live video stream
and this is the case that
we use inside the Fun
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
House application.
And in this case, you'll
get a call back method
to produce a process
of frame of video
and that will give you
a sample buffer object.
Once you have the
sample buffer object,
you can ask for
CMSampleBufferGetImageBuffer
and that will return a
CVImage buffer object.
And a CVImage buffer
object is really--
it's just a subclass of
the CVPixel buffer object
which you can use to
create a CIImage from.
At the same time I'm talking
about creating CIImages,
we should also talk a little
bit about image metadata
which is a very important
thing about images these days.
You can ask an image
for its properties
and that will return
a dictionary
of metadata associated
with that image.
And it will turn the dictionary
containing the same key value
pairs that would be present
if you would call the API
CGImageSourceCopyProperties
AtIndex.
It contains, for some image,
hundreds of properties.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The one that I want to call
out today is the
orientation property,
kCGImagePropertyOrientation.
And this really important
because we all know
with our cameras
today, you image--
the camera can be held in any
orientation and the images
that saved into the camera
roll has metadata associated
that says what orientation
it was in.
So, if you want to present
that image to your user
in the correct way, you need to
read the orientation property
and apply appropriate
transform to it.
The great thing is that
metadata is all set
up for you automatically if
you use the image with URL
or image with data APIs.
If you're using other methods
to instantiate an image,
you can specify the
metadata using the
kCIImageProperties option.
Another thing we've
added on both Mavericks
in iOS 7 is much more robust
support for YCbCr images.
A CIImage can be based on
a bi-planar YCC 420 data
and this is a great way to get
good performance out of video.
On OS X, you want to
use an IOSurface object
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to represent this
data and on iOS,
you want to use a
CVPixelBuffer to represent this.
The great thing is Core Image
takes care of all the hard work
for you, it will take
this bi-planar data
and it will combine the full-res
Y channel and the subsampled
to CbCr planes into a full image
and it will also apply the
appropriate 3x4 color matrix
to convert the YCC
values into RGB values.
If you are curious about all
the math involved in this,
I highly recommend the book
by Poynton, "Digital Video
and HD Algorithms" which goes
over in great detail all the
matrix math that you need
to understand to
correctly process YCC data.
The other thing you might
want to keep in mind is
if you're working on 420
data, you might be working
on a video type workflow and
in that case, you might want
to tell on Mac OS, you might
want to talk Core Image
to use the rec709 linear working
space rather than its default
which is generic RGB and this
can prevent some clipping errors
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to the color matrix operations.
The third section I want to talk
about today is rendering
Core Image output.
If you have an image and
you've applied a filter,
there are several ways to render
the output using Core Image.
One of the most common
is rendering an image
to your photo library.
And again this is
very easy to do.
There's one thing you
want to be aware of is
when you're saving images
to your phot library,
you could quite easily
be working
on a very high resolution
image, 5 megapixels for example
and resolutions of this
size are actually bigger
that a GPU limits
that are supported
on some of our devices.
So, in order to render
this image with Core Image,
you want to tell Core Image
to use a software renderer.
And this also has the advantage
that if you're doing a bunch
of exports in the background,
you can do this while your
app is suspended rather
than if you're-- we try
to use our GPU renderer.
And once you've created
the CIContext,
there's an assets library
method which we can use
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to write the JPEG into
that library roll.
The key image from key API
to call is for Core Image
to create a CGImage
from a CIImage.
Another common way of rendering
an image is to render it
into a UIIimage view
using a UIImage.
And this code for this is
actually very, very simple.
All you do is you
UIImage support CIImage
so you can create a UIImage
from an output of a filter
and then you can just tell an
image view to use that UIImage.
And this very, very easy to code
but it's actually not the best
from a performance perspective
and let me talk a
little bit about that.
Internally, what UIImage is
doing is asking Core Image
to render it and turn it into a
CGImage but what happens here is
that when Core Image
is rendering,
it will upload the
image to the GPU,
it will perform the filter
effect that's desired
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and because the image is being
read back into a CGImage,
it's being read back
into CPU memory.
And then when it comes
time to actually display it
in the UIView, it's then
going back being rendered
on the GPU using core animation.
And while this is
effective and simple,
it means that we're
making several trips
across the boundary
between CPU and the GPU
and that's not ideal, so
we'd like to avoid that.
Much better approach is to
take an image, upload it once
to the GPU and have CI do
all the rendering directly
to the display.
And that's actually quite
easy to do in your application
if you have a CAEAGLLayer
for example, you can--
at the time that you're
instantiating your object,
you want to creat a
CIContext at the same time.
We created an EAGLContext
with the--
of type OpenGL ES2 and then
we tell CI to create a context
from that EAGLContext.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Then when it comes time
to update the display
in your update screen method,
we're going to do a
couple things here.
We're going to ask the--
our model object to
create a CIImage to render.
We're then going to set up
the GL blend mode to be--
let's say in this
case, source over.
This is actually a little
subtle thing to change
between iOS 6 and iOS 7.
On iOS 6, we would always blend
with source over blend mode.
But there are a lot
of interesting cases
where you might want to
use a different blend mode.
So now if your app is
linked on or after iOS 7,
you have the ability to
specify your own blend mode.
And then once we set up the
blend mode, we tell Core Image
to draw the image into
the context which is based
on the EAGLContext,
and then lastly
to actually present
the images to the user,
we're going to bind the render
buffer and present that to the--
present that render buffer.
The next thing I'd like
to talk about is rendering
to a CVPixelBufferFef, and this
is another interesting thing
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that we talked about and show
in the Fun House application
where you may want to be
applying a filter to a video
and saving that to disk.
Now, if you want to make
that a little bit more
interesting a problem,
you may also-- while you're
saving the video to disk,
you may also want to present
it to the user as a view
so they can see what's
being recorded.
So this is actually an
interesting example and I want
to talk a little bit
about that in a few--
a little bit of code
we have here.
All right, so again all
this code is available
in the Core Image Fun House.
And what we have here is a
scenario of where we want
to record video and also display
it to the user at the same time.
And in our view object when we--
when our view object
gets instantiated
when the app launches,
we're going to be
creating an EAGLContext
as I demonstrated in the slide.
And then were going to be
creating at the same time,
we'll also create a
CIContext at that same time
with that EAGLContext.
We look later on it-- when
it comes time to render,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
we have a callback
method to capture output
and again were going to be
given a sample buffer here.
If we look down further
in this code,
we are doing some
basic rectangle math
to make sure we render
it in the correct place.
We're going to take the
source image that we get
from the sample buffer
and we're going
to apply our filters to it.
Now we have this output image
and we want to render that.
Now, there're two
scenarios here.
One is when the app is running
and it's just displaying
live preview.
In that case in this code here,
all we're doing is setting
up our blend mode and rendering
the filtered image directly
to the context of the display
with the appropriate rectangle.
In the case when
we're recording,
we want to do two things.
First of all, we
[inaudible] start
up our video writing object.
And then we're going to ask
core video for a pixel buffer
to render into out of
it's as-- out of its pool.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Then we will ask Core Image
to render the filtered image
into that buffer and that will
apply our filters into that.
Now that we have
that rendered buffer,
we're going to do two things
with it, one is we're going
to draw that image
to the display
with the appropriate rectangle
and then we're also going
to tell the writer
object that we want
to append this rendered buffer
with the appropriate time
stamp into the stream.
And that's pretty much all there
is to it and that's how we--
when I run the application
to Fun House if you try it
after the presentation,
you can both record
into your camera roll and
preview at the same time.
So just a few last minute
tips for best performance,
you keep in mind that Core Image
and filter objects are
autoreleased objects.
So if you're not using
ARC in your application,
you'll want to use autorelease
pools to reduce memory pressure.
Also, it's a very good idea not
to creat a CIContext
every time you render it,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
much better to create
it once and reuse it.
And also be aware that
both Core Animation
and Core Image both make
extensive use if the GPU.
So if you want your Core
Animations to be smooth,
you want to either stagger
your Core Image operations
or use a CPU CIContext so
that they don't put pressure
on the GPU at the same time.
Another couple of other tips
and practices is to be aware
that the GPU Context on
iOS has limited dimensions
and there's an
inputMaximumImageSize
and an outputMaximumImageSize
and there're APIs
that you can call to query that.
It's always a good idea
from performance effecting
to use small an image
as possible.
The performance of Core
Image is largely dictated
by the complexity of your
filter graph and by the number
of pixels in your output image,
so there's some great APIs
that allow you to
reduce your image size.
One that we used in this example
for Fun House is when you ask
for a library from your
asset library, you can say,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
I want a full screen image
rather than a full size image
and that will return
in appropriately sized
image for your display.
So, the last section of our talk
today is Bridging Core Image
in OpenCL.
And early on in the
presentation, I was talking
about the great performance
whence we're getting
out of Core Image on
Mavericks by using OpenCL.
And the great thing is we
get improved performance
to do advances in the OpenCL
Compiler, and also the fact
that OpenCL has less
state management.
So, we got some great
performance whence.
And the other great thing is
so there's nothing your
application needs to do
to change to accept
this to reach us,
to do this automatically.
And there's actually some great
technology behind the hood.
All of the built-in kernels
and your custom kernels
that are all written in
CI's Kernel language are all
automatically converted
into OpenCL code.
So, it's really some
great stuff that we have
to make all this work
behind the scene.
If we think about
it a little bit,
the Core Image kernel
language has some really great
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
advantages though.
With its CIKernel language,
you can write kernel once
and it'll work across
device, classes and also
across different image formats.
And it also automatically
supports great things
like tiling of large images and
concatenation of complex graphs.
However, there are some very
interesting image-processing
operations out there that
cannot be well-expressed
in Core Images kernel
language due
to the nature of the algorithm.
But some of those
algorithms can be represented
in OpenCL's language.
And the question is, how
can you bridge the best
of both to these together?
How can you bridge both Core
Image and OpenCL together
to get some really great
image-processing on Mavericks?
So, to talk about that, I'm
going to bring up Alexander
to talk about bridging
Core Image in OpenCL,
and have some great
stuff to talk about.
>> Thank you, David.
So, my name is Alexander
Neiman [phonetic]
and today I'm going
talk to you a little bit
about how we can use both
Core Image and OpenCL together
in order to create new
and interesting effects
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
which we wouldn't be able
to do using just Core
Image on its own.
So, we're going to start with
an image that's pretty hazy
that David took a little
bit more than a year ago
from an airship, and he
said, "This picture suck,
how can we make them better?
Can we get rid of the haze?"
And for the sake of
the demo, we did.
So, if we look closely
here, we're just going
to get a little animation of
the desired result we're going
to try to get here is
we're literally going
to peal the haze off this image.
So, how are we going to do this?
Well, the basic idea is
that haze is accumulated
as a function of distance.
And further away, you get
the more haze there is.
But if we were to look at
any part of this image,
so if we zoom in this
little section here,
there should be something that's
blocking this image which is
to say that if we were to look
at this area under the arches,
there should be either
an object that's black
or a really dark shadow.
But because of the atmospheric
haze that's accumulated,
it's no longer black.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
It has-- it's colorful.
And what we want to do is we
want to remove that color.
So, the question is, how are we
going to find out what's black
and apply that to a greater area
so that we can eventually
get a dehazed image.
So, if we were to look at the
pixel somewhere in the middle
of this little area, and then
search for a certain area in X,
and other area in Y, we
get a search rectangle.
Now, if we look at
the search rectangle,
we can see that there is going
to be a local minimum
that we can use.
And once we know that that
should have been black,
we can apply that amount of--
we know that how much haze has
been applied because we know
that that should have been black
originally, and that amount
of haze is probably
going to be uniformed
over the entire rectangle.
So, if we look at this visually,
we're going to compute a
morphological mean operation
in the X direction first.
So, we're going to have our--
we're going to search for the
small value in the X direction,
and then we're going to
do the exact same thing
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
in the Y direction.
And then we get this
kind of blacky pattern.
We're going to blur this
result to some degree,
and now we have a really
good representation
of how much haze there is,
and we can subtract this
from our original image using
a Difference Blend Mode.
And if we do that, we get a
beautifully dehazed image.
In terms of workflow,
the way we're going
to do this is we're going to
start with our input image,
and we're going to perform a
morphological mean operation
in the X direction first where
we search for a minimum value,
and the values just
kind of come together.
The next thing we're
going to do is we're going
to perform a morphological mean
operation in the Y direction,
and these, they're
exaggerated a little bit
like your search is a little bit
larger than we would actually do
in real life to get this
effect but just for the--
so you can actually
see something
on these slides where,
exaggerated it.
Once we've got a morphological
mean operation performed,
we're then going to
blur that result,
and then we get a nice
gradient which we can then use
to subtract from
our original image,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and we'll get our dehazed image.
So if we take our input image
and our Gaussian blur image,
and perform Difference Blending,
we'll get our desired result.
The generation of
the input image
and the Gaussian blur can
be done using Core Image.
But we want to use OpenCL to
perform the morphological mean
in X and Y operations
because those are operations
which would be traditionally
difficult to do
in Core Image's kernel language.
The problem is that Core Image
doesn't know about clMemObject
and OpenCL doesn't
know about CIImages.
So, how are we going to do this?
Well we're going
to use IOSurface.
And in OS X Mavericks,
we've done a lot of work
to improve how we use IOSurface
and make sure that we stay
on the GPU the whole time.
And so we're going to go
through all the steps today
to show how we can do this
and get maximum performance
and combine all these
APIs together.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So, let's take a look at our
workflow to process this image.
So, we're going to start
by asking Core Image
to down-sample the image
because we don't need--
in order to generate a
gradient which we're going
to be then using to perform
the subtraction, we don't need
to run at full resolution.
The next thing we're going to do
is we're going to ask Core Image
to render into an IOSurface.
Traditionally speaking, most of
the time you render to a buffer
as in like writing the file
to disk or directly displaying
on screen, but you can
also render to IOSurface
which as I mentioned
something we've really improved
in Mavericks.
We're then going to use OpenCL
to compute the minimum
using some kernels
that we're going to
go over in detail.
Once we've got the output from
OpenCL, we're then going to take
that IOSurface that was tied to
the clMemObject, and we're going
to create a new CIImage.
We're going to blur that result,
perform a Difference Blending,
and then we just
render and we're done.
So, let's take a look
at all these steps
in little more detail.
So, first thing is first,
we're going to import an image
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
with the URL, so we just create,
we just call CIImage,
image with URL.
We then are going
to down-sample it
so that we have fewer
pixels to process.
Again, in order to compute this
gradient as I mentioned earlier,
we don't need the
full resolution.
And then we're going to inset
our rectangle a little bit
such that if we were to
have generated an image
that wasn't integral,
we would end up with--
on the boarder of the image, we
might end up with some pixels
that have a little
bit of alpha value.
And we don't want to make
our kernel more complicated
than it needs to be so
we're just going to get rid
of one pixel on the EDGE.
And then we're going to down--
and then we're going to crop
that image to get rid of
that one pixel of boarder.
Also, I should mention before
I forget that the sample code
for this be also
available for download
on the Session Breakout page
at some point later on today.
So, don't worry if you don't
follow everything here.
But let's get to the next
step of this process.
First thing we're going
to do is we're going
to create an IOSurface.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So in order to that, we're
going to specify a bunch
of properties including the,
you know, bytes per row,
bytes per element, width, height
and whatever pixel OS type,
the pixel format that we'll be
using for our input surface.
We then create an IOSurface
using IOSurface Create.
Once we've done that,
we're going to want
to create a CIContext
which again as mentioned--
as David mentioned earlier,
we're going to want to hold
on to 'cause if we perform
this effect multiple times,
it's good to hold on to
all the resources tied
with that context.
And we're going to
make sure that we use--
that we initialize our CIContext
with the OpenGL context
that we eventually
planned on rendering with.
And then we're going to
actually ask Core Image
to render our scaled image
down into the IOSurface,
and we're going to
make sure that we--
our output color space is
equal to the input color space
of our original image.
So now, let's get
into the nitty-gritty
of what we'll be
doing with OpenCL.
So, first things first,
we're going to want
to create a CL context
which is going to be able
to share all the data
from the IOSurfaces
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that we created with OpenGL.
In order to do that, we're
going to get a shared group
for the GL context that we
plan on using and we're going
to create a context
using that shared group.
Once we've done that,
we're then going
to use a function called CL
Create Image from IOSurface
to the Apple which allows
us to take in IOSurface
and create a clMemObject.
Now, this is going to
correspond to our input image
which was the result from what
we generated that we asked CI
to render in initially,
the down-sampled image.
But we also need an
image to write out to,
and our algorithm is going
to be a 2-pass approach,
a separable approach,
we're going to perform
our morphological mean
in the X direction and
morphological mean Y direction.
So, for the first pass,
we're going to create
an intermediate image
which is the output of the first
kernel which will then be using
as the input for
the second kernel.
So, we've got our
intermediate image here.
And then we're going to need
one more IOSurface based
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
ClMemObjectfor our final output
which is what we're going
to hand back to Core Image
to do the final rendering.
Let's take a little
bit, let's take a look
at like conceptually
what we want to do.
So, we've got zoomed in area
of an image, and we're going
to take a look at how the
search is going to happen,
and then eventually, we're
going to go over each line
of code that's involved
in doing this.
So, basically we're looking for
a minimum value, and we're going
to initialize it to a very large
value which is to say 1,1,1,1,
so all white and opaque.
And we're going to look
for a minimum value,
and we're basically just going
to start searching from our left
and go to our right, and we're
going to keep updating the value
of that mean V until we
get the lowest value.
And right now, we're looking
at how this would work
if we were operating
in the X direction,
and eventually we would
do the exact same thing
in the Y direction, and that's
going to keep animating,
and I'm going to start
talking a little bit
about the source code.
So, that's the entire
source code for the filter
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that we're going
to want to create,
it does the morphological
mean operation.
So first things first,
we're going to have three
parameters to this function.
The first parameter is
going to be our input image,
second parameter is going
to be our output image,
and the third parameter is going
to tell us how far
we need to search.
We're then going to create
a sampler, and we're going
to use unnormalized
coordinates, and clamp to EDGE
because the way this
algorithm is designed,
we will eventually search
outside of the bounds
of the image, and
we want to make sure
that we don't read black,
but that we reach the value
of the pixel on the
EDGE of the image
such that are we
don't bleed in black
and then get an incorrect
results.
The next thing we're going to
do is we're going to ask CL
for the global ID in zero
and one, and that's going
to tell us basically
effectively where we want
to write our result out to and
we're also going to use this
to determine where we should be
reading from in our input image.
So, the next thing we do is we
initialize our minimum value
to opaque white, and then
we perform our For loop
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
which searches in this
case from left to right.
So, if we're going
to compute our new--
the new location where
we're reading from for,
and we're going to do
this for, you know,
span [phonetic] times
2, and so we are going
to offsite all these,
the location by 0.5.5
such that we're reading from
the center of the pixel.
And we're also going to
offsite the X location
by the value of I.
And if we do that and then read
from the image of that location,
we'll get a new value,
and we can compare
that with our current minimum.
And we just keep updating
that, and when we're done,
we write that value out to our
output image, and we're done.
And this is going to
get ran on every kernel
in a very similar
fashion that you would do
if you're writing a CIKernel.
And although this may look like
a relatively naive approach due
to texture caching on GPUs, this
is about its optimal as it gets.
So, you can actually perform
this at really high speed.
And we've tried a bunch of
other approaches, and this ended
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
up being as fast it gets.
And if you were going
to do this,
you would also create a kernel
that was very similar to this
for the Y direction, and
all you need to do is
to change the relocation.
Instead of incrementing I for
X, you would increment I for Y,
and the rest would
remain the same.
So let's take a look
at what we need to do
to actually run the CIKernels,
I'm sorry, CL kernels.
First things first, we're going
to give it a character string
with our code that
we looked at earlier.
We're then going to create
a CL program from that code
with the context that we
have-- we created earlier.
We're then going to
build that program
for some number of devices.
And then we're going to
create 2 kernels by looking
into that program
and asking for--
to look up the morphological
mean X
and morphological
mean Y kernels.
Once we have that, we're
pretty much ready to go.
All we need to do now is
set up some parameters,
and ask OpenCL to run.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So, as we saw earlier,
the CL kernel took 3
parameters to the function.
The first parameter
is the input image.
The second parameter
is our output image.
In this case, when we're doing
the morphological mean operation
in the X direction, it's going
to be the intermediate image.
And our third parameter
is going to be the value
of how far we want to
search in the X direction.
Once we've done that, all
we need to do is ask OpenCL
to enqueue that kernel and run.
And so here, we're gong to
say, run the minimum X kernel,
and we're going to give
it some workgroup sizes,
and the map for figuring out how
the optimal workgroup size is
on the source code that will
be available later on today.
So once we've done our
pass in the X direction,
we're going to do
the exact same thing.
But instead of searching in
X, we're going to search in Y
and we're going to run that,
and we just need to set our--
our input image is going to
be the intermediate image
and the output image is going to
be the output image and we need
to get a new span Y, we call
clEnqueueNDRange once again
with the minimum Y
kernel and we're done.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The last thing we need to do
before we hand this off back
to Core Image is we
need to call clFlush,
and the reason we do this
is because we want to make
that all the work from OpenCL
has been submitted to the GPU
with no additional work
get submitted such that
when we start using
the IOSurface inside
of Core Image, the
data is valid.
And so this is a
really important step,
otherwise you're going
to see image corruption.
And that's all we need
to do with OpenCL.
The next thing we're
going to do is
when we've got our input
image from OpenCL that was--
that corresponds
to an IOSurface,
we then create a new
CIImage from that IOSurface,
and we specify the color
space which is identical
to the color space that we used
at the very beginning to ask CI
to render that down-sampled
image.
So, we're almost done.
The next thing we're
going to do is we're going
to blur the image, and in
order to blur the image,
the first thing we're going to
do is we're going to perform
and to find clamp which is going
to basically give us a very
similar effect to what we did
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
when we asked for clamp to
edge 'cause we don't want
to be reading in black pixels
when we perform our blur.
So, we're going to do and to
find clamp, we're then going
to call CI Gaussian blur,
specify a radius and ask
for the output image, and then
we're going to crop that back
to the original size of
what the scaled image was.
So now we have the blurred
image that we were looking
for which we can then use to
the final Difference Blending.
So, in order to do the
Difference Blending,
it's very simple.
We just create a filter called
CI Difference Blend Mode.
We set the input image to
our original input image
which was scaled
down in this case.
We use the blurred image that we
just created from the IOSurface
as our background
image, and in order
to generate the final image,
we just call value for key
and ask for the output image.
Once we've done that,
the next thing we need
to do is call context
or CIContext draw image,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
our final image at a certain
location and give it the balance
of what we would like
to render which is then
in this case be equal
to final image extent.
Now, this can get
kind of complicated
when you start generating a
lot of effects, and in Xcode 5
and on Mavericks, you can now
hover over things in Xcode
and get a description.
And so in this case, you can see
as I'm hovering over an image,
I can see the IO-- the CIImage
is based on an IOSurface
and that it's got
a certain size.
But we've also added something
now on OS X Mavericks,
such that if you were to
click on the Quick Looks,
you'll actually get a preview
of what your CIImage looks like,
and we're hoping to
have this for iOS 7
as well in the near future.
So, this can really help when
you're debugging your apps.
So, let's take a look
at the effect once more
in action 'cause it was
a lot of blood and sweat.
So, it's worth one
more animation I think.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And in the meanwhile,
we're going to talk
about a few little caveats.
One thing worth noting is
that this algorithm
performs really well
in removing atmospheric
haze to the extent
where if you were actually
to run this on an image
that had a lot of sky and
didn't have any dark data
or anything black, no
shadows, no nothing.
It would actually
get rid of the sky.
So, it would look black and
that's not terribly interesting.
So, you don't want
to necessarily use
this one wholesale
but is a fair amount
literature out there
about how this is implemented
and ours is actually
pretty quick.
You can get really
good frame rates
and were quite pleased
with the results.
The other thing is
you can also--
because this is a function,
atmosphere case basically
accumulates exponentially,
if you take the logarithm
of those values,
you get effectively what
corresponds to a depth map.
So once you have the depth map,
you could do really
interesting effects
such as refocusing an image
afterwards which is the kind
of thing where you can do like
a fake tilt shift effects,
et cetera, and we talked about
that in WWDC a few years ago
about how you could do that
with Core Image as well.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So, some additional information,
Allan Schaffer is our graphics
and game technology's
evangelist,
and you could reach him
at aschaffer@apple.com.
There's documentation
at developer.apple.com,
and then of course you can
always go to devforums.apple.com
to talk to other developers
and get in touch with us
if you have any questions.
Related sessions and labs, there
are few additional sessions
which you may also want to
go back and look at later
if you have additional--
if you're curious
about the technologies
that we talked a little
bit about-- earlier today.
And on that note, I would like
to thank you all once again
for coming, and I hope you
enjoy the rest of WWDC.
Thank you.
[ Applause ]