WWDC2017 Session 508

Transcript

[ Applause ]
>> I'm happy -- my name is
Etienne.
And I'm happy to be here today
to show you how you can use
depth to apply new kind of
effects to your images.
First, we're going to see what
depth is and what it looks like.
Then we're going to see how to
load it and read it from our
image files.
And then we're going to show you
several examples of effects you
can achieve with depth.
And we'll conclude with how to
save depth data.
All right.
So let's get started.
What is depth?
So, to answer that question,
let's start with how we capture
depth.
Depth can be captured only on
iPhone 7+ and only on iOS 11.
iPhone 7+ has a dual camera
system that can be set to
capture two images of the same
scene at the same time and at
the same focal length.
The differences between those
two images is called disparity.
So disparity is a measure of the
parallax effect.
It measures how objects that are
closer to the camera tends to
move, to shift more from one
image to the other.
Once we know disparity, we can
compute depth with a simple
formula.
Depth is 1 over disparity.
So, in the remaining of this
session we're going to talk
about depth or disparity under
the broad term of depth data.
But remember, they're pretty
similar and one is the inverse
of the other.
For more information about how
we capture depth I would refer
you to the session on capturing
depth in iPhone photography that
took place yesterday.
All right.
So now that we know what depth
and disparity looks like --
sorry -- now that we know what
depth and disparity is, let's
take a look at what it looks
like in practice.
And for that, I'm going to call
my colleague Craig to show you
what it looks like.
Craig.
[ Applause ]
>> Thank you, Etienne.
What we're seeing here is an
image that was captured by the
iPhone 7+.
And here is its disparity map.
And as we've learned, disparity
refers to the distance between
two corresponding points that
were captured by the iPhone 7+'s
dual camera system.
Bright areas are closer to the
camera and correspond to higher
disparity values.
Where dark values are farther
away from the camera and
correspond to low disparity
values.
So, let's go back to the image
and look at the disparity map
again.
We can pinch in to zoom in on an
area.
But we have one more trick we
can do with this application.
If I drag my finger across, we
can view the data in 3-D.
[ Applause ]
We can zoom in and really get a
good look at the range of data
that's available to us.
We can rotate all the way around
and look at that again.
And even switch back to the
image data overlaid over top of
it.
So let's look at another image.
Here are some beautiful flowers.
When I zoom in and rotate
around, you see that we need to
fill in those polygons with some
data.
So we just take the image values
and stretch them along there.
This goes to the fact that the
depth data is not a good
representation for recreating a
full 3-D scene.
But this view is still
interesting to look at.
Also, it's important to note
that the disparity map is a
lower resolution than the full
image, roughly a half a
megapixel for the disparity map
versus 12 megapixels for the
image.
This application we built with
SceneKit.
So it made it really easy to
implement.
We took a mesh and then we
transformed the z positions of
the vertices so that the
brighter pixel values were
closer to the camera.
Also, we normalized and remapped
the data so that it made sense
when we viewed it in 3-D.
We'll look at one more image.
On this image, it's interesting
if we look at the disparity map.
And we zoom in and move around a
little bit, we see that we have
a few distinct planes to work
with here.
So, with that, we might get the
idea that maybe it would be a
good idea to quantize or
threshold this depth data before
we filter it for a more dramatic
effect.
So with that, I'd like to turn
it back over to Etienne.
[ Applause ]
>> Thank you, Craig.
All right.
So now that we've seen what
depth and disparity looks like,
what kind of effects can we
apply with that data?
So let's take a look.
Here's an example image and its
disparity map.
One effect we can apply is a
depth blur effect.
And this is the effect that you
can achieve by capturing using
camera in portrait mode.
We can get a bit more creative.
And here's an example where we
apply a different effect to the
background and the foreground.
Here we dissipate the background
while increasing the saturation
of the foreground to make those
flowers stand out.
We can go even further than
this.
And here we are actually the
deeming the pixels in the
background proportionally to the
depth.
And so this is just a couple of
examples to give you a taste of
what you can do with that data.
And we're going to show you how
to do this and more later in the
talk.
Now let's see who could use that
depth.
Well of course, if you are
editing an application, you can
now use depth to create new
kinds of effects and to apply
new kinds of effects to your
images.
But if you're a camera
application, you can also opt in
to capture depth and be the very
first -- apply the very first
depth effect to the images such
as your own depth blur effects,
for example.
If you are a sharing
application, you may also want
to take advantage of depth to
apply cool effects before
sharing images.
All right.
But before we can apply any
effects, let's take a look at
how to read depth data and load
it into memory.
So let's take a look.
Depth data is stored in image
files alongside image data in a
section called Auxiliary Data.
Beware that the values image
[inaudible] in the system such
as UI image and [inaudible]
image or [inaudible] image do
not contain depth information.
You need to access the image
file in order to read the depth
data.
So let's see how to do that.
If you're using PhotoKit,
there's a couple of ways to can
access the image file.
You may be using PH Content
Editing input, for instance.
Here's how you can request the
Content Editing input for a
particular PHAsset.
And you can access the image
file URL from the Content
Editing input that way.
You may also use PH Image
Manager.
You can ask the PH Image Manager
to request image data for a
particular asset.
And that will give you back a
data object that contains the
file data.
All right.
So now that we have access to a
file, let's see if it contained
depth data.
And so we're going to use
ImageIO for this.
We start from an image source
that we create from our image
file.
And then we copy the image
source properties.
This will give you back a
dictionary that looks like this
one.
You want to look for the
kCKImage Property Auxiliary data
key in that dictionary.
The presence of that key will
tell you that image file that
you're working with contained
auxiliary data.
You can look at the type of the
data.
And here you can see its
disparity.
Could also be depth.
One thing to note here is that
the dimension of the depth data
are smaller than the dimension
of the full size image.
This is an image captured by
iPhone 7+.
The full size image is 12
megapixel.
And the depth data is about less
than a megapixel.
All right.
So now that we know that we have
a file with depth data, let's
see how we can read it
[inaudible].
So, it goes like this.
We start with the auxiliary data
from the file.
And then we create an AV Depth
Data object, which is a
[inaudible] memory
representation for the depth
data.
From that object we can access a
CV pixel buffer that contains
the depth data.
The pixel buffer will be a
single channel of data
containing either depth or
disparity and in 16-bit or
32-bit floating [inaudible]
values.
All right.
So let's see how to do that in
code.
Again we start from [inaudible].
And next we ask to copy the
auxiliary data out of the image
[inaudible].
So for that we request a
particular auxiliary data type.
Here we are requesting
disparity.
And this give us back a
dictionary that contains the
auxiliary data.
It can also return the
[inaudible], and that will
indicate that the image file
does not contain the disparity
-- sorry, the auxiliary data of
that particular type.
So that's another way you can
check if a file contains depth
data.
Next, we can create an AVDepth
Data object from the auxiliary
data, the representation that we
got from ImageIO.
And that AVDepth Data contains a
couple of properties that you
can ask.
For example, you can check for
its native data type.
And that's a pixel format that
you can check.
And if it's not the one you
want, you can also convert it to
a new pixel format.
So for example, here we ask for
disparity float 16.
Because maybe we're going to use
a disparity map on the GPU, for
example.
And this will return a new AV
Depth Data object of the right
format.
So once you have a depth data
object that you're happy with,
you can access a CV pixel buffer
using the depth data map
property.
Once you have the CV pixel
buffer you can use it directly.
Or you can use it using Metal or
Core Image.
If you're working with Core
Image, there's a convenient way
you can load the depth data
directly into a CI Image.
Here's how to do it.
When you create a CI Image from
the contents of a file, you can
now specify a new option such as
kCGIImage auxiliary depth or
disparity to indicate to CI to
load the depth data -- the depth
image instead of the regular
image.
Once you have a depth image, you
can always go back to the AV
Depth data object by call its
depth data property.
And keep in mind that you can
always convert back and forth
between disparity and depth
using convenient UCI filters
such as CI depth to disparity.
All right.
So now that we've read the depth
data out of a file and into an
image, let's see -- we still
need to take a couple more step
before we can start editing with
it.
If you remember, the depth data
is lower resolution than the
image.
So, the very first thing that
you want to do is to scale it up
to the resolution of the image
that you're working with.
There's a couple ways to do
that.
So, let's take a look.
Here's our example image and its
disparity map.
So, if we scale up, let's say
this small, tiny portion there,
using narrow sampling, you can
see that it's very, very
pixelated.
So at the very least you would
want to apply linear sampling to
get the smoother result.
You can also use a new CI filter
to CI BiCubicScale Transform to
get an even smoother result.
However, beware that depth data
is not color data.
And so, instead of smoothing,
maybe what you want is actually
preserve as much as possible the
details of the image so that the
depth data matches the image
more closely.
And you can do this with a
convenient CI filter called CI
Edge Preserve Upsample Filter.
This filter will upsample the
depth data and will try to
preserve the edges from the
color image.
All right.
Oh, also another thing that you
need to be careful of.
For all of those resampling
operation, we recommend that you
use disparity over depth because
it gives you -- it will give you
better results.
Okay. A couple thing that you
may want to do as well.
You may want to compute the
minimum and maximum value for
the depth data.
Because there are many cases
where you need to know those
values for the particular
effects that you want to apply.
Also keep in mind that the depth
data is not normalized between 0
and 1.
For example, disparity values
can range from 0, which means
infinity, to greater than 1 for
objects that are closer than one
meter away.
Okay. Another thing you can do
is to normalize the depth data.
So once you know the min and max
you can normalize the depth or
disparity between 0 and 1.
And that's pretty convenient
first to visualize it.
But also if you want to apply
your depth effects consistently
across various different scenes.
All right.
So now that we've read our depth
data, and prepared it for
editing we are ready to start
filtering with it.
So in this section we're going
to show you several example of
depth effects you can apply.
We're going to start with simple
background effects that you can
achieve using built in Core
Image filters.
Then we're going to show you a
custom depth effect that you can
achieve using a custom CI
[inaudible] code.
Then we're going to show you how
you can apply your own depth
blur effect using a brand new CI
Filter.
And finally, we're going to show
you how to create a brand new
3-D effect using depth.
So let's get started with the
first one.
And for that I'm going to call
my colleague Stephen on stage
for that demo.
Stephen.
[ Applause ]
>> Thank you, Etienne.
Good morning everybody.
My name is Stephen.
And now that Etienne has shown
you how to load and prepare your
depth data, it's my pleasure to
be here to show you a couple of
ways that you can use depth now
to achieve some new and
interesting effects on your
images.
So we're going to jump right in
with a demo.
Okay. What we're looking at here
is -- I'm in the Photos app.
And I'm going to enter Edit here
on this image.
So, this effect is implemented
here in a photo editing
extension.
And now that we got the rough
parts out of the way, we're
looking at the original image
here.
No edits applied.
And I'm going to go ahead and
turn on the effect now.
What you see is that I've
applied a de-saturating effect
to the image, but only to the
back ground region.
And I can pick a different
background effect to apply.
In this case I've picked just a
flat white image.
And maybe I'm not terribly
satisfied with where that
threshold is between background
and foreground, so I can pick a
new threshold by tapping.
And this is all based on the
depth data.
So let me bring it back here to
the front.
And you can clearly see there's
a pretty sharp boundary between
what's considered background and
foreground.
There's actually a narrow region
in between where we're doing a
little bit of blending between
the two.
And I have control over the size
of that blend region.
I can adjust that by pinching.
And there you can see it looks
like a pretty nice white fog
effect.
We've implemented this effect by
making use of a blend mask.
So let me show you what our
blend mask looks like here.
The black regions of the blend
mask correspond to the
background image.
Solid white is foreground.
And then everything in between
is where we blend between the
two.
So, this is what that blend mask
looks like as I pinch.
We're adjusting the size and
slope of that blend.
Okay, back to the original
image.
As many of you know, there are
so many built in interesting
Core Image filters we could
choose to apply.
I'm going to show you a couple
of others.
This one is a hexagonal pixelate
filter.
And a motion blur.
Let's say I'm happy with this.
And I'll save it back to my
photo library.
Okay, now let's talk about how
we did this.
As I mentioned, we accomplished
this effect by building a blend
mask.
And so I'll talk to you now
about how we build that blend
mask.
The basic idea is that we're
going to map our normalized
disparity values into values
between 0 and 1 for the mask.
And so we want some high region
of disparity values to map to 1
in our blend mask corresponding
to the foreground.
Some region of low disparity
values to map to 0.
And that will be the background
region.
And then all disparity values in
between will blend with the
linear ramp.
The first part of building this
blend mask is to pick the
threshold between background and
foreground.
So when the user taps on the
image what we do is to sample
the normalized disparity map at
that same location and set that
as our threshold between
background and foreground.
Now I'll show you the code for
what that looks like.
This is all accomplished using
built in Core Image filters.
So the first of which I'd like
to show you here is the CI Area
Min Max Red filter.
This filter, when you render it
into a single pixel will return
the minimum and maximum values
of the image within the region
that you specify.
Here we're passing in the small
rect that the user tapped on.
The other thing to note about
this line is that before we
apply the effect we're clamping
the disparity image so as to
image that if the user taps near
the boundary of the image we
won't sample any clear pixels
outside the boundary.
On this line we're simply
allocating a 4-byte buffer large
enough to store a single pixel.
And we render into that pixel on
this line.
Note that we're passing in nil
as our color space.
And this tells Core Image that
we don't want it to do any color
management for us.
Finally we read the maximum
disparity value out of the
pixel's green channel and then
remap it to the range of 0 and 1
by dividing by 255.
The other input the user has
control over is the size and
slope of the blend region.
So as the user is pinching on
the view, we're adjusting the
size and slope accordingly.
And this is the result of a
mapping -- sorry, this mapping
is the result of applying a CI
Color Matrix filter.
So I'll show you in a second.
But then we also apply a CI
Color Clamp to ensure that the
values remain within the region
of 0 to 1, the range of 0 to 1.
So here's the code.
First we apply our CR Color
Matrix filter.
Its inputs are essentially the
slope and the bias that were
selected by the user by tapping
and pinching.
And then on a single line we
apply the CI Color Clamp filter.
Now that we've build the blend
mask, the rest is
straightforward.
What you see on the left is the
original image.
And on the right you see that
image with the background effect
applied.
When we apply the blend mask to
the original image, the
background region disappears.
And when we blend that together
with the background image, we
get our final effect.
One more slide of code.
Here's where we apply our
background filter.
This could be any filter you
choose.
There are many built into Core
Image.
You could write your own.
And then we apply the CI Blend
With Mask filter, passing in
both the background image and
the mask.
And that's it.
That's how we accomplish this
effect using a suite of built in
Core Image filters.
And next I'd like to jump in and
show you another demo.
For this one -- the previous one
we used disparity -- depth data
indirectly, right?
We used it to build a blend
mask.
For this one we're going to use
disparity a little bit in more
of a direct fashion.
And I'm going to pull up this
other editing extension here to
show it to you.
Okay. Here we are in photos
again.
Let's pick this next extension.
There we go.
Original image.
No edits applied just yet.
I have a slider at the bottom,
though.
And I'll start to move that now.
And you can see that the
background fades to black,
leaving us just this prominent
foreground figure.
That's a really nice effect,
wouldn't you say?
Let's save that back to the
library and show you how this
one's done.
What we're doing is we're
mapping our normalized disparity
values into a scale value, which
we then apply directly to our
pixels.
And for this particular effect
we're mapping our disparity
values through an exponential
function, which when we start
off, we raise our normalized
disparity values to the power of
0.
And that maps all of our scale
factors into one, producing no
effect on the output image.
When we raise the power to 1 as
we move the slider over, this is
effectively the same thing as
scaling our pixel intensities by
the inverse of the depth.
Because we're scaling by
disparity directly.
The effect becomes more
interesting as we start to raise
it to higher and higher powers.
As you can see, the shape of the
curve becomes such that there's
a sharper distinction between
background and foreground with
the background quickly going to
black.
So I'm going to show you the
code for this effect in just one
slide.
We've implemented this effect as
a custom CI Color Kernel.
And there are a couple of
notable advantages to using a
customer CI Color Kernel.
One of which is performance.
If you are able to express your
effect in terms of a custom CI
Color Kernel, Core Image is able
to optimize that by
concatenating your kernel into
its render graph, thereby
skipping any potentially costly
intermediate images along the
way.
The other nice thing about this
is that Core Image allows us to
pass in multiple input images.
And Core Image will
automatically sample those for
us and passing in those sample
values to us as parameters to
our kernel function, which you
see here.
The first parameter is a sample
from the original image.
The second is a sample from the
normalized disparity map.
And then third is the power
selected by the user moving the
slider.
The first thing we do with a
normalized disparity is to raise
it to a power, as I mentioned.
That gives us our scale factor.
Then we take our scale factor
and apply it to the intensity of
the pixel while preserving the
original alpha value.
This last line is a line of
Swift code illustrating how we
can apply our custom kernel to
our original image once it's
been constructed from the source
code you see above.
We pass in our image extent as
well as a list of arguments.
Here these are the original
image, the normalized disparity
map, and the power selected by
the user.
Note that these arguments
correspond one to one with the
parameters defined in our kernel
signature.
Okay. So that's it.
I've just shown you how to use a
custom CI Color Kernel to
produce this really nice
darkening background effect.
And hopefully it gives you some
ideas of other things you can do
with custom CI Color Kernels
combined with depth to produce
some nice effects.
So now I'm going to invite my
colleague Alex up onto the stage
to show you something brand new
in Core Image.
Alex.
[ Applause ]
>> Thank you, Stephen.
Good morning everyone.
May name is Alexandre Naaman.
And today I'm really excited to
be here.
And today I'm going to talk to
you about a new Core Image
filter we have.
So as you know, in iOS 10 using
the iPhone 7+.
You could capture images using
with depth capabilities using
the camera app and the portrait
mode.
Now, with iOS 11, and Mac OS
High Sierra, we're enhancing
those capabilities by allowing
you to use those exact same
algorithms via a new Core Image
filter called the CI Depth Blur
Effect.
So, now let's try and switch
over to a demo and see how that
works in real life.
Yay! All right.
So here we have an asset, a
photo that was taken with depth.
And we're just viewing the image
without having applied any
filters to it.
If I tap on this once, we can
see what the disparity data
looks like.
I'm going to tap on it once more
and we'll back to the main
image.
And if I tap once more, we're
going to see what happens when
we use those two images together
along with the new CI Depth Blur
Effect filter to create a new
rendered result.
We should see the background get
blurry.
Yay. So now, in addition to just
applying the filter as is, there
are many tunable parameters that
we can set.
And inside of this application
I've set things up such that it
will respond to a few gestures.
So, if I now, for example,
pinch, we can dynamically change
the aperture and get a new
simulated look.
And so we can simulate any lens
opening that we would like quite
simply.
Another gesture I've set up in
this application is such that
when we tap at a different
location it's going to change
the focus rectangle.
And so we can see right now the
aperture is quite wide open.
And only the lady in the front
is in focus.
But if I tap on the lady on the
left, all of a sudden now she's
in focus.
The background is a little less
blurry.
And the gentleman on the right
is still a little blurry.
And I can tap on him now and
change the focus rect once
again.
And now they're all three in
focus and the background still
remains blurry.
Now that we're done with our
demo, let's go and look at how
this happens --
[ Applause ]
-- in terms of code.
Okay. So as I was mentioning, at
its base the CI Depth Blur
Effect really just has two
inputs: the input image and the
input disparity image.
And with those two, Core Image
is going to extract a lot of
metadata for you and apply the
effect in order to render a new
image.
Internally, however, there are
many parameters that you can
set, as I was mentioning
earlier.
We already know that you can set
the input image and the input
disparity image.
And in the case of the
application that we were looking
at earlier, when we were
tapping, we were setting the
input focus rect.
And then as we were pinching, we
were setting the aperture.
So now that we have an idea of
how we want to do this from a
conceptual standpoint, let's
take a look at how this will be
done in terms of code.
And this is effectively my only
slide that has any code on it,
so you can see how simple it is
to use.
As we saw earlier, you can load
a CI Image via URL, quite
simply.
This gives us the main image.
And then in order to get the
disparity image, all you need to
do is use that same URL and ask
for the auxiliary disparity
information via the Options
dictionary as Etienne mentioned.
Once we have our two images, we
can create a filter.
And we do that by name, CI Depth
Blur Effect.
And then we specify our two
images.
Once that's done, we can then
ask for the output image via
.outputImage.
And we have a new CI Image that
we can then render in any number
of ways.
And it's important to remember
that a CI Image is really just a
recipe for how to render.
So this is actually a quite
lightweight object.
In the case of the application
that we were looking at earlier,
all we had to do in order to
render a new effect with a new
look was to change two values.
So in this case, we changed the
input aperture.
And we do that by calling filter
setValue forKey and specifying a
float value between 1 and 22 to
create a new simulated aperture.
And we specified a new rectangle
where we wanted to focus on via
the input focus rec key, which
corresponds to a lower left
based rectangle in normalized
coordinates.
Once those two things are done,
we can ask that filter for a new
output image and then render as
we wish.
Now as I was mentioning, Core
Image does a lot of things for
you automatically by examining
the metadata.
There are, however, a few things
you can do in order to further
enhance the render that we don't
do for you automatically.
And those relate to finding
facial landmarks.
And you can use the new vision
framework that we have in order
to do this.
So, via the vision framework you
can get the left eye positions,
right eye positions, nose
positions and chin positions.
And you can specify up to four
faces to be used inside of the
CI Depth Blur Effect.
In the case of this image,
because there are three faces,
we would actually specify six
floating point values into a CI
vector and set that for each
landmark that we found.
And there are [inaudible], so it
would be xy, xy, xy.
The next thing I'd like to talk
to you a little bit about is
dealing with rendering outputs
of different sizes.
Although the inputs are quite
large, 12 megapixels, chances
are you won't often be rendering
the entire image.
And you may want to down sample
the output.
Your initial reflex may be to
just down sample it, the result
of the CI Depth Blur Effect.
But that's actually not very
efficient because the CI Depth
Blur Effect is actually quite
computationally expensive.
It makes more sense to, instead,
down sample the input.
And if you do this, we can then
take advantage of the fact that
the input image is smaller and
perform some optimizations.
In order to do this, however,
you do have to set one more
parameter, which is called the
Input Scale Factor.
So in this case if we wanted to
down sample the image by 2, we
would set the input scale factor
to 0.5.
And doing so ensures that we
sample appropriately from the
image and also take into account
other effects such as the noise
in the image.
There are a few additional
things that I'd like to mention
about using the CI Depth Blue
Effect which are important to
keep in mind.
The first and foremost is that
when you create your CI context
where you'll be using these
filters, you're going to want to
make sure that you're using half
float intermediates.
And you can do this by
specifying the kCI Context
Working Format to be RWAH.
On MacOS this is the default.
But on iOS, by default we use
8-bit.
And if you don't, you will see
that the data will be clipped
because disparity data comes in
extended range.
And without specifying this, it
will get clipped and the result
won't look very good.
So it's really important to
remember to keep doing this when
you use this filter.
Also as I mentioned earlier, the
CI Depth Blur Effect will
automatically set many
properties for you on the
filter.
In order to do so, it will
examine the metadata from the
main image as well the data that
exists inside of the auxiliary
image.
And so it's important to try to
preserve that throughout your
pipeline.
Core Image will do its best in
order to ensure that this
happens.
But it's something you're going
to want to keep in mind as
you're using this filter.
And Etienne's going to talk to
you a little bit later about how
to ensure that you do this when
you save images.
All right.
Well, the last thing I'd like to
talk to you about today has to
do with some internals of the CI
Depth Blur Effect.
It's been mentioned many times
so far already today, the main
image and the disparity image
are of very different
resolutions.
And now internally, Core Image
is going to up sample the
disparity image up to a certain
point in order to achieve the
final result.
And this is an area where we
feel like if you have additional
processing time you could
perhaps do something a little
different.
Maybe some of the methods that
Etienne spoke of earlier.
And that concludes pretty much
everything I wanted to tell you
about using the CI Depth Blur
Effect.
And I hope you all go and start
adding it to your apps.
And on that, I'm going to hand
it back over to Etienne.
Thank you very much.
[ Applause ]
>> Thank you, Alex.
All right.
So, so far we've seen
interesting cool new effects
that you can do using depth
data.
But the depth data was really
used as a mask to apply
different effects to different
part of the image.
And so now we want to show you
something different.
Something that actually uses
depth as a third dimension.
And this will give you a taste
of what kind of new creative
effects you could apply using
this data.
And to tell you all about it,
I'm going to invite Stephen back
on stage.
Stephen.
[ Applause ]
>> Thank you, Etienne.
It's good to be back with you.
What I'm going to show you now
is a true 3-D effect.
And this particular effect that
we're going to show you is
called dolly zoom.
Many of you are probably already
familiar with what dolly zoom
is, especially if you've ever
seen a scary movie.
But to get everybody up to speed
a little bit, I'm going to show
you a little animation of what's
going on with dolly zoom.
So what you're looking at here
is a scene consisting of three
spheres.
While the camera is dollying
back and forth, it's also doing
it so in such a way that the
field of view is also
simultaneously constrained so
that the gray sphere in the
center on the focal plane
remains at roughly the same size
throughout the effect in the
output image, which you see on
the right.
Everything else in the scene
will move around due to
perspective effects.
So let's take a look.
Let's switch over here to the
device.
Perfect. All right.
Let me pull up the dolly zoom
editing extension.
And I'll draw your attention now
to the group of flowers in the
center of the image.
Those are on the focal plane.
So as I begin to move the
camera, there you see the dolly
zoom effect in its full glory.
When I pull the camera in this
direction in particular you can
really see the true 3-D nature
of this effect with the
foreground flowers really
popping out and the background
sort of fading or pulling away
from the camera.
You do also see a couple of
artifacts, of course.
One of which are the black
pixels that you see coming into
view around the background.
This is due to the fact that in
the camera's current
configuration, the virtual
camera, its field of view is
wider than the iPhone that
captured the image.
And so the virtual camera sees
more of the scene than the
iPhone did at the time of
capture.
So we're just filling those
pixels in with black.
Similarly, the stretching you
see in between the foreground
flowers and the green leaves
behind them is due to the fact
that this camera, the virtual
camera, has exposed some
portions of the scene that
weren't visible to the iPhone at
the time of capture.
One strategy you might take to
work around some of these issues
is to set a new focal plane.
So, now I've tapped on the
yellow flower in the foreground,
which is in the bottom right
corner of the image to set that
as the focal plane.
And as I move the camera now in
this direction you can see that
none of the black pixels are
coming into view.
Of course if I move the camera
again in this direction, they
show up again.
And really, that 3-D effect is
quite strong here.
Correspondingly, I can tap on a
background region of the image,
such as the trees you see in the
upper left corner.
And when I pull the camera now
in this direction, it really
produces a pleasing sort of
prominent effect on that central
group of flowers.
So let's take a look now at how
we implemented this.
Because of the true 3-D nature
of this problem, we turned to
Metal as a true 3-D tool to
solve this effect, to produce
the effect.
We were able to get our system
up and running quite quickly in
Metal because of all the work
that it does for us.
Basically all we had to do to
start off with was to construct
a triangular mesh that we mapped
onto the image, much like what
you saw in Craig's Depth
Explorer demo at the beginning
of the session.
And we mapped the -- excuse me
-- we mapped the image -- we
center the image around the
origin.
Metal also gives us the
opportunity to program a couple
of stages of its pipeline, one
of which is the vertex shader.
And the job of the vertex
shader, it gives us the
opportunity to process the
geometry of the scene in some
way.
And we can also program the
fragment shader, which gives us
the opportunity to produce a
color for each pixel in the
output.
We were able to reintegrate all
of this 3-D Metal rendering back
into our Core Image pipeline by
using a CI Image Processor
Kernel.
So here's the code for the
vertex shader.
Again, the job of the vertex
shader is to process the
geometry.
And it does so one vertex at a
time.
So we get one vertex of the
original mesh as input.
And then we'll produce something
new on the output.
The first thing we do in the
vertex shader is sample the
depth at that vertex.
That's this line you see here.
We're storing it in a variable
called z, which will get used in
a couple of places in this
shader.
The first of which is this
magical line right here.
This line is the line that every
young engineer grows up dreaming
that they'll write some day.
Because this is where we do the
math.
There are three variables as
input to this equation.
One is the depth, which we just
sampled above.
And the others correspond to the
user inputs of the focal plane
and the camera's configuration.
This produces a scale factor,
which we can apply to our
vertices, which we do on this
line right here.
And we can apply a scale factor
to it because the vertices are
centered around the origin.
So this scale factor serves to
move vertices either radially
away from the center or toward
the center of the image.
And this is what produces the
illusion of three dimensions.
Once we have transformed our
vertex position, we output it
here in the new output vertex
while preserving the original
depth value, z.
And this is important because it
will get passed into the z
buffer machinery of Metal, which
will then just do the right
thing for us as pixels move
around in the output and start
to overlap each other.
Also, we output the texture
coordinate of the original
vertex.
This will get used by the
fragment shader, which I'll show
you now.
Remember, the fragment shader's
job is to produce a color pixel
output.
And since Metal interpolates all
of these texture coordinates for
us, all we have to do in our
fragment shader is to sample the
original image at the
interpolated texture coordinate.
And that's it.
That's really all the code you
need to see to implement the
dolly zoom effect.
And hopefully it's given you
some ideas of new directions you
can take this in to produce your
own brand new custom 3-D
effects.
We're really excited to see what
you come up with.
And now, I'll hand the stage
back over to Etienne to finish
up.
[ Applause ]
>> Thank you, Stephen.
All right.
So now that we've applied
various new effects to our
images, there's one more step
that we need to take.
And that's save your depth data.
So, you should always preserve
the depth data.
All right?
That way your users will be able
to use other apps like yours to
apply their own depth effects on
top of yours.
Even if you don't use the depth
data, you should always preserve
it if it was present in your
original image.
This will really ensure the best
possible experience for your
users.
However, when you store the
depth data, be sure to match the
geometry of image data.
If you don't apply geometry
correctly, the depth data will
no longer match the image data.
And so further depth effect
applied on top of that will no
longer work properly.
So let's take a look at how the
kind of geometry transforming
might apply to depth data.
A very common operation is
orientation.
Often times you get to work with
a portrait image that was
actually shot in landscape and
has an [inaudible] orientation.
So, the depth data may look like
this.
And so you want to make sure to
orient the depth as well.
So make sure to apply
orientation.
Another very common operation is
crop.
Right? And so again, make sure
you crop the depth data to
match.
Now, you may have a more
advanced transform that you also
apply to your image such as
[inaudible] transform like this
one.
Right? Or maybe you have a
custom [inaudible] transform
such as a perspective transform.
Or maybe even a 3-D transform
like the one we saw in the dolly
zoom demo.
In any case, you want to also
apply the same transform to the
depth data so that it matches
the image perfectly.
Okay. So the key thing to
remember here is to apply your
transform at the native
resolution of the depth data.
So you want to scale your
transform parameter from the
full size image to that size of
the depth image.
Otherwise, right, your transform
will be applied incorrectly.
And then the depth image will no
longer match the image, the
output image.
Another thing to note is that
depth data is not color data.
So when you are rending a new
depth image, make sure you don't
apply any kind of color match
into it.
All right.
So now that we've seen what kind
of transform we may apply to
depth data we can render it into
a new CV pixel buffer.
Once we have a new CV pixel
buffer we can create a new AV
depth data object from it.
Here's how.
We start from our original depth
data object and then we call
Depth Data by replacing Depth
Data Map.
And we pass in all newly
rendered depth buffer.
Now returns a new AV depth data
object.
And we can then save into our
output image.
Let's a take a look at how to
write depth data using Image IO.
We start from an image
destination that we create for
our output file.
And here we ask for a JPEG
format.
So please note that not all
image from that [inaudible]
depth, but JPEG does.
Next we add our output image to
the image destination.
And then we ask all depth data
object that we want to store
into the image for a dictionary
representation for the auxiliary
data to store in the file.
So this is will return the
dictionary for the auxiliary
data as well as by reference the
type of the auxiliary data to
store.
Then we ask the CG image
destination to add that
auxiliary data, passing in the
type and the dictionary.
And finally, all we have to do
is to call CG image destination
finalized to write everything
down to disk.
If you're working with Core
Image, there's a new very
convenient way you can do this
as well.
So, if you are using CI context
[inaudible] with image
representation for a particular
CI image directly in order to
render and save to a JPEG file,
you may now pass in using an
option key a depth data object
that you want to store as part
of that file.
Even better, if you have an
image, a depth image that you
have -- let's say you have
applied a transform to it or
something.
You can also specify it as an
option to that method so that
Core Image will render both the
regular image and the depth
image and save everything down
to the file in one call.
Very convenient.
And so that's it.
That concludes our session on
editing images with depth.
So, let's recap what we've
learned today.
We've learned what depth is and
what depth and disparity looks
like.
We've learned how to read and
prepare depth data for editing.
And then we saw several ways,
and we showed you several ways
to apply new depth effects to
your images.
The first one was background
effects using a built in Core
Image filter.
Then we had a custom darkness
effect using a custom Core Image
kernel.
And then we showed you how you
can apply your depth effect
using a new CI filter.
And then we saw how you can
create a brand new 3-D effect
using depth.
I hope that this session will
inspire you to use depth data
into your own applications.
And I can't wait to see what
cool effects you'll come up
with.
For more information, please go
to the [inaudible] at Apple.com.
We have a couple related
sessions.
There's a session on Advances in
Core Image that's going to take
place later today.
So we strongly encourage you to
go there.
Check it out.
And there was a couple of
session yesterday was doing
Photo Kit and also how to
capture depth with iPhone.
And with that, I hope that you
have a good rest of the WWDC.
Thank you very much.