WWDC2018 Session 719

Transcript

[ Music ]
>> All right [applause].
Thank you [applause].
Good afternoon everyone and
thank you for coming to our
session today on Core Image.
My name is David Hayward, and
I'm really excited to be talking
about the great new performance
and prototyping features our
team has been adding to Core
Image over the last year.
We have a lot to talk about, so
let's get right into the agenda.
So, the first thing we're going
to be talking about today are
some great new APIs we've added
to Core Image to help improve
the performance of your
applications.
After that, we're going to segue
into another topic, which is how
you can use Core Image to help
prototype new algorithm
development.
And lastly, we're going to be
talking about how you can use
Core Image with various machine
learning applications.
All right.
So, let's get into this and
start talking about performance
APIs.
There's two main areas where
we've worked on performance this
year.
First of all, we've added some
new controls for inserting
intermediate buffers -- we'll
talk about that in some detail.
And the second thing is we'll be
talking about some new CI kernel
language features that you can
take advantage of.
So, let's start by talking about
intermediate buffers.
As you are aware, if you've used
Core Image before, Core Image
allows you to easily chain
together sequences of filters.
Every filter in Core Image is
made up of one or more kernels.
And one of the great features
that Core Image uses to improve
performance is the ability to
concatenate kernels in order to
minimize the number of
intermediate buffers.
In many cases, to get the best
performance you want to have the
minimum number of buffers.
However, there are some
scenarios where you don't want
to concatenate as much as
possible.
For example, your application
might have an expensive filter
early on in the filter chain.
And the user of your application
at a given moment in time might
be adjusting a filter that
follows it in the graph.
And this is a classic situation
where it's a good idea to have
an intermediate buffer at a
location like this in between.
The idea is that by having an
intermediate buffer here, the
cost of the expensive filter
does not have to be paid for
again when you adjust a
secondary filter.
So, how do you do this in your
application?
We have a new API, very aptly
named, inserting intermediate.
So, let's talk about how this
affects our results.
What we do is instead of
concatenating as much as
possible, we will respect that
location of the intermediate and
concatenate as much as possible
around it.
Some notes on this.
One thing to keep in mind, is
that by default, Core Image
cashes all intermediate buffers
so that the assumption is that a
subsequent render can be made as
fast as possible.
There are, sometimes, however,
when you might want to turn off
caching of intermediates.
So, for example, if you
application is going to be doing
a batch export of 100 images,
there is little benefit of
caching the first one, because
you'll be rendering a completely
different image afterwards.
So, you can do that today in
your application by using the
context option cache
intermediates and setting that
value to false.
However, if you are also using
this new API that we spoke
about, you can still turn on
caching of intermediates, even
if this context option is turned
off.
So, this allows you to really
make sure that we cache
something and don't cache
anything else.
The next subject I'd like to
talk about is some new features
we've added to the kernel
language that allows us to apply
image processing.
So, one thing to keep in mind is
that we have two different ways
of writing kernels in Core
Image.
The traditional way is to use
the CI kernel language.
And in this case, you have a
string inside your source file;
either your Swift code or your
objective C code.
And at run time you make a call,
to say, kernel with source.
And later on, when you create an
image based on that kernel, you
can then render that to any type
of Core Image context, whether
that context is backed by Metal
or open GL.
When it comes time to render,
however, that source needs to be
translated.
It needs to be translated either
to Metal or GLSL, and that step
has a cost.
Eventually then, that code is
compiled to the GPU instruction
set and then executed.
Starting last year in iOS 11, we
added a new way of writing CI
kernels, which has some
significant advantages.
And that's CI kernels based on
the Metal shading language.
In this case, you have your
source in your project and this
is -- this source is complied at
build time rather than at
runtime.
As before, you substantiate a
kernel based on this code by
using the kernel with Metal
function name and binary data.
The advantage here is that this
data can be applied without
paying the cost of an additional
compile.
The caveat, however, is it works
on Metal backed CI context.
But it gives a big performance
advantage.
So, starting in this release
we're going to be marking the CI
kernel language as deprecated,
because while we will continue
to support this language, we
feel that the new way of writing
Metal kernels offers a lot of
advantages to you, the
developer.
For one thing, you get the
performance advantage I outlined
earlier, but it also gives you
the advantage of getting build
time syntax coloring on your
code and great debugging tools
when you're working with your
Metal source.
So, great.
[ Applause ]
So, with that in mind I want to
talk about a few other things
that we've added to our kernel
language.
For one thing, we have added
half float support.
There are a lot of cases when
your CI kernel can be perfectly
happy with the precision that
half float gives you.
If you're working with RGB color
values, half float precision is
more than adequate.
The advantage of using half
floats in your kernel is it
allows operations to run faster,
especially on A11 devices like
the iPhone 10.
Another advantage of using half
floats in your kernels is it
allows for smaller registers,
which increases the amount of
utilization of the GPU, which
also helps performance.
Another great feature we've
added to the kernel language
this year is adding support for
group reads.
This gives your shader the
ability to do four
single-channel reads from an
input image with only one
instruction, so this really can
help.
And as a complement to that, we
also have the ability to write
groups of pixels.
This gives you the ability to
write four pixels of an image
with just one call inside your
shader.
So, all three of these features
can be used in your shaders to
give great performance
improvements.
So, let me talk a little bit
about an example of how that
works.
So, imagine today you have a
simple 3 by 3 convolution kernel
that is working only on one
channel of an image.
This is actually a fairly common
operation, for example, if you
want to sharpen the luminance of
an image.
So, in a kernel like this,
typically, you're -- each time
your kernel is evoked, it is
responsible for producing one
output pixel.
But, because this is a 3 by 3
convolution, your kernel needs
to read 9 pixels in order to
achieve that effect.
So, we have 9 pixels read for
every one pixel written.
However, we can improve this by
making use of the new group
write functionality.
With the new group write
functionality, your kernel can
write a 2 by 2 group of pixels
in one evocation.
Now, of course this 2 by 2 group
is a little bit bigger, so
instead of a 3 by 3, we need to
have a 4 by 4 set of pixels read
in order to be able to write
those four pixels.
But, if you do the math, you'll
see that that means we have 16
pixels read for 4 pixels
written.
So, already we're seeing an
advantage here.
The other feature we have is the
ability to do gathers.
In this example, we're reading a
4 by 4 or 16 pixels.
And with this feature, we can do
these 16 pixels red with just
four instructions.
So again, if you look at the
math on this, this means we're
doing just 4 group reads for
every 4 pixels written.
And this can really help the
performance.
Let me walk you through the
process of this on actual kernel
code.
So, here's an example of a
simple convolution like the one
I described.
Here, what we're doing is making
9 samples from the input image
and we're only using the red
channel of it.
And then once we have those 9
values, we're going to average
those 9 values and write them
out in the traditional way by
returning a single vec4 pixel
value.
Now, first step to make this
faster is to convert this to
Metal.
This is actually quite simple.
So, we start with code that
looks like this, which is our
traditional CI kernel language.
And with effectively, some
searching and replacing in your
code, you could update this to
the new Metal-based CI kernel
language.
There's a couple things that are
important to notice here.
We have added a destination
parameter to the kernel, and
this is important if you're
checking for the destination
coordinate inside your shader,
which a convolution-like kernel
like this does.
And then we're using the new,
more modern syntax to sample
from the input by just saying
sample -- s.sample and
s.transform.
And the last thing we've done
when we've updated this code is
to change the traditional vec4
and vec2 parameter types to
float 4 and float 2.
But as you can see, the overall
architecture of the code, the
flow of the kernel is the same.
All right.
Step 2 is to use half-floats.
Again, this is an example where
we can get away with just using
the precision of half-floats
because we're just working with
color values, and so again,
we're going to make again some
very simple changes to our code.
Basically, places in our code
where we were using floating
point precision, we're going to
use half-float precision as
well.
This means the sampler parameter
and the destination parameter
have an underscore H suffix on
them and any case in their code
where we're using float 4 now
becomes half 4.
So again, this is very simple
and easy to do.
Another thing to be aware of is
if you've got constancy in your
code, you want to make sure to
add the H on the end of them,
like the dividing by 9.0.
So again these -- this is
another simple thing.
The last thing we're going to do
to get the best performance out
of this example is to leverage
group reads and group writes.
So, let me walk you through the
code to do this.
So, again, we want to write a 2
by 2 group of pixels, and from
that we need to read from a 4 by
4 group of pixels.
So, the first thing we're going
to do is specify that we want a
group destination.
If you look at the function
declaration, it now has a group
destination, H datatype.
Then, we're going to get the
destination coordinate like we
had before, and that will point
to the center of a pixel.
However, that coordinate
actually represents the
coordinate of a group of 2 by 2
pixels.
The next thing we're going to do
in order to fill in this 2 by 2
group of pixels is do a bunch of
reads from the image.
So, the first gather read is
going to read from a 2 by 2
group of pixels -- in this case,
the lower left-hand corner of
our 16.
And it's going to return the
value of the red channel in a
half-four array.
The four parameters will be
stored in this order, which is
X, Y, Z, W going in a
counter-clockwise direction.
This is the same direction that
is used in Metal, if you're
familiar with the gather
operations in Metal.
So, again in that one
instruction we've done four
reads and we're going to repeat
this process for the other
groups of four.
So, we're going to get group 2,
group 3, and group 4.
Now that we've done all 16
reads, we need to figure out
what values go in what
locations.
So, the first thing we're going
to do is get the appropriate
channels of this 3 by 3 sub
group and average them together.
And then we're going to store
those channels into the result 1
variable.
And we're going to repeat this
process for the other four
result pixels that we want to
write -- R1, R2, R3, and R4.
And the last thing we're going
to do is called "Destination
Write" to write the 4 pixels all
in one operation.
So note, this is a little
different from a traditional CI
kernel where you would've
returned a value from your
kernel and said you're going to
be calling "Destination Write"
instead.
All right.
So, the great result of all this
is that with very little effort,
we can now get two times the
performance in this exact
shader.
This is a very simple shader.
You can actually get similar
results in many other types of
shaders, especially ones that
are doing convolutions.
So, this is a great way of
adding performance to your
kernels.
So I'm going to seg -- I like to
tell people to go to this great
new documentation that we have
for our kernel language, both
the traditional CI kernel
language and the CI kernel
language that's based on Metal.
I highly encourage you to go and
read this documentation.
But now that we've talked about
improving the performance of how
your kernels can run, I'd like
to bring up Emanuel on stage,
who will help talk to you about
how you can make your
development process of new
algorithms even faster as well.
[ Applause ]
>> Thank you, David.
Good afternoon everyone.
It's great to be here.
My name is Emmanuel, I'm an
engineer on a Core Image team.
So, during the next half of this
session, we'll shift our focus
away from the Core Image Engine
and explore novel ways to
prototype using Core Image.
We'll also see how we can
leverage Core Image in your
machine learning applications.
So let's get started.
Since I want to talk about
prototyping, let's take a look
at the lifecycle of an image
processing filter.
So, let's say that we are trying
to come up with a foreground to
background segmentation.
And here, what this means
precisely is that we'd like to
get a mask which is 1.0 in the
foreground; 0.0 in the
background and has continuous
values in between.
The difficulty in implementing
such a filter heavily depends on
the nature of data you have
available.
So, for example, if you have an
additional depth buffer,
alongside your RGB image, things
can become easier.
And if you're interested to
combine RGB images with depth
information, I highly encourage
you to look at the session on
creating photo and video effects
using that.
Today, I don't want to focus on
these other sources of
information, but I want to focus
on prototyping in general, so --
so let's say that -- so, we have
this filter well-drafted, and we
know the effect we're trying to
come up with, so in this
particular case, a foreground
and background mask.
The very next natural step is to
try implementing it, and you
pick your favorite prototype in
the stack and you start hacking
away and combining different
filters together and showing
them in such a way that you
achieve the filter effect that
you're looking after.
So, let's say you did just that,
and here we have an example of
such a foreground to background
mask.
Now, if you're in an iOS or Mac
OS environment, the very next
natural step is to deploy that
algorithm.
So, you have a variety of
[inaudible] that you can use
such as Core Image,
Metal-with-Metal performance
shaders, as well as VImage if
you want to stay on the CPU.
That initial port from prototype
to production can be quite time
consuming, and the very first
render might not exactly look
like what you're expecting.
And there is a great variety of
sources that can contribute to
these pixel differences, one of
them being simply the fact that
the way filters are implemented
across frameworks can be quite
different.
If you take an example here on
the left-hand side, we have a
[inaudible] blur that applies
this nice feathering from
foreground to background.
And that's an example of a
filter that can leverage a grid
variety of performance
optimizations under the hood to
make it much faster.
All these optimizations can
introduce numerical errors which
will propagate in your filter
stack, thereby potentially
creating dramatic changes in
your filter output.
Another problem that typically
arises when you're putting your
code, is that when you're
prototyping environment, a lot
of the memory management is
taken care of for you.
So, you don't often run into
issues of memory pressure and
memory consumption until it's
pretty late in the game.
Another topic, of course, that's
important to consider is
performance.
Oftentimes, the prototypes are
already using CPU code, and we
over - sometimes-- we often
over-estimate the amount of
performance we can get from
pointing our CP Code to GP Code,
thinking that everything is
going to get real-time.
So, what if we could catch these
concerns way, way earlier on in
our prototyping and workflow?
Well, we believe we have a
solution for you.
And it's called PyCoreImage.
Python bindings for Core Image.
So, this is combining the
high-performance rendering of
Core Image with the flexibility
of the Python programming
language together.
And by using Core Image, you
also inherit its support for
both iOS and Mac OS along with
more than 200 built-in filters.
Let's take a look at what's
under the hood of PyCoreImage.
So, PyCoreImage is made of three
main pieces.
It uses Core Image for its
rendering back end.
It uses Python for the
programming interface.
And it had -- it also has a thin
layer of an [inaudible] code to
allow interoperability with your
existing code bases.
We believe PyCoreImage can now
allow you to reduce the friction
between your prototyping and
product-ready code.
If you'd like to stay in a
Swift-centric environment, a lot
of that can be done as well
using Swift playgrounds, and we
encourage you to look at a
session on creating your own
Swift playgrounds subscription.
All right.
Let's take a look at the main
components of PyCoreImage.
So, PyCoreImage leverages Python
bindings for objective C, PyObjC
and interestingly, we've been
shipping PyObjC since Mac OS
10.5 Leopard.
It was initially implemented as
a bidirectional bridge between
Python and Objective C and
[inaudible] in the context of
Coco app development.
But since then it's been
extended to support most Apple
frameworks.
The calling syntax for PyObjC is
very simple, you take your
existing Objective C code and
you place columns with
underscore.
There's a few more intricacies
and I encourage you to look at
the API if you'd like more
information.
But let's take our CIVector
class as an example here.
So here we have some Objective C
code where we create an instance
of a CIVector by calling
CIVector, Vector with X, Y, Z,
W.
Let's take a look a the PyObjC
code.
It's very similar.
We import the CIVector from the
Quartz umbrella package and we
can call a vector with X, Y, Z,
W and the CIVector class
directly.
One thing you may note here is
that the code does not exactly
look Python-like.
And so, we're going to address
that in just a few minutes.
Now, let's take a look at the
[inaudible] gram for
PyCoreImage.
So the rendering back in is then
using Core Image, and Core Image
is very close to the hardware,
so it's able to redirect your
filtered calls to the most
appropriate rendering back end,
to give you as much performance
as possible.
PyObjC lives on top of Core
Image, and it can communicate
with it through the Python
bindings for Core Image by the
Quartz umbrella package.
And the Quartz is a package that
also contains a variety of other
image processing frameworks such
as Core Graphics, and all the
classes that's using Core Image,
such as CIVector, CIImages, and
CI Context.
PyCoreImage lives on top of
PyObjC, and it provides--
essentially leverages PyObjC to
be able to communicate with Core
Image and makes a lot of
simplifications under the hood
for you, so that you don't have
as much setup code when you're
working with Core Image.
And we'll take a look at this in
just a moment.
A lot of it is done through the
class CIMG that you can also use
to interpret with NumPy via
vendor call.
And you can also wrap your NumPy
buffers by using the class
constructor directly.
All right.
So, let's take an example how
you can apply a filter using
PyCoreImage, and you'll see just
how simple and powerful the
framework is.
So, the very first thing you
want to do is import your CIMG
class from your PyCoreImage
package, which we can then use
to load the image from file.
Note that at this point we don't
have a pixel buffer.
Core Image creates recipes for
images and in [inaudible] the
recipe is just giving
instruction to load the image
from file.
You can create a more
complicated graph by applying a
filter, by just calling the CI
filter name on it and passing
the input primaries in this
case, a radius.
And we can that we are
assembling a more complicated
graph.
And if we zoom on it, we can see
that we have out blur processor
right at the middle.
If you want to get your pixel
buffer representation, what you
can do is call render on your
CIMG instance.
And what you get out is a proper
unit by buffer.
So, to make that possible, we
need to make a few
simplifications on how Core
Image is called or do a bit of
the setup code for you.
So, for those of you who are
already familiar with Core
Image, this will not come as a
surprise, but for those of you
who are not familiar, please
stay with me until the end.
You'll see the few
simplifications we made, and
that should become very clear.
So, Core Image is a
high-performance GPU image
processing framework that
supports both iOS and Mac OS as
well as a variety of rendering
back ends.
Most pixel formats are
supported.
That, of course means bitmap
data as well as raw files from a
large variety of vendors.
Most file formats are supported,
so, like I said, bitmap data and
raw from a large variety of
vendors, most pixel formats are
separated.
So, for example, you can load
your image in an unsigned 8-bit
through your computation and
half float and during your final
render in full 32-bit float.
Core Image can extract image
metadata for you, for example,
capture time; exist tags, as
well as embedded metadata such
as portrait map and portrait
depth information.
Core Image handles color
management very well.
This is a difficult topic on its
own that a lot of frameworks
don't handle.
Core Image supports many battery
conditions, infinite images, and
has more than 200 built-in
filters that you can use, so you
don't need to invent the wheel.
All right, so I don't think I
need to convince you that that's
a lot of information and if
you're trying to use Core Image
in your prototype and your
workflow, the learning curve can
be quite steep.
So what we did is we kept the
best of that list, made a few
simplifications, which,
remember, these simplifications
can all be overridden at one
time.
And since we'll be giving you a
weighted code, you can actually
hardcode these changes in if
this was your prototyping stack.
The first thing we did is that
we still have the
high-performance feature of Core
Image.
We still render to a Metal
backend.
Most all formats are still
supported in and out and we can
still extract capture time of
the data as well as portrait
depth and matte information.
Last but not least, you have
more than 200 built-in filters
that you can use.
The first change we made is that
by default, all your renders
will be done using full 32-bit
float.
Second change, everything will
be done using SRGB color spaces.
Third, all the boundary
conditions will be handled with
clamped and cropping.
What that means is, if you're
applying convolution or
creation, for example, your
image will be repeated
infinitely.
A filter will be applied, and
the resulting image will be
cropped back to your input size.
This is a setting that can also
be overridden at one time.
Finally, infinite images become
finite so that we can get their
pixel buffer representation, and
that's what really PyCoreImage
is under the hood.
So, before looking at a great
demo of all of this in practice,
I just want to go through
quickly a cheat sheet for
PyCoreImage.
So, let's have a look at the
API.
So, as you saw earlier, we
import the CIMG class from the
pycoreimage package.
We can use this to load images
fromfile [inaudible].
Here's a Swift equivalent for
those of you who are wondering.
You can use CIImage for contents
of file.
You can use fromfile to load
your portrait matte information
directly as well as your
portrait depth by just using the
optional arguments usedepth and
usematte.
You can interpret with NumPy by
wrapping your NumPy buffers in
the CIImage constructor or
calling render directly under
CIImage instances to go the
other way around.
If you're in Swift, there's a
bit more set of coding to do.
You need to first create a
CIrender destination.
Make sure to allocate your
buffer previous.
Make sure to give the right
buffer properties, initiate and
create an incidence of the CI
Contex and Qtest render.
So, all of that is handled for
you under the hood.
Core Image also supports
residual images, such as
creating images from a color or
creating an image from a
generator.
Let's have a look at how to
apply filters now.
So, applying a filter has never
been easier.
You take a CIImage instance,
call the filter name directly on
it and pass the list of input
primaries.
Every CIImage instance is
augmented with more than 200
lambda expressions, which
directly map to the Core Image
filters.
If you're in Swift, this is the
syntax you've seen before I'm
sure, applying filter, passing
the filter name as well as the
list of input arguments as a
dictionary of key value pairs.
To apply kernels, you can go
applykernel in your CIMG
instance, passing of the source
string containing your kernel
code and the list of input
parameters to that kernel, and
we'll have a look at this in
just a second.
Then you just specify the extent
in which you're applying that
kernel, as well as a region of
interest from which you're
sampling in the buffer where
you're sampling from.
PyCoreImage provides a selection
of useful APIs that you can use,
such as a composite operations.
Here is a source-over as well as
geometrical operations such as
translation, scaling, rotation,
and cropping.
All right.
I just want to spend a bit more
time on the GPU kernels because
that's an extremely powerful
feature especially for pro
typing.
So what we have here is a string
containing the code to a GPU
fragmentor.
And what we have there is
essentially a way for you to
prototype in real time what that
effect is.
This is an example of five tap
Laplacian and we're going to be
using this for sharpening.
So, we make five samples in
neighborhood of each pixel.
Combine them in a way to compute
a local derivative, which is
going to be our detail, and
we're adding in on -- back on
top of the center pixel.
I don't want to focus too much
on the filter itself, but how to
call it.
So, we call it black kernel on
our CIMG instance.
Bass source code, that's just
sitting above using a triple
[inaudible] python string.
[Inaudible] the extent in which
we're going to be applying the
kernel.
And define the region of
interest is along the expression
that we're going to be sampling
from.
If you're not familiar with the
concepts of domain of
destination as well as regions
of interest, I encourage you to
look at online documentation for
Core Image as well as previous
WWDC sessions.
But here is the convolutional
kernel, we are reading one pixel
away from the boundary, so we
need to instruct Core Image if
we're going to be doing so, so
that it can handle boundary
conditions properly.
All right.
So, that was a lot of
information and looking at APIs
is going to always be dry, so
let's take a look at a demo and
let's put all of this into
practice.
[ Applause ]
All right.
So, during that demo I'll be
using Jupiter Notebook, which is
a browser-based real-time Python
interpreter.
So, all the results you're going
to be seeing are rendered in
real-time using Core Image in
the backend.
So, none of this has been
pre-computed; this is all done
live.
So, the very first thing I want
to do here is import the utility
classes we're going to be using,
the most important one being the
CIMG class here for my
PyCoreImage package.
Then, we just have a bit of
setup code so that we can
actually visualize images in
that notebook.
Let's get started.
First thing I want to show you
is how to load images in.
So, using from file here, we see
that the type of my object is a
PyCoreImage CIMG.
And we can see that it's backed
by an actually proper Core Image
object in the back.
We can do a render on our image
and have a look, I would -- its
actual pixel representation
using Matte [inaudible] lib
here.
This is our input image, and now
I want to apply a filter on it,
so let's take a look at the 200
plus filters that are supported
in Core Image.
Let's say that I'm interested in
applying GaussianBlur here, and
I'd like to know which
parameters are supported by that
filter, so I'm going to call
inputs on my CIMG class, and I
see that it supports input image
-- I shouldn't be surprised --
as well as an input radius.
So, I'm going to do just that
here.
Take my input image.
Apply GaussianBlur filter on it
with a radius of 100 pixels, and
then show the two images
side-by-side.
So, pretty easy, right?
Okay.
Let's keep going.
Like I mentioned earlier, you
can generate procedural images
using Core Image.
So, we'll take a look at Core
Image generators.
And the first thing we do here
is we call from generator --
specify the name of our
generator.
In this case, CIQR code and pass
them in the message that we're
trying to encode.
Here it is in real time, so I
can do changes to that message
and see how that affects the QR
code that's being generated.
Core Image also has support for
labeling your images, so you can
use a CI text image generator to
do that.
So, here's the example here.
WWDC and using the SFLO font.
All right, let's keep going.
As I mentioned, we support
interoperability with a -- to
and from NumPy, so this is the
first thing we're going to do
here.
We're going to start with an
image and apply some interesting
and non-trivial affect to it.
In this case, a vortex
distortion.
Next thing we'll do is, we'll
render that buffer getting NumPy
area out of it.
You can see its type here as
well as its shape, its depth,
and a few statistics on it.
It's minimum, median value as
well as its maximum value.
We can also go the other way
around and go from NumPy to Core
Image.
To do this, let's start with a
NumPy array-- that's
non-trivial.
In this case, a random -- a
random buffer where 75% of the
values have been [inaudible] to
black.
First thing I do here is wrap my
NumPy array into my CIMG
constructor, and we can see that
we have again, our CIMG class
instance and the backing
CIImage.
Now that I have my CIImage, I
can apply a variety of filters
on it.
So, the first thing I do is I'll
apply this blur.
I'll use a wire filter; a light
tunnel.
Change of contrast in my image.
The exposure adjustment as well
as the gamma value.
Let's take a look at these
filters side-by-side.
So at this blur, light tunnel,
exposure adjust; gamma adjust,
here's our final effect.
Pretty fun and really easy to
work with.
So we -- let's put it all
together.
So, I'm going to start with a
new image here, and what I'll be
showing you in this demo is, how
we can do bend processing.
For those of you who are filming
here with slicing images in
Python, this is exactly what
we're going to be doing.
We'll define bands or slices --
horizontal slices in our image,
and we'll only be applying
filters on these.
Let's take a look at a code
first.
This is our add band function
here.
And we can see that the very
bottom of it, we render our
image in two composites, which
is an actual NumPy buffer.
But the right-hand side is a
CIImage.
By using slicing like this, we
forced Core Image to only do a
render in that band, not in the
entire image, thereby being much
more performant.
So, let's do this and create
five different bands into our
image and show the final
composite.
Pretty amazing.
And we've got other labels on
top as well, which correspond to
the filters being applied.
It's really that simple to work
with PyCoreImage.
All right.
And I mentioned performance
earlier, so let's take a quick
look at this.
First thing I want to show you
is that whenever you call render
on your CIImage instances, the
NumPy is baked and cached under
the hood.
For example, here we create an
image where we scaled down as
well as applied GaussianBlur, so
the first call took 56
milliseconds; the second one
only 2 milliseconds.
And let's take a look at large
convolutions as well.
Core Image is extremely fast and
is able to handle large
convolutions as if it was
nothing.
Here we're using CIBlur,
CIGaussianBlur with a radius --
with a sigma of 200 -- a value
of 200 for a sigma, which is
huge.
Just to give you a sense here,
as I was look -- showing you the
image, I'm actually executing
the equivalent using
scikit-image.
And we had a 16 seconds running
time.
But this time the same thing
using CoreImage; 130
milliseconds.
Yeah, it's that fast [applause]
-- 200X, yeah.
Thank you.
All right, let's keep going.
So, one of the most powerful
features of PyCoreImage is its
ability to create custom GP
kernels inline and execute them
on the fly and modify them on
the fly.
So, let's take a look at that.
All right.
So, the first thing I want to
show is how to use color
kernels.
So, color kernels are kernels
that only take a pixel in and
spit a pixel out and don't make
any other samples around that
pixel.
So, here's our input image and
here's our kernel.
So what we actually get in is a
color and we turn a color out.
So, let's take a look at this
effect here.
I'm going to be swapping my red
and blue channels with my blue
and red channels and we'll be
inverting them.
Not a terribly exciting effect,
but what I want to show you is
that I can do things like, start
typing away, and say, maybe I
want to scale my red channel by
my blue channel and I want to
just play with the amount of
scaling I'm playing here, so we
can go from .25 to pretty high
values if we want to and
generate interesting effects
here.
It's extremely powerful, and
this all [inaudible] time so you
can really fine-tune your
filters this way and make sure
you achieve the effect that
you're looking for.
Let's take a look at a more
complicated kernel here.
So, we'll look at a general
kernel, which is a bit like the
[inaudible] I showed you
earlier, which is a kernel that
makes additional taps in the
neighborhood of each pixel.
So start with an image from
file, which is the same image we
saw earlier, and we have our
kernel code here.
Without going into the detail,
this is a bilateral filter,
which is an edge over blurring
filter.
So let's just get the code in
and use apply kernel with some
parameters that will allow us to
get this very nice effect.
And what we did here,
essentially, is clipped the
non-redundant high frequencies
in the image.
And if we take a look -- let's
take a look at this a bit more
closely.
Look at a crop here.
We can see how the strong edges
are still there, but the fine
frequencies that are not
redundant were washed away.
And bilateral filter can be used
for many, many different
purposes.
In this particular case, we'll
use it to do sharpening.
And to achieve sharpening with
this filter, we can simply take
the image on the left and
subtract the image on the right,
giving us a map of high
frequencies or details of the
image.
Let's do just that.
So, here what I'm doing is I'm
rendering my image, it's an
NumPy buffer.
Rendering my bilinear, my
filtered image and we're
subtracting them together using
the operator overloading that's
provided with NumPy.
Let's take a look at the detail
layer.
So, if you have detail on your
left-hand side for the entire
image and a crop for the center
of the image.
Now, what we can do with this is
we can add it on top of the
original image.
We're going to be doing just
that here.
We're going to be adding it
twice.
By doing this, we achieve formed
sharpening.
It's really that simple.
If I wanted, I could go back to
my filter kernel string and
start hacking away and making
changes there in real time.
The other thing I wanted to show
you is how to load metadata from
your images.
So, here I have an image that
has portrait effect matte loaded
in, as well as portrait depth
data.
Here are the images
side-by-side.
The image on the left is the RGB
image.
In the center is the depth data.
On the right-hand side is a
high-quality portrait effects
map, which we introduced in
another session today.
We can also look at the exist
tags directly by looking at the
underlying CIImage from CIMG
instances and calling
properties.
Here, we get information
pertaining to the actual capture
itself.
Like I said, we introduce the
portrait effects matte at
another session, Session 503, so
I highly encourage you to look
at it.
So without going into the
details here, I'm going to be
choosing this filter.
If you are interested to know
how we did this, I highly
encourage you to take a look at
this session.
Pretty fun stuff [applause].
Thank you.
[ Applause ]
All right.
Let's go back to the
presentation.
I want to switch gear a little
bit here and talk about bringing
CoreImage and CoreML together.
If you would like to get more
information about working with
Portrait Matte and Portrait
depth information, I encourage
you to look at session on
creating photo and video effects
[inaudible].
All right.
Let's look at bringing Core
Image and CoreML together.
This year, we're really excited
to announce that we're coming up
with a new filter, CI CoreML
model filter.
It's an extremely simple, yet
very powerful filter that takes
two input.
The very first input is the
image itself with a filter and
input CoreML model, and you get
an output, which has been run
through the underlying neural
network.
It's really that's simple;
extremely powerful.
Just to show you how simple the
code is, let's take a look at
Swift.
So, we have an input image on
the left-hand side, all we need
to do is call applying filter.
But as a new filter that we've
entered -- are introducing this
year and give your [inaudible]
in the model.
It's really that simple.
And if you'd like to look at
other ways to leverage machine
learning in your image
processing applications, I
encourage you to look at the
other sessions on A Guide to
Turi Create as well as Vision
with CoreML.
All right.
On a related topic, one of the
common operations we carry in
our training datasets in machine
learning is data augmentation.
And data augmentation can
dramatically increase the
robustness of your neural
networks.
In this particular case, let's
say we're doing object
classification and we're trying
to determine whether that image
is a bridge or has water in it.
So, augmentation is on your
original trend dataset will
increase yet the number of
images you have in that dataset
without needing to gather new
images.
You essentially get them for
free.
So, there's many operations you
can carry.
One of them is just changing its
appearance.
For example, the tense, the
temperature and the white point
of your image.
Changing the spectral properties
of your image by adding noise.
Or changing the geometry of your
image by applying transforms.
Well, it turns out all of these
are trivial to achieve using
Core Image.
Let's take a look at a few
filters and how you can use them
for your data augmentation
purposes.
So, we have our input image on
the left-hand side here.
And we can change the
temperature and tint using CI
Temperature and Tint.
We can adjust the brightness,
contrast, as well as saturation
in your images using CI color
controls.
Change the frequency spectrum of
your image using CI dither as
well as CI GaussianBlur.
And change the geometry of your
image using affine transforms.
Let's take a look at all of this
in practice.
All right.
So, we're back to my Jupiter
notebook here.
Same setup as before.
First thing I want to show you
is how to [inaudible]
augmentations using Core Image.
So, we're loading an image in
and we're going to define our
augmentation function here.
And what we'll be doing
essentially is sampling from a
random space for each of the
filter I've defined here.
So, we'll be applying
GaussianBlur, scaling rotation,
a few adjustments -- exposure
adjustments -- fibrines as well
as dithering for noise.
All right?
Let's cache that function in and
let's have a look at a few
realizations of that
augmentation.
So my slider here controls the
[inaudible] that I'm using in
the back end.
All right, pretty cool.
I'm not sure how efficient that
is, so here I'm going to be
processing 200 of these
augmentations in real time and
we'll take a look at here -- how
they are being -- actually saved
to disc in real time.
So let's just do that, just to
give you a sense of how fast
that is.
That's really powerful.
All right.
This next thing I want to show
you is how to use CoreML using
Core Image.
And first thing you do is to
load your Core ML model in,
which we did here.
We have a glass model, which
we're going to be using to
generate interesting effect.
So, let's start with the
procedural image.
We've seen this one before.
And then to make it a bit more
interesting, we'll add some
texture to it.
So, we'll be adding multi-band
noise on it as well as some
feathering and some [inaudible].
All right, so this is the input
image we're going to be feeding
to our neural network alongside
the other -- the CoreML model
that we have pre-trained.
All right?
So, let's run this.
And --
There you go.
WWDC 2018, just for you.
All right.
On that note, I want to thank
you for coming to this session
today.
I hope you enjoyed this as much
as we enjoyed preparing these
slides for you.
I highly encourage you to come
and talk to us tomorrow at Core
Image Technical Lab at 3:00 pm
and thank you very much.
[ Applause ]