WWDC2004 Session 201
Transcript
Kind: captions
Language: en
hello and welcome to session 201 encore
image that's me Ralph Brunner and one
yeah so here's the agenda of this
session first I'm going to explain to
you what the core image framework
provides and well what's there then how
to use the API and I will not go in too
much detail there there's a nice set of
documentation on this but it's
essentially about the spirit of the IP
is and what are the key methods do you
have to to know then Mark Zimmer will
come up and explain to you how to write
your own image processing kernel
after that ranked up Kay will create it
show you how to take that image
processing kernel and create an image
you need out of it and by then you
hopefully know what an image unit is and
then we'll bore you a little more at the
end we've give you essentially ideas how
this could be used in various
applications so if you're not interested
in any of these things and rather hear
about databases now would be a good time
to leave okay so what's core image core
image is an image processing framework
now in Mac OS 10 tiger and the goal is
really to put image processing onto the
GPU it has a full floating point
pipeline so the there is no clamping to
0 to 1 through the pipeline and you get
really nice deep pixels there's a base
set of about 60 filters which contains
various things from fairly pedestrian
stuff like contrast brightness to some
more funky stuff like bump distortion
things like that there is the concept of
image units which is essentially a
plug-in architecture how you can make a
filter and put it on the system and have
it show up in other applications that
are hosts for image units and there is
while core image is really focused on
working on the GPU there is a CPU
fallback available so if your GPU is not
adequate they equipped with you know
fragment programs and
the nice things that we usually utilize
then it will run on the CPU and actually
the CPU it's it's a fallback but it's it
is working pretty well it's actually
fairly good very well optimized code
so why the GPU well they're essentially
two reasons one if you can put something
on the GPU well the CPU is free and CPU
cycles are kind of a scarce resource in
whatever you are doing so that helps and
the second reason is GPUs actually are
outpacing GPS CPUs in a number of
interesting benchmarks like memory
bandwidth and the floating-point
processing power so this graph down here
tries to illustrate that we essentially
took a bunch of image processing filters
about five stack them behind each other
and then run it on different hardware so
the first four are different hardware
configurations from the iMac g4 to the
dual 2.5 g5 and that's interesting here
because by the number by the way the
number is in megapixels per second for
that particular operation so the
interesting thing here is the two LG
five is actually about four times faster
than the iMac which makes sense it's
twice the gigahertz and it's twice the
number of processors so what about right
so if you look at the graphics cards
there is if you take it a low-end
graphics card the g-force effects 5200
and compare that to the high-end
graphics cards then you actually get a
factor of you know almost nine and
that's that's kind of a challenge
because a factor four is something that
your software can handle it means you if
you have twenty frames per second before
now we have five frames per second so
it's not exactly good but yeah you can
live with that if you have a factor 10
or maybe you want to consider adding a
special code into your application that
if they detect if the gap is that big
well maybe we should do something else
and for example if you have something
like a keynote presentation application
if you can't do the transition smoothly
well maybe just for
back to a dissolve or something like
that okay
well on Mac os10 the way to access the
GPU is OpenGL so let me tell a bit more
about that
there are sensually two different ways I
see things I see two different ways how
you can use OpenGL well one is use it as
a low-level 3d renderer and the
characteristics there you have a high
polygon count moderate amount of
textures and you're doing a lot of work
in optimizing your scene and figuring
out which pieces you need to draw and
the model new matrix is set up that is a
camera looking into a 3d scene and
pretty much all 3d games you play I've
ever played or in that category second
category is using OpenGL more as a pixel
calculation engine that case you usually
have a lot of large textures and the
geometry is pretty much negligible and
the modelview matrix is really set up to
allow you to address individual pixels
so I call that quartz versus quick and
so since Jaguar the Windows Server
compositor I was layered on top of
OpenGL so we call that quartz extreme
and it took essentially what Peter put
it I think two years ago as it removes
the transparency tax the cost of doing
all that compositing is now on the GPU
and from the CPUs perfect if it's free
well this year
quartz extreme also encompasses the
course 2d 2d drawing API and core image
uses OpenGL in that way as well so why
are we doing this well for an
application you will offer using OpenGL
efficiently is hard yes you have a lot
of stuff you need to learn about p-bars
and the right way to switch contexts so
that everything is still stream to a
graphics card correctly and the goal of
quorum edge is essentially to abstract
you the developer from that burden
essentially which you should be able to
concentrate on drawing big quartz with
video on it and
that's it and not know about the two
hundred lines of code underneath to
manage all the buffers so after talking
about that much about hardware well here
are actually the hardware requirements
so he think that core image needs is an
art fragment program of a graphics card
and it works pretty much on any fragment
programs graphics program graphics card
but more memory tends to be better
especially with all these other system
services going to do graphics card as
well well more is better and but it does
work well on my laptop with the ATI card
which has 64 megabytes of memory so if
the GPU isn't dragging programmable then
we have a fallback which uses the CPU
and here the g4 and the g5 are strongly
recommended it does work on a g3 however
because all the computation is done in
floating-point having that regular
velocity engine unit there is a great
game and it supports multiple processors
so if you have a one of these nice dual
desktops then will utilize your two
processors pretty much optimally so with
that I would like to go to demo machine
2 and give you a bit of an impression
what kind of filters are available so
let me switch to ROS image okay so here
I have a sepia tone filter and more of
the more pedestrian ones there is
saturation that looks ugly brightness
and contrast oops
you can make a monochrome image yeah
this isn't all not terribly exciting but
it turns out that these are actually
really useful in in your workflow they
have to be there so let me try to get
some more interesting stuff going here
is an edge detection filter or a
different picture here can do bump
distortion and if that's too scary let's
smaller okay also have a glass
distortion filter and I can move the
glass around this or modulate how thick
the glass actually is yeah more funky
stuff like perspective transform loops
that was a bit too much or put a spot
line as an image things like that so
there are about 60 of those and some are
fun some are useful some where both okay
can also do effects on text which turns
out to be fairly interesting if it for
example get your drawing out of core
graphics and then put effects on top of
it so you can do things like that or my
favorite zoom blur actually zoom blur
works even better if I go and say third
to the edge detection filter like this
and then put the zoom blur on top of it
so I like that better anyway so let me
try to do something a bit more useful
and you had a bunch of artists come up
with scenarios that I could demo so I
will just repeat one of them so here you
have a guitar and the first thing I'm
going to do I'm using the monochrome
filter to produce more of a blue like
look here
and then exposure to it a bit darker me
I want to focus on the hand here so I
will add a spotlight oh yeah that's good
and then let me add a layer on top of
this like Lu's album text line art and
at the very end crop the frame and now I
have a little album called or cover for
my blues band so the key thing here is
that I stacked a bunch of filters on top
of each other and all of these filters
are still life there are none of these
as you can get the result written back
into main memory so I can still go and
change the blue tint in the middle or if
I don't like the spotlight I can move it
around underneath and so on okay so the
next thing I would like to show we can
actually handle fairly decent-sized
images so what I showed before ever
before these were images pretty much
screen size or projector size this one
here is different this one is an 11
megapixel image and it's 16-bit per
component so it's a 90 megabyte image
total and well it's a 90 mm megabyte
image but it goes to a 1 megapixel
projector so you can't really see that
much fit oops my apologies for that so I
have this little lens here so I can
actually move around and see you to show
you the detail so see there are actually
little people down here taking pictures
of each other there's a statue of a bull
and so on so let's do something on this
image and go back and use the glass
distortion I used before so let me show
you I can actually see all the detail in
the glass distortion apply to this 90
megabyte image so naturally 90 megabytes
this takes a bit longer it's not totally
real-time so if I drag this around you
see it's something like 3 to 4 frames
per second but it still works and the
key message here is this is actually
well beyond the key
the hardware hardware has a texture
limits exercise limitation and this
image is big too big and core image goes
and cuts it into little tiles sends them
up to the card does all the magic that
my forever filter there's enough
information there that to do its job and
then stitches everything together in the
end so that you can enjoy this on your
screen so the last thing I would like to
show is well if it's fast you can also
apply it to movies so here I have my
little bump distortion on this movie
trailer
I just have hours of fun with this okay
oh yeah dude sepia tone and even then
you can go and stack filters like so
let's do an edge detection filter on
this movie so okay I think you get the
idea with that let's go back to the
slides so now how does this all work so
you essentially have to know three core
classes in core image to use filters so
the first one is the CI image and to
everybody surprised that is an image so
an image is typically something you
bring in from the outside world like a
jpg file or it's the output of a filter
and more on that a bit later there is a
context which is the other end of the
workflow which is the destination
abstraction you build a context say on
top of a ZG context or an OpenGL context
and the context is where you draw into
and the third thing third piece you need
is the CI filter and filter is what
actually does the work it put one side
the image comes in the filter modifies
it somehow and then the output is an
image you can either take that image and
pass it
filter or draw it into the context so a
bit more detail about what the image
does so an image can get created from
for example as each image ref there's
also possible you can pass an NS data
with row bytes and an encoding or for an
OpenGL texture and this is the API call
you do in this case for the cg image
they all look very similar and typically
the output of an filter is a CI image
and this is the API call you do to get
that out so an important thing here is
that CI image can have infinite extent
so if you imagine an infinite plane with
a checkerboard pattern on it that's a
valid CI image I hope you will never try
to draw something with an infinite
extent so better specify a sub rectangle
you're really interested in so filters
filters are created by name so this case
here we created create a filter with the
sepia tone by name CI sepia tone and all
the parameters that go into the filter
are set using key value coding so in
this case down here I will create an
endless number and set it to the key
input intensity and key value coding is
also how you get your output so you can
ask to filter give me the value for key
output image that's well yeah put image
so the reason why that is done is well
first reason is we're lazy we have 60
filters and we don't want getters and
setters for each one of them and the
second one is this allows you to
introspect the capabilities of a filter
so you can build automatic UI and
support plugins and things like that so
here's the part about the introspection
if you want to know what filters are
available well solution one you go and
read in the documentation or solution
two you ask the CF filter class
give me all filters in a specific
category you can ask for give me all
filters period or give me all filters
that are transitions and suitable for
video and then it you get a reduced set
there's a bunch of categories for a
color adjustment geometric adjustment
things like that and there are
attributes like
yeah suitable for video suitable
suitable for interlaced data to do for
still images and things like that so
once you have a filter you can also ask
the filter for well what are your
parameters what are the input keys and
what are the output is and there's an
attribute dictionary you can get which
sorry which contains all the metadata
about an input so it has things like
type this is a number this is a vector
this is an image and it has semantic
information so for a number you could
for example have this is an angle this
is a distance and if you want to put up
a slider it's from this value to this
value so that attributes dictionary is
fairly rich and allows you to build
automatic UI in fact the demo I showed
you I didn't write any UI code specific
to these two filters there it just
interested introspective filter
attributes and build okay for each of
these values add a slider for each of
this at a point that you can move around
and things like that so there is an
interesting detail in behind that API
it's fairly straightforward in the sense
it's like looks like immediate mode you
take an image you pass it to the filter
you get in your image out but what's
really happening behind the scenes is
that image hasn't been rendered the
image you got back is kind of an image
it's the recipe it's an image that
contains references to original data
plus a program that says what to do if
somebody really wants to draw this and
it also contains a snapshot of the
parameters that were in the filter at
the time when you create asked for the
output image and the image is really
only rendered if you go to the context
and say okay draw this image and that
has a bunch of interesting implications
and pretty much all of them related to
performance so first of all if you take
an image from one filter and pass it to
the next filter the next filter just
says okay I have this image here I'll
append this program that I need to
execute and choose passes the data
further on so if you never draw it's
really really cheap
but probably you wanna draw it at one
time and at the time you draw there is a
compiler concatenating these programs
and actually it's like a just-in-time
inliner
that Pro tries to for the target that
you're drawing to produce an optimal
program that runs in these pixels so
concatenation of filter operations and
the second thing it allows you to do if
you actually draw only a sub rectangle
out of your your image then it it at
that point knows which pieces it
actually needs to process so let me go
on to that point a bit more so what this
does is you can have several components
in your app or the operating system that
conspire on an image computation so as
an example down here I have an image
that comes from the image IO subsystem
and it does a color adjust and returns a
new image that gets passed onto a
differences subsystem in this case a
thumbnail renderer which scales the
image down and then draws it on I don't
know
sheet with a bunch of problems I guess
because all the operations are deferred
at the very end where the context sees
well I have to draw a small image and it
propagates it all back and figured out
well I could move that color adjust to
the right level and process far less
pixels then doing the colleges just on
the original size image so this is a
certain optimization technique and it
happens completely transparently to the
user of the API so let me give you a
really simple example how code with
using core image actually looks like in
fact it's so embarrassing ly simple you
might usually ask why would you use that
values core image at all for this but
I'm gonna build on it so stay with me
so the first thing I'm doing is creating
a CI image in this case I'm using a CG
image ref the next step I'm creating a
context and again I like CG so I'm using
a CG context to have fun
and then I go and draw the image here
and I specified two things one is when
the festivai the point where I'm going
to draw it and specify the subject out
of the image I'm gonna draw remember
that there are infinite sized images so
it's really good idea to specify a sub
rectangle in that case so let's add
something interesting in between which
so far we truly just draw an image which
it's hardly worth the time I should
spend on it so let's create a filter in
the middle I create a CI color in word
filter it's simplest filter we have it
in words the colors and hasn't no
parameters at all except one image and
keys and values this is a rise of keys
and values the convenience method to set
all the keys and values instead of
having to call individually for each one
of them so it sets the input image to
the image we created and the draw image
call instead of taking the original
image it just asks the filter for the
output image and so this is our four
lines of code example how to use a
filter so the bigger picture of things
everything that happens within the space
of core image happens in a defined
working space and that thing has two
aspects to it there is a color working
space and they're called working
coordinate space so imagine the diagram
down here you have a bunch of images
coming in it flows through a graph of
filters and then it exits to the context
essentially so let me talk about the
coordinate space first coordinate space
is essentially an infinite sized plane
with the divine origin where you can
place your images and there are a bunch
of filters that do things like affine
transform so you can move them around
and scale and things like that so that's
pretty simple doesn't again this is the
reason why you have to specify some
rectangle at the very end when you
actually draw out of that working space
the color space is a bit more
interesting so color matching happens on
input and output if an image comes in
has it stacked with a color space it
cuts
converted into a working color space all
the processing is done in that working
color space and on output there is a
conversion to the contexts color space
which for example if you created off a
CG context you don't have to do anything
it figures it out directly and the
default working space is seek mo to see
camera two has a bunch of interesting
properties for image processing and
first one is its light linear so if you
imagine you have two flashlights and
they point at the same spot on
I don't know this stage and yell out in
the amount of light that come hood is
reflected is somehow proportional to one
delighted from one flashlight plus the
light of the other flashlight and light
linear essentially means that the math
inside your filter program has the
property and we'll have an example for
that a bit later and the second one is
it has infinite gamut so the C camera
too doesn't clamp colors to zero one
range so if you for example have y cbcr
image coming in for from video and you
want to process it if you convert it to
RGB if you clamp to zero one you will
lose some color but because it has an
infinite gamut in this case it doesn't
so it conversion from vise ER subi to
RGB and back is lossless there's a bit
of secret sauce on the color space issue
when it comes to see images and the
color space of a CIA image can be nil
what that means you're telling the
system this image I'll give you an image
but it actually doesn't contain any
color information so for example if you
have elevation maps normal vectors or
actually function tables you want to
pass into to your function to your
filter kernel then clearly you're not
going to have color matching on your
sampled a sine function so a nil is the
color space to use in that case so note
that there is a subtle difference
between nil which means this isn't
colors or don't match and this is
already in device color space if you
want to send an image in that is already
in the device color space just tag it
with the working space then code image
will say okay image in working space are
the same so
there's no work to be done here let me
try to give you an example why this
matters so what I did I took a tripod
put my camera on it and took a picture
with bracketing on so I have three
different exposures of the same scene
and it's Tuesday to expose apart from
left to right so this is physics this is
letting more light through the shutter
by keeping it open longer so what we
doing now is take that or it left most
image darkest one and try to simulate
the same thing inside core image so I
take that image and multiply out the
numbers by two gives me the middle one
multiply the numbers by two again gives
me anyone this is why that SECAM two
color space matters is it actually
matter matches what physics does so for
comparison if you actually do that in
device color space you skip all the
color matching the results are rather
different so for your application that
means for example if you're dealing with
photographs you can have an exposure
adjust UI and it actually produces the
same results that the user is used to
from his camera okay let me switch gears
here a little and talk about the overall
model behind the architecture and it's
based on this graph paper model for
efficient and flexible image computing
by Michael chances and it's published
SIGGRAPH 94 and it essentially explains
how to build a pipeline of image
processing operations and how to
propagate through that pipeline all the
information that is needed to do the
operation in the end and actually shake
is based on the same model so I will use
the terms out of that paper because well
I meant something new if it's already
been done so there are two key concepts
here that first one is the domain of
definition domain of definition is
roughly the area of an
image that is now and the Protestants
interesting meaning that everything
outside that domain of definition is
transparent as I mentioned before the
domain of definition could be infinite
but for pretty much every very image you
load from disk it's probably not second
term is the region of interest or ROI so
when used to when you actually draw at
the very end of your filter chain and
you specify the area that you want to
pick out out of the working space and
this is called a region of interest so
here's an example how this works
so at leftmost image is the original
image and then do the zoom blur filter
on it which you saw before so the domain
of definition of the original image is
about the extent of the image the domain
of definition of the filtered image is
larger because the filter bleeds out
into neighboring pixels so when actually
go and now now I'm at the bottom right
corner of this diagram and draw a sub
rectangle this case the dog's eye out of
that scene that region of interest is
propagated back into the original image
so the filter has to have all of the
yellow area in the bottom left corner
available to do its operation and what
it does it intersects the region of
interest and domain of definition of the
source and that's the real data that
needs to be fetched so why is that
important if you write your own filters
you actually have to provide functions
that do this mapping if that will come
to the topic of how to write your own
filters so there is actually two more
classes that you need to know on top of
the three I mentioned before how to
write your own filters there's the CI
kernel which represents the per pixel
operation is essentially a little
program that produces a pixel at the
output and it can have things in it like
looking up pixels in the source image at
various places or just math or scaling
and things like that and there is the CI
sampler which is the kind of the
mediator between the kernel
and the original image sampler is an
accessor function so when you go and say
give me the pixel at coordinate XY then
the sampler will kick in and do some
magic things to that
so sampler has two key elements it has
an interpolation style so you have to
know which I do we want to do linear
interpolation or no interpolation it has
wrap mode attributes so if you ask
outside the DoD what what should happen
so the colonel here yeah sorry about
that
the colonel represents the per pixel
operation and I have a little sample
here this is the magnifier code I showed
in the previous demo the little circle
that showed you the Eiffel Tower that's
pretty much all there is to it there is
a distance computation to make the
little gradient on the right on the side
and ya pick out the right scaled pixels
with that go to back to demo machine
number two
at some point okay so what I'm going to
show you now is essentially more of the
same but now that you have the overall
concept I can explain a bit more what's
going on
so the first thing I'm doing M this is a
little test that by the way that we
wrote to figure out how the framework
how to tune the framework and it has a
nice aspect that it visualized nicely
visualizes nicely what's going on so
I'll take an image hibiscus here then I
pipe that through let's say the bump
distortion filter you saw before and
then display it so this is yeah image
and in the top left corner you see that
little flow how the data flows through
so this is the bump distortion you saw
before I can move that around and it has
a bunch of parameters like the radius
and things like that so the first thing
I mentioned is that there is a CPU
fallback so what this case here is doing
it's the bump distortion running on the
GPU and it's probably really hard to
read in the audience but there is a
little framerate indicator in the bottom
left corner and it says this runs at 93
frames per second so let me go and
switch okay and switch this to use the
CPU render instead and you see it to be
more chunky but the CPU renderer runs on
about 20 to 25 frames per second for
this operation and it produces the same
result which is kind of surprising
attacks
okay so let me try to build a bit more
complicated example that shows the power
behind that the framework especially
when the just-in-time compiler kicks in
and does some more or less small things
at times so let me start with the edge
detection filter you saw before so looks
like this
oops let's make these edges really
visible and I'm gonna change the image
as well so let's open an image like core
image yeah like a little wire here I'm
gonna need that a bit later okay so this
is the actual edge detection filter on
core image line art so far not that
interesting I'm gonna add a new filter
for exposure adjustment and then I'm
adding this zoom blur I showed you
before so this is how to zoom blur image
looks like movie make that a bit okay so
this is the zoom blur effect on line art
and a very end I'm gonna add a
compositing mode this case addition and
add the original image back to the zoom
blur image so this looks like this oops
so looks like that and if the mouse I'm
just moving the origin of the zoom blur
around so you can easily imagine that if
you want to do some visual effects on
line art building some interesting
visuals 50s is pretty easy and that's
the key where the compiler actually
comes in most of these operations
actually are collapsed into single pass
it's only the sue in the zoom blur which
already does multiple passes toward is
to create this effect that still has
multiple passes in the very end but
everything else is collapsed into a
single pass over the image
okay if that I'm going back to the
slides so I would like to ask mark
simmer up on the stage to explain to you
how to actually write one of those
things Thanks thanks Ralph
okay the fun part about this is writing
your own image processing kernel
the thing about core image is that it
does come with 60 plus filters but if
they're not really the filters that you
want or you can't put them together like
you just saw into a graph to do what you
want or that graph that you did put
together it's not running fast enough
then you may want to write your own
kernel in fact that turns out to be one
of the things in tiger that really runs
much faster than anything we've ever
produced in the past think about it so
you can basically make your tiger roar
by building your own filter so creating
your own plug-in filter in core image is
something that you can do and basically
when you do that you end up creating a
pixel shader to program a GPU and it's
something that sounds actually pretty
complicated but as it turns out it's
really very simple the neat thing is
that these filters once you've created
are first-class citizens so that means
your filter can run as fast as our
filters you basically are going to
concentrate on the kernel implementation
most of the stuff around there
represents a small fraction of what you
do a lot of your time will be spent
working on the actual shader so filters
are based on Objective C at the top
level they have a few methods which
you'll see and it's pretty simple to
create one it has an image processing
kernel which is the fun part and you'll
see what I'm talking about in a minute
all of the filters that you've seen so
far have kernels and are built on top of
the the GL shading language and we have
a special variant of that called the CI
kernel language the filter is embedded
in the image unit bundle structure which
Frank Dib code will talk about the next
speaker the cool thing about this I've
been writing image processing
affects essentially for 18 years I found
that this was the easiest way to write
them and it produced the fastest results
for any filter that I've that I've
worked on so for instance the zoom blur
filter that I produced ended up being
about a hundred times faster than the
same filter in Photoshop all right
that's running on a g4 1.4 dual anyway
so let's talk about the objective-c
portion what you do is you're going to
build a class definition that's
sub-classing CI filter in the in the
class definition you will define your
input keys the init method of that
basically helps you to locate your
kernel which is going to be in a text
file from the bundle it's actually
pretty simple although it's several
steps the custom attributes method helps
you to specify key defaults and ranges
for your input keys that's particularly
useful if you load your filter into a
program that automatically produces UI
such as quartz composer or any of the
demo apps that we saw here the output
image method finally is something that
helps you to take your okay let's talk
about this is take a step back the
kernel is executed once per pixel so
what you want to do is take everything
inside of there that doesn't need to be
there and hoist it into the objective-c
output image method so that's what I
mean by performing pixel loop-invariant
calculations anything you can hoist or
constant fold is pulled out it will
organize your input keys as you'll see
it calls the kernel using an apply
method which is a filter method and then
it also helps you to specify the domain
of definition of the operation that's
part of the apply as you'll see okay so
the funhouse filter which is
demonstrated up here in the corner shows
how is something I think has been shown
here is something that we're now showing
a class definition for you'll see input
image which is basically your one image
that comes into it you'll see three
parameters the santur the width and the
amount so we're creating an interface
for that notice the header files at the
top quartz core core image not H I hope
I got that right
right okay now the anit method is kind
of this is where things get a little bit
more complicated and what we're trying
to do here is we're trying to locate our
kernel from the bundle and and process
it so the first thing you do is to find
your bundle
so it's bundle for class the second
thing you do is to load up code
basically string with contents of file
yeah so you've located your bundle you
brought it in as a it's like opening a
text file and putting it into a string
the third thing you do is to the
clickers got to work I probably have my
hand in front of the thing okay is to
basically use the CI Colonel kernels
with string method to extract all of the
kernels from that's the contents of that
file and what happens is there's
multiple kernels per file if you want
but usually a filter will only have one
the reason you might want multiples is
so you can do multiple pass
operations internally to a filter and
that's useful for various things we
wanted to build a blur for instance you
really would have to use multiple passes
okay there we go
the custom attributes method this is
something all pretty much scheme over
it's not massively important for this
you see input width is defined with a
range and a given default also its type
is defined type distance there's type
scalar there's angles there's other
other possible types check the header
out for the various types you'll also
have to set the ranges and defaults for
input amount and input center which are
the two other keys non-image keys
they'll be provided here as well I just
didn't show them the output image method
is probably the most important part of
the Objective C level and you'll see the
apply method being called there and it
gets passed the kernel and also the
other parameters of the filter it's
important to note that the parameters in
the filter have to mirror exactly what's
being passed into the kernel in the in
the kernel program itself so what we're
doing here we have one of a radius one
of our 10 to the float value of input
amount etc these are way these are
things that we sort of cost
that folded and put into here so we
won't have to do them in the colonel
program and as always parameters are
passed as objects so we're using in this
number if you were passing a coordinate
you'd pass a CI vector if you're passing
a color you'd pass the eye color and of
course the images are passed as sampler
so you'll see how we load the sampler
and then pass it in there so that's
really all you have to do inside of your
filter except now we're getting to the
most interesting part
you know if filters were like a piece of
candy then I suppose the shader part the
pop processes the pixels you've probably
considered that the soft chewy Center
the best part this is the part you want
to spend your time thinking about what
to do okay we're using a subset of the
GL the OpenGL shading language here you
can specify multiple kernels other
functions can be included from
modularity sake if you have something
that's used multiple times and you want
to extract it you can basically treat it
like you would normal see this is the
place where you're going to want to go
to find out about the OpenGL shading
language so opengl org documentation so
GLSL dot HTML okay each kernel procedure
gets called once per pixel we I
mentioned this before and it becomes
important when you decide how to build
your shader when you are building your
shader another thing you should know is
that there is no knowledge passed or
accumulated from pixel to pixel so it's
kind of like a ray tracer you don't know
anything else that what you're doing for
this pixel think of it that way and like
I said before hoisting much invariant
calculation as you can out anything you
can do is going to save time because it
gets executed once per pixel when you
pass in color remember the colors are
premultiplied alpha and in general so
are the images ok so here's some effects
you can look at there's a rid the
original image there's the funhouse
effect which provides sort of a
distortion in X the edges effect which
everybody seems to like to show got
shown in the keynote and the spotlight
effect these effects are are all going
to be shown here so let's start with a
little fun in the shader here we've got
the displacement effect one thing you
should notice first
placement effects is if you want to do a
displacement effect you want to sort of
operate using the inverse transform in
other words for the destination point
what the source point is anybody who's
done you know in the texture that you're
loading anybody has done an image
processing transform or a displacement
transform well no this is how it works
it's kind of the opposite of you know
where does my pixel go it's you know the
opposite thing so in this case here we
start by loading the discord this is a
built-in function in our kernel language
that allows you to load the location of
your death in working coordinates of the
current pixel then you want to apply the
transform so I'm subtracting a center X
multiply radius whatever blah blah blah
basically it's got what it does is it's
going to distort t1 X which is the
destination coordinate okay finally we
fetched the displaced sample and what
we're doing here is we're using the
sample function again from the sampler
and we're the coordinates being passed
in is t1 but what we have to do is we
need to run it through sampler transform
this is if that sampler actually is
referenced under transform so it will
take care of that for you if you don't
want the sampler to be referenced under
transform then I guess you can just use
sampler chord instead of sampler
transform I'll show you that method at
the end I'll talk about it and it's just
battery running out sounds going on okay
all right the edge is a fact which seems
to be showing quite a bit is really
quite simple it's that what it is is a
cross gradient function and one thing
you want to remember is when you do
calculations in here particularly on
things like Veck fors
which is r1 r3 r2 r0 what's happening is
you're doing all the operations
component wise so it actually use you're
using the power of the vector
instructions to get it done several
times faster even inside the shader
itself remember that the graphics cards
are actually multi pipe lines so what
that means is it's doing you know when I
say r1 minus r3 there it actually does
for subtraction simultaneously but
there's also multiple pipe
so it's times the number of pipes which
is the number of calculations happening
at the same time so you can actually get
you know 20 gigaflops on these cards
quite easily okay so the first thing we
do is load a neighbourhood of samplers
of samples I'm sorry and that's just a
square neighborhood very simple okay
then the second thing we do is to
compute the cross gradient so we're
subtracting a cross gradient like so and
then we're doing the the least squares
computation and multiplying by the scale
so we can scale up or down the edges and
then I just throw in a alpha of 1 at the
end just a little T visible
doesn't really have a reasonable value
that you might want to put there okay
okay the the final example is the spot
light and often when you're doing a 2d
rendering you'll want to use some kind
of a 3d effect on it to make it look
cool in this case here we're doing a
spot light the spot light is done in the
shader that's probably the most
complicated shader the first thing I do
is to get the pixel color so that's the
color that we're shading okay second
thing I do is to calculate the vector to
the light source and then normalize it
because I'm assuming the picture is in
the is in the XY plane the normal of the
picture is zero zero one so if I so n
dot L where n is zero zero one means
that the N dot L is going to be our 0
dot Z here hmm
okay and then I calculate the light
solid angle cosine this tells me how
much light is being put into that spot
so it's just really a cosine a dot
product is for calculating the cosine
then I raise it to a power to
concentrate it and make it a little you
know so if you wanted to have a
concentrated or a wide spread or a thin
spread then finally I calculate the
pixel color by multiplying n dot L by
the light color by r1 which is the color
of the pixel times the beaten
concentration that gives me the final
result for the for the spot light
so it's not really a lot to it really
when you talk about the GL shading light
let me just sort of go over it in
general the first thing is it uses
standard C prescient so it looks pretty
much like C the second thing is that
there is no coercion in this language
which means if you want to add an int to
a float it's it's going to give you an
error when it compiles it so you should
have a constants should be specified for
floating point as like 1.0 0.0 okay the
third thing is you noticed in those
example three we have float variables
scalar or vector variables back to
vector three vector for which are
vectors of floats
there's also boolean vectors and such
things as that but basically the vector
variables are how you get a lot of the
computation done the built-in functions
I should talk a little bit about it the
the first row built-in functions sign
Coast power etcetera those are kind of
the more expensive functions it's the
ones at the beginning kind of being more
expensive than the ones at the end
square root or inverse square root is
actually pretty cheap but you know if
you have like sign calls in a fragment
program you know it may be a very slow
fragment program the second line these
are practically zero cost functions
early they cost one instruction things
like absolute value sign floor etc
they're all available for you and then
there's the things like distance
normalized etc which are extremely
useful in doing you know radial or
centric kind of calculations also it
uses a Swizzle syntax for load and mass
store so you can reference something
like R 0 dot R for the ring component or
R 0 RGB for the red green and blue
components of it in particular when
you're doing a store it really just says
which of the components you're storing
into the language for CI kernels is the
GL OpenGL static do things that we've
added to it particularly we've added the
kernel specifier we've seen in each
example the sampler type is used to
declare image parameters death chord is
added to give you this the working
coordinates location of the current
there's a sample function which allows
you to do a texture map lookup so it's
it's a very simple as you saw
in all the examples there's a sample ok
sampler chord is for giving you the
location on that sampler and that's
actually access texture unit if you're
familiar with OpenGL so each step or
make it its own texture unit the sampler
transform is one where a sampler may
have an affine transform applied to it
and that's how you can make sure that is
preserved also it literally generates in
fine it's actually two instructions at
the low level we use the underscore
underscore color type to define vector
that are color matched inside of your
inside of your shader and finally
premultiplied there if you're doing
something like a color transform
operation like well invert is a good
example color invert or any kind of
color control brighten up contrast the
first thing you want to do is unpromoted
I that color then do your light linear
print hack and make sure you preserve
alpha in your transform if that's what
you want to do so there are some things
like for instance if you just want to
scale you know do an opacity calculation
that you can do on the entire color
without we multiply or runt we multiply
you could multiply your vector for color
by the opacity fraction to get you know
a pre multiply color that was correct
okay and basically that's the entire
that's how to write the internals of a
filter now I'd like to introduce our
packaging experts thank you very much
thank you good afternoon as you can see
I'm the packaging expert on the how we
put the eye candy now into the box so
what our image units image units is our
architecture that you can provide a
plugin that we can use in any
application that will use the core image
architecture and for that we chose
bangle as the delivery mechanism for the
whole part and that makes it very easy
to write actually an image imaging it
the key point here is that this is
actually your business opportunity we
know that this is something that we only
introduced in Tiger but we have
applications will pick up this
technology and we already talked our
applications to and in the future on our
stuff so there's a good opportunity that
you can write just a filter and have a
good audience of like who want to use
these filters later one concept that is
interesting to know here is we have a
non executable filter that means we just
have a CI kernel that's all what you a
free CD would provide in your plugin and
you don't have any cpu executable code
that's important when you talk about
security sensitive applications like
some system servers or if you have like
something like the screensaver you want
to pass around and those you definitely
don't want to have any kind of Trojan
horses or viruses in it since they will
be executed without even know it so now
let me talk a little bit like where do
we actually store these image units
location is the key point here and we
have two spots where we would normally
introduce them and those are the
graphics plugins architecture sorry
folders inside the system library folder
or the user library folder that's where
the custom load API basically will look
and find your filters if your
application has additional filters that
they just want to have in your own
bundle we have an API that will load
those units one by one so you have to
call those by yourself and on that part
that brings me over a little bit to the
structure of our image units and you can
see I put up a little screenshot like
how it looks in Xcode and as you can see
in the very
papad I have a little loader part that's
all like my objective-c code and then I
stole marks filter there the funhouse
filter which has some object to see
stuff in it and then we see there's in
the resources part that's where the real
in terms of image MIT yes right now we
have the CI kernel which as Mark
explained are really the core of the
filter and we have a description P list
and that is the part which gives us the
apart what is really inside this empty
unit and what do I get out of it
especially important for the
non-executable ones because we have to
communicate somehow okay which are the
parameters that I can pass in and out
and we provide also way that you can put
in your localization into it so you can
provide for your filter for multiple
languages and then let me go to the API
which you can see is really extensive we
have three calls that are important for
the loading the first one will just load
all filters the second one will only
load the non-executable one so if you
write an application where you want to
make sure that I cannot load any
security sensitive plugins that will be
the call to use and the third one is the
one that you would use if you have your
image units in the cific your own
application bundle and you want to load
those so you would just go through your
fold and each of the bandits that you
find there you would load them with this
API call on the other hand when we look
into the plug-in side thank you we have
a very simple call that we call on the
primary class of your NS bundle and that
is the load call and you can see this
actual returns a boolean so this is a
place where you come for instance to
your registration or you can check your
hardware requirements you can see oh
well that's an serial number that ends
on the three I'm not running that
machine so I can simply return false and
the filter will not be loaded so and
from there I would like to go to them
one machine tool
okay
okay what we see here is now again the
Xcode project and I just want to get
quickly over the stuff since we've seen
the funhouse filter works I won't go too
much into details there but I show you
now an example of a non-executable
filter and this is my filter function
that I have here this is something like
a pixel light filter just does a slight
twist to it and then at the end I have
for the description plist so the only
part that's kind of important that to
know about my little kernel is that I
have like two parameters that I need to
pass in see it takes an input and a
sampler which is my image and have a
scale vector on it so I open now the
description P list in our P list editor
it might be a little bit out hopefully
for you to read but I try to go with it
so that you understand what I'm doing
here in description P list we see that I
have two filters we have in the first
part the funhouse mirror filter and just
my simple kernel filter the important
part are then the attributes and as
wealth mentioned in the beginning we
have like different categories so I can
see this is okay this is a distortion
effect that's how I can detect it it's
suitable for video and also for still
images this is my display name kind of
pay attention that's here it stills my
kernel further I'm actually using our
localization technique so that will are
actually later on in the UI show up as a
correct and a little bit more nicer name
and then the important parts actually
the inputs and I have two of them as I
already mentioned in the beginning one
part is an image so I just give it a
class and say well this is my input
image and the second part ways I have an
NS number which is the scale of my
floating so I can just set some scale
point up here and that's pretty much all
I need so what I do now is PC I would
have to build this bundle once and put
it in the right location and now I can
use it in my applications and for this I
go into the quartz composer so in right
now what I set up is just simply an
image that will be shown on the screen
so this looks like this now we have a
surfer
and now I can simply go in here and look
into my distortion effects and look
there
it's my dear colonel filter and you see
that has law a little bit more nicer
name and all what I will do now is
simply reroute the image through that
filter and wanna sing again and you see
it looks like a little bit pixelated and
I can do the same thing with the
funhouse filter which we also have any
yeah
and let me reroute that through here and
as you can clearly see on the side of
the image there's the distortion that
especie how easy it is to create one and
how to use it as well and as no pixels
were hurt in this demonstration I would
like to pass back to Ralph to finish up
our administration thank you well Frank
said that no pixels are were hurting
this demo that's not totally true there
were two but they had it coming
okay so the key message what am i let me
get back to first of all so what I'm
going to do now is give you some ideas
what we could do with these things well
I assume if you're you know in the
business of writing an application that
deals with photographs or video you
should have a good idea by now what you
do with it but there are actually
applications where the use of core image
isn't totally obvious but could actually
make a nice difference so first thing I
would like to say here is the key
message that in this millennium image
processing is something that happens in
the display pipeline
there is no separate render stage where
core image essentially does most of the
things in real time today on next year's
Hardware most likely it does very
significant portions what will you ever
want to do on an image in real time so
if you're building an application that
has these are a bunch of settings and
then you press apply and it gets
rendered into a bitmap and that's
probably the wrong UI to to pursue
instead try to make it make the
application respond in real-time so you
have a slide and ask the user slider
moves the slider around the image data
is processed right away and this has a
bunch of interesting applications so for
example if you have an undo buffer in
last millennium undo buffers on images
were really quite a science you have to
keep megabytes of data around for each
stage that you're doing in this case you
no longer do that you just have your
filter and test a handful of parameters
and the only thing that the undo really
affects is these handful parameters so
things like indefinite undo on an image
is almost trivial so here's a bunch of
ideas what to do with it well we have
progress processing photographs and
video using it for transition effects or
work on creating a richer user interface
the running kind of late on time here so
I'm gonna switch
demo right away they were machine to
please okay so the first thing I would
like to demo oops
which key is it here well you saw
dashboard in the keynote and when I take
something and drag in here you see this
little ripple effect and that's Cory
Mitch at work so this is an example how
you can put car image somewhere in a
little piece of your application and do
something nice with it let's do it again
just because it's so much work to make a
triple working okay so I have two more
examples first one is a little toy when
I'm really bored in airplanes and that
happens a lot I take pictures out of two
window and unfortunately it looks like
this the key problem is there's just so
much air between you and your subject
and produces haze effect and it looks
completely washed out so I try to build
a filter that tries to correct some of
this
so this slider here tries to simulate
the distance to the ground at the bottom
of the image so I can move that around
and let's see this is about the brown
that you would expect from the ground
and then this slider here the second one
is the distance at the top so it assumes
there is a slanted plane you're looking
at so you can figure that around like
this and by the way this is crater lake
in Oregon so just too ready to put it
here
so this is the before after shot
essentially and it's not exactly a
particularly a problem that you
encounter all the time it's a very
specific solution and I will probably
never have attempted it to do if it
would have taken me a half a day of work
but we've core image it was literally 40
minutes of work so I could well not
experiment a bit and try to do this
Chloe the white point is right so that
mountain needs to adjust we need some
adjustment so and by the way the source
code for that and as well as source code
of how to build an image unit is
available on the Apple website source
code is also available over for this
example here this is essentially doing
nine different transitions all at the
same time and why do I show that this is
an example what new types of UI just
could enable imagine an application like
he note which has a widget to select the
transition between one slide to the next
well today that which it is a pop-up
menu and a little preview at the bottom
but if the power of Kareem is on your
hands you could actually go and say well
skip the pop-up just show all transition
at the same time the user just clicks on
one this makes a more compelling UI
because well it's clearly more
discoverable but discoverable what kind
of functionality is available and I have
to admit I was cheating for this demo
and this morning I found the back in our
GPU implementation with this particular
case so this is actually running on the
software renderer
okay with that we go back to the slides
so we have to go from now well tomorrow
morning there's the graphics and media
lineup my colleagues and I will be there
if you have questions that's the right
place to go if you're interested in how
to use core image together with video
then the New Directions for QuickTime
performance is the session to go to you
will learn about core video and how to
create pipe video frames through core
image and that Friday that is the
discovering quartz composer session you
saw it in Frank's section quartz
composer is a really great tool to just
wrap it a prototype string a bunch of
folders together figure out how how
things look and if you build your own
filters it will load them so that's the
session to check out for more
information well there's documentation
available actually on the tiger DVD
there is the reference on core image and
on the Apple website connected Apple
comm is the actual architecture and
fairly rich set of all filters are in
there with an image you know before
after that explains what they're doing
so it's a great way to start