WWDC2003 Session 208
Transcript
Kind: captions
Language: en
good morning everyone my name is
it's brown and the graphics and imaging
evangelist and I want to welcome you to
session 208 fragment fragment
programming with OpenGL and now this
session is our second session and
programmability in terms of some of our
focus on exposing the capabilities of
GPU to do interesting things like vertex
operations and in this case doing per
pixel operations very advanced per pixel
operations at incredible speeds a lot of
you saw some demonstrations using
fragment programs and earlier this week
in the graphics imaging overview and
what we're gonna do in this session is
really sort of drill down and focus on
fragment programming and we're really
looking forward to seeing what you the
developers going to be able to do with
this incredibly new technology so it's
my pleasure invite james macomb to the
stage to take you to the presentation
[Applause]
morning everyone so thanks Travis so
today yeah I'm going to introduce you to
this pretty new technology that we have
and we have a really great
implementation of it at Apple combined
with some I think really pretty cool
tools that should make this easier to do
than it would be perhaps on some of the
other platforms before I start I just
want to point out because a lot of
people have noticed this if you think my
spelling is wrong I don't think it is
it's just that i'm using the UK sewing
for a lot of this stuff and uh i was
criticized a lot of with this so i'm
going to try and just warn you now
anyway so first thing why would you want
to do fragment programming well has so
many applications which is one of those
exciting things about it not only can it
be used for 3d lighting calculations and
the like but you can also use it to do
sort of even like think of the Photoshop
filters well you can implement 2d effect
even color correction style things using
these programs you can do 2d
displacements all sorts of interesting
things things that you maybe wouldn't
even think of such as using the GPU for
just general purpose computation even
thinking of as a second processor on
your computer
the other great thing about it is you
can load this program up in your card
and you can open up your cpu monitor on
the surprise surprise you'll notice it's
taking only no cpu time that's pretty
compelling reason to offload some of
your intensive per pixel calculations
after the GPU and then the my third
point Apple I think this definitely
provides the best tools in the industry
for sheeter but for building shaders
specifically last year I was here and
showed you a vertex programming showed
you how to use shader builder to do that
well shooter builder has had a lot of
work done to it and not have support for
new languages and a lot of new features
which I'll cover later on in my
presentation so to run through and seven
points here what you're going to learn
where does fragment programming fit in
the OpenGL pipeline will also take a
look at you know does your current
graphics hardware your have you have
allow you to do this if so great if not
what do you need to buy next the next
thing we'll look at is a ARB fragment
program this is the language that the
architecture review board approved for
fragment programming the great language
I'm going to talk more about it later
will if you don't have the absolute
cutting edge and you you weren't willing
to do that we'll look at a what sort of
a condolence prizes we have for you and
then we'll again we'll run over the
shader builder and we will then I move
on open up shader builder and show you a
basic fragment program and we'll work
our way forward without to something to
show you just how the language works and
move forward then I'm going to show you
something kind of interesting with
fragging program that is using the GPU
for general-purpose computation and then
I'll move on and show you how to get
multipass working in the most optimal
way with our OpenGL implementation at
Apple then we'll look at how
optimization applies to fragment
programming as programmers we optimize
to where we shoot off to my stuff often
we'll see how that applies to this
slightly different way of thinking about
software and then finally I will move on
to something a little more math
intensive and look at implementing per
pixel lighting in a fragment program I'm
going to pull that apart and hopefully
remove a lot of the mystery that may
surround that so starting off of my
first point where does it fit in the
OpenGL pipeline well this has been
covered pretty extensively in some of
the prior session so I'm on you most of
you I think will be familiar with the
traditional OpenGL pipeline so I'm going
to try and move quickly here on this
slide you have your vertex data and you
have your pixel data vertex data can
enter through immediate mode calls or
vertex to raise pixel data usually going
through the text image calls or those
could come from displaylist that you
have already prepared at your program
initialization pixel data gets through
all the pixel store stuff gets unpacked
and makes its way towards the graphics
card now I've vertex data traditionally
it is a very fixed function that's done
to the vertex the vertices that you
submit the they're simply x the model
view on the projection matrix inside of
OpenGL then they're clipped into window
into the window coordinate and then
OpenGL will make use of the GL lights
that you've enabled it will calculate
the vertex colors based on on a fairly
primitive lighting model on a per vertex
basis so the colors are interpolated
across the polygon surfaces which we all
know those look so good after that stuff
calculated it the rasterizer goes along
on the colors in the frame buffer
basically excels and then this last part
that the blue dotted line that's just
come up that indicates that this is the
process of feedback and I was taking
your frame buffer and eventually
bringing it back through as it can be
fed into a texture units again for
multipath operation so I just pulled out
the fixed function transform clipping
lighting and I'm also going to pull out
the fragment operation the fixed
function ones and then vertex program
which has been around for longer it
replaces that part with a programmable
language
today I'm going to talk about replacing
this part with our pregnant program so
support it hardware this slide I'm going
to amend slightly at this point the
scene for our fragment programs the ati
radeon 9700 but the good news now is
that all of our new high-end systems the
new g5 boxes all of them will support
art fragment programs so if you've been
thinking about buying one of those maybe
this is another reason you might want to
look into that come and then the ATI
text fragment shader this is a slightly
significantly less capable language but
something that you might want to look
out and I will give you a little bit of
information on it later that's supported
on some of the lower end card and for
instance our current 15 inch powerbook
the latest ones do have a graphics card
that's capable of vertex programming and
support for this ad I vendor-specific
extension so what exactly is a fragment
program just out of curiosity how many
people here have actually been writing
fragment programs in the last while how
many people have done it before okay
that's good great so what exactly is it
basically on the guy it's round this
little program that you upload to the
graphics card it's a it's round one for
every color look up during the
rasterization basically whenever the
rasterizer is going across and filling
in the pixels on the surface of the
polygon normally it goes through a fixed
function which can you know get pics
attack souls from there's a different
text you're going to unplanned them
together well this allows that process
to be totally programmable so yes you
could go and get the data from the
texture units or you could just return
any color you want based on a
mathematical function nice so again the
output is a fragment program is good
it's got a very well to find out put
single RGB a color and optionally you
can specify its position in the depth
buffer so
was it can be cold by later stages in
the Geo pipeline regards to inputs to
the fragment program the texture
coordinate channels which you would
normally set up in the vertex program
you'll note that when you write an
output vertex you can put into the
texture coordinate channels and you have
the number of channels is equal to the
number of texture units on your graphics
card the values you right in there on
the vertex program those are
interpolated across the surface that the
fragment program is running on so you
get that interpolation for free so
that's one of the inputs you also get
the interpolated vertex color and the
position of that fragment in the window
and went to coordinate also with our
fragment program you can access all of
opengl state which is certainly very
convenient that means you get access to
the matrices and also the program
parameters as well which i'll discuss
and then the other important input is
you can you can sample texels from any
of the texture units which is important
also let's take a look at a specific
superb fragment program great great news
here if you've programmed our vertex
program the language is is totally
parallel to that it was intentionally
designed that way too if we know our
vertex program you know ARB fragment
program as well the only st. difference
is that instead of dealing with X X Y Z
W components to specify a vertex
position in 3d space you're now I
defining red green blue and alpha
components for a colorful and output
color so it's same thing applies all
these instructions there are a few
exceptions ninety percent of these
instructions are their vector
instructions meaning if you do an ad
it's obviously it does the ad for all
four components there are some of these
instructions talk about later which are
considered scalar instructions and that
is for instance the to the power
instruction to the power of that's a
scalar instruction meaning it only deals
with one component so if you want to do
a power for all four components you need
for instructions that's one thing so
group these together those are all your
ass arithmetic instructions you've got
some logic instructions like other
languages unlike unlike programming a
general-purpose CPU don't be expecting
to be able to do bit shifts and things
like that that isn't going to happen do
you want people do shift left shift
rights any of that but you do have some
basic logic to determine if something is
equal to zero or not or its greater less
than zero the other thing then is these
are the important ones that are pregnant
program ads though the text your
sampling instructions these allow you to
get things from the texture units with
text being the most common one and then
use some miscellaneous instructions like
vertex program you have a squizzle
instruction so you can reorder your
components in a vector and you have this
kill instruction I haven't actually used
it yet in the fragment program but what
actually is supposed to do is stop a
fragment from continuing down the OpenGL
pipeline it actually literally just
kills it off so it might be some
possibilities for you without
construction so inputs fragment program
attributes I've already actually covered
this one bit of duplication here but
still needs to be in the slide color
texture coordinates all for the foghorn
I guess then again the OpenGL state
matrices light materials these are these
are all accessible in the fragment
program then you've got these convenient
little program parameters which mean
that you can make a fragment program
that is parametric for instance if your
fragment program is is all is
implementing a lighting model the
position of the light in 3d space you
want that to be sort of a parameter of
the fragment program you can define a
variable in your fragment program and
then there is a new OpenGL entry point
which you call and you can hand in a
four components or for floating point
values and that's just those four values
get uploaded to the card and on when you
when you flush your next seeing those
get applied and then finally your output
result color and depth advantages of our
fragment program
portability this is the standard now so
if you're programming in this I think
you'll find your urine you're in good
shape because a lot of vendors are going
to pick this up and our high-end
machines support it and I think it's a
good it's a good bet to go with on the
arbor proved it which always helps and
also the instruction set it's it's very
rich in comparison to some others i'll
show you which allows you better better
be able to implement better better
models for lighting or whatever and also
you don't need to generate so many sort
of frustrating look-up tables for
instructions because they're just
natively supported on the hardware and
also flexible text your sampling there
is a certain only hardware design
challenges to certain aspects of this
which i'll discuss maybe a little later
but you can do with ARB fragment program
it you can do many many samples you
could have them depending on each other
and stuff it's it's really cool this is
this is what i want to cover for this is
sort of to support some legacy hardware
this is a vendor specific extension I've
certainly written plenty of fragment
programs in it it's you it's certainly
quite usable but it's it's not it's not
very pleasant to programming so it
floats basic fragment programming there
is a pheromone current hardware support
shooter builder does fully support this
language so you don't lose I'd it's not
like we aren't going to provide a
developer tool for you know you do
provide all the tools we just discourage
you from using unless you absolutely
must so yeah again so now I'm going to
introduce you again for those of you who
saw shader builder last year you're
going to i'm going to show you it again
except this time it's a bit it's quite a
bit different on the support for new
stuff here's a here's a screenshot of it
right here this is the new layout of the
shader builder on the top left of the
window here on the top left of the
screen that looks pretty familiar to
those of you have see
before except you'll notice that now the
the rendering window the OpenGL view is
actually detached and on a separate
poulos window which is convenient
because the multi head users out there
will be able to take that and drag it
onto a separate display ahead and
they'll be able to code on one screen
see their graphic and full screen on the
other which is convenient also also
another major thing is that shooter
builder adds is the ability to monitor
the actual date of the graphics card in
terms of how much expandable resources
you're using so you'll notice on the top
left most window up the sort of just
above the debugger buttons you can see
these little gauges which are showing
you how much expandable resources you're
using on the graphics card so as you put
more instructions in that's going to
that's going to keep growing until it
hits the top and if your fragment
program stops working well at least now
you have a fighting chance of
determining why that you filled up your
graphics card and you buy a new one and
then the the other great thing here is
this like like of what it was before on
the righteous above the resources and
performance box you can see this list of
up sort of identifiers when you're
writing your program and you declare an
identifier the actually will take that
identifier and put it in this list for
you meaning that when you're running the
program if one of those identifiers is a
program parameter you can pop open this
great symbol editor which is on the
bottom right most window the metallic
window and you can click on that and you
can slide those and that will actually
change the parameters on the card you
can have any number of those parameters
up to the hardware limit that has some
interesting side effects which I'll
cover later which you're pretty exciting
so just to summarize here add to support
our fragment covered that texture / f
you could there's more control of the
texture units in this Tjader builder
again program parameters can be changed
in the fly you've got the resource
monitor better solution for the multi
head folks because the rendering window
can be put on a separate head II
underlying code has had some rework for
better code sharing with the OpenGL
profiler documentation has improved so
this full documentation for all of the
languages a pretty nice UI for that and
it's assault it's with panther so your
CD you got it it's all on there in terms
of example code that that's available on
the website so let's switch over to the
demo machine and uh let me show you
shader builder here here demo to
actually great so what we have here is
all I like the screenshot i showed you
on the top right of the window we have
the rent the rendering view which you
know i can move it around at the moment
is just showing a simple quad which is
really what you want if you're just a
writing fragment program you want to see
the pixels you don't really care about
the geometry so much for a lot of the
things you'll be doing on the bottom
right again you've got the tech the
texture units inspector so here if I
click here this shows me that in texture
units 0 i have this rock texture loaded
and text your unit 1 i have the water
texture which is what's currently being
shown now let's look at the code here we
can see basically what's going on here
this is a very simple pass through
fragment program it's doing nothing
special remember the programs running
once I beg your pardon
oh it looks ok here it must just be
clear sorry with that not there ok sorry
with that so what is this doing this is
running once for every for every pixel
let's just think of it as and what it's
doing is the text instruction basically
that acquires a color it gets an RGB a
color from a text your unit at a
specified texture coordinate so we can
see that I have a temporary variable
declared which I've just called t0 and I
am sampling from texture unit one that's
the second argument so text your unit 1
into t0 on this fragment text cord that
you can see that is the interpreted a
texture coordinate that it's using to
look up at and then I'm simply in the
last instruction here I'm moving t0 into
result color result color is it's sort
of a fixed thing in the language which
defines the output color your writing
let's look at something i could do so
for instance this texture one that's it
that's the texture you know that I'm
sampling from so as I change that its
you can see it's changing that's very
straightforward let's look at something
a little more interesting let's
implement say we wanted a fragment
program who would allow a program
parameter to adjust the brightness of
the image and do it all and hardware how
would you do that ok let's create a new
line of code right here so what we've
done is we've declared a parameter P
which is bind to a program parameter
meaning that there's a there is a AGL
entry point that i can call with the
number with the index zero and I can
pass in a floating point value which is
sent to the card and run in this program
so let's implement a brightness control
pretty straightforward a simple multiply
will do that so let's do a multiply
remember that it's like assembly hero if
you've written vertex rooms you'll be
familiar it's the destination first and
then the argument multiply takes takes a
destination and to
argument so we multiply 20 with the x
component that's the first one last will
change to make okay with the x component
of P now right now the brightness is at
the bottom so we're seeing it's black
but what's this we move over here you
notice p is in this identify or less
i'll select stop i will open up an
inspector right here it's the symbol
editor right here this allows me to
change those values so remember the
brightness is stored in the X component
so i'm editing p and you can see right
here o of my mice is i can change the
minimum a maximum value this slider will
will go to and i want to set the range
to go from 0 through to just to keep
reasonable values and as i slide this
right now you can see it's sending that
value to the graphics card on the
graphics card is that there's a
brightness control implemented in
hardware that's pretty straightforward
stuff let's now show okay i want to
implement a contrast control as well i'm
going to do a pretty cheesy
implementation of contrast here i'm
going to use the i'm just going to
basically raised the color to a power
where that power is a contrast and as I
if you may recall I mentioned that the
power instruction is a scalar
instruction note i only had one multiply
instruction yet it multiplied all the
three red green and blue components for
me how well the power instruction will
not work like that so i'm going to need
three of them and i'm going to need to
do it for each channel so i'm going to
do that so I'm dealing with the red
channel here red channel dealt with
green
and you notice that shader builder is
keeping it like before it it as i type
it's all real time it's updating on the
fly and it's showing me all the sign tax
information is highlighting the line
that the error is on it's just great to
work with ok we don't have a contrast
control if I know take the y-component I
set the range like I did the other 10
through five happens to work well here
it's totally low contrast right now
hence white but notice as I as I
increase the contrast here it's becoming
more and more and more contrasting and
again the brightness control works as
before so this is a simple example of
writing a fragment program on showing
how you get program parameters into it
so let's let's ahead ahead almost a
presentation here back to the slide so
I'm going to try and keep the rest of
this presentation pretty example centric
and get get you thinking here so let's
look at a 2d image displacement I think
here of the the photoshop filters that
allow you to do for instance to twirl
effect that's what we want to think of
be thinking about here so it's another
example yeah if I to offload stuff to
the GPU performance is excellent on the
it could make an interesting inner scene
transitions in the game or something
it's certainly possibilities a little
background and how this is going to work
what we're going to do is I'm basically
i'm going to show you how to do a twirl
effect in a fragment program it's single
path the way it's going to work is
imagine a texture map I'm not texture
map the colors of the pixels actually
define the vectors of defined movement
vectors like displacement vectors so
what I'm showing here on the last of the
slide pardon me is a very low resolution
to buy to texture which we do a very
coarse twirl effect now what i'm doing
here is for the top left most ones that
one that kind of looks red that's that's
pointing to the right and it
it's encoding the XY in the red and
green component which you can see there
and it you can see it's showing it as a
twirl going on in my fragment program
i'll be able to sample those those
texels and i'll be able to interpret
them as vectors not colors i'll be able
to interpret them as just to think of
texture units as just being woke up
tables at this point think beyond them
as being what you've used them for
before important thing to note in four
colors in texture units they they have
to be between 0 and 1 and in positive
space so I want these vectors to have
you know full full full freedom of
motion so what I'm going to need to do
is they're packed into the texture units
in a slightly different way whereby I
want them to go from minus zero through
to one so simple equation I just divide
that by two and add point five to it
pushing them into positive space keeping
them between zero and one in the
fragment program i will i will do the
reverse of that to unpack them so that's
describes describing what the
displacement map is so back over to the
the demo machine here where i'm going to
just open up the other example which i
have laid out here
okay we just don't make it so you can
see what's going on okay the same
problem again huh okay there we go thank
you so switching to the to the fragment
program it's a pretty pretty
straightforward fragment program it does
much like what i described if you follow
the comment let's like take a look at
what's in the texture units in texture
units 0 we have this rock texture which
is the texture I'm going to displace
going to warp it and then a text your
unit one I wrote a little program that
basically just generated a generator a
spiral sort of effect and it encoded
those vectors of colors and put them
into positive space which is what you're
seeing in the texture unit right there
in touch unit 1 so my first instruction
here is the tax instruction which
samples the displacement map I just
highlight that right here and then it
does what I described which is stealing
it from color space into into sort of
into a vector which has negative and
positive and then I have a program
parameter a bit like what i showed in my
last example where that allows me to
control the magnitude of the effect so
by multiplying that vector with a
baghdad a normalized magnitude I can
control how much of a spiral effect of
going on I add it to the original so I'm
displacing the texture coordinates here
that's how i'm doing a look up i'm
actually changing the texture
coordinates for each fragment that's how
you do the warping effect and then
finally when I have a texture coordinate
I sample from the rock texture map in
the fruit and texture units 0 so let me
show you what it looks like so here I
select the magnitude parameter and the
list I open the symbol editor again as I
slide this you'll see there's actually a
spiral effect going on you know it might
not look as good as the Photoshop filter
but the reason for that is simply
because the quality of the displacement
map that I put in was crudely generated
it was a simple program you can imagine
here how this theme code it totally
generic you could create a displacement
map that was a sine wave filter or any
number of other displacements on this
same fragment program you know all i'm
doing is send
for four values that for floating-point
values across the bus this is all
running on the graphics card I think
that's pretty exciting so I'm let's see
no I ok yeah that's a move on back to
the back to the slides again so now
we're going to move on to a
general-purpose computation and a
fragment program it given a new piece of
hardware it seemed utterly correct to me
to implement the game of life on it so
we can say thank you to dr. John
Connolly for coming up with this
interesting interesting scheme so and it
is quite a while ago too so yes it can
be done using a fragment program it's
correct to do in the fragment program
because it's a totally parallelizable
problem this is a great thing if you're
if you have a mathematical problem which
is kitty kappa be parallelized fragment
programs a good place to do it as long
as you have you don't have too many
instructions will fit on the hardware
it's also a great example of multipath
because when you calculate one frame of
life you need to you need to feed it
back into the into it and run it again
and you keep doing that for every time
quantum and you keep running it to on
until either it stabilizes or it keeps
going and yes it runs extremely fast
it's really cool and i'm going to show
you that very shortly refresher course
on the game of life for those who
haven't looked at this in your daily job
so this configuration that we see on the
left of the of the display is sort of a
sort of a famous configuration in life
called our pentomino basically it's
interesting because when run it runs for
it doesn't a lot of life initial setups
will die eyes and you'll just end up
with a black screen well this one does
not it runs and looks interesting for I
think like 60 times quantum's or
something and then eventually it
stabilizes
what life is is imagine a grid and for
each cell it's either alive or it's dead
there is no of no gray zones so either
alive or dead so what I'm going to think
about it here is think we're calculate
we're running for every single cell on
the grid we want to figure out is this
current cell going to be alive or dead
in the time quantum we're trying to
calculate well the way you calculate
that is very straightforward you first
of all you sum up the the total of the
Living neighbors in the surrounding
eight cells so this basic equation on
the left is what you do and number of
neighbors and then you taught you some
of them together when you have the total
number of neighbors you apply a very
simple consistent rule to that which
I've outlined here on the honor on the
right if the cell is currently dead if
your current cell is dead and you have
two or three surrounding neighbors you
will become alive you turn yourself on
your life if you're currently alive and
you have actually died that they look
wrong way around but i'm sure you can
read the code oh but if you're currently
alive and you have two or three
neighbors you will stay alive otherwise
you're the sort of fighting for
resources an engine you'll die that's
the basic principle of this run this for
every pixel and do it per frame and
you'll get this interesting sort of
little colonies growing and stuff so
let's take a look at I'm not going to
show you how I did this in shader
builder I figured it'd probably be more
intuitive just to show you it on the
slides here building in so let's take a
look at the fragment program that
implemented this first things first
declaring our variables the important
thing to note is that in a fragment
program to address important thing in
life is we need to be able to address
each of the pixels in the fragment
program and remember that an OpenGL the
if you're not using if you're using 2d
textures the texture coordinates are
normalized so the last most is 0 right
Ruth is one if we have a 256 the 200
six image to address like the eat of
them you need to have a scale factor 2 x
to be able to address each of the pixels
accurately and that's very important in
a in a calculation like this so the next
thing we do is we need to know the
coordinates of the surrounding eight
cells remember this fragment program is
currently determining the destiny of one
cell is it going to be alive or dead so
it needs to find out how the coordinates
of the surrounding cells so those are
the instructions to do it it's it's
pretty straightforward what's going on
there the next thing we do is we need to
we've got the coordinates of the
surrounding cells we need to sample them
now so we were using it it text your
sample instructions here to get the
colors of those of those surrounding
pixels the surrounding cells when we
have that we add them together very
straightforward here and then we
implement the life logic and this is
actually a little tough because of the
limited number of logic instructions is
know if there's nothing nice like that I
had to use you know just is it is it
greater than or equal to 0 or less than
or equal to 0 and move them around and
stuff and abdomen have the logic using
that it it's not very clean but we'll
we'll look at the better solution to
that in a minute and then finally when
we have the destiny of that cell is
alive or dead we write it to the output
color this runs in parallel for every
pixel on there automatically so ok we've
calculated what the life grid is going
to look like for the first time quantum
how do we get ready we need to feed that
back in for the second calculation so
how do we do that well opengl has
provided the api for this for a very
long time on our platform it's extremely
fast it's really cool because the data
does not have to travel across the bus
the frame buffer can be fed back in to a
texture unit without even leaving the
graphics card to the process of feedback
your CPU would you'll see nothing
happening and your bus traffic will be
zero because it's all on the graphics
card if you follow these simple
instructions the first thing you
need to do a GL read buffer which
defines where GL is going to read data
from and you would set that to be GL
front left which you could make it an
auxiliary back buffer but for simplicity
I just made it a single buffered
application so I just got it from the
front buffer and then when you have the
read buffer set you do a GL copy text
sub image to D which you basically get a
rectangle from that and puts it into a
texture target again it's covered pretty
pretty much what i've said there enough
a little code snippet it's only two
lines to do it the it's very fast now
the moment you've all been waiting for
what does life look like when it runs on
a fragment program so back to demo to
hear let's not say goodbye the shooter
builder for a moment and open that up
okay watch it if it is quick yeah so
basically this configuration it started
off as three of those little art
pentomino configurations I described you
can see it's sort of it entered a stable
state right now there's a little bit of
movement but it's totally stable right
now so that was life running on the on
the graphics card it just kind of
interesting can we cover it at the end
Thanks okay now let's head back head
over to the slides here and let's look
at how does optimisation apply to this
new environment that fragment program i
showed you was hideously unoptimized
extra sample instructions these are the
very things that you really don't want
to be doing in a fragment program
because yes it's very quick it's very
very quick but it could be quicker
remember i was only i was only i had a
very small window and I was only text I
was only texture mapping the face of one
polygon if I was you know in a
complicated 3d game environment which
I'm sure some of you work on on a daily
basis performance really really matters
and we're going to look at how you can
make big big improvements in your
fragment program performance suitable is
going to help you here a lot I think
let's first look at some of the tricks
that we can do the first thing is it
possible why instead of calculating
things in the fragment program if it's
something that isn't absolutely
necessary for to be calculated on a per
pixel basis remember you've still got
your vertex program running that's
running for every vertex now for that
example there were only four vertices
defining a quad but many many many
fragments so I'm
why not calculate it on a per vertex
basis of its if it doesn't need to be as
high resolution and B you can then pick
up that data in an interpolated form in
the fragment program as those values
will be interpolated across the surface
the other useful thing is using look up
tables is at this time it can bring
performance benefits and I'm going to
show you how to do that in the you know
instead of doing a calculation in the
fragment program why not pre January it
a doubt at your application launch time
a table of numbers stored into a texture
map and then be able to text sample that
in your fragment program and there you
have a you there you have a look up
table I'm just like you're familiar with
the other thing is the instruction set
is very rich so I would encourage you
and I know no one likes to read the
manual but it's a good idea in this case
to read there to read through the
instructions shooter builder has a nice
instruction browser go through them
learn your toolkit learn the
instructions you have because you could
be implementing something in four or
five instructions and you realize oh
it's just a strange instruction I've
never heard of actually does all of this
in one instruction that's going to run
faster and it's going to make your code
short or too so let's look at a
optimizing life specifically how did I
go about that this part I'm a little
concerned about because it's a little
hard to explain but I'm going to do my
best so I did it this is a little little
clipping out of the light the grid life
was running on its that our pentomino
thing I was talking about one of the the
thing that added a lot of instructions
to that program that I showed you when I
was running through it was the process
remember there were it texture sample
instructions not was it was sampling the
state of all the surrounding cells well
that's a lot of texture sample
instructions if we could do that on the
last it would be great not only do we
have to sample those texture that those
eight times we also needed to calculate
the surrounding coordinates to do that
we needed to calculate the surrounding
extra coordinates nowhere eight of them
well I'm going to show you a way that I
went about it that actually has allowed
me to drastically cut down the
instructions here if we look carefully
at the the life configuration we realize
that it exists only in one color plan it
doesn't have it doesn't require the
three color channels so let's take
advantage of that we realize that in
fragment program the instructions ninety
percent of them will deal will operate
in one instruction it would like altivec
on with it well it will operate on all
four components at once so if we can
pack more data into those channels and
execute one instruction weaken in one
instruction we can do the work of like
three or four so what I did was outside
of the fragment program just before I
did the feedback where I was feeding it
what I was doing that the feedback I
actually took the the sort of one bit
image and I took it and I offset it at
one pixel to the left and I stamped it
into the red channel then I offset it at
one pixel to the right and I offset it
and I stamp it into the blue channel and
then in the center it within the green
channel so i end up with I've magnified
I but you can see what I've got here on
the right hand side and that shows you
how I packed the data in this is really
good because now if I do one texture
sample in the fragment program by
accessing the red green and blue
channels independently I have got the
state of this of the left and right tick
cells meaning to get all the surrounding
neighbors all I need to do is read the
is-3 texture samples instead of instead
of the all those other ones I only need
three and then by using the red green
blue channels I've got all the pixels
that's a useful thing the other
important thing is remember that nasty
logic that I implemented which was to
choose the weather the cell would be on
or off well that was like six or seven
instructions let's get rid of those with
like one instruction let's create a
lookup table here is a texture map it's
a two by eight texture map
and basically what we do is on the y
component of that on the Y text and the
texture coordinate Y component we just
put in a 1 or a 0 is this is the current
cell on or off and on the X component we
have the number of neighbors that we
calculated and by simply doing a texture
sample with with that lookup value it's
a lookup table we get back on or off
that's the color of the output cell
that's another pretty useful thing to be
able to do after doing that here was a
benefit remember a builder had that
nice thing would show you the number of
instructions and a number of texture
samples that were going on on the
graphics card well I just created a
chart to show you what I was able to get
it down to and it runs exactly the same
it looks no different except it's faster
if you measure it the what we see here
is the instructions went down from 36
instructions down to nine instructions
that that's a pretty big deal texture
samples those were expensive to run on
the graphics hardware Gotham's down to
nine down to four temporary variables
well not 13 of them anymore got those
down to five and then this is this last
one is interesting added DTR that expand
that for you it's a dependent text you
read what a dependent text you read is
so I've shown you the text instruction
which allows you to kick a texture
coordinate and look up and get a pixel
back from a text from a texture unit
well imagine that the coordinate that I
use to look that up was actually derived
from a previous text your sample you can
imagine that you have dependencies
between them well for every one of those
dependencies that's known as a dependent
texture read the graphics hardware has
support for four of these four of these
the pattern texture reads I am not using
one of them it's still a huge
performance improvement and dependent
text you eat if you're going to use
look-up tables in your program you
going to inevitably use these you've got
four of them I find that is more than
enough for anything that I'm going to
implement you can look into that after
now I've this is my my final part that
I'm going to look out in this part have
had me most concerned because it seems
that a lot of a lot of what I look at
other demo programs of people who may be
implemented for pixel lighting models
using not necessarily ARB fragment
program but other extensions I find it
personally very hard to understand what
they're doing I've looked at their
example yeah they look cool but it's
like I'm not really sure what they're
doing so what I've done here is I just
threw a lot of way and start from
scratch implemented my interpretation of
a lighting model which I think it's
correct and it looks good I'm going to
just explain step by step I'm going to
break it apart for you sure it's how
it's done and try and dispel any of the
mystery there so this there's certainly
some 3d math involved here but it's sure
it should be quite understandable before
explaining it let's look at what the
contributing entities to this are what
are the think what are the sort of
inputs to this fragment program the
first one is the position of the light
source where the light source is
relative to the fragment that we're
currently calculating the color of well
so we're going to call that LP the other
thing is I support for variable
brightness lights so that's another
thing for it for the light source we
have another just a scalar which is how
bright it is the next important thing is
where is the fragment in in in 3d space
where is it located that's important to
know as well because we're going to use
it to calculate a vector from the light
to the fragment important to know that
the other important thing is the normal
of the fragment this is the key part
that makes per pixel lighting models
look so cool and that is that the
normals unlike the OpenGL traditional
lighting model where normals are only
our only exist on a per vertex basis
under interpolated across the surface of
the polygon
here the normals are on a perfect cell
via this meaning we can create these
amazing detailed surfaces and the way
it's done is if you recall back to when
I showed you my displacement map where I
had encoded a two-dimensional vector
into your color this is exactly the same
thing except we're encoding a three
dimensional vector into each color so
not only an XY but an XYZ so we're able
to encode a three-dimensional normal
into a texture map and that site by
sampling that in the fragment program I
can get the normal at that point which
is the which is a large reason why this
look so good and then finally we have
the fragment the we're calculating the
colors of the fragment on the surface of
a polygon so we need to know the normal
of that entire polygon so that we can
transform the defragment normal I'll
cover that in a minute another useful
thing I did to make this program simpler
was I wanted to calculate everything in
model space so the in your 3d game your
light sources you're probably going to
want to define them in in world space or
I space you're going to want to find
them relative to your world and then
what I did was in the vertex program I
try x that that by the inverse modelview
matrix to transform it into model space
so the fragment program had something
easier to work with so I've already
actually covered the normal map kind of
what it is but this is what it looks
like that's the best texture map that
I'm going to use and you can see the one
below is the normal map well these
psychedelic colors are the that are well
defined the bumpiness of the surface
again the same little trick applies the
I have to do this sort of you know
normals are obviously they range from
their normalized they range from 4-1
through one I have to squash them and
push them up into positive space using
this super simple equation and I'll
unpack them in the fragment program so
quick quick run-through before I
demonstrate it what's good what's going
on here we
calculate the vector from the light
through the fragment we have a light
source we were fragment nunavut vector
we calculate that I'm a light vector
incoming lvi then while we when we've
done that it's pretty trivial to
calculate the distance just a scalar the
distance from the light to the fragment
if we know the vector we can get the
magnitude of the distance on that we use
to attenuate the light later on to get a
10 get attenuation in here also then we
acquire the fragment normal like I
described we sample the normal map we
get the normal FM then we transform that
relative to the surface of the polygon I
explained that I want to go to then the
good part if we know the fragment normal
and we know the incoming light vector we
want to calculate it after its reflected
so it comes in and we get it coming out
like that we do that for every fragment
that's important also and then when we
have that we do a dot product between
that with the incident light ray and
then the direction of the light source
so in this lighting model you could
actually not just Nestle have the light
pointing directly down on the surface
but you could have it those sort of
going along the surface you can change
it I've actually made it fixed right now
but there's no reason I couldn't move it
around I just wanted to have them have
spent so much time here on this then the
next thing is to do the attenuation
remember LD we calculate the distance
from the light to the fragment we're
going to use that to attenuate the the
light intensity which will create
through the rough fall off which is
important also like for instance I hold
a light here a little light here shining
on this it's not going to have very much
effect lighting up the far wall that's
the attenuation and then we multiply at
that with a light brightness control so
quickly little bit of animation here
this is the surface or a polygon FP
that's the current fragment we're
calculating here we have the FM the
fragment normal which I acquired from
the texture lookup encoded in the red
green blue channels
is our incoming light vector like so
then we reflect we can't get it after
reflection with the normal and then LD
that was the distance I was talking
about so I think it's probably very
clear in your head now what I'm going to
do and I'm going to move over to shader
builder and I to illustrate this in
action show you how really pretty cool
it looks close our previous example
okay just uh get these win over in shape
here and the resolutions a little
smaller than what one typically used to
is that is that in range now can you see
it is that good okay so if you can see
this here's the fragment program I'm not
going to go into it on a per instruction
basis on this it simply would take too
long this again this exact thing i'm
editing right now on stage will be
available as a demo afterwards so you
can go home you can run it and onto the
build up in your new g5 you buy and then
you'll be able to see this you'll be
able to see it play around with it what
i've done here is i've actually
commented out the pro signed the program
in such a way that the exact order of
the stages I described to you are the
exact order of the instructions in this
program and I put a comment that matches
that exactly so when you look at it and
you look at my corresponding slide you
can write you can walk your way through
this program after so let's take a look
text your unit one there's our North
there is our normal map in the bottom
right corner of the screen in texture
units 0 I have the best texture which is
the gargoyle fifth in the rendering
window right now what you're seeing is
for every fragment you're seeing the
color you're seeing the in the incoming
light vectors in displayed for you in
encoded and other decoded as a color so
if i go here and I open up the symbol
editor this will allow me to move the
light source around so here I'm moving
the light position around and you can
see the vectors are updating let's pull
the light back from the surface you can
see it we're close to the surface there
it's right at the light is right on the
plan knife you can see a singularity
there let's put it back no let's start
on commenting some code and watch this
effect build up let's calculate the
vectors after they've been reflected off
the surface and this also off depend
this depends on the fragment nor
it comes from the normal map slit on
commenters line these are the light
vectors after reflection again move the
light around you can see them updating
right there let's not do the dot product
that i described with the light
direction vector which is constant right
now move it around again you can see it
moving around and then let's do the
attenuation and you'll see the things
with big difference that looks better
and then the next stage is we we
multiply it by the color of the light
the moment the lights just what white
light what we want to be able to have
that controllable so we do that a
colorful light and then finally we
modulate that with the best texture map
which we know I have that's that's that
now if we move the light source around
you can see I see how good this looks
I'm actually I'm going to make this
window bigger so you can you can see it
more clearly let's start moving the
light around you can see it
[Applause]
because we cool the light away from the
surface she is getting far away lights
so far away now i leaked it is even
showing up bring it in really really
really close so I think it looks pretty
good and then we can also do of course
we can change the color of the light
that's another program parameter by
adjust that I can edit it as a color you
see it's changing the color as well so
multiple parameters as well as supported
shooter builder so I've certainly uh um
she should be fun for you to be able to
look at so at the end of average the end
of my presentation now I I hope you all
find this enlightening um and I hope
that you'll go home afterwards and we'll
run the new shader builder if you look
at these examples because I think
there's a lot to learn here I'll a lot
of scope for implementing these things
so thanks a lot and I think I'm going to
head back to come back to travel here
[Applause]
Thank You Jane um if you have any
questions with information that you've
seen today feel free to contact any one
of these email addresses up there I'd
like to actually add mine to it I'm
Travis at apple com and again any
questions about the graphics
technologies or various things you've
seen relating to 2d and 3d graphics at
wwc feel free to email me with your
questions
you