WWDC2013 Session 507

Transcript

X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[Silence]
>> [Applause] Welcome, hello.
So I'm Chris Niederauer.
I work on the GPU Software
Team and I'm here to talk today
about what's new
in OpenGL for OS X.
So during this talk I'm
going to go into some
of the feature support update.
What's new in OS X
Mavericks in particular.
And then after that I'm
going to go into a few
of the key features that we
think you're probably going
to want to be using
in your applications.
After that we're going to
talk about using Compute
with OpenGL using OpenCL.
And then finally towards the
end I'm going to do a quick run
through -- trying to
get your applications
into OpenGL Core Profile so
that you can take advantage
of all these new features.
So let's start with
the features.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
In OpenGL as of Lion we've
had support for Core Profile
and this has a bunch of features
that are pretty much guaranteed
if you ask for [inaudible] Core
Profile; frameBuffer Objects,
vertex array objects, the usual.
New in Mavericks; we've got a
whole bunch of new extensions
like texture swizzle, sampler
objects, texture storage
and then additionally on
modern GPUs we have support
for tessellation shaders and all
the other OpenGL 4.1 features
plus a little bit more.
So that's an update
on where we are.
And so these are some of the
features I'm going to go into.
Explain how you can
get your applications
to take advantage
of these features.
So let's start with probably
the big new feature on --
in Mavericks; tessellation
shaders.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So what tessellation shaders do
is they allow you to use a GPU
in order to generate geometry
to a specific how course,
how fine you want your geometry
to be and you generate it
on the fly where you need
it to be, where you want
to spend your Vertices.
Let's see.
Define using shaders
and the benefits.
It allows you to dynamically
increase your polygon density.
So you're able to decrease
your vertex bandwidth
because you're only
uploading a course mesh.
And then you get to decide
how to refine that mesh.
And it's used often for
displacement mapping,
terrain rendering,
high-order surfaces like NERPS.
And it's available
on the modern GPUs,
so all the hardware
we ship today.
If you -- so you'll
need to make sure
to check the extension
stream using glGetStringi
for ARB underscore
tessellation underscore shader.
So we have here an
application that's been --
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
is out on the Mac which
is called Unigine Heaven
and actually it's been updated
to be able to take advantage
of tessellation shaders.
So we sort of before and after.
We have the stairs are
just a flat polygon there
and then the dragon's neck.
There's little points where
there should be spikes
but there's no spikes there.
With tessellation shaders you
see how it generates geometry,
it uses displacement map, it
creates spikes on the dragon
and the stairs suddenly
become actually stair shaped.
So to get a little bit into --
I'm going to describe how
tessellation shaders work.
So I also want to give
this example screenshot
where you can see that the
application is choosing
to dynamically tessellate
this geometry based
on the distance to the camera.
So you're using the
vertex bandwidth only
where you need to, you know,
only where you think
this thing needs it.
So closer objects you'll
probably tessellate a lot more
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
than objects in the distance.
So we start with a new
type called a patch.
And we set up our patches --
we say how many vertices there
are going to be in each patch.
So in this case with a triangle
patch we have 3 vertices
and then when we draw
-- we call DrawArrays.
So in the shaders there's
two parts to the shader.
The first part of the
shader we're going
to control how tessellated
we want it to be.
So first in our control
shader we're going
to be setting the outer levels
and we get to pick per side
of this triangle how
tessellated we want it to be.
And so we see in the --
on the left side of the triangle
we did a little bit more
tessellation there than we did
on the bottom and the right.
Additionally you get to control
the inner levels tessellation.
So we're adding some
geometry there.
And then once we have these --
this data output it gets output
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to an evaluation shader
and that shader gets a --
the control points that you
originally got and we evaluate
where those positions should be.
So we have test coordinates
and in this case
because we had a triangle we
have three different TessCoords
and you can see how it's
very centric and each
of those is weighted towards
the original control points.
And so using this is our
evaluation shader we can now
figure out where those
points should be and push it
out with displacement --
with a displacement map
or do whatever you
want at that point.
So the OpenGL 4 Pipeline;
you're probably aware
of what it looks like before.
We have vertex shaders on
the left, fragment shaders
on the right and
then tessellation
and geometry shaders
are both optional.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So tessellation fits
right in the middle there
and it's actually made out
of two different shaders
that are -- that I will go into.
So as I was saying
there's a control shader
and an evaluation
shader so you get
to control how tessellated it is
and then you evaluate where each
of those vertices should be put.
So the control shader.
It takes as inputs the
control points from the patches
and then basically the
array of control points
and the original patches.
And the outputs is setting how
tessellated we should tessellate
each of the edges and then we
also have inner tessellation
levels, as well.
And a tip for when
you do have patches
that are touching each other.
You want to make sure that
you have the same amount
of tessellation on
those touching edges.
So let's get onto looking
at an actual control shader
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
with this triangle example.
We have -- we set up the
layout with the vertices saying
that there's three
control points per patch,
and we're going to have
our input vertex position
from the original patch
points and we're going
to output control positions.
So first for every
input we're going
to copy the vertex position
to the control position.
And then additionally once
per patch this InvocationID --
once per patch we're checking
for only doing this once.
We're going to calculate what
those tessellation outer levels
should be.
What the TessCoords --
and it will generate
TessCoords from there.
And then the evaluation shader.
That takes and evaluates
where the output
from the control shader
should be within --
to pass on to the
geometry or fragment stage.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So it takes the original patch
and the tessellation coordinates
sort of waning towards each
of those original
patch coordinates.
And then it outputs; your
position, your TexCoordinates
and any other attributes
that you may have.
So here is an example with a
triangle evaluation shader.
We're specifying how
the -- oops, let's see.
We've got the controls as input
and we've also got a model
view protection matrix.
And basically we're
treating the TessCoords --
we have three TessCoord inputs
which are barycentric
weights towards each
of those original
control points.
And so we're multiplying --
doing a barycentric multiply
here to get the output
of where those should be.
So doing just this simple
math here gets us the evenly
distributed points as we
specified in our control shader.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And then after we've
done that we can --
here was have a model view
projection matrix multiplying
by our position that
we calculated passed
into a custom function which
is doing our displacement.
So we started with our
triangle patch just three
points originally.
We controlled how
tessellated it was,
and then we evaluated exactly
where those positions should be
to pass on to the
fragment shader.
The quad -- just as a further
example, with the quad --
the control shader
is very similar
to what the triangle except
for this time we're
specifying the vertices as four.
We also set patch
parameter I for the number
of control points to be four.
And then similarly we're passing
through the original
vertex position
for the new control positions
and then calculating
our tessellation levels.
And in this case we have more
inner levels and outer levels
to calculate because we
have four sides to the quad
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and additionally for the inner
levels we're controlling how
split should it be
horizontally and vertically.
Then in the evaluation shader
taking the control points
in again, there's four
of them this time,
and because we have a
quad we're actually able
to treat those barycentric
weights
as just UV coordinates
basically within our quad.
So we just -- we're just doing
a simple mix here to figure
out what those positions should
be for each of the points.
And then again we pass
that calculated position
into the custom displacement
map to get our position out.
So with our quad we started
with just the four points.
We tessellated it and
then evaluated where each
of those places should be.
So in summary tessellation
shader allows you to add detail
where you need it in your
scene and you can start
with triangles, quads,
arbitrary geometry.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Even like isolines.
And generates this
data on the GPU instead
of you having to submit it.
So you have a low
resolution model potentially
that you are able
to only tessellate
as the model gets
closer to the camera,
or just too simply add
displacement or extra geometry
to make a character
realistic in your scenes.
And so again it's available on
modern hardware so you'll need
to check for the existence
of tessellation shader
with GetStringi, and be sure
to match your outer patches
because otherwise
you may have cracks
in between your touching
patches.
So also another feature I
want to go over is instancing.
And instancing is allowing
you to draw many instances
of an object with only
a single draw call.
And it's a big performance boost
because each draw call does
take a little bit of overhead
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
in order to get that to the GPU.
So instead we're just
passing it all down at once
and allowing the GPU to do that
work as a single draw call.
And so each of these
instances can have their own
unique parameters.
So the offset of a one
instance from another, colors,
skeletal attributes and
you define all these
in external buffers.
And this is actually
-- when you ask
for a Core Profile
you're guaranteed on OS X
that you have support
for this extension.
And there's two forms
of instancing.
There's instanced arrays
using ARB underscore instanced
underscore arrays where
you get to have a divisor
that says how often your
attributes should be submitted
per vertex.
So as opposed to having
-- per instance, sorry.
So if you wanted to
submit an attribute
for every one instance you
would pass a vertex [inaudible]
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
divisor of one.
If you wanted to have a
different attribute every two
instances, two and so forth.
Also draw instanced provides
a variable shader instance ID.
So in your draw call from
your vertex or fragment --
from your vertex shader you
can decide what you can see,
which instance ID you are in.
So you can tell -- do
an offset into a buffer
for instance based
in an instance ID.
And I'll go into a little bit --
some examples of doing
that in a short bit.
But I'm not actually going to go
into this too deeply right now
because we actually
announced support for both
of these features in iOS 7.
And Dan my colleague
gave a talk this morning;
Advanced is OpenGL ES where he
went into depth in instancing.
So that was instancing.
Let's go on to how
we pass up some data
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
like uniform buffer
objects is one way
to get some data
up to the shaders.
And basically it's
a buffer object
to store your uniform data.
It's faster than
using glUniform.
You can share a single uniform
buffer object among different
GLSL shaders and you can
quickly switch between sets.
If you have some prebaked sets
you can just choose on the fly
which one you're using.
And additionally because
it's a buffer object
as all buffer objects you can
generate your data on the GPU
and use the output from that
in your UBO, in the shader
without having to
do a read back.
And it's used for skinning
character animation, instancing,
pretty much it's whatever
you want to use it for.
So we have a shader
example here.
And what a UBO is, is it's
basically a uniform buffer
object is a CSTRUCK like layout
defined interface where we're --
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
we have here of the
layout one standard 140
which is the defining --
which defines how our vectors
and variables are packed.
And we've called
this uniform MyUBO.
And so this MyUBO which we named
MyBlock underneath we're able
to access that from
our shader similar
to how the CSTRUCK would work.
Pretty straight forward.
And for setting that up
in the API we would have
just created a buffer object
of type uniform buffer,
set up the size of that
and then we have to
get the location.
Instead of the get uniform
location we're getting uniform
block index for this UBO
structure and of type MyUBO.
And then we just bind that
to one of our binding indexes
that we have for making -- for
knowing how to provide that data
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to the proper UBO in the shader.
So in summary you can upload all
of your uniform values at once.
The -- however you want to make
sure that you're making sure
that you're not modifying
your UBOs every single time.
So if you have for
instance some variables
that you're updating
a lot more often
than other variables you should
split those into two UBOs.
So one of your UBOs
may be static,
and another may be
updated once a frame.
And then additionally
if you are trying
to modify the data
even more often
than that you could
orphan your buffer objects
by calling buffer data
with a null pointer,
or you could just double buffer,
or triple buffer etcetera your
UBOs to ensure that you're able
to pass data to the GPU
and still modify something
one the CPU to pass
that for the next call
that the GPU will execute.
And one key point I want to
make note of is that each
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
of the UBOs are limited
to a size of about 64KB.
So as an alternative to
UBOs there's also texture
buffer objects.
And texture buffer objects is
a buffer object that allows you
to store a 1D array
of data as texels.
And again like all buffer
objects it gives you
to GPU generated data and beyond
UBOs it also gives you access
to a large amount of
data within a shader.
So you can have a
very large UBO.
And additionally it uses --
it takes advantage of
the GPUs texture cache.
So just like TBOs -- just
like UBOs, TBOs are also used
for skinning, character
animation, instancing,
whatever you want to be
using it with really.
And here's an example
shader that's using TBO.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So we basically have
a new sampler type.
Here we have samplerBuffer.
There's also like
isampleBuffer, usampleBuffer.
And we're naming our texture
buffer object reference myTBO.
And it's as simple as just
doing a texelFetch from it.
Passing our offset into there.
And since this really is just
raw data I modified the shader
now to read back four
values from texelFetch
and a result I'm able to with
a single texture sampler --
texture object I'm able
to actually read
back the equivalent
of what would have had to do
with four vertex
attributes before.
So in the API to set this
up we're just making again
a texture buffer object,
a buffer object as
you normally would.
And then setting it up similar
to a texture where we're going
to attach that buffer object
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to the texture object using Tex
buffer and we're specifying here
that it's a type RGBA32F.
But you can use whatever format
you want that's supported
by textures.
And then finally we do get
the uniform location for it
and send it just like a
texture sampler would be set up.
So it gives you access
to a lot more data
than you would have gotten
64MB or more with a UBO.
And it's very useful
for instancing
where you do have a lot of
data that you need to pass
down that you may
not have been able
to have enough vertex
attributes for
or any complicated
things like that.
And again just like
UBOs and buffer objects
in general try not to use --
try not to modify a TBO
while it's being used
to draw on the GPU.
So again double buffering,
orphaning that --
those TBOs will help you ensure
you're not stalling your CPU
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
waiting for the GPU
to complete its work.
So another feature that's new
in Mavericks is draw indirect.
And draw indirect allows
you to specify the arguments
to your draw calls
from a buffer object.
So for instance normally a
draw raise will take a count
and then there's
draw raise instance
where you have an instanceCount.
And you also specify
first, baseVertex.
I'll show you an
example in a second.
So when you've generated
data for instance
with OpenCL no longer
do you need
to know how many vertices
you may have generated there.
So in those cases you'll be
able to find the buffer object
as a draw underscore indirect
buffer and avoid that round trip
of having to otherwise read back
what those variables should have
been that you were going to
pass into your draw call.
And this is similar to
tessellation shaders.
It's available on all
modern hardware and it's --
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
so check for the
extension string
with glGetStringi using GL
underscore ARB underscore draw
underscore indirect.
So here's an example of
using DrawArrays Indirect.
First I've got a comment
at the top showing the general
structure of what I need
to be outputting into this
buffer object in order to ensure
that the GPU is able to know
what those arguments are.
So we're going to match this.
It's got a count, and
instanceCount and a first
because we're going to be
calling DrawArrays instance
and then finally a
reservedMustBeZero variable
after that.
So we would -- so first we
generate our data with OpenCL
and write into our
indirect buffer object.
And then in OpenGL
we're going to bind
to that indirect buffer object
and then we're always going
to setup all our vertex
attribute pointers and --
with our BindVertexArray call.
And then finally just
call draw indirect.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And it still takes a mode
so you're still saying GL
underscore TRIANGLES, glPoints,
whatever you're using.
And then the indirect
offset -- you take --
you pass in an offset of
your indirect buffer object.
So where you put
that data, the count,
the instanceCount,
and the first.
DrawElements, very similar.
I've highlighted
what's different here.
We've got a firstIndex
instead of a first
and then additionally we have
a baseVertex offset for each
of the elements in
your element array.
And then -- so we've created our
data using OpenCL and OpenGL.
We then bind that --
the location of that
indirect buffer,
what buffer we were
writing into.
Again we bind the
vertex array objects
so we have all the
vertex attributes setup
and then we are additionally
going to have
to set our buffer point
for where the elements
are going to read out of.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So we're also doing
that binding.
And then finally we call
DrawElementsIndirect and this --
instead of passing down
the node, the counts,
and the instanceCount
and the baseVertex
and firstIndex that's
all being read
out of your indirect
buffer object.
And instead we only have
to pass down the mode
like GL underscore
TRIANGLES again
and the element type
saying that our elements
in the element array buffer
are type unsigned int,
unsigned short, whatever you
may want to be using there.
So let's go over just
a few more extensions.
We've got some new
extensions here.
Separate shader objects
available
on all hardware on Mavericks.
It enables you to mix and
match your GLSL shaders.
So if you have one vertex
shader that you're using
with five fragment shaders no
longer do you have to recompile
that vertex shader five
times to be used with the --
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
you don't have to link --
you can link really quickly.
You don't have to redo that work
for that vertex shader five
times in order to use it
with five fragment shaders.
Instead you just create that
program once and then link it
with all five of those
shaders pretty easily.
Additionally we've
got ES2 compatibility
and this is probably interesting
if you have an iOS application
that you're trying
to port to OS X.
It allows you to use version
100 of GLSL on the desktop.
So on OS X.
However, you are limited
to the functionality
that GLSL 100 specifies.
So you aren't able for
instance to take advantage
of tessellation shaders
if you'll be using
this version of GLSL.
Another nifty new extension in
Mavericks is NV Texture Barrier
and it allows you to
bind the same textures,
both the rendered target
and the texture source.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So now if you have --
it's similar to Apple Frame
Buffer Shader Fetch --
Shader Frame Buffer Fetch on iOS
where you can do
programmable blending there.
However, this is limited to
cases where there is no overlap
between a single draw call
where your depth complexity
of your scenes is one.
So basically it saves
you a little bit of vRAM
and not having to create a copy
of your texture if you're trying
to do ping pong back and forth
between two buffer objects.
You might be able to
update your application
to just render right
back into itself
and use itself as
a texture source.
And then additionally
we actually added
in 10.8.3 our texture swizzle,
and this is a heavy extension
for supporting older
applications
which might have
been using a format
like GL underscore LUMINANCE.
And so instead of having to
modify all of your shaders
to be able to take in RGBA data
and LUMINANCE data you
can specify up front
that LUNINANCE should
be interpreted
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
as red, red, red one.
Or LUMINANCE Alpha would
be red, red, red, green;
a red green texture passing
out and trying to pretend
that it's LUMINANCE Alpha.
So those are the
features that --
some of the features in
Mavericks that I wanted to go
over and now I wanted
to go into using OpenCL
as a compute for with Open GL.
So OpenGL and OpenCL on our
platform were created together
and use the same infrastructure
in order to talk to the GPUs.
And as a result of that
cooperation you're able
to be able to share things
like buffers, objects,
textures between
OpenGL and OpenCL.
There's no need for any reading
back data and then put it back
up just to switch between APIs.
It's a very simple integration
into your render loop
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and I'm going to go
into that right now.
So some of the use
cases for Compute.
We may use OpenCL here,
for instance, to generate
or modify geometry data.
So I'm going to go into
generating a tea pot in OpenCL
and drawing that in OpenGL.
Also I got to go in after that
to post-processing an image
that you may have generated with
OpenGL and then displaying that.
So first starting out
with filling up the VBO
with vertex data using OpenCL,
and then rendering
that with OpenGL.
We're going to have our one time
setup of setting up our OpenGL
and OpenCL context to share.
And then we create that
vertex buffer object in OpenGL
and we'll specify that we can
share that object in OpenCL
in order to fill it in there.
And then every frame enqueue our
CL command in order to create
that -- the data that
goes into that VBO.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Then after that we'll flush the
CL enqueued commands in order
to ensure that when we
use that data with OpenGL
that it's now been pushed
to the GPU to be filled
in so it won't -- so
you're not using your --
you're not using data that
has not yet been specified.
So after that's been done
you just draw it with OpenGL.
So let's start out looking
a little bit closer at that.
So first we're going to
create our OpenGL context
and we're setting up the
pixel format for NSOpenGL
and here we're adding a new
PFA, pixel format attribute
of
NSOpenGLPFAAcceleratedCompute.
And that specifies that I
want to have an OpenGL context
that is capable of accessing the
GPUs that have OpenCL support.
Then after I've created our --
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
my context I'm going to get the
share group from that context
and with that share group I
can use the CL APIs in order
to get the -- in order to get
the matching CL device IDs
that will match up with my
OpenGL virtual screen list
and so I'll be able to share
between these two contexts.
And then I'll create my
context from that CL device list
and then back in OpenGL
we're going to be creating --
or creating a vertex buffer
object here to be able to fill
in and fill in with the tea pot.
Specify the size, we're going
to flush that to the GPU
and then we just use this API
[inaudible] from GL buffer,
and now we have a CL mem object
that points exactly to where
that VBO in OpenGL is.
So that's a onetime set up.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And then every time we're going
to be drawing you can check
for the CL/GL DEVICE underscore
FOR underscore CURRENT
underscore VS underscore APPLE,
didn't quite fit there.
So by using -- by looking
that up you can look
at what virtual screen
OpenGL is currently using
in order to do its rendering.
So if you do have say
multiple GPUs in your system
in this case I'm using this
query in order to check
which GPU OpenGL is on and
I'm having OpenCL follow it
so that it can do its
computations and not have
to copy back to the
other GPU any of the data
that I'm going to
be creating here.
so I've not picked the CL device
ID that I want to submit my data
to and I enqueue a OpenCL kernel
and this OpenCL kernel will
be generating our vertex data
that we're going to then
consume with OpenGL.
So after that's been done
we flush that to the GPU.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The GPU starts creating
this data for us,
and I have a barrier
here just to say
that if you are doing some
more complicated things,
if you're doing some threading
and doing things in OpenGL
and OpenCL that may
not be interacting
with each other you can continue
doing some of that work in the
in between time but
you need to make sure
that this flush here
is done before we try
to use that data in OpenGL.
So we're using here the new
function GL DrawElements
and Direct and that allowed
us to draw this tea pot
without even knowing how
many vertices were in remodel
and so that's how you get
a vertex buffer object
from OpenCL.
So the opposite is also true --
or not the opposite
but the other side
of the pipeline you can do image
processing and well with OpenCL.
So for this case we have
similar onetime setup.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
We're going to setup
OpenGL and OpenCL
to share just like
we did before.
However this time, we're going
to create a texture object
and share that between
OpenGL and OpenCL.
And so every time --
every frame that we want
to do this post-processing will
draw that texture using OpenGL,
flush OpenGL's command buffer
to the GPU calling
glFlushRenderApple and then,
in OpenCL we can
enqueue the commands
to process that texture.
And then finally if you
want to display that back
on the screen you're going
to flush the OpenCL commands
and then you can blip or
swap back to the screen.
So here we're again, picking
a pixel format that has access
to the GPUs which have Compute
capabilities with OpenCL.
We get the share group
form which we're going
to get CL device IDs that match
up with the OpenGL context
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that we've created and
then we'll create a context
from that.
And this time we have a texture
that we're going to be sharing
with OpenCL so we'll
bind to that texture,
setup how big it is to
store using techImage 2D.
And then FlushRenderApple --
and after which you can
create a CL meme object
from that GL texture.
So now every frame
we're going to be doing
or normal drawing in OpenGL.
We'll render to the
texture say DrawElements
and other OpenGL draw calls.
We'll then flush that data to
the GPU to start processing it.
We happen to be drawing
the tea pot
that we've already
done with OpenCL here.
Again the barrier.
So after that FlushRenderApple
and only
after that FlushRenderApple
should you start doing your work
in OpenCL that depends on
the results of the draw calls
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that were modifying that
texture that we're sharing.
So here again we're
going to check
which virtual screen
OpenGL is on,
and match the CL
device ID there.
So OpenCL is generating
the data on the same GPU
that OpenGL created
that data in.
And then we're going to enqueue
our post-process kernel there
and do our calculations such as
like edge detection or blurring.
And then finally if we
want to pass that data back
to GL we're going to flush
that results using CLFlush
and in OpenGL we'll then be able
to just bind to that texture
and blip it to the
screen if we want to.
So that's passing data back and
forth between OpenGL and OpenCL.
You get best of both worlds.
You a graphics API with
full Compute capabilities
and just sharing in between
with very little cost.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And it's -- additionally
wanted to reiterate
that if you are creating data
but you don't know how big it is
for vertex geometry then
it's great for using
with ARB underscore Draw
underscore Indirect.
And after this talk
we're going to be talking
about OpenCL actually
and how to use
that a little bit more
explicitly and even going
into some OpenGL/OpenCL
sharing there, as well.
So if you're not familiar with
OpenCL yet and you are hoping
to do Compute in your projects
I recommend you stay right
after for that talk.
So we've gone over Compute
so let's now talk about how
to get your context into
Core Profile in order
to take advantage of
all these new features
that we're supporting
in Mavericks.
So first to start out OpenGL
Core Profile is a profile
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that gives you access to
the latest GPU features.
The API -- I don't
know if you're familiar
with OpenGL ES2 versus
OpenGL ES1.
It's very akin to that
transition where the -- we --
the API was trimmed down to be
streamlined and high-performance
and it's more in tune with
how the GPU actually works.
So there's no matrix
math or things
that the GPU wouldn't
have actually been doing
in the first place.
And it gives you
the full control
over the rendering pipeline
so we're specifying
everything as shaders.
And similar -- I was saying it's
similar to OpenGL ES2 actually
so much so that it's pretty
easy to port back and forth
between OpenGL ES2 and
OpenGL Core Profile.
So let's go over conceptual
overview of what you're going
to need to do in your
applications in order
to get them working
with Core Profile.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And many of these are actually
things that you want to do
to your application even
if you're not trying
to get it in Core Profile.
If you're not doing it
for the reason of going
into Core Profile it's
at least an enhancement
to your application to take --
to switch over to these new ways
of doing your existing
applications.
So for instance a media mode
vertex drawing should be
replaced with using
vertex array objects
and vertex buffer objects.
And there's obvious
increase in performance
by just switching
your application
to use vertex buffer
objects instead of having
to provide the data each
frame, every single time
that you're drawing with it.
Fixed function state gets
replaced with GLSL shaders.
So you're going to have
to specify vertex shader
and a fragment shader,
and then optionally
as well the tessellation
shaders and geometry shaders.
And the matrix math is now
no longer part of OpenGL.
So you do have to provide
your own custom matrix math.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And then older shaders which are
in say 110 or 120, version 110,
120 need to be updated
to version 150 or above.
However, on our platform
in Mountain Lion we also
introduced support for GL Kit,
and GL Kit actually
solves a couple
of these transition steps.
You're still going to have
to update your application
to take advantage of
vertex array objects
and vertex buffer objects.
But for your fixed function
state, for your application
that you made but are
just trying to get to work
in Core Profile you can
use GLKBaseEffect in order
to achieve the same affects that
you would have been trying to do
with your lighting and so forth.
And your Legacy OpenGL
application.
And then additionally the
matrix math that I said was gone
in OpenGL is now fully
replaced by math libraries
that GLK Math provides.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So let's talk about creating
that Core Profile Context.
So all we do is in our
pixel format attribute lists
that we're going to use to
create our context from we pass
in an NSOpenGLPFA,
OpenGL profile.
And in this attribute
list we're passing
down NSOpenGLProfile3
underscore 2Core.
And this gets you access to all
the new features of OpenGL 4.1.
And OpenGL 4.1 is fully
backwards compatible
with OpenGL 3.2.
So that's how -- why
picking this enables you
to take advantage of
all those new features.
And so now that we've picked
that pixel format we're going
to create our context from that.
So to go over briefly how to get
your application switched over,
what you're looking for and
what you're replacing it
with [laughter].
Basically saying you need
to cash all your vertex data
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
in vertex buffer objects.
And additionally
you're going to need
to encapsulate those objects
into vertex array objects
that point where all the
attributes are coming
from for your shaders.
So we have code on the left
which is what would have
been in your application.
So glBegin, GL underscore
TRIANGLES, glEnd or glCallList
with display lists
and Core Profile
on the right we now can
change all those calls
to just two calls we
call glBindVertexArray,
and then glDrawArrays
or glDrawElements if you
so choose to use elements.
glBitmap, glDrawPixels are
subsumed by uploading a texture.
So glTexSubImage in this case
and then we can just draw
with that or call BitFramBuffer
to do something similar
to what it would have done --
what it would have been
doing anyways for you
in the Legacy profile.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And additionally, the
pointers that used to exist
for VertexPointer,
TexCoordPointer,
ColorPointer those
are all subsumed
by generic VertexAtribPointers.
So instead we're going to
call VertexAtribPointer
and then we bind each of our
pointer -- attributes by name.
So we have like myVerts in this
example and myglShellShader.
I would have had to input
myVerts and I can bind
that attribute using
BindAttribLocation.
And then finally
glEnableClientState no longer
takes -- because we're no longer
dealing with those color arrays,
normal arrays and so forth.
Instead we're just enabling the
VertexAttribArray by the index
that it's being passed up
to the gl shell shader with.
For your math portions, for
matrix math just use GLKMath.
It's got the ability to replace
all the built in transformations
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that OpenGL would
have provided for you.
And of course you can use
your own matrix math library
if you already have it
but if you don't this is
not a bad place to start.
So we've got for instance
our translate, rotate,
and scale functions
have been replaced
by GLKMatrix4MakeTranslate,
Rotate, Scale.
And the first function there
GLKMatrix4MakeTranslate is
actually equivalent to -- the
make there means it's equivalent
to calling glLoadIdentity
followed by glTranslate.
Additionally perspective we can
call GLKMatrix4MakePerspective.
Similarly that's a
glLoadIdentiy followed
by what gluPerspective
would have done.
And the GLKMath also
provides you with MatrixStacks
so you can push and
pop your stacks.
But when you've actually
finally gotten your value
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
out of your matrix, no longer
are you going to use LoadMatrix
to pass your data up
to your GLSL shaders.
But you're going to upload
those as generic uniforms.
So you use glUniformMatrix4fv
in that case.
So here is a list of some
of the functions in GLKMath.
I hope you brought
your binoculars.
But it's basically -- I want
to say it provides everything
that you need for Core
Profile in order to do --
for all your OpenGL
Core Profile needs.
It even has support
for quatrains.
So now on to the next
part of your application
that you need to update.
The fixed-function state that
you may have in your app.
Let's talk about using
GLKBaseEffects and GL Kit
in order to update
your application
to work with Core Profile.
So you would have had
fixed-function in the lighting,
fixed-function materials,
fixed-function texturing.
That's no longer available.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Instead we're passing
everything up as shaders
and then GLKEffect provides
you with base shaders
that you can treat similarly
to how your code
may have interacted
with your Legacy OpenGL context.
So here we have our lights that
we're passing up for instance.
We can pass up a
position, diffuse,
specular values for our lights.
Instead with the GLKBaseEffect
we're just setting the
light0.position, diffuse color
and specular color
as you would expect.
Pretty straight forward.
Additionally you can go
enable it with an enabled bit
and then we even have materials.
So it's a very close correlation
to how you would have been using
fixed-function state before.
So afterwards -- some of you
may already have some shaders
already written using
GLSL 110 and --
or 120 and there's some slight
differences in getting it
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to work with 140,
150, 330 and 410.
So again, I was already saying
that we've got our client
state enables are now switched
to generic attributes.
So we're just going to be
enabling our index here
of our attributes
that we're passing in.
Matrices; we're no longer
loading those matrices
as a built in matrix
like glModelViewProjectionMatrix
and so forth.
They're instead going
to be generic uniforms
that we pass into our shaders.
And so we're going
to be uploading those
with glUniformMatrix4fv.
Additionally some of the current
state that you would have set
like glColor4fv you can set
those either using similarly
glVertexAttrib4fv for constant
values or glUniform4fv as well
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
for values that are not
changing very often.
And then additionally all the
pointer calls get replaced
with a generic
VertexAttribPointer call.
So looking at the actually GLSL
language itself there's some
slight differences
here where the ins
and outs are now very
explicit in GLSL 150.
And then additionally
for the frag- --
similarly to how the
built-ins are removed
for like the fixed-functions
state
and so forth the frag data
output is replaced with an
out that will be where
we're going to be writing
to which tells us which
one of our draw buffers
to provide a result to.
So we have up here
the attributes
that would have been
attributes in GLSL 110.
In 150 those become in because
the attributes are going
into your vertex shader
so that we're just passing
in vec4 data here.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And then our varying's
that we would have produced
with our vertex shader
and then consumed
with our fragment shader are
no longer called varying's.
They're more explicitly named
out from the vertex shader
and in in the fragment shader.
So our texCoordinates for
instance we're outputting
that from the vertex shader here
and then inputting
it to the fragment.
And then finally as I was
mentioning just a moment again
-- ago, gl underscore
FragColor is replaced
with a binding of your choice.
So we've made an out to vec4
here called myColor and prior
to linking our GLSL
program I made sure
to BindFragDataLocation
for the --
for myColor so that I'm
specifying it to be writing
to the [inaudible]
for zero with myColor.
Some additional changes
we've got
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
in GLSL version 150 over 110.
Now the GLSL version
is also required.
So 110 would have been implicit
which version you were using.
However, in Core Profile you're
required to say version --
#version 150, 330 or 410
at the top of your shaders.
And then some examples
here of the built ins
that have been removed; gl
underscore Vertex, Normal,
MultiTexCoord are replaced
with your own generic uniforms
or vertex attributes here.
vertPos, inNormal, texCoord
that we've named ourselves
and upload as vertex attributes.
And then additionally some
of the uniform variables
like our
ModelViewProjectionMatrix
and the gl underscore
NormalMatrix we could --
we upload those as
glUniforms here.
And then finally small
change; texture2D,
texture3D are replaced by
just a simple texture call.
And the sampler type
overloads how
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that texture call
should be sampled from.
So now that we've got our GLSL
shaders pretty much working
with Core Profile let's
go over a little bit more
of the other API
differences here.
We've got of course different
headers in OpenGL3 and so
if you can modify your
code to only include gl3.h
and gl3ext.h you can assure that
your code is building cleanly
and as a result that
you're not calling any calls
that may have been
removed from Core Profile.
And if you are for instance to
call glCallList in Core Profile
that would throw an invalid up.
And so instead of having
to figure out at runtime
where you may have errors
just getting rid of the gl.h
and glext.h in your
file can allow you
to at compile time know
which functionality is
-- needs to be replaced.
Additionally getting extension
strings is slightly different.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Instead of getting one huge
string like you would have
in Legacy Profile it's now
split up into an index string
where you have to get the number
of extensions that are available
and you go through that
loop, and can get each
of the extensions one by one.
And then finally APPLEFence
is replaced by Sync objects.
So say that FenceAPPLE
becomes glFenceSync,
glTestFenceAPPLE gets replaces
with glWaitSync and then some
of the functionality --
some of the functions
like VertexArray
objects are replaced
by the Core equivalents.
So you'll call glVertexArrays
instead
of glGenVertexArraysAPPLE
and so forth.
So of course a lot of you
may have somewhat larger
applications where you can't
just go and switch immediately
from Legacy to Core
Profile context.
For you guys I'm suggesting
a more piecemeal approach.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So really you can do any of
these operations by themselves
and not affect the
rest of your code.
So while still staying on
Legacy Profile we're going
to switch our application
first to wherever we have some
of the older draw calls.
We can replace them with drawing
with vertex buffer objects
and vertex array objects.
And this can be done
to multiple pieces
of code at your own timing.
And you don't need to
switch to Core Profile
to start using vertex arrays
and vertex buffer objects.
Secondly replacing the math.
You can actually use GLKMath
with Legacy OpenGL context.
And it's because it
doesn't really have a --
its profile agnostic.
It just gives you back raw data.
So with that raw data instead
of calling gluPerspective
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
for instance with the projection
matrix instead I would have my
projection matrix
as just a variable
that I've calculated
on using GLKMath.
And then for instance on
the CPU I may then multiply
that by the ModelViewMatrix
as well
to get my
ModelViewProjectionMatrix
and then once I have
that result --
because I don't need to do
that multiple every single
time in the vertex shader.
I would pass that result in
as a -- just using LoadMatrix.
And so using LoadMatrix
you can pass in those --
the original matrices.
Then for updating your existing
shaders; if you have 110,
150 you can already move your --
while staying in 110 you
can keep your attributes
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and replace them with generic
attributes, use uniforms for any
of the built-ins and
by doing so you --
you're getting rid of your
dependency on the built-ins
that were very specific.
Because you can use the
generic attributes with Legacy
and Core Profile alike.
And then additionally EXT
underscore gpu underscore
shader4 enables you
to have your outs --
your out color specified
usingBindFragDataLocation just
like in Core Profile.
So in doing this we can
create our shaders in a way
that even using say
#define to define
in the vertex shader attributes
to end and varying's to out.
And do those #defines such that
you could easily switch your
shaders from 110 to 150
when you do make the switch
to Core Profile.
And then finally for places
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
where you may have
fixed-function use right now you
will have to make -- you can
make those into GLSL 110 shaders
and do similar things
to what you were doing
with your existing
shaders just before.
And GLKEffects unfortunately
depends on Core Profile.
So this -- to do this
piecemeal you will have
to be replacing your
fixed-function with shaders.
And so you can do any of these
steps above, but one at a time
and check that when
you touch this one file
and replace vertex buffer
objects and vertex array objects
that we're getting the expected
result just like we used to get.
And so we're able to debug our
application on a more piece
by piece basis and not doing
one big switch all at once.
So after you've made all
these changes you switch
to Core Profile by
specifying Core Profile
in your pixel format attribute
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and then update your shader
versions ideally with --
by replacing those #defines
with in, outs and so forth.
And just a tip for large code
bases where you may have a bunch
of code that's using
Legacy, OpenGL context calls.
You do a Grep for some of the
streams and tokens that you have
that are referencing like
glBegin, glEnd, glLight
and using that Grep and just
doing a line count you can track
how many lines of code you have
left to switch to Core Profile
and track over time
that adoption.
So to summarize what we went
over today we went over a bunch
of new features in Mavericks and
just how to get access to those
by using the Core Profile.
I also wanted to throw
in a little mention
about OpenGL Profiler here.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
It allows you to break on
OpenGL errors for instance
which is very useful for
when doing a transition
to Core Profile.
And so you no longer have to
take glGetError in your code
between every single
place to find out what
where that error's coming from.
And you should never
have glGetError
in shipping code
anyways for release mode.
So instead you can just use
OpenGL Profiler and break
on GL error as the
screenshot there shows.
And so finally we did go
over how to use OpenCl
and OpenGL together in
order to do computes
and solve your computes needs.
So if you have any questions
Alan Schafer our Graphics
and Games Technology Evangelist.
We've got some great
documentation
at developer.apple.com/opengl
and then
of course you can
interact with each other
at devforums.apple.com.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And so the related sessions to
this, we had again this morning.
Dan went over OpenGL ES and
some of the advances there
such as instancing which is
now available with iOS 7.
And then Working with OpenCL
as I mentioned earlier is right
after this talk so stay
put if you're interested
in learning about using OpenCL.
So thank you very much.
[Applause]
[Silence]