WWDC2004 Session 207
Transcript
Kind: captions
Language: en
I'm Travis Browne in the graphics
imaging evangelist and I helped to put
together this for graphics media track
along with help from obviously the
engineers who've created the great
technology and content and I'm sort of
generally your host for a lot of things
going on with graphics and media here at
WWDC the one they wanted to take a
little bit of time is to talk about some
big sort of shifts that are happening in
graphics and particularly the tiger
timeframe and you've noticed they've
been several mentions of the fact that
now we're leveraging floating-point
pipelines and for example the GPU we're
also have floating-point pixel support
and quartz 2d and also new technologies
like core image also are based on
floating-point pixel operations and this
is sort of important shift in graphics
because it used to be that you know
eight bits per component was plenty it
was enough it was the ideal 16.8 million
colors encapsulated at all but as the
sort of imaging market has matured it's
really become obvious that we need more
bits per pixel we also need them encoded
differently and this raises new
challenges that we need to solve inside
the ls not only to be able to pump those
through our graphics subsystem but
additionally be able to do things like
read and write them to to disk so this
session we're going to be talking about
two topics and specific one is high
dynamic range which is going to
basically cover several techniques in
terms of how deep pixel data is encoded
put on disk available file formats Plus
also once you read that deep pixel data
back and you actually make it viewable
what what you need to do to it to
actually you know basically create an
image you can display on your monitor
and then secondary to be talking about
the reciprocal changes in the operating
system the technology such as image i/o
which is an imaging library which is
going to deal with these new data
formats and also changes in areas such
as colors ink and color management
because these changes all together sort
of complete the picture they're going to
Nabal to you to actually step beyond
eight bits per component and start
leveraging the new fantastic
capabilities inside the GPUs are imaging
stack to really do new and interesting
things with high res
illusion high fidelity data on that note
I'd like to invite David Hayward this
stage to take you through the session
thank you thank you travis with
introduction and thank you all for
coming today session on high dynamic
range imaging with image I Oh what I
want to talk about today and what you'll
learn is that the new exciting emerging
field of high dynamic range imaging and
how you can take advantage of it today
in tiger using a new facet of quartz
called Amy Jo but before I talk about
those two fields and the people who'll
be coming up and talking about it in
more detail I want to give a brief
interrupt eight on what's new in color
sync for tiger because color stink is
one of the key pieces of technology that
allows for the proper rendering of both
standard and high dynamic range images
so let me give an update on color sync
for tiger will be talking briefly about
adding floating-point support in color
sync use of core foundation types some
API changes will be making some notes
for developers of custom CMS and some
changes to color sync utilities user
interface so first and foremost is
floating point support the Travis
mentioned earlier one of the things
we're trying to do for Tigers provide a
new high fidelity cinematic graphic
environment for tiger and in order to
achieve that we need full floating-point
support throughout the entire system and
one key piece of that is color stink so
in order to achieve this first thing we
needed to do was to have a new bitmap
structure in color stink for supporting
arbitrary bitmaps of floating point data
that your application can pass to us we
wanted to make the structure as flexible
as possible so that you wouldn't have to
repack the data before you send it to us
so this new structure supports both
chunky and planar arrangement of data
and also allows for the channels to be
in any arbitrary order the way we
achieve this in the structure is it's a
little different from other bitmap
structures you may have seen is that
instead of having a single base address
for the all the pixel data we actually
have a different base address for each
Channel this allows for the channels to
be in any order and then we also allow
for both robots and column bytes to be
specified which allows for you to have
your data be scanned
in reverse order if needed or also if
you've got unusual packing between
channels that you want to make sure to
skip over so it's a fairly basic
structure but in most cases most people
will be passing in a buffer of chunky or
interlayer interleaved data and so we
provided a simple utility function
called CM float bitmap make chunky which
you supply at a single base address and
then it'll fill on the structure
appropriately for you in either case
whether you fill in the structure by
hand or whether you call this helper API
once you have a source and destination
float bitmap you can then call color
sync to match data from one space to
another we have three functions to do
this the first one is CM convert XYZ
float map which allows you to convert
between all the cie related color spaces
so x y z yxy la d and l UV and also is
another function which is convert RGB
float map which allows you to convert
between the RGB derives spaces RGB HSV
and HLS these both of these functions
are based on textbook formulas and so as
a result there's no need to pass in a
profile or color world to do the
transform it just does the math for you
with floating-point precision the last
is probably the most interesting is mrs.
the new API CM match float map which
allows you to pass in a color world
reference to perform the actual
transformation you can create the color
world by concatenating one or more
profiles and then at that point the data
will be sent through the CMM which if it
supports floating point data will be
done in full floating point precision
one of the other changes we've made to
color sync is integrated in more closely
with the core foundation types and the
key way we've done this is that the two
common colors think opaque data types
which are the CM profile ref and the CM
color world rest are now CF types and
this is quite convenient because it
means that you can now call the CF base
functions such as CF retain and CF
release it also means you can add
profiles in color worlds two
dictionaries or arrays this is kind of
handy if you're passing profiles and
dictionaries around to other parts of
your code the other way that we
supported core foundation type is to
actually get the data out of a profile
one of
questions I often hear from new users to
color sink is I've got this profile
reference how do I get the data out of
it in the past that was done with by
calling either CM copy profile or CM
platen flattened profile now it's much
easier you can just call CM profile copy
icc data and it will return you all the
data within the profile as one giant ICM
CF data type ok next thing I want to
talk about are some API changes that
we're making for Tiger way back several
years ago one of the features we added
to color sync both at the API and the
user interface level we're a set of
preferences so that applications could
have one place to go to for specifying
default profiles based on usage or color
space and at the time we hope that this
would be a way of simplifying the user
interface across a wide variety of
applications and so we presented both
API and user interface to help with this
in practice however it turned out that
very few applications have used this API
and what we're left with is the user
interface in the color sync utility
where people say I can't figure out what
this does because nothing I'd change
here make seems to make a difference so
we're listening to the usage and we're
actually beginning the process of
deprecating the API and also the user
interface so we still want the any
applications that we're using this API
to function correctly so what we're
doing is changing the behavior of CM get
default profile by space CM get default
profile by use and CM get preferred CMM
instead of storing their preferences as
a setting that's global across all the
Machine it will now be stored in the
current application current house
current user domain so the api's will
still function but we're deprecating
them this house and the ramifications
also in the UI which I'll talk about
later I also want to make a take this
time to talk a little bit about custom
CMM one of the things that we've been
doing over the last few years is making
even tighter and more powerful
integration between the graphic system
as a whole notably courts and printing
and color management and in order to
achieve this with high performance and
high reliability
we have made made it so that courts and
printing will only use the Apple CMM
that said we have a long tradition of
allowing applications and other other
developers to develop their own CMMS and
for applications to call those as they
wish it's still possible for
applications if they wish to have a
custom CMM and for them to explicitly
create a color world using that CMM and
that can be done using the recommended
API for this now is NCW can cat color
world and this this API has an easy
convenient way for you to specify which
CMM to use the other thing to mention
for CMM developers is that there's a new
entry point for cmms which is CMM match
float map if your CMM supports this then
you can have a full floating-point
support throughout the rest of the court
system and if you don't support this
then the data will be truncated to
16-bit integers and everything will work
sufficiently lastly I want to mention
some changes we're making the color sync
utility as I mentioned earlier we're
deprecating the Preferences api's for
default profile and one visual
manifestation of this is that we were
removing the user interface from color
sync utility however we're adding
something in its place where I'll be
adding a new utility to the colors link
utility which is we call a calculator so
let me give a brief demonstration of
this on demo to rear so as we see in the
color sync utility everything looks
similar except for there's no longer the
preferences pane as the first item but
we have a new item which is calculator
which provides a very simple way to
convert color spaces between in all the
various different color spaces using
floating-point precision this is a
convenience that also provides a good
way to demonstrate our floating point
data pad so obviously we can specify our
source color space in our destination
color space if we're just converting RGB
to HSV we can see the slider values we
can update the sliders on the left and
they update on the right one thing
you'll notice is because the RGB and HSV
are related color spaces their basic
formulas for each other so as a result
the color on the left will be the same
as the color on the right if we switch
to CMYK you'll see something a slightly
different which is now it's going
through a profile and if I go to a
saturated color you'll notice that the
color on the right is g saturated one of
the other things we added is the ability
for it to be fully symmetrical so now
instead of just updating on the Left I
can also update on the right and it'll
show you the values in that order we can
also this is an interesting way to test
out a CMYK profile we can specify that
we want to input la b values and output
to CMYK and as we scroll through all the
possible la b values we can see what the
resulting CMYK values will be so that's
the brief demo of the color calculator
we hope that's a useful function so back
to slides so the next thing I want to
talk about is something that's all new
for tiger which is this new facet of
quartz called image I oh and again as
Travis alluded to earlier we wanted to
provide a new API for doing image
processing or image reading and writing
from a variety of formats and this is
image aya we talking today about its
features its goals what formats it
supports the clients of this API some of
the core concepts you need to understand
for using this API and some advanced
techniques as well so what are the
features of image I well first is we
want to be able to read a wide variety
of file formats and write to a wide
variety of file formats we also want to
support reading and writing metadata and
also we want to support incremental
loading for clients such as web browsers
that get data an incremental fashion
over slow data connection we also want
to support floating-point support
because that's one of the key
initiatives for graphics and in Tiger we
also want to have broad color space
support and something called cacheable
decompression mentioned a little bit on
this now which is typically different
api's for reading and writing image file
formats have one of two behaviors in
terms of decompression in the case of
the existing core graphics AP is every
time you draw the image it's fully
decompressed each time this obviously
has the advantage that you have very
little memory overhead but it's a
performance hit if you draw the image
more than once other api's have the
behavior
that the first time you draw the image
it fully decompresses it which obviously
requires more memory but has the
advantage that subsequent draws will
perform quickly there are merits to both
approaches and so what one of the
approaches we've used with image i/o is
to try to allow for both features not
all file format support both approaches
but we were ever possible we support
both philosophies here are some of the
overarching goals for image iya first
foremost with the reduced code
duplication turns out there was an
embarrassing number of different
variants of JPEG readers and writers and
tiff readers and writers within our
system and they all had different
strengths and weaknesses and if you were
actually trying to write an application
that read and wrote images you had to
make a choice between which strengths
and weaknesses he wanted to use we
wanted to have a single reference
implementation within the system use
that use that means many places as
possible so that we have a single place
to make changes in the future one of the
other goals is willing to leverage open
source so that our havior of our api is
consistent with other implementations
and improved performance this is one of
the other key things we've been spending
a lot of time with the vectorization
team at Apple to make sure that our key
file formats decompress with optimum
speed another feature was lazy
decompression in the sense that if all
you need to do is get the height and
width or metadata out of an image you
shouldn't have to fully decompress the
data so we want to support that as well
and lastly we wanted to make sure we had
a very modern core graphics friendly and
easy to use API so that you could all
easily adopt this in your applications
so one of the first questions I always
get when I'm talking about image i/o is
well what formats you support and we
support all the standards for the
internet TIFF jpg ping gif and JPEG 2000
these are already supported on the
developer CD that you got this week
we're also supporting some exciting new
formats such as some high dynamic range
formats such as openexr radiance and
some important variants on tiffs such as
log lu v and some pics are variants
there's also countless other formats
we're going to be supporting BMP PSD qti
f SGI icns files
and we're considering more both for
tiger and beyond so the clients for
image I obviously we hope that anyone
who wishes to use this API are free to
use them in their application but
there's also lots of places within the
system that are going to be calling
image aya so you may get the benefits of
image era without having to change your
code at all probably the first and most
important client is for image i/o is a
preview application this has been a
great example of how how the power of
the new image I oh and some of the
advantages and get from its making
strong use of this new API also app kit
will be switching over it's not yet
switched over in the current developer
release but app it will be switching
over to using API the new image i/o API
as well WebKit and its clients such as
Safari mail and any of your applications
that are using web kit will be using
image io core image is using image I owe
to load data and floating-point format
spotlight is using it for generation of
thumbnails and getting metadata and some
of our scripting technology such as
sifts and image events are also using
imagej oh so we're trying to use this
everywhere in the system this should I
want to give an outline on the API in
image I oh but before I do that I want
to talk a little bit about how images
are organized so you can get an
understanding for why we designed the
API the way we did in previous systems
the standard way of representing an
image in core graphics was with a CG
image rep and this is a great basic
format for representing images it allows
you to specify three things the geometry
of the image such as its height with
robots and pixel size the color space of
the image which can be a profile or
other equivalent description of the
color space and the actual pixel data
this is the minimum information you need
to describe an image however it turns
out that there's a lot of file formats
out there and they are actually quite
elaborate in many cases and so one of
the things we want to do support an
image i oh was a richard model for
images for one thing we want to be able
to support thumbnails and metadata free
images and also a lot of file formats
support multiple images within the same
file format such as tiff so we want to
make sure we support that as well
and also there's a set of attributes
that apply to the image file as a whole
rather than to the individual file
images contain within the image file
this is the file format of the image
such as it's whether it's tiff or jpeg
and also some properties that apply to
the file as a whole for example tiff
files can be big endian here's an
example of how this works in practice
we're using an example of a tiff file
the file type is public tiff which is a
universal type identifier that describes
this image as being of the type tiff we
have some properties that apply to the
file as a whole the file size and bytes
for example in the Indian asst of the
tip and then we have the standard
information for each image such as its
height width its color space it's pixel
data its thumbnail as possible and its
metadata such as copyright and artist
information you name it so here's how
this model is reflected in our API
through data types what we have is we
use the existing CG image ref to
represent the geometry color space and
pixel data the thumbnail is also
represented by SVG image rest the
metadata and the file properties are
represented as key values in a CF
dictionary f so it's a little bit all
very simple so now I can talk a little
bit about the API what we've added is a
new data type called CG image source and
this is the opaque type used for reading
images from either memory or disk you
can create a CG image source from either
a CF URL ref CF data or with the CG data
provider once you have a CG image source
you can query the image source for
several attributes you can ask for the
properties of the file as a whole using
CG image source get properties you can
ask for its file type by calling cgma
source get type you can get the count of
images using CG image source get count
once you know the count of images you
can then for each image ask for its
image you can ask for its thumbnail and
you can ask for its metadata so it's
pretty simple just to show you how this
works here's a little code sample that
shows you how given a URL to get the
first image out of the file it also
returns some simple metadata in this
case it's just returning the
the DPI of the image in the horizontal
and vertical direction first thing this
code does is call CG image stores create
with URL which creates our data type for
subsequent access to the file then what
we want to do is we want to get the set
of properties for the first image so we
call CG image source get properties at
index and that returns a dictionary we
can then query that dictionary to see if
it has the GPI height and width
properties and return those to the
client lastly we need to actually return
the image so we call CG image stores
create image at index and that will
return the image to the caller here's
another example for getting a thumbnail
out of an image image IO is very
flexible for creating some nails as it
turns out some file formats support
thumbnail some don't also with some file
formats thumbnails can be quite large
your application may need to have
control over how thumbnails are returned
and we provided that with the image io
API be an options dictionary in this
case what we're doing is we're again
creating a CG image source by specifying
a URL and then we're going to be
creating an option to dictionary with
two key value pairs in it the first q is
CG image source creates thumbnail from
image if present what this does is tell
Amy Jo that even if the image doesn't
create a thumbnail return the actual
image instead so we'll always get an
image for the thumbnail the second key
value pair we specify is CG image source
thumbnail max pixel size and this allows
us to make sure that thumbnails are a
reasonable size which is especially
important if you've specified the
previous option so in this case we're
saying that we always want an image to
be returned and we want it to be no
bigger than 160 x 960 pixels once we've
created that dictionary all we do is
call CG image source creates thumbnail
at index specifying the image source 0f
index and the options dictionary and
it's returned this is for example the
way that the spotlight technology
creates thumbnails for images in the
search results field so that's the
basics of reading from an image I oh
here's what we do for writing we have
another data type which is CG image
destination which can be created with
the CF URL CF mutable data or with the
CG data consumer at the time of creation
you also specify the type of
file whether it's a jpeg or tiff for
example and the capacity or the number
of images that that image will hold once
you have a CG image destination you can
specify the properties for the file as a
whole using CG image destination set
properties and then you can repeatedly
add each image with various options and
metadata at the same time using CG image
destination add image lastly you could
flush the file out to either the URL or
to the data by calling cgms destination
finalized and that returns true if the
image was successfully flushed again let
me give a short example just to show how
easy this is to add to your application
we have a function called right jpg data
which takes a URL and an image to right
and a dpi to specify in the metadata
first thing we do is we create an image
destination with a URL specifying that
it's going to be a type jpg and that
it's got one image in it next thing we
do is we specify a dictionary with three
keys and values for options and metadata
one option that we're specifying is the
quality of the JPEG and that's specified
with the key kcg image property equality
in this example we're specifying a
quality of point eight or eighty percent
compression the other two key values are
for metadata and they are the kcg image
property DPI wits and DPI height in this
case we're just creating CF numbers
based on the value that was passed in
once we have this dictionary then we
call CG image destination add image to
add the image and its options in
metadata to the CG image destination and
lastly we call CG image destination
finalized to write the file to disk so
it's pretty easy so those are the basics
of the image I oh I hope I've given the
impression that this is a very simple
and easy API to add to your application
and again some of this benefits you'll
be getting for free if you're using
applicant and other technologies let me
talk for a minute about some of the more
advanced techniques that come up when we
talk about image reading and writing
such as extracting a RGB data requesting
the depth of an image and loading an
image incrementally
so one of the common questions we have
is well I haven't an image has been
returned from image I oh but it's I
don't know what color space it is I
don't know what depth it is I don't know
what pixel format it is and i have an
application that only works in RGB
that's a common scenario and this is an
interesting piece of code that makes it
very easy to convert the data matter
what format it came in into a RGB
basically the technique is to use a CG
bitmap context to render the original
image into an off-screen and one
advantage of this is that it takes care
of all the color management correctly if
the image happen to be an l a b or CMYK
image and had a profile then it'll be
correctly color managed to the RGB
colour space that you're working in
another interesting question is the
depth of image some formats only support
one pixel depth for example JPEGs are
always eight bits per sample other
formats can support arbitrary pixel
depths the rental tips can be 1 2 4 8 or
16 bits per sample as a rule the image
returned by image io will be the same
depth as that indicated by the file so
if you open a 16-bit tiff file you'll
get a 16-bit cgm address however in the
case of high dynamic range file formats
it gets a little bit more complicated
the data in these file formats are
typically encoded in special encoding
formats which can then be decoded in a
variety of ways they can either be
unpacked two floating point values
either 32 or 16 bit formats or two
integers with 16 or eight bit precision
also in the decoding process that you
can either be left as extended range
values or they can be compressed to the
logical 0 to 1 clipped range both of
these are reasonable types of values to
be returned and your application may
want one versus the other by default cg
image io will return an image ref that's
compressed to 16-bit integers this will
give the best results with reasonable
memory for the typical application
however if by request an application can
specify that they want the floating
point unprocessed data returned here's a
brief example that
is how to do this this is a code snippet
that given a URL will request that the
data be returned in floats and if it is
it'll the data is actually returned as a
float a boolean will be returned to
specify that it was actually floats the
way we've done this is as you've seen
from the previous examples we create an
image source and we specify an options
dictionary which has as one of its key
value pairs CG image source of maximum
depth with the value of 32 at this point
we can then ask image I owe to get the
properties of the first image given
those options and this will return a
dictionary we can then query that
dictionary to see if it has floating
point data or not then lastly we can get
the image and return that to the client
another advanced techniques I wanted to
make sure people knew that we supported
was incremental loading his images I
won't go in too much detail on this but
the basic idea is that you create an
image source in an incremental fashion
using CG image source create incremental
and then you repeatedly add updated data
to the image source each time you add
data you can request a new image and it
will give you a partial image or
complete if the image is fully loaded
the and then it when you once you're
done with the image you can release it
and then once you've added more data you
can get a new updated image it's
important that you release it before you
ask for a new image so let me give a
brief demonstration of image I oh and
action so one of the things i want to
show first is the new preview and i've
got a bunch of images here open
and one nice thing in preview is you can
open all the images just by selecting a
folder and I've got a variety of images
in here one of them is an LED image and
we can do that but we can verify that it
is an lav image by going to tools get
info and this shows the metadata that's
been obtained using image I oh and we
can tell in here from the metadata
that's currently returned that the color
model is la b we have a variety of other
images we can zoom in and zoom out the
thumbnails over here were obtained using
image aisle as well we have high dynamic
range images here you can zoom in and
zoom out on that Luke later will show
how we can manipulate these images in
real time here's another interesting
example which I like to show people this
is one of our things that we use for
testing oftentimes people want to know
well how do I know if the profile is
being used what I have here is a
document that's a black-and-white CMYK
document that has a profile on it that
makes values that are gray disappear so
if this image were rendered and the
profile were ignored what you'd see is
the text the prophetic test profile is
not used and that's because you can't
see it here because the profile is being
used but there's actually a gray word
not right here so it provides an
interesting test that you can tell if
your profile is being respected or not
here in this gray version you can kind
of see a little bit of the hint of what
was once there and the word not but this
is a great way of testing images we
really should distribute these at some
point one other example of using image I
oh I have a test application which shows
some of the options so I'm going to go
to open one of the images we just saw
with the desktop images and open up this
image here we can see some information
the height and width and how long it
took to draw we can one one thing we can
do is we can specify that we'd like to
see what this would look like if it was
progressively loaded if I open up
another image if i open the high dynamic
range image this is a big image
unfortunately so it takes a couple
seconds to open
we can if we bring up the metadata on
this you go to window metadata we can
see that it has height and width and its
depth of 16 this is because by default
we return 16-bit integers however if we
want to return it as 32 and again
they'll take a second or so this was
still need this code still need to be
altivec someday soon we need to bring up
the metadata and hat now we can see that
there's a new property in here which is
saying that data is returned it floats
so that's the introduction to image I oh
I'm going to pass the microphone and the
demonstration and all the new stuff over
to Luke Wallace who will be talking
about high dynamic range imaging thank
thank you David so today I will be
talking about Mac os10 support for high
dynamic range imaging which is a new and
exciting feature that we are adding into
the tiger release as many of you know
high dynamic range imaging is generating
a lot of interest and is still a subject
of very active research so we could talk
about high dynamic range imaging from
many different points of view but what I
would like to do today is concentrate on
answering very simple three questions
what is it why use it and how to process
it before we try to answer this question
let's take a quick look at the current
status quo in digital image processing
we can conclude that in majority digital
image processing is dominated by the
what is called output referred approach
what it means is that the requirement of
image reproduction are imposing certain
requirements on the way we acquire and
create images and because most of the
devices we are dealing with like
displays and printers can only handle
8-bit data through color channel we
impose the same requirement on digital
cameras that in fact could produce much
more about an order of magnitude more
data if they were not restricted to that
requirement obviously there are some
advantages this is not done for no
reason the main is that there is a very
minimal image manipulation required
before displaying or printing such an
image but obviously there is an
disadvantage that we are losing a lot of
color and image information that could
be used in further image processing that
could result in much higher
quality of display or fringe oops sorry
infirm direction another requirement
which is sort of hidden in the output
referred approach is that the data is
exchanged in one predefined color space
and in the most difficult case this is
srgb so when you look at this slide you
see I drew the shape of the typical
exchange color space made it be srgb
that color space covers only a part of
visual gamut so everything is fine as
long as the camera is acquiring the
color data we send that triangle but if
we are outside then we are out of luck
we have to do something with this color
and typically we have to push it into
the color space can be done through
different methods but because the
cameras are not very sophisticated in
terms of processing power we are using
very often clipping and as we know from
practice gammas clipping can produce
really bad results like for example hue
shift and here is one of the maybe a
little bit stronger and exaggerated
example of what could happen but this is
a real clipping that in which the white
color because of clipping became a
mixture of completely unrelated colors
so when that is what we can conclude
when we look at the image processing
from the point of view of device
capabilities to reproduce the image and
what I would like to do now is to look
at the image processing from a little
bit different perspective from the
perspective of human vision and as we
know from the very rich research in this
area color and visual acuity are to the
most important are the most important
characteristics of the scene and not
only this these to depend on luminance
and observers visual adaptation we know
that we can measure the war
of luminance and it will cover the range
of the values between 10 to power of
minus 6 all the way to the power of 10
to aid in when measured in Candela per
square meter but what is important for
us that different ranges of luminance
create different illuminations now and
that illumination also can stretch all
the way from very dark environments to
starlight all the way beyond the
sunlight and now I could spend a lot of
time talking about physio psychological
and physiological mechanism mechanism
controlling our vision but what I would
like to do without going to those
details to say that humans have three
types of vision which are dependent of
the type of luminance we have scotopic
vision which works when we are in the
dark environment we have mossop excision
which works in light dark environment
and finally when we are in high
illuminant illumination environment we
switching to photopic vision why this
decision is important because our
quality of vision is related to this
type of vision as we know if we look at
something at the very dark environment
we have no color vision and very poor
acuity everything in the darkness seems
to be just a shade of gray on the other
hand our best vision is in this photopic
range where we can see many colors and
have a good color and visual acuity this
is not everything the what is very
important that human has a limited
simultaneous range which also depends on
the type of illumination and here I'm
showing the widest simultaneous range
which again existing this photopic
vision that can cover the range of
of order of magnitude three to four but
if we try to estimate this simultaneous
range in poor of vision that can the
values can drop by the order of two so
we may ask ourselves why this is all
important well I think there is an
answer because if we want to represent
faithfully the scene that we want to
process through image processing we
should have a mechanist to encode the
data the same way as or at least as
close as possible to the human vision
fidelity so now let's take a look we're
in this picture we can fit the typical
8-bit display and as we know the typical
8-bit display can cover the range of
luminance on the order of the order of
magnitude of two that is big discrepancy
between human simultaneous range and
dynamic range of a display so this is
the biggest challenge that we are facing
when we that we have to map the
relatively wide human simultaneous
ranging too low dynamic range of our
display device there is one solution
which we already know about this is the
output referred digital photography we
are imposed imposing the low result low
resolution low color of small color
space and the only thing we can do is to
choose between different options this is
a simplistic view in which we may say
well if I want to expose the details in
highlight I can use the short exposure
but if I want to see the details in the
shades i can sacrifice the details in
highlights and use the long exposure to
capture or what I wanted the most
important point is that this applied
exposure is permanent
once we burn this into the image there
is no way back so I think that at this
moment I will try to answer the question
what is high dynamic range and I think
that we can define high dynamic range as
a special encoding of the image data
which allows us to preserve the full
fidelity of human vision from the
implementation point of view the high
dynamic range imaging is based on color
values that first of all extend over at
least or four orders of magnitude that
can encompass the entire visible color
gamut and allow the values of outside of
a typical 021 range values of color in a
summary what it means that in high
dynamic range imaging we are no longer
limited to a specific color space we are
trying to encompass as I said all
visible colors but on the other hand we
need to remember that we no longer have
a convenient ready to display or ready
to print image high dynamic range data
requires some kind of manipulation
before it can be displayed but the big
advantage is that we can make this
decision at the moment when we need to
reproduce the image with our preference
instead of burning that to the image
this is a kind of simple explanation how
we can do that we can go back and select
the short exposure long exposure but
most importantly we can implement
something which was not needed before is
the tone knob rendering which will allow
us to achieve completely different
results for example here I can try to
combine in one image the details from
the highlights with the details from the
shade so now I'll try to answer the
question why use high dynamic range
images and the most important reason is
to preserve the scene referred
information that can be useful in
further image processing and this way we
want to avoid intermediate and coding
with
restrictive color gamut which was
happening in this previous approach
called output referred preferred and
also we can avoid irreversible
modifications that happen during the
image acquisition how to process high
dynamic range images the most and the
simple the simplest answer is that we
should not add any rounding or clipping
errors and for that we want to render
and capture the data in floating point
we want to store the entire image and
it's needed to process the color data in
extended color space which again will
not impose any clipping and at the end
we want to apply the tone mapping for a
specific image reproduction for example
that specific reproduction could be an
example I just showed you that I want to
see all the details in the image from
the highlights and shadows now let's
take a look at the file formats that we
are supporting in Tiger I think that the
most important citizen here is open exr
that comes from ILM first of all it has
the smallest quantization error and most
importantly as we will see later it
comes with the recommended way of tone
rendering which solves a lot of problems
in terms of presenting the image content
the other formats basically just defined
the way of encoding a decoding data with
preserving the image fidelity so now I
would like to show you if i can get my
this is demo2 machine my little
application in which i can open high
dynamic range images and what i would
like to show you is that we have to do
something with those values which are so
large and much bigger than what we can
represent by typical range of 0 to 1 and
one would think that
very simple approach would be to simply
mob the brightest point in the image to
the brightest point of the display but
if I do that with my little up demo
application you see that we don't see
too much in this image there is way too
much information beyond one and scaling
didn't produce any visible image another
very simple approach could be okay let's
say I would like to see whatever you
have in this image clip the values 201
typical range and show me that well as
you see the image quality somehow
improved but it's still very poor and
now if i use the open exr and this is
their default zero exposure value i'm
getting some reasonable result and i can
see many more details and not only this
I can do what I was talking about it I
can impose my preference at the moment
of reproducing the image for example
someone may like this kind of image or
and someone else may still want to focus
on this beautiful stained glass I want
to show you a couple of classic examples
like for example the same as memorial
church picture which comes from the
deviled egg website and the same thing
happens here if we just scale the image
the image the dates the image is
basically unreadable clipping will show
something that quality is really poor
openexr is doing a very good job here
another example is the picture I was
using in the previous slides of the our
garage at apple and this is how it looks
when scaled this is how it looks when
clipped once again typical huge shift
when clipping the data and open exr
producing quite reasonable result what
this leads us to the conclusion that
tone rendering is a very important issue
when processing the high dynamic range
images and maybe they
many different methods of doing that and
I think this gives me a very good segue
to introduce Gabriel Marcoux who will be
talking about high dynamic range tone
mapping developed at Apple thank you
very much for the introduction as you
have seen the main problem of rendering
high dynamic range images is how to map
the high dynamic range how to reduce the
high dynamic range into a low low
dynamic range or device and this problem
is not trivial so we have look into what
is available in the published literature
and here i put a list with few methods
that i select from this from what is
available and the site from openexr you
can see the histogram adjustment
proposed by one retina Calgary's this is
a class of methods and good review on
this method is proposed is published by
a Jamaican electronic imaging conference
in 2002 another interesting approach is
based on the color appearance models and
using one of these models which is icon
Fairchild and Johnson try to reduce this
high dynamic range gamut high dynamic
range imaging to the displayable won the
last three in this list are proposed at
siggraph 2002 and they are bilateral
fast filtering doujin burn method and
the gradient compression you can see in
this list that you can group the methods
in two classes one of the classes
general algorithm that are applied in
a bit images to get more information
from them and reduce the global contrast
to something that is visible on this on
the display and the other second class
of algorithm as P our algorithms that
are specifically designed to handle high
dynamic range images while doing this
research of this available methods we
came up with our own algorithm ention
the apple method which i will
demonstrate in a minute one of the
things that you face when you try to
design or even evaluate a method is what
is the intent of that method aside from
just compressing the high dynamic range
into the dynamic range of the display
and here is a flavor of to kind of
rendering intense one that is retaining
the look and feel of a high dynamic
range and you can see this on the image
on the left and another one that is
showing as much content from the image
and you can see that comparing these two
images around the highlight region in
the image that retain the look and feel
of of a high dynamic range image you see
the glowing around the window while in
the other you see more detail more color
of the stained glass of that sealed
window ceiling window and you can see
this the same thing happening on the
window on the wall and it depends on the
subjective preference of the user which
of these settings will prefer to to use
the good thing is that with upper method
we are allowing both intends to be shown
so I will do a demo on this and I will
show you how this
algorithm is working so we open the
viewer and with this we can choose
between different images let's start
with a memorial image with that is let's
say a classical one which is pick up as
an example by a lot of algorithms it is
a good thing because you can compare
algorithms on the same image this is the
rendering of glowing the intent of that
is showing the glowing around the
windows and around the highlight in the
image I also show here the high dynamic
range size which is a good metric for
that is the logarithm of the maximum
over the minimum values in the file and
this give us three stops or 13 stops and
you can see on the bottom the actual
range of floating values in the file I
said that we can support both kind of
intense and right now I switch to the
intent where you can see the color of
the glass in the side in the stained
glass of the window so we can open a
different kind of different kind of
files not only from one source and you
can you can evaluate this the power of
this method this is an image that is
from openexr user by the opening acts
out of the test image and it's quite
interesting it has 18 stops so it's a
quite large high dynamic range and you
can see details in the shadow you can
see details in the highlight also you
can look at the text that is on the book
and see the details under the table as
you have noted I haven't changed any of
the default parameters of this method so
our intent was to actually design a
method that where the user will
have us less intervention as possible
and will be allowed to take the default
parameter and get decent results with
this settings so i choose also other
image from another source nave one which
is the image with the highest dynamic
range that i was able to find publicly
available and you can see that this
image has 22 stops one stop being
meaning that doubling the range of the
values in the file so if the difficulty
in rendering this image is around this
window where you can you are required to
show colors of the window and also not
to show visible artifacts around like
ringing or or darkening the area around
the last image that i will show is the
image that an image that we create
ourselves with different algorithms
image and you can see details from
outside and inside and details in the
shadow and in the highlight eventually
with high resolution image you are able
to see when the fluorescent tubes on the
top so this is a demonstration of the
ATR viewer this brings me to the next
topic that i would like to touch which
is the creation of high dynamic range in
in summary about the viewer you can see
that with default parameters who are
able to open images from different
sources and this method is quite robust
in the sense of showing a very good
results from different high dynamic
range images how do we create high
dynamic range image images is the next
topic and this is quite interesting we
can start with the file format of these
images and we have to
code in RGB floats the radiance of the
scene so how do we capture this we turn
to a method that is was published by a
dybbuk and malik recovering high dynamic
range radiance maps from photographs and
essentially this method is requiring to
take multiple shots of different
exposures or of the same scene and then
combine these exposures into a high
dynamic range file we start with the
block diagram of the digital camera if
you look closely you can see that the
scene radiance it is transformed to the
digital output in the in the digital
file which is maybe jpg or other file by
a set of transformation first the image
is passing to the lenses then the
shutter then the image is captured by
the CCD and converted by a DC converter
and then some mapping is happening in
the camera for example gamma correction
or raw image to jpeg image
transformation and you finally get the
digital values in the file because the
scene radiance is in direct
correspondence with sensory Radian so
the first block can be skipped and we
can group the last four blocks in a
single entity which can be described by
a transfer function so we get the output
digital output in the file as a function
of sensory radians II and the exposure
time DT with a little math on this
applying the inverse transformation and
low transformation we can recover the
sensory radiance from the digital output
and the exposure time this is easily
possible the only thing that we need is
the middle term in there which is the
mark in blue and which is referred as a
camera exposure function one once we are
able to derive this function then we use
this equation to immediately find sin
radians so we will concentrate now on
deriving this function that we call in
here in the next slides with G so this
function described the exposure has the
exposure dependency on the output gray
level the idea is to take multiple
exposures and then to pick up a grade
level in one of the exposure which is
that white ring in there in the first
image and then look at how the gray
level is changing from one image to
another so we keep the same position
same x and y in the image and we look
and plot the variation of the gray level
from one image to another this will give
us a curve that is showing up like in
black in the diagram we can do this for
other grade levels in the image and we
can recover several curves and finally
we end up with a set of curves that
described this variation of the gray
level and what we need in the end is to
put these curves together and derive a
single curve that is the exposure
function of the camera so now once we
know this exposure function of the
camera we know the output gray level in
the digital file and the exposure time
we are able to recover the scene
radiance of the image that we capture
and with this I would like to do it them
on the application that is able to
create this high dynamic range image so
let's close this and open the Creator
the Creator is an application that first
will allow us to pick up several images
and I choose a number of already exposed
shots and we get a thumbnail view on the
left side and in here we can select any
of these images and see what is their
content you know immediately that no
matter how we take the images you can
see either details in the in the shadow
or either details in the highlight of
this and the high dynamic range file
will be able to capture all information
of the radiance of the scene and will
will encapsulate this in a high dynamic
range format the first thing is to
calculate the high dynamic range to
calculate the transfer function of the
camera and we did this in a single step
you have seen several curves put
together and you recover the transfer
function of the camera then the next
step is to use this this transfer
function and to compute the high dynamic
range image now you can see that even in
the paper it said that you need to do
this processing of the transfer function
of the comet over many in many images
until you get an average behavior of the
camera and use that transfer function to
create a high dynamic range we actually
add more robustness to the algorithm
that is computing the transfer function
of the camera such that we are able to
recover the transfer function only from
the same set of images that we used to
create the high dynamic range so we
recover the function from this set of
images and we apply this function to
these images and we create a high
dynamic range this bring us to a now
going that will be able to do this kind
of things by just specifying the set of
images so we choose a set of images then
the algorithm is computing the transfer
function of the camera and is computing
the high dynamic range
single shot and this gives independence
of this application from any camera
settings on any or any setup that you
may have been required to do for the
method that is published in the
literature so this let's say you are you
are switching to another set of images
for example from a different color and
you don't have to specify the camera and
you immediately get the high dynamic
range files direct from this from
specifying only the set of images the
interesting thing about this is we want
the user the Apple user to be able to
have less intervention in this algorithm
SES work and finally end up with an
application that will be able to provide
directly high dynamic range images
without taking care of anything this is
an advantage for the user finally I
would like to mention that as you have
seen here we select the images from a
folder but we have worked with the image
capture so I invite you on the Friday
afternoon from 5pm to see the an
integration of this algorithm each
capture modules and you will see an
interactive demonstration of how these
images are captured live with the camera
and then high dynamic range file is is
created and with this I thank you very
much and and I will turn back to look
[Applause]
thank you
averill so at the end of the
presentation on high dynamic range
images I like to touch on this subject
which is very close to our hearts of
color sync engineers we are really
interested in color managing high
dynamic range images and you must know
that this is the area which is under
very intensive investigation both in
academic society and in the industry
that Apple we also are developing our
own method of color managing high
dynamic range images and we are trying
to take a new approach which is based on
human adaptation to image viewing
environment we think that image contains
enough white point and adopting
luminance information that could be used
by color appearance model to predict
human perception of color in different
viewing environment which basically
means we can color manage our high
dynamic range images just to clarify
what kind of color appearance model
modeling we are dealing with let me say
that we are looking at this kind of
modeling which consists of two major is
based on two major concepts of on
chromatic adaptation which allows us to
predict the influence of adopted white
point on the color perception and on the
second concept is the degree of
adaptation which allows us to predict
simultaneous color contrast related to
the luminance of adopted white point so
in summary what we are trying to do is
to transform the Kurama tree of this
source using high dynamic range data to
our destination and then after doing
after color managing into and bringing
two new environment then at lying tone
mapping which will compress color to the
range of destination device and this is
what concludes our talk on
high dynamic range images and I'll turn
back microphone to David thank you so
thank you Luke and Gabriel free
discussion I just wanted to bring it
back just to do a quick summary slide
and then we'll have a few minutes of Q&A
at least just wanted to summarize once
again what's new and tiger we got a lot
of great stuff here first of all in
color sync we have floating point
support image I oh we have a brand new
modern API for reading and writing that
has optimized performance and support
for metadata and then we're doing a lot
with high dynamic range supporting
openexr file formats access to
compressed or unprocessed data and also
this is an area of all sorts of ongoing
and future research will have watched a
show for you so again we have a few
other places you might want to go
there's a graphics and media lab session
on Thursday where you can talk to us if
you have more questions and we can get
to today and then also there's going to
be some great demonstration at on the
last show last session on Friday talking
about image capture in high dynamic
range