---
title: WWDC2004 Session 207
framework: wwdc
role: article
path: wwdc/wwdc2004-207
---

# WWDC2004 Session 207

## Transcript

Kind: captions Language: en I'm Travis Browne in the graphics imaging evangelist and I helped to put together this for graphics media track along with help from obviously the engineers who've created the great technology and content and I'm sort of generally your host for a lot of things going on with graphics and media here at WWDC the one they wanted to take a little bit of time is to talk about some big sort of shifts that are happening in graphics and particularly the tiger timeframe and you've noticed they've been several mentions of the fact that now we're leveraging floating-point pipelines and for example the GPU we're also have floating-point pixel support and quartz 2d and also new technologies like core image also are based on floating-point pixel operations and this is sort of important shift in graphics because it used to be that you know eight bits per component was plenty it was enough it was the ideal 16.8 million colors encapsulated at all but as the sort of imaging market has matured it's really become obvious that we need more bits per pixel we also need them encoded differently and this raises new challenges that we need to solve inside the ls not only to be able to pump those through our graphics subsystem but additionally be able to do things like read and write them to to disk so this session we're going to be talking about two topics and specific one is high dynamic range which is going to basically cover several techniques in terms of how deep pixel data is encoded put on disk available file formats Plus also once you read that deep pixel data back and you actually make it viewable what what you need to do to it to actually you know basically create an image you can display on your monitor and then secondary to be talking about the reciprocal changes in the operating system the technology such as image i/o which is an imaging library which is going to deal with these new data formats and also changes in areas such as colors ink and color management because these changes all together sort of complete the picture they're going to Nabal to you to actually step beyond eight bits per component and start leveraging the new fantastic capabilities inside the GPUs are imaging stack to really do new and interesting things with high res illusion high fidelity data on that note I'd like to invite David Hayward this stage to take you through the session thank you thank you travis with introduction and thank you all for coming today session on high dynamic range imaging with image I Oh what I want to talk about today and what you'll learn is that the new exciting emerging field of high dynamic range imaging and how you can take advantage of it today in tiger using a new facet of quartz called Amy Jo but before I talk about those two fields and the people who'll be coming up and talking about it in more detail I want to give a brief interrupt eight on what's new in color sync for tiger because color stink is one of the key pieces of technology that allows for the proper rendering of both standard and high dynamic range images so let me give an update on color sync for tiger will be talking briefly about adding floating-point support in color sync use of core foundation types some API changes will be making some notes for developers of custom CMS and some changes to color sync utilities user interface so first and foremost is floating point support the Travis mentioned earlier one of the things we're trying to do for Tigers provide a new high fidelity cinematic graphic environment for tiger and in order to achieve that we need full floating-point support throughout the entire system and one key piece of that is color stink so in order to achieve this first thing we needed to do was to have a new bitmap structure in color stink for supporting arbitrary bitmaps of floating point data that your application can pass to us we wanted to make the structure as flexible as possible so that you wouldn't have to repack the data before you send it to us so this new structure supports both chunky and planar arrangement of data and also allows for the channels to be in any arbitrary order the way we achieve this in the structure is it's a little different from other bitmap structures you may have seen is that instead of having a single base address for the all the pixel data we actually have a different base address for each Channel this allows for the channels to be in any order and then we also allow for both robots and column bytes to be specified which allows for you to have your data be scanned in reverse order if needed or also if you've got unusual packing between channels that you want to make sure to skip over so it's a fairly basic structure but in most cases most people will be passing in a buffer of chunky or interlayer interleaved data and so we provided a simple utility function called CM float bitmap make chunky which you supply at a single base address and then it'll fill on the structure appropriately for you in either case whether you fill in the structure by hand or whether you call this helper API once you have a source and destination float bitmap you can then call color sync to match data from one space to another we have three functions to do this the first one is CM convert XYZ float map which allows you to convert between all the cie related color spaces so x y z yxy la d and l UV and also is another function which is convert RGB float map which allows you to convert between the RGB derives spaces RGB HSV and HLS these both of these functions are based on textbook formulas and so as a result there's no need to pass in a profile or color world to do the transform it just does the math for you with floating-point precision the last is probably the most interesting is mrs. the new API CM match float map which allows you to pass in a color world reference to perform the actual transformation you can create the color world by concatenating one or more profiles and then at that point the data will be sent through the CMM which if it supports floating point data will be done in full floating point precision one of the other changes we've made to color sync is integrated in more closely with the core foundation types and the key way we've done this is that the two common colors think opaque data types which are the CM profile ref and the CM color world rest are now CF types and this is quite convenient because it means that you can now call the CF base functions such as CF retain and CF release it also means you can add profiles in color worlds two dictionaries or arrays this is kind of handy if you're passing profiles and dictionaries around to other parts of your code the other way that we supported core foundation type is to actually get the data out of a profile one of questions I often hear from new users to color sink is I've got this profile reference how do I get the data out of it in the past that was done with by calling either CM copy profile or CM platen flattened profile now it's much easier you can just call CM profile copy icc data and it will return you all the data within the profile as one giant ICM CF data type ok next thing I want to talk about are some API changes that we're making for Tiger way back several years ago one of the features we added to color sync both at the API and the user interface level we're a set of preferences so that applications could have one place to go to for specifying default profiles based on usage or color space and at the time we hope that this would be a way of simplifying the user interface across a wide variety of applications and so we presented both API and user interface to help with this in practice however it turned out that very few applications have used this API and what we're left with is the user interface in the color sync utility where people say I can't figure out what this does because nothing I'd change here make seems to make a difference so we're listening to the usage and we're actually beginning the process of deprecating the API and also the user interface so we still want the any applications that we're using this API to function correctly so what we're doing is changing the behavior of CM get default profile by space CM get default profile by use and CM get preferred CMM instead of storing their preferences as a setting that's global across all the Machine it will now be stored in the current application current house current user domain so the api's will still function but we're deprecating them this house and the ramifications also in the UI which I'll talk about later I also want to make a take this time to talk a little bit about custom CMM one of the things that we've been doing over the last few years is making even tighter and more powerful integration between the graphic system as a whole notably courts and printing and color management and in order to achieve this with high performance and high reliability we have made made it so that courts and printing will only use the Apple CMM that said we have a long tradition of allowing applications and other other developers to develop their own CMMS and for applications to call those as they wish it's still possible for applications if they wish to have a custom CMM and for them to explicitly create a color world using that CMM and that can be done using the recommended API for this now is NCW can cat color world and this this API has an easy convenient way for you to specify which CMM to use the other thing to mention for CMM developers is that there's a new entry point for cmms which is CMM match float map if your CMM supports this then you can have a full floating-point support throughout the rest of the court system and if you don't support this then the data will be truncated to 16-bit integers and everything will work sufficiently lastly I want to mention some changes we're making the color sync utility as I mentioned earlier we're deprecating the Preferences api's for default profile and one visual manifestation of this is that we were removing the user interface from color sync utility however we're adding something in its place where I'll be adding a new utility to the colors link utility which is we call a calculator so let me give a brief demonstration of this on demo to rear so as we see in the color sync utility everything looks similar except for there's no longer the preferences pane as the first item but we have a new item which is calculator which provides a very simple way to convert color spaces between in all the various different color spaces using floating-point precision this is a convenience that also provides a good way to demonstrate our floating point data pad so obviously we can specify our source color space in our destination color space if we're just converting RGB to HSV we can see the slider values we can update the sliders on the left and they update on the right one thing you'll notice is because the RGB and HSV are related color spaces their basic formulas for each other so as a result the color on the left will be the same as the color on the right if we switch to CMYK you'll see something a slightly different which is now it's going through a profile and if I go to a saturated color you'll notice that the color on the right is g saturated one of the other things we added is the ability for it to be fully symmetrical so now instead of just updating on the Left I can also update on the right and it'll show you the values in that order we can also this is an interesting way to test out a CMYK profile we can specify that we want to input la b values and output to CMYK and as we scroll through all the possible la b values we can see what the resulting CMYK values will be so that's the brief demo of the color calculator we hope that's a useful function so back to slides so the next thing I want to talk about is something that's all new for tiger which is this new facet of quartz called image I oh and again as Travis alluded to earlier we wanted to provide a new API for doing image processing or image reading and writing from a variety of formats and this is image aya we talking today about its features its goals what formats it supports the clients of this API some of the core concepts you need to understand for using this API and some advanced techniques as well so what are the features of image I well first is we want to be able to read a wide variety of file formats and write to a wide variety of file formats we also want to support reading and writing metadata and also we want to support incremental loading for clients such as web browsers that get data an incremental fashion over slow data connection we also want to support floating-point support because that's one of the key initiatives for graphics and in Tiger we also want to have broad color space support and something called cacheable decompression mentioned a little bit on this now which is typically different api's for reading and writing image file formats have one of two behaviors in terms of decompression in the case of the existing core graphics AP is every time you draw the image it's fully decompressed each time this obviously has the advantage that you have very little memory overhead but it's a performance hit if you draw the image more than once other api's have the behavior that the first time you draw the image it fully decompresses it which obviously requires more memory but has the advantage that subsequent draws will perform quickly there are merits to both approaches and so what one of the approaches we've used with image i/o is to try to allow for both features not all file format support both approaches but we were ever possible we support both philosophies here are some of the overarching goals for image iya first foremost with the reduced code duplication turns out there was an embarrassing number of different variants of JPEG readers and writers and tiff readers and writers within our system and they all had different strengths and weaknesses and if you were actually trying to write an application that read and wrote images you had to make a choice between which strengths and weaknesses he wanted to use we wanted to have a single reference implementation within the system use that use that means many places as possible so that we have a single place to make changes in the future one of the other goals is willing to leverage open source so that our havior of our api is consistent with other implementations and improved performance this is one of the other key things we've been spending a lot of time with the vectorization team at Apple to make sure that our key file formats decompress with optimum speed another feature was lazy decompression in the sense that if all you need to do is get the height and width or metadata out of an image you shouldn't have to fully decompress the data so we want to support that as well and lastly we wanted to make sure we had a very modern core graphics friendly and easy to use API so that you could all easily adopt this in your applications so one of the first questions I always get when I'm talking about image i/o is well what formats you support and we support all the standards for the internet TIFF jpg ping gif and JPEG 2000 these are already supported on the developer CD that you got this week we're also supporting some exciting new formats such as some high dynamic range formats such as openexr radiance and some important variants on tiffs such as log lu v and some pics are variants there's also countless other formats we're going to be supporting BMP PSD qti f SGI icns files and we're considering more both for tiger and beyond so the clients for image I obviously we hope that anyone who wishes to use this API are free to use them in their application but there's also lots of places within the system that are going to be calling image aya so you may get the benefits of image era without having to change your code at all probably the first and most important client is for image i/o is a preview application this has been a great example of how how the power of the new image I oh and some of the advantages and get from its making strong use of this new API also app kit will be switching over it's not yet switched over in the current developer release but app it will be switching over to using API the new image i/o API as well WebKit and its clients such as Safari mail and any of your applications that are using web kit will be using image io core image is using image I owe to load data and floating-point format spotlight is using it for generation of thumbnails and getting metadata and some of our scripting technology such as sifts and image events are also using imagej oh so we're trying to use this everywhere in the system this should I want to give an outline on the API in image I oh but before I do that I want to talk a little bit about how images are organized so you can get an understanding for why we designed the API the way we did in previous systems the standard way of representing an image in core graphics was with a CG image rep and this is a great basic format for representing images it allows you to specify three things the geometry of the image such as its height with robots and pixel size the color space of the image which can be a profile or other equivalent description of the color space and the actual pixel data this is the minimum information you need to describe an image however it turns out that there's a lot of file formats out there and they are actually quite elaborate in many cases and so one of the things we want to do support an image i oh was a richard model for images for one thing we want to be able to support thumbnails and metadata free images and also a lot of file formats support multiple images within the same file format such as tiff so we want to make sure we support that as well and also there's a set of attributes that apply to the image file as a whole rather than to the individual file images contain within the image file this is the file format of the image such as it's whether it's tiff or jpeg and also some properties that apply to the file as a whole for example tiff files can be big endian here's an example of how this works in practice we're using an example of a tiff file the file type is public tiff which is a universal type identifier that describes this image as being of the type tiff we have some properties that apply to the file as a whole the file size and bytes for example in the Indian asst of the tip and then we have the standard information for each image such as its height width its color space it's pixel data its thumbnail as possible and its metadata such as copyright and artist information you name it so here's how this model is reflected in our API through data types what we have is we use the existing CG image ref to represent the geometry color space and pixel data the thumbnail is also represented by SVG image rest the metadata and the file properties are represented as key values in a CF dictionary f so it's a little bit all very simple so now I can talk a little bit about the API what we've added is a new data type called CG image source and this is the opaque type used for reading images from either memory or disk you can create a CG image source from either a CF URL ref CF data or with the CG data provider once you have a CG image source you can query the image source for several attributes you can ask for the properties of the file as a whole using CG image source get properties you can ask for its file type by calling cgma source get type you can get the count of images using CG image source get count once you know the count of images you can then for each image ask for its image you can ask for its thumbnail and you can ask for its metadata so it's pretty simple just to show you how this works here's a little code sample that shows you how given a URL to get the first image out of the file it also returns some simple metadata in this case it's just returning the the DPI of the image in the horizontal and vertical direction first thing this code does is call CG image stores create with URL which creates our data type for subsequent access to the file then what we want to do is we want to get the set of properties for the first image so we call CG image source get properties at index and that returns a dictionary we can then query that dictionary to see if it has the GPI height and width properties and return those to the client lastly we need to actually return the image so we call CG image stores create image at index and that will return the image to the caller here's another example for getting a thumbnail out of an image image IO is very flexible for creating some nails as it turns out some file formats support thumbnail some don't also with some file formats thumbnails can be quite large your application may need to have control over how thumbnails are returned and we provided that with the image io API be an options dictionary in this case what we're doing is we're again creating a CG image source by specifying a URL and then we're going to be creating an option to dictionary with two key value pairs in it the first q is CG image source creates thumbnail from image if present what this does is tell Amy Jo that even if the image doesn't create a thumbnail return the actual image instead so we'll always get an image for the thumbnail the second key value pair we specify is CG image source thumbnail max pixel size and this allows us to make sure that thumbnails are a reasonable size which is especially important if you've specified the previous option so in this case we're saying that we always want an image to be returned and we want it to be no bigger than 160 x 960 pixels once we've created that dictionary all we do is call CG image source creates thumbnail at index specifying the image source 0f index and the options dictionary and it's returned this is for example the way that the spotlight technology creates thumbnails for images in the search results field so that's the basics of reading from an image I oh here's what we do for writing we have another data type which is CG image destination which can be created with the CF URL CF mutable data or with the CG data consumer at the time of creation you also specify the type of file whether it's a jpeg or tiff for example and the capacity or the number of images that that image will hold once you have a CG image destination you can specify the properties for the file as a whole using CG image destination set properties and then you can repeatedly add each image with various options and metadata at the same time using CG image destination add image lastly you could flush the file out to either the URL or to the data by calling cgms destination finalized and that returns true if the image was successfully flushed again let me give a short example just to show how easy this is to add to your application we have a function called right jpg data which takes a URL and an image to right and a dpi to specify in the metadata first thing we do is we create an image destination with a URL specifying that it's going to be a type jpg and that it's got one image in it next thing we do is we specify a dictionary with three keys and values for options and metadata one option that we're specifying is the quality of the JPEG and that's specified with the key kcg image property equality in this example we're specifying a quality of point eight or eighty percent compression the other two key values are for metadata and they are the kcg image property DPI wits and DPI height in this case we're just creating CF numbers based on the value that was passed in once we have this dictionary then we call CG image destination add image to add the image and its options in metadata to the CG image destination and lastly we call CG image destination finalized to write the file to disk so it's pretty easy so those are the basics of the image I oh I hope I've given the impression that this is a very simple and easy API to add to your application and again some of this benefits you'll be getting for free if you're using applicant and other technologies let me talk for a minute about some of the more advanced techniques that come up when we talk about image reading and writing such as extracting a RGB data requesting the depth of an image and loading an image incrementally so one of the common questions we have is well I haven't an image has been returned from image I oh but it's I don't know what color space it is I don't know what depth it is I don't know what pixel format it is and i have an application that only works in RGB that's a common scenario and this is an interesting piece of code that makes it very easy to convert the data matter what format it came in into a RGB basically the technique is to use a CG bitmap context to render the original image into an off-screen and one advantage of this is that it takes care of all the color management correctly if the image happen to be an l a b or CMYK image and had a profile then it'll be correctly color managed to the RGB colour space that you're working in another interesting question is the depth of image some formats only support one pixel depth for example JPEGs are always eight bits per sample other formats can support arbitrary pixel depths the rental tips can be 1 2 4 8 or 16 bits per sample as a rule the image returned by image io will be the same depth as that indicated by the file so if you open a 16-bit tiff file you'll get a 16-bit cgm address however in the case of high dynamic range file formats it gets a little bit more complicated the data in these file formats are typically encoded in special encoding formats which can then be decoded in a variety of ways they can either be unpacked two floating point values either 32 or 16 bit formats or two integers with 16 or eight bit precision also in the decoding process that you can either be left as extended range values or they can be compressed to the logical 0 to 1 clipped range both of these are reasonable types of values to be returned and your application may want one versus the other by default cg image io will return an image ref that's compressed to 16-bit integers this will give the best results with reasonable memory for the typical application however if by request an application can specify that they want the floating point unprocessed data returned here's a brief example that is how to do this this is a code snippet that given a URL will request that the data be returned in floats and if it is it'll the data is actually returned as a float a boolean will be returned to specify that it was actually floats the way we've done this is as you've seen from the previous examples we create an image source and we specify an options dictionary which has as one of its key value pairs CG image source of maximum depth with the value of 32 at this point we can then ask image I owe to get the properties of the first image given those options and this will return a dictionary we can then query that dictionary to see if it has floating point data or not then lastly we can get the image and return that to the client another advanced techniques I wanted to make sure people knew that we supported was incremental loading his images I won't go in too much detail on this but the basic idea is that you create an image source in an incremental fashion using CG image source create incremental and then you repeatedly add updated data to the image source each time you add data you can request a new image and it will give you a partial image or complete if the image is fully loaded the and then it when you once you're done with the image you can release it and then once you've added more data you can get a new updated image it's important that you release it before you ask for a new image so let me give a brief demonstration of image I oh and action so one of the things i want to show first is the new preview and i've got a bunch of images here open and one nice thing in preview is you can open all the images just by selecting a folder and I've got a variety of images in here one of them is an LED image and we can do that but we can verify that it is an lav image by going to tools get info and this shows the metadata that's been obtained using image I oh and we can tell in here from the metadata that's currently returned that the color model is la b we have a variety of other images we can zoom in and zoom out the thumbnails over here were obtained using image aisle as well we have high dynamic range images here you can zoom in and zoom out on that Luke later will show how we can manipulate these images in real time here's another interesting example which I like to show people this is one of our things that we use for testing oftentimes people want to know well how do I know if the profile is being used what I have here is a document that's a black-and-white CMYK document that has a profile on it that makes values that are gray disappear so if this image were rendered and the profile were ignored what you'd see is the text the prophetic test profile is not used and that's because you can't see it here because the profile is being used but there's actually a gray word not right here so it provides an interesting test that you can tell if your profile is being respected or not here in this gray version you can kind of see a little bit of the hint of what was once there and the word not but this is a great way of testing images we really should distribute these at some point one other example of using image I oh I have a test application which shows some of the options so I'm going to go to open one of the images we just saw with the desktop images and open up this image here we can see some information the height and width and how long it took to draw we can one one thing we can do is we can specify that we'd like to see what this would look like if it was progressively loaded if I open up another image if i open the high dynamic range image this is a big image unfortunately so it takes a couple seconds to open we can if we bring up the metadata on this you go to window metadata we can see that it has height and width and its depth of 16 this is because by default we return 16-bit integers however if we want to return it as 32 and again they'll take a second or so this was still need this code still need to be altivec someday soon we need to bring up the metadata and hat now we can see that there's a new property in here which is saying that data is returned it floats so that's the introduction to image I oh I'm going to pass the microphone and the demonstration and all the new stuff over to Luke Wallace who will be talking about high dynamic range imaging thank thank you David so today I will be talking about Mac os10 support for high dynamic range imaging which is a new and exciting feature that we are adding into the tiger release as many of you know high dynamic range imaging is generating a lot of interest and is still a subject of very active research so we could talk about high dynamic range imaging from many different points of view but what I would like to do today is concentrate on answering very simple three questions what is it why use it and how to process it before we try to answer this question let's take a quick look at the current status quo in digital image processing we can conclude that in majority digital image processing is dominated by the what is called output referred approach what it means is that the requirement of image reproduction are imposing certain requirements on the way we acquire and create images and because most of the devices we are dealing with like displays and printers can only handle 8-bit data through color channel we impose the same requirement on digital cameras that in fact could produce much more about an order of magnitude more data if they were not restricted to that requirement obviously there are some advantages this is not done for no reason the main is that there is a very minimal image manipulation required before displaying or printing such an image but obviously there is an disadvantage that we are losing a lot of color and image information that could be used in further image processing that could result in much higher quality of display or fringe oops sorry infirm direction another requirement which is sort of hidden in the output referred approach is that the data is exchanged in one predefined color space and in the most difficult case this is srgb so when you look at this slide you see I drew the shape of the typical exchange color space made it be srgb that color space covers only a part of visual gamut so everything is fine as long as the camera is acquiring the color data we send that triangle but if we are outside then we are out of luck we have to do something with this color and typically we have to push it into the color space can be done through different methods but because the cameras are not very sophisticated in terms of processing power we are using very often clipping and as we know from practice gammas clipping can produce really bad results like for example hue shift and here is one of the maybe a little bit stronger and exaggerated example of what could happen but this is a real clipping that in which the white color because of clipping became a mixture of completely unrelated colors so when that is what we can conclude when we look at the image processing from the point of view of device capabilities to reproduce the image and what I would like to do now is to look at the image processing from a little bit different perspective from the perspective of human vision and as we know from the very rich research in this area color and visual acuity are to the most important are the most important characteristics of the scene and not only this these to depend on luminance and observers visual adaptation we know that we can measure the war of luminance and it will cover the range of the values between 10 to power of minus 6 all the way to the power of 10 to aid in when measured in Candela per square meter but what is important for us that different ranges of luminance create different illuminations now and that illumination also can stretch all the way from very dark environments to starlight all the way beyond the sunlight and now I could spend a lot of time talking about physio psychological and physiological mechanism mechanism controlling our vision but what I would like to do without going to those details to say that humans have three types of vision which are dependent of the type of luminance we have scotopic vision which works when we are in the dark environment we have mossop excision which works in light dark environment and finally when we are in high illuminant illumination environment we switching to photopic vision why this decision is important because our quality of vision is related to this type of vision as we know if we look at something at the very dark environment we have no color vision and very poor acuity everything in the darkness seems to be just a shade of gray on the other hand our best vision is in this photopic range where we can see many colors and have a good color and visual acuity this is not everything the what is very important that human has a limited simultaneous range which also depends on the type of illumination and here I'm showing the widest simultaneous range which again existing this photopic vision that can cover the range of of order of magnitude three to four but if we try to estimate this simultaneous range in poor of vision that can the values can drop by the order of two so we may ask ourselves why this is all important well I think there is an answer because if we want to represent faithfully the scene that we want to process through image processing we should have a mechanist to encode the data the same way as or at least as close as possible to the human vision fidelity so now let's take a look we're in this picture we can fit the typical 8-bit display and as we know the typical 8-bit display can cover the range of luminance on the order of the order of magnitude of two that is big discrepancy between human simultaneous range and dynamic range of a display so this is the biggest challenge that we are facing when we that we have to map the relatively wide human simultaneous ranging too low dynamic range of our display device there is one solution which we already know about this is the output referred digital photography we are imposed imposing the low result low resolution low color of small color space and the only thing we can do is to choose between different options this is a simplistic view in which we may say well if I want to expose the details in highlight I can use the short exposure but if I want to see the details in the shades i can sacrifice the details in highlights and use the long exposure to capture or what I wanted the most important point is that this applied exposure is permanent once we burn this into the image there is no way back so I think that at this moment I will try to answer the question what is high dynamic range and I think that we can define high dynamic range as a special encoding of the image data which allows us to preserve the full fidelity of human vision from the implementation point of view the high dynamic range imaging is based on color values that first of all extend over at least or four orders of magnitude that can encompass the entire visible color gamut and allow the values of outside of a typical 021 range values of color in a summary what it means that in high dynamic range imaging we are no longer limited to a specific color space we are trying to encompass as I said all visible colors but on the other hand we need to remember that we no longer have a convenient ready to display or ready to print image high dynamic range data requires some kind of manipulation before it can be displayed but the big advantage is that we can make this decision at the moment when we need to reproduce the image with our preference instead of burning that to the image this is a kind of simple explanation how we can do that we can go back and select the short exposure long exposure but most importantly we can implement something which was not needed before is the tone knob rendering which will allow us to achieve completely different results for example here I can try to combine in one image the details from the highlights with the details from the shade so now I'll try to answer the question why use high dynamic range images and the most important reason is to preserve the scene referred information that can be useful in further image processing and this way we want to avoid intermediate and coding with restrictive color gamut which was happening in this previous approach called output referred preferred and also we can avoid irreversible modifications that happen during the image acquisition how to process high dynamic range images the most and the simple the simplest answer is that we should not add any rounding or clipping errors and for that we want to render and capture the data in floating point we want to store the entire image and it's needed to process the color data in extended color space which again will not impose any clipping and at the end we want to apply the tone mapping for a specific image reproduction for example that specific reproduction could be an example I just showed you that I want to see all the details in the image from the highlights and shadows now let's take a look at the file formats that we are supporting in Tiger I think that the most important citizen here is open exr that comes from ILM first of all it has the smallest quantization error and most importantly as we will see later it comes with the recommended way of tone rendering which solves a lot of problems in terms of presenting the image content the other formats basically just defined the way of encoding a decoding data with preserving the image fidelity so now I would like to show you if i can get my this is demo2 machine my little application in which i can open high dynamic range images and what i would like to show you is that we have to do something with those values which are so large and much bigger than what we can represent by typical range of 0 to 1 and one would think that very simple approach would be to simply mob the brightest point in the image to the brightest point of the display but if I do that with my little up demo application you see that we don't see too much in this image there is way too much information beyond one and scaling didn't produce any visible image another very simple approach could be okay let's say I would like to see whatever you have in this image clip the values 201 typical range and show me that well as you see the image quality somehow improved but it's still very poor and now if i use the open exr and this is their default zero exposure value i'm getting some reasonable result and i can see many more details and not only this I can do what I was talking about it I can impose my preference at the moment of reproducing the image for example someone may like this kind of image or and someone else may still want to focus on this beautiful stained glass I want to show you a couple of classic examples like for example the same as memorial church picture which comes from the deviled egg website and the same thing happens here if we just scale the image the image the dates the image is basically unreadable clipping will show something that quality is really poor openexr is doing a very good job here another example is the picture I was using in the previous slides of the our garage at apple and this is how it looks when scaled this is how it looks when clipped once again typical huge shift when clipping the data and open exr producing quite reasonable result what this leads us to the conclusion that tone rendering is a very important issue when processing the high dynamic range images and maybe they many different methods of doing that and I think this gives me a very good segue to introduce Gabriel Marcoux who will be talking about high dynamic range tone mapping developed at Apple thank you very much for the introduction as you have seen the main problem of rendering high dynamic range images is how to map the high dynamic range how to reduce the high dynamic range into a low low dynamic range or device and this problem is not trivial so we have look into what is available in the published literature and here i put a list with few methods that i select from this from what is available and the site from openexr you can see the histogram adjustment proposed by one retina Calgary's this is a class of methods and good review on this method is proposed is published by a Jamaican electronic imaging conference in 2002 another interesting approach is based on the color appearance models and using one of these models which is icon Fairchild and Johnson try to reduce this high dynamic range gamut high dynamic range imaging to the displayable won the last three in this list are proposed at siggraph 2002 and they are bilateral fast filtering doujin burn method and the gradient compression you can see in this list that you can group the methods in two classes one of the classes general algorithm that are applied in a bit images to get more information from them and reduce the global contrast to something that is visible on this on the display and the other second class of algorithm as P our algorithms that are specifically designed to handle high dynamic range images while doing this research of this available methods we came up with our own algorithm ention the apple method which i will demonstrate in a minute one of the things that you face when you try to design or even evaluate a method is what is the intent of that method aside from just compressing the high dynamic range into the dynamic range of the display and here is a flavor of to kind of rendering intense one that is retaining the look and feel of a high dynamic range and you can see this on the image on the left and another one that is showing as much content from the image and you can see that comparing these two images around the highlight region in the image that retain the look and feel of of a high dynamic range image you see the glowing around the window while in the other you see more detail more color of the stained glass of that sealed window ceiling window and you can see this the same thing happening on the window on the wall and it depends on the subjective preference of the user which of these settings will prefer to to use the good thing is that with upper method we are allowing both intends to be shown so I will do a demo on this and I will show you how this algorithm is working so we open the viewer and with this we can choose between different images let's start with a memorial image with that is let's say a classical one which is pick up as an example by a lot of algorithms it is a good thing because you can compare algorithms on the same image this is the rendering of glowing the intent of that is showing the glowing around the windows and around the highlight in the image I also show here the high dynamic range size which is a good metric for that is the logarithm of the maximum over the minimum values in the file and this give us three stops or 13 stops and you can see on the bottom the actual range of floating values in the file I said that we can support both kind of intense and right now I switch to the intent where you can see the color of the glass in the side in the stained glass of the window so we can open a different kind of different kind of files not only from one source and you can you can evaluate this the power of this method this is an image that is from openexr user by the opening acts out of the test image and it's quite interesting it has 18 stops so it's a quite large high dynamic range and you can see details in the shadow you can see details in the highlight also you can look at the text that is on the book and see the details under the table as you have noted I haven't changed any of the default parameters of this method so our intent was to actually design a method that where the user will have us less intervention as possible and will be allowed to take the default parameter and get decent results with this settings so i choose also other image from another source nave one which is the image with the highest dynamic range that i was able to find publicly available and you can see that this image has 22 stops one stop being meaning that doubling the range of the values in the file so if the difficulty in rendering this image is around this window where you can you are required to show colors of the window and also not to show visible artifacts around like ringing or or darkening the area around the last image that i will show is the image that an image that we create ourselves with different algorithms image and you can see details from outside and inside and details in the shadow and in the highlight eventually with high resolution image you are able to see when the fluorescent tubes on the top so this is a demonstration of the ATR viewer this brings me to the next topic that i would like to touch which is the creation of high dynamic range in in summary about the viewer you can see that with default parameters who are able to open images from different sources and this method is quite robust in the sense of showing a very good results from different high dynamic range images how do we create high dynamic range image images is the next topic and this is quite interesting we can start with the file format of these images and we have to code in RGB floats the radiance of the scene so how do we capture this we turn to a method that is was published by a dybbuk and malik recovering high dynamic range radiance maps from photographs and essentially this method is requiring to take multiple shots of different exposures or of the same scene and then combine these exposures into a high dynamic range file we start with the block diagram of the digital camera if you look closely you can see that the scene radiance it is transformed to the digital output in the in the digital file which is maybe jpg or other file by a set of transformation first the image is passing to the lenses then the shutter then the image is captured by the CCD and converted by a DC converter and then some mapping is happening in the camera for example gamma correction or raw image to jpeg image transformation and you finally get the digital values in the file because the scene radiance is in direct correspondence with sensory Radian so the first block can be skipped and we can group the last four blocks in a single entity which can be described by a transfer function so we get the output digital output in the file as a function of sensory radians II and the exposure time DT with a little math on this applying the inverse transformation and low transformation we can recover the sensory radiance from the digital output and the exposure time this is easily possible the only thing that we need is the middle term in there which is the mark in blue and which is referred as a camera exposure function one once we are able to derive this function then we use this equation to immediately find sin radians so we will concentrate now on deriving this function that we call in here in the next slides with G so this function described the exposure has the exposure dependency on the output gray level the idea is to take multiple exposures and then to pick up a grade level in one of the exposure which is that white ring in there in the first image and then look at how the gray level is changing from one image to another so we keep the same position same x and y in the image and we look and plot the variation of the gray level from one image to another this will give us a curve that is showing up like in black in the diagram we can do this for other grade levels in the image and we can recover several curves and finally we end up with a set of curves that described this variation of the gray level and what we need in the end is to put these curves together and derive a single curve that is the exposure function of the camera so now once we know this exposure function of the camera we know the output gray level in the digital file and the exposure time we are able to recover the scene radiance of the image that we capture and with this I would like to do it them on the application that is able to create this high dynamic range image so let's close this and open the Creator the Creator is an application that first will allow us to pick up several images and I choose a number of already exposed shots and we get a thumbnail view on the left side and in here we can select any of these images and see what is their content you know immediately that no matter how we take the images you can see either details in the in the shadow or either details in the highlight of this and the high dynamic range file will be able to capture all information of the radiance of the scene and will will encapsulate this in a high dynamic range format the first thing is to calculate the high dynamic range to calculate the transfer function of the camera and we did this in a single step you have seen several curves put together and you recover the transfer function of the camera then the next step is to use this this transfer function and to compute the high dynamic range image now you can see that even in the paper it said that you need to do this processing of the transfer function of the comet over many in many images until you get an average behavior of the camera and use that transfer function to create a high dynamic range we actually add more robustness to the algorithm that is computing the transfer function of the camera such that we are able to recover the transfer function only from the same set of images that we used to create the high dynamic range so we recover the function from this set of images and we apply this function to these images and we create a high dynamic range this bring us to a now going that will be able to do this kind of things by just specifying the set of images so we choose a set of images then the algorithm is computing the transfer function of the camera and is computing the high dynamic range single shot and this gives independence of this application from any camera settings on any or any setup that you may have been required to do for the method that is published in the literature so this let's say you are you are switching to another set of images for example from a different color and you don't have to specify the camera and you immediately get the high dynamic range files direct from this from specifying only the set of images the interesting thing about this is we want the user the Apple user to be able to have less intervention in this algorithm SES work and finally end up with an application that will be able to provide directly high dynamic range images without taking care of anything this is an advantage for the user finally I would like to mention that as you have seen here we select the images from a folder but we have worked with the image capture so I invite you on the Friday afternoon from 5pm to see the an integration of this algorithm each capture modules and you will see an interactive demonstration of how these images are captured live with the camera and then high dynamic range file is is created and with this I thank you very much and and I will turn back to look [Applause] thank you averill so at the end of the presentation on high dynamic range images I like to touch on this subject which is very close to our hearts of color sync engineers we are really interested in color managing high dynamic range images and you must know that this is the area which is under very intensive investigation both in academic society and in the industry that Apple we also are developing our own method of color managing high dynamic range images and we are trying to take a new approach which is based on human adaptation to image viewing environment we think that image contains enough white point and adopting luminance information that could be used by color appearance model to predict human perception of color in different viewing environment which basically means we can color manage our high dynamic range images just to clarify what kind of color appearance model modeling we are dealing with let me say that we are looking at this kind of modeling which consists of two major is based on two major concepts of on chromatic adaptation which allows us to predict the influence of adopted white point on the color perception and on the second concept is the degree of adaptation which allows us to predict simultaneous color contrast related to the luminance of adopted white point so in summary what we are trying to do is to transform the Kurama tree of this source using high dynamic range data to our destination and then after doing after color managing into and bringing two new environment then at lying tone mapping which will compress color to the range of destination device and this is what concludes our talk on high dynamic range images and I'll turn back microphone to David thank you so thank you Luke and Gabriel free discussion I just wanted to bring it back just to do a quick summary slide and then we'll have a few minutes of Q&A at least just wanted to summarize once again what's new and tiger we got a lot of great stuff here first of all in color sync we have floating point support image I oh we have a brand new modern API for reading and writing that has optimized performance and support for metadata and then we're doing a lot with high dynamic range supporting openexr file formats access to compressed or unprocessed data and also this is an area of all sorts of ongoing and future research will have watched a show for you so again we have a few other places you might want to go there's a graphics and media lab session on Thursday where you can talk to us if you have more questions and we can get to today and then also there's going to be some great demonstration at on the last show last session on Friday talking about image capture in high dynamic range
