---
title: WWDC2004 Session 217
framework: wwdc
role: article
path: wwdc/wwdc2004-217
---

# WWDC2004 Session 217

## Transcript

Kind: captions Language: en good afternoon welcome to the penultimate session for wwc and the ultimate quicktime session today we're going to be talking about next-generation video format in QuickTime my name is Tim churna I'm the manager of the quicktime video foundation team and the QuickTime team is going to be talking to you about h.264 AVC the new video codecs that were shipping in QuickTime the next version of quicktime and in tiger and the technologies that are required in quick time to support h.264 and as well as changes that were required to support ipb video frame coding we're going to talk a lot about what that actually is so you're going to see a bunch of abbreviations in my slides and it'll be a little bit strange but let me talk about the quicktime video technology is before tiger basically we have this software stash and at the top of the software stack is the movie toolbox the movie toolboxes use for creation and editing and navigating movies it's awfully used for playing back movies stepping through movies and so it's typically what the highest level applications use now the movie toolbox is using the video media handler to sequence video frames from files or from a network device to the image compression manager an image compression manager is the service within QuickTime that deals with compression and decompression and so we've creates these compressor components and decompress their components underneath the ICM which are created by Apple and third parties and that serves as the codec model and there's a base coat at a base decompressor which helps to implement decoders and we recommend you actually use that and you'll see today that it's actually essential for the new video new compressor for you compress this format so the four tiger there was some limitations in the movie toolbar and the video media handler I see em this wasn't really a big issue because most of the Codex were either I frame or keyframe codec such as DV or motion jpeg or a different frame or I peak coded videos such as MPEG four simple profile Sinha packs the Sorenson codecs etc what we didn't support is the more complex frame ordering ipd frame ordering diffuse within h.264 and mpeg-2 mpeg-4 now we do support MPEG one and two but that's via the MPEG media handler not the video media handler it's a different code task and we're not changing that code path and tiger so today we're going to cover some fundamentals about the new h.264 video codec details about what I PB is and how a difference from I and IP by the end you'll really know that the user level impacts for the changes that we're doing in QuickTime 66 changes to the movie toolbox for navigation editing and of course the changes to the ICM to support these new kinds of video codecs with that I'd like to bring up Thomas poon to talk about h.264 hi I'm Thomas and I walk in the video code at tim in quicktime Oh actually back to the surface so today I'm going to briefly talk about x264 I'm sure you guys have all heard about it so the whole week I'm just going to recap so h.264 what is it it's a joint effort from the 2 1 the 2 biggest organizations regarding standards when it's the ISO and data one is itu they bought us video standards such as mpeg-1 mpeg-2 and also it's to 61 to 63 now because of John effort at different state they seem to give this codec a different name so you may also heard em say for part 10 jbt x264 and also ABC now it's standardized in last year so it's a very recent addition to your video standard and it has all the new technologies and it works very well at various bitrate all the way down to 3G all the way up to HP and because of that it has recently been chosen for one of the video codecs for HD DVD and also 3gpp standards as well and because quicktime always stand behind standards and it will for sure become a new videos Kodak's in quick and that i'm just going to show some demos and go to your demo machine trees now subside it has been recently chosen for HD DVD and let's see how it looks at HD resolution yeah I have a clip here ascender it's encoded of HD us version 1280 x above 550 and this is actually only at six megabit we can really do that with mpeg-2 at this kind of quality before so let's see I'm going to play the whole club you [Music] by the age of 25 yet conquered the known world and change at the course of mankind forever [Music] come on young man and I promise your concert dad [Music] so that's how it looks at six megabit and if you actually play with our encoding video you notice like at the beginning the sandstorm it's actually really hard to code and the code that does it really well and other damage and I'm going to show is give your idea how it compared with existing standard one that I chose as mp4 so I'm going to open this to file first can you put into it please okay now when we try to compare with a different standard we usually use three guidelines just fix a bit ray have quality how does the quality look how does a bit rate so one could be a lot bigger try to bake the other one is how much information you actually packing the stream so that goes with frame rage with resolution or frame size so here I have two clips both of them encoder a megabit the 264 h.264 one is about four times as big as the mpeg-1 mp4 one that we do that we ship earlier so I'm going to play this I'm just going to play a short about 30 seconds going to play or movie so the qualities about the same except based arrested [Music] identify detective Wow richest man in the world Kanaka coffee sure why not I don't think anyone saw us coming so whatever I can do to help sugar I'm sorry for coffee sugar ah oh oh you thought I was calling you sure that you're not that rich okay don't have anything to pick up yeah gonna stop it now so we can't pay the Hooksett later if anyone let's just continue so can we get back to the sides okay so what makes this codec a lot better than say mp4 and Petry that we already have so here's a big table that I'm sure some of you may have seen it from another section what I really want to point out here is the three things the first one is there's really not one single technology that gives you all the games is a combinations of offenses in different view different technologies most of the technology that we use in h.264 are based on technologies that we already have like 10 years ago that we have a lot more improvement and we know how to use the technologies a lot better second thing is wanna bring o is as we mentioned earlier that for example will impact to it tooks a couple years to try to get the best out of the mpeg-2 and h.264 is the various new standards it's just standardized last year so you should expect the quality of x 264 stream getting better and better once you know how to use the tools even more efficient efficiently and the last thing is a lot of technologies here i'm not going to go over all of them this is boring but most of the technologies what it really means is this self contained within a codec so your codex receptor implemented for example a different transform different way of packing the bits in the streams but there's one particular technologies which is the IPP ipd frames which is the third column down we're x264 you have a lot more options you can you have it's very flexible you can do almost anything you want at the simplest forum is about the same as the periods impact I BP which you're going to explain more but if you really need to take advantage of it then the higher layer will require some changes and thus quicktime will have to mix on structural change as well and that's going to bring it to ends which is going to talk about what changes in quicktime needed to support h.264 all right so how are we going to deliver h.264 in quick time we've added four new h.264 specific components a compressor decompressor and a packet eyes ur and REE assembler and made a whole lot of changes inside the infrastructure to support these components with Tiger applications will be able to playback h.264 content they can also play back h.264 strings and if they use a high level movie toolbox api's they'll be able to do this without any changes in their apps in addition to quicktime movie files will be able to store these h.264 streams in mp4 files and 3gp files and the streaming realm will fully support h.264 in that you can playback h.264 streams you can take h.264 content hints them put them on the quicktime streaming server and stream them to clients and you can broadcast using quicktime broadcaster these h.264 streams are in the standard format as defined by the IETF on the authoring front applications will be able to edit h.264 movies and if they call the high level movie toolbox api's they'll be able to cut copy paste with no changes they'll be able to produce h.264 content and store them in QuickTime files also mp4 files 3gpp files all right so if you want to compress h.264 content you can do it using the movie exporter components and if you call these components we've modified them so that they can generate h.264 be frame content now if you call stood compression and the ICM api's yourself instead of the movie exporter api's then h.264 will show up as a new item in the codec list however b-frames won't be enabled by default and in order to get the frame content you'll have to opt-in to be framed by calling new api's that will describe in this session if you call the sequence grabber api's then and you want h.264 befriend content you'll have to call stood compression and compress the frames yourself so what's in the seat with the speed you'll be able to playback h.264 streams and edit them we've been working really hard on HTS explorer but we haven't fully integrated into all our exporters yet but we really wanted to get you something something in your hands so we've included a preview h.264 exporter it appears as a new menu item in the exporter list and it does support multi passing code and one thing to note is that it produces an interim format right now at the format is guaranteed to change before GM so don't produce any content that you want to stick around for a long time and be able to play it back with the feed anyway the ATIS we talked about today are in the seeds so please try them out and a couple things h.264 there's a lot going on there so it requires a g4 g5 and also the seed doesn't contain a compressor packet I zaroor reassembly yet so say you want to take advantage of h.264 in your application what do you have to do and what do you have to change well what you have to change depends on what level of api's you're calling if you're calling into the high level quicktime api's then chances are you don't have to do anything in order to gain support for h.264 in your app however if you call some of the lower level API such as the media level you might or might not have to change your application depending on what specific API is your calling and if you access the sample level api's yourself if you access the samples yourself you'll have to change your app all right so as I said if you call the high level api's you'll be able to gain access to h.264 with no changes to your app and some examples of the high-level api's are the various views that quicktime provides such as the new cutie movie view part of the cutie kit the new H I movie view and the older carbon movie control if you use those views you're all set you don't have to do anything in order to use h.264 if you call the movie and track level api's you'll still be able to playback step through the movies edit them navigate through the movies without any changes in your app so let's have a look at that ok here I have a h.264 movie that I've compressed and if I drop it on the currently shipping version of adobe golive I can go and it opens as expected and it plays the video 10 to page 394 okay and just for fun I can bring up this timeline editor in go live and if I click around in there I can click around in the movie and step around in the movie networks I can also take this movie into quicktime player select the portion of the movie go over here copy that small portion of the movie and this is the currently shipping version of word I can create a new document and if i go and paste then that small section of the movie is pasted into the document and word will play it back for something moving out there it was add event so use the high level API if at all possible because when you do that with each new release of QuickTime you'll gain a lot of new functionality and usually you won't have to make any changes to your app and they'll just magically work can we go back to slides okay so if you can't just call the high level api's and you have to call some of the lower level api's well in order to use this be framed content you might have to change your application so if you're calling the media level ap is and you call API is that don't reference time durations or sample flags then those api's haven't changed you don't have to do anything however if you do make calls at that level that reference time duration or sample flags you it won't work with the new be trained content and you'll have to change your application if you obviously can still use those api's it will still work with content that doesn't contain be frames but once you start trying to use it would be frames those api's will return errors and we've added some new era so you know that that's the cause of the problem instead of using those older api's we've added some new api's for do you use if you use sample references then we've added a whole new set of QT sample table api's which I'll describe later and for everything else we've added similar-looking api's which i also described later okay and one last thing before Sam comes up I want to stress that these new api's work for content that contains be frames but they also work for all the other content too so please switch to them whenever you can and here's Sam to talk about be frame thank you hi I'm Sam let's talk for a moment about video compression technology lossy video codecs provide you with a trade-off between quality and bitrate if you want more quality you need to use more bits if you can't use so many bits you might have to accept a lower quality and we're constantly trying to improve this quality curve and move it towards a higher quality at a lower bitrate and we do this by adding more tricks the Thomas said many of these tricks are self contained within the Codex but some of them require awareness outside the codec in other parts of the system of the modules and that's what we're going to talk about so suppose you had some video that you wanted to compress here's a clip of some guy parking your car it's prosaic but this is educational so we could encode each of these frames independently if we did this this is called a spatial compression because we're only compressing in the spatial domain if every frame is called is it self-contained we call it keyframes we call it syncs samples we call them I frame I stands for intra and random access is fast which is good but the data rate isn't so good because if we're compressing everything independently we're not taking advantage of the similarities between frames I've got a in four and five of the previous six on the screen here and you can see that the tree and the building of practically the same and the car has moved it a little but it's it's mostly the same so we can improve compression performance substantially by using one frame at the basis for describing another frame and the jargon for this in correct terminology is temporal prediction the way it works is you start off by saying these are the areas of the new frame that are similar to areas of the old frame for example in the example that I've got here we're describing frame 5 in terms of frame for so first in the yellow parts of the screen we're saying these pixels are more or less the same as the pixels in the same location in frame for and then the green part that's what we're saying these pixels are like the frame if you just move over so many pixels to the right but these are only first approximations there's still a fix up that has to be added because the wheel is turning and the reflection doesn't move with the carrot sort of feeds to stay in place and so you can see that there's an additional image that must be added as well this is called the residue the first part is called motion compensation and the fix-up is called the residue you'll notice that there's a strip of that car that is in frame 5 that wasn't there in frame for that this part might need to be coated from scratch encoded from scratch so this is what we get if we encode the last five frames out of those six as a motion compensation piece and then a residue we call these different frames or P frames p stands for predicted well we get better compression because motion compensation can be described extremely compactly relative to describing something from scratch and as a result the bit rate that we get is a whole lot better there's something else that's worth in paying attention to here which is that each of these frames the encoded frame can only be interpreted with reference for the previous one which means each frame in a way depends on the previous one if you want to decode and display the last frame in this sequence and you haven't decoded the previous frames will you better go and do that right away so random access into a sequence like this could be somewhat expensive so when we have I frames and P frames or keyframes and different frames this is what it's this is what it's like we call it IP four ice creams and P frames it gives you much better compression than I friends only but random access can be somewhat slower for example if the key frame rate is is 20 frames you might have to decode 20 frames before you can display the one that you want to see another thing to pay attention to is that gradually appearing images are constructed incrementally like a car in this clip the image of the car that you see in frame 6 was constructed out of strips in five different frames this might not be give most efficient way of doing things so let's introduce an alternative what if we encode the first frame in that sequence as an iframe self-contained and then go all the way to the end and encode frame six as a P frame based on that iframe well if we done that first then we can encode all the frames in between using motion compensation part from the previous frame at the yellow piece and part from the later frame which is the blue piece and you can see that these frames are almost entirely motion compensation very little residue to encode here's what it looks like if we encode are six frames which with all of the four frames in the middle encoded as be frames which stands for bi-directional prediction based on the frames at the end again these four frames in the middle are almost entirely motion compensation and another thing to notice about them is that random access can be a bit faster for any of the frames in the middle starting from scratch if you needed to display those you only need to decode three frames along the beginning the one at the end and the one in the middle so these are be frames they refer to information in a future frame as well as to have information from a previous frame and the good news about be frames is that they let us enhance the compression quality improves the lower the bitrate even further but there's more there's two benefits you get better compression especially when objects appear gradually the reason that we've described and also random access is faster as i illustrated accessing any of those frames the worst case for random access is having to decode three frames another example to think about is if you're playing in fast-forward you could skip the friends you didn't need to display if they will be friends well you wouldn't have to decode them at all the jargon for this is temporal scalability but there's something strictly about be frames the decoder that's displaying these can only use motion compensation from friends it's already decoded if one of those frames is going to be displayed later then that means the order in which frames are decoded and the order on which frames are displayed is different so the frames have to be reordered somewhere and this reordering is why your application might need to understand be frames so some of you have been working with ipb codecs for some time and this is no news to you but I want to speak to you guys for a moment because there's an important point so I want to drive home with some other ipb codecs you can implement playback using a small finite state machine in which is driven with different transitions for iframes p frames and b frames and this works for mpeg-2 because only one frame can be held at time there's only one future frame that would ever need to have been decoded but not displayed and this is not true for h.264 your stand of rage 2 64 allows up to 16 future friends to be held in fact h.264 allows the encoder of an enormous new amount of flexibility and how it chooses to find material from motion of motion compensation pnb friends can depend on up to 16 frames not all iframes reset the decoder completely we have a new tag for those the name and age 26 boys IDR frames which stands for instantaneous decoder reset you care some be frames can be used to provide material for motion compensation so not all be friends can can be skipped and some iframes and P frames can be skipped because they don't count for motion compensation throws on the left you can see that the pattern for mpeg-2 is fairly regular and in fact you can entirely derive the dependency graph of the frames just knowing the frame letters and that's how the finite state machine works everything can be worked out from the frame letters but with h.264 the encoders free to do things in a much wilder way and just knowing those those letters those friend letters doesn't let you derive the graph in fact as you can see it's I don't know that you really want to try and store that graph unless you were the decoder itself so the new rules if you want to work in in h.264 it's no longer sufficient to use the frame type letters to derive frame dependency information and the dependency graph instead you should pay attention to four things first is a frame of synchronization sample not all I frames are synced samples and this is because an iframe may not if you decode an iframe that may not prime the decoder to with all of the motion compensation material that it'll need in P&B frames that follow it so instead you only want to you want to pay attention to whether a frame is a think sample which is equivalent in the new world to an eye dr frame now the pool is a frame droppable now our some be frames are not droppable and some inp frames are and that's the information that if you're outside of the codec that you really want to know you want to know whether you need to decode that frame in order to get a random access number three what autism of the frames we decoded in and sometimes it's also sensible to include information about what time the frame should be encoded at a decoded ad before what time should each frame be displayed out and this is how we know how the friends are reordered so to summarize dependencies between frames are getting weirder but it's all in those the cause of improving the quality versus bitrate trade-off number two ITB means someone needs to know about frame reordering and if you work with the compressed media it could be you and three some of the convenient rules things like the one frame delay and the ability to build this little finite state machine although they're okay for mpeg-2 they don't hold 3264 back to land [Applause] so what changes do we have to make in the movie tool boxing is in order to support be frame well first we had to change the file format for those of you who cares and parse the files yourself we've added four new tables in the QuickTime files when there's be frames one other thing to note is that samples are stored in the files in decode order now they've actually always been stored in the files in decode order but decode order and display where order we're always the same before so you couldn't tell the difference we've added a bunch of new AP is to distinguish between decode time and display time because of Sam explained with be fine content they're not necessarily the same anymore and some of those api's take something called a display offset and a display offset is simply just the difference between the decode time and the display time and just note that sometimes display offset is a negative number okay so for example where where you used to call sample num2 media time if you're processing be frame content that call is going to return an error so instead of calling that you should call either sample num2 media display time or sample num2 media decode time and which one you call depends on which time it is that you want we've added a whole bunch of new sample flags and increase the size from 16 bits to 32 bits most of them are optional but the main one that you need to know about is media sample droppable which usually but not always indicate the bee frame if you need to know whether a movie or track contains be framed content don't hardcode in you know if track is encoded with h.264 because not all h.264 movies necessarily contain d frames and we might add new codex in the future that use B frames so instead call media contains display offsets okay so if you're using sample references we've added a whole new set of api's the Cutie sample table ap is for you to work with these cutie sample tables represent media sample references in a movie the reference counted still use retain and release similar to other Apple ap is and you can use these api's for all media types of audio video texts etc not just video and be framed in order to get sample references out of a movie called copy media mutable sample table to get the sample table from that you can get the number of samples in there and then you can index through the samples and get information about about each sample such as data offset sample flags whole bunch of things and these samples are in decode order same as stored in the files in order to add sample references to a movie called QT sample table create mutable to create an empty sample table and then add your sample references to the sample table similar to the old add media sample reference call and when you're done call add sample table to media to actually add the sample references to the movie we've also included a whole bunch of more advanced sample table API so if these aren't quite what you need you could have a look in the headers and the documentation to see that if what we provided helps and here's Sam to talk about changes in the ICM I'm still them so and explained that if you call high-level api's you might not need to worry about be friends because they might not make a difference for you but if you like the kind of application that deals with compressed frame data yourself or if you write a codec then there's not hiding this information from you and you wouldn't even want to anyway so let's talk about how the API is that this layer might need to make have there been changed to support befriend codecs there are three things that are missing from the current AP is in order to support be frames frame reordering new frame information like the droppable flag and multiple buffers in flight at once the image compression manager provides api's votes for compression and decompression it provides high level client AP is and also defines the interface to decompress their own encode and compressor components underneath let's go through each of these in turn em tiger we're introducing a new multi buffer API for compression we're also extending the existing G world based decompression API to support b-frames and we're introducing a new multi buffer decompression session API the these new multi buffer api's are based on core video pixel buffers underneath we're introducing a new multi buffer API for compressor components and we have extended the decompressor component API to support b-frames so if you write code that works at any of these levels then you'll want to look at this stuff ping let's let's start here with the new compression API the existing G world based API is one frame in one frame out what this means is that the compressor has to give you back the compressed data for frame one before it'll get the image data for frame to this makes it really difficult to reorder frame also the current compression API its can almost completely unaware of time so the new compression session API is based on core video pixel buffers instead of G world if you're using a new style compressor component Abby frame aware compressor then multiple buffers may be in flight at once this allows the compressor to reorder the frames and also in code B frames it also allows the compressor to have a look ahead window for better rate control time stands can flow all the way through the compression chain and the new API supports multipass encoding so where as in the previous API with you the dual based compression API you draw each frame into the same buffer and then each time you'd pass that buffer off the ICM with the new API you take a fresh core video pixel buffer each time put your your source frame in it and then pass that over to the compression session and it will retain those until it's done with them and then it will release them so you can release the buffer as soon as you passed it to the compression session so this uses standard retain and release semantics and you could just general allocate these each time if you wanted but mapping and unwrapping these large pieces of virtual memories that you use for pixel by pixel buffers it can involve some memory management overhead and that can be somewhat expensive so we have a pixel buffer pool of products or video that does efficient recycling so this is how reordering happens you push source code video pixel buffers in display order and the session will call a callback function that you provide with the frames that have been encoded in decode order the session will also call you when it's releasing those pixel buffers so you can perform your own frame buffer recycling if you want now in some cases you might not want the compressor to hang on to too many frames at a time perhaps your networks application like a video conferencing application and there's a maximum latency before you have to send those frames out over the network well in those cases you can set a maximum number of frames that the compressor is allowed to hang on to it once and you can also make an explicit request that forces the compressor to finish encoding the frames that it's currently hanging on to the new compression special API has a bunch of other features that make it a big jump forward it's easier than before to add encoded frames to a movie you can use the fixed or flexible dot pattern if you know what a got pattern is it's not politics don't worry you can set a CPU time budget you can set data rate limits and as I said before it supports multipass encoding in fact the movie explored oh that's in the plague of speed supports multipass encoding as well in the final version of Tiger the compression session API will be compatible with existing compressors but no b-frames will be generated however in the tiger see that you've got it's not yet compatible with existing compressors and also we don't have an h.264 compressor so it would be a good idea to try and get on our seed program if you want to try and exercise this API so what's next let's talk about what's underneath the compression section API which is a new compressor component interface new style compressor components still use the full character code imco but it supports three new component calls for be frames and if you want to opt in for multipath support there's three more api's to implement as well it's also talk for a moment about decompression ping there we go so there's two flavors of decompression API that we have in in the jeweled based mode we have synchronous api's and these are all one frame in one frame out we also have a second mode for decompression which is called scheduled decompression and the scheduled decompression you can cue multiple frames each of which with a frame time and when that time arrives the frame is triggered and we decode it and display it with B frames as we've we've gone through the decoder and the display order can be different in fact you may need to decode several frames before you come to the first frame to display so immediate one frame in one frame out api's are the good match once we look at that example of the little clip of that car parking in the decode case the first frame happens to be the first frame both to display and decode so we decode it and then we display it but the second frame and decode order doesn't need to be displayed until x 60 but it does need to be decoded before the next frame in decode order which is the frame of frame at time 20 and it needs to be decoded before the frame up of that which is the x 30 in the 10 x 40 and 125 x 50 after that it's okay for that frame to be displayed at x 60 so the new model for doing decompression is that you always Q frames in decode order and then you provide the display timestamps so that we know how the frames should be reordered as before friends can be scheduled against the time base in which case the frames will automatically be output wanting to that time base when that trigger time happens but we have a new mode called non-scheduled display times in which case there's no time base and you have to make an explicit call to the ICM to say I would like this frame back you can also optionally supply decode time stamps which are a hint saying this is when it would be a good time to decode that frame so many of you will have loops of in your code where you decode some frames by calling the ICM and generally the pattern has been you go to all the frames in order whatever there was only one order before you read the frame into a buffer and then you call the ACM to decode the frame and decompress it immediately and then you use that output frame somehow well I've been saying immediate mode one frame in one frame out is very awkward for be frames at least if you want to get the friends out and dig in display order which is the automatic sensor to the to the user so we need to enhance this a bit so here we have an outer loop and an inner loop the absolute cues frames in decode order and the inner loop retrieves frames in display order so friends go in and decoder and you pull them out in display order and there may not be a one-to-one correspondence so that's why we have the the alpha loop and the inner loop the inner loop isn't going to be run many times I'll show you in a second one other thing because you're queuing multiple frames you need to load them into multiple buffers these aren't call video pixel buffers these are just data buffers and the ICM will call you back to say when it's time to release those because the codec no longer needs them you can do this both with the existing G old API and we've also introduced this new multi buffer core video pixel buffer based API called decompression sessions now the decompression session API does not support any drawing operations it doesn't do clipping it doesn't do matrix transformations it doesn't do transfer modes only that other Gulf instead it just gives you the buffers in the format you want there's a flavor of this API that support spending of buffers directly to a visual context and there's one that just gives you the buffets back so let's do a demo wake up I am right-handed ok so I have a little clip here I like the harry potter trailer and i put out a little bit of it in the middle it's not very long but it plays in display order see everything's moving forward the bus is moving on down the road movie players a quicktime player has not been revised to understand be frames so i can said if you're using the high level api's you don't need to change what's more I'm going to step through this by pressing the arrow key and you notice that the frame is moving forward so what's happening here is we are telling the movie to move forward to new to move to new movie times and rendering the frame for each of those times if you do that in your application if you step through with something like set movie time and movies task appropriate then you don't need to worry you'll get the frames out and display order everyone will be happy great job but this movie does have the frames the frames are being reordered I have a copy of dumpster here it's great when we get to demo dumpster many of you will know what dumpster is salvi one dumpster is a tool that shows you the internal structure of movie files we actually of the movie header not the place where the the media compressed data is but the movie header itself and we've modified dumpster this version of dumpster to show you the new be frame tables that Anne was mentioning you probably can't see the detail so I'll just pop them open so you can take my word that they're there there's a is a piece of information here that tells you the size of individual frames these are numbers varying between 40 and 80 kilobytes it also stores the timing information each of these frames has a duration of 1,000 and this is from film so the the speed is more or less 24 frames per second and so the time based here is something around 24,000 and each frame has the same duration 1,000 we also now store this is a new table for tiger you can also store the display offsets and you probably can't see but the first 10 the next one's a thousand then minus a thousand and a thousand and minus a thousand what does this mean well you think about the durations which are now interpreters as decode durations the frames are at x 0 a good 0 1000 2000 3000 and we add the display x will be added to display offsets to those to find the display time and we'll c 0 2000 1000 these first are these the second two frames are reordered their exchanged in pairs will same for the next one so here's the difference between the decode and read the display order the bigger I'm an adopter the decoder is one two three four five and so forth the display order is 1325 476 and so on okay oh how about keyframe there's also information here that shows you the sink samples or keyframe there's one of them it's the first frame so this is a new version of dumpster that shows you the new tables this version of dumpster I believe is in the disk image for this session if you want to use it with movies that you've encoded yourself so what's the season code this is a little command-line tool which we've included in the disk image for this session it's a command line tool that steps through a movie and does that new kind of loop that I was describing it can cause the new decompression section API to decode each frame into a core video pixel buffer and it takes command line arguments I've made it scaled afraid and down it's so big there's not a little space for debugger and it also takes a command line path to the movie file so so let's try it out I'm pausing in between the frames so that you can see them so they don't race path wasn't there something odd down did you see that I'll play it again it's Harry Potter but didn't come from he was allow me up from this movie in quicktime player what's going on well this little tool is going down to the media layer of the movie and it's accessing the sample table directly but when you do that you're circumventing the the higher layers and that the track layered there's a thing called the Edit list and when I cut crop that little piece out and copied and pasted it out of the balanga movie I had edited out the frame of Harry Potter but it still inside that sequence of frames at the media layer so we're going around the Edit list if we wanted to make the movie sets of the movie in the same way that it appears to someone who's using it then it would look we'd have a bit more code that would have to walk through the ad list or an easy way would be to do that what I did with the arrow keys stepping through the movie so briefly let's have a look at this in in the debugger that's with this code little and what you have what so we're doing some things that aren't very special about be free movies and I'll skip over those we are opening the movie we are getting the video track they're getting the image description and then we're making a window that's the scale size of that image description there them there's a window in the background there nothing big yeah and then we're making a decompression session i want to show you this yep local variables love them so we're creating a pixel buffer attributes dictionary this is how we're describing the kinds of pixel buffers we'd like to get back from the decompression session and we give it the width and height that we'd like it to give us buffers in because we would like to be to be a specific width in a specific height and we ask for one specific pixel format we could also ask say here's a list of pixel format piglets have a best out of these we also pass a call back when we're creating the decompression session this callback is called with the encoded pixel buffers there's no G world here the callback is is called the fresh buffer for each time and when you release them they can be go back into the pool that the ICM is using to recycle pixel buckles that tracking callback is also called when we want to recycle when the data buffers can be recycled ok so now we've got a decompression session but nothing's been decoded into it now we've got to have a look at the media and pull out those frames we get to the number of samples in the media and we allocate some storage for each of them now here's a first interesting question what's the first frame that we want to display well we're starting at time 0 in display time but that might not be the same as the first frame so we need to go and do a mapping and this is calling the new API and Tiger previously we'd call media time to sample them to get this information but now we need to specify which kind of time you want to talk about so it's media display time to sample them and i think i have debug two expressions aha my little expressions would have some of these variables are uninitialized so you can't see this but I'm telling you that number in red is one so that's the next sample number we want to display that's also the first some sample that we're going to be code that's just like the as we saw in dumpster ok so we're approaching the outer loop here the outer loop q strains in decode order that's just ascending numbers in sample time so if we use plus plus to get to the next one so we translate that sample number to a decode time to that we can get the size of that sample we allocate some storage we load it into that storage along the way we have found out what the decode time and display offset are we add those to find out the display time for that frame and now we have enough information to queue that frame with the decompression session and we hand it to the itm and nothing happens yet nothing happens because we're using this new mode where you pass display times and then non-scheduled there's no time base they're not going to come out until we ask for them back so the next question is well have we include the frame that we need to get out and so we just compare these two numbers we have queued sample one and we want to get back sample one so yet let's go into the inner loop where we retrieve samples in display order and we do that by saying here's the non scheduled display time I wouldn't like to get the frame back for and there's a frame alright next we want to know what's the next time that we want to display so we call get media next interesting display time you might be familiar with get media next interesting time that we have to say which time we want and then we translate that time to a sample number what we call from time 0 to time 1,000 but that's frame 3 so let's go around this loop again I've got a break point here so the next thing that we decode is that we share your 50 code is fine too and now we've cured frame too but we want fine three so we have to go around again we'll skip over the inner loop and come back to the outer loop will link you frame 3 and then we'll ask back to frame 3 which is in the queue now and here we go we'll move the the bus will move a bit of the road woman I guess when the bus and the next kind of two thousand and that's frame too so remember this this order we're going through the frames and display order thanks to the get media next interesting display time so forth we're going through these trains in order 1325 476 so as we go through we're going to get back frame to now so okay big deal we carry on in this pattern because the frames a shot of exchanged in pairs will Q two frames and then retrieve two frames and then we'll q two frames and retrieve two frames and it's going to carry on like this I'll clear that break point and just continue and in fact here's us going through the rest of the frames and his Harry Potter great note okay so one more thing it's not like a Steve one more thing I'm sorry we've enhanced the decompression component interface to support b-frames this is all based on the base decompress so we introduce the basically compressor six years ago in QuickTime three it's been helping you since right to video codecs the base codec helps by implementing the cue that hold those scheduled frames and now it'll also help you with the frame reordering kabhi frames is the new rules for be frame aware decompresses if you want to write one yourself you opt-in in your initialize function by setting a flag you classify frames in your begin band function which means that you say whether the frame is a keyframe a different frame or a drop a whole different frame it's always been a good idea to do this but for be framed mandatory third we've split the work in the draw band function this used to be work we're both decode and display happened and then we separated those and decode happens in the decode band function and display happens in the draw band function and only one of those will get cold if we just need to decode the frame in order to prime it in your inside your decoder final mode and this is not just for D phone colleagues that in fact any codec that wants to work in the pen player now don't cache the pixmap base address in begin band it might change between begin band and draw band and then you draw in the wrong place so these are the new API is we've provided an image compression manager for multi buffer support and 4b frame support so there's a video track blue the exporter in the feed I encourage you to try it I encourage you to try out the new AP is that exercise that because the new api's will work with the be framed stuff and you'll see how that works also exercise your application and examine whether your application needs revising order the cope with befriend content and it gives you one small warning about one bug because it's likely to bite you and I'm a kind guy the bug is this the video track movie exporter only creates video tracks it doesn't copy the soundtrack so it's likely that you're going to want to go and extract the soundtrack and paste it into the new movie on with the add command so they're next to each other if you do that save the movie but don't state it's self-contained and save as self-contained is now the default in the new quicktime player carefully switch it back to say the reference or what used to be called save normally the reason for this is that when you say to movie self-contained we do a thing called interleaved flattening we interleave a half second chunks of audio and video and the code that does this it's got a bug it doesn't do the right thing for h.264 and your the movies that won't play properly they're not going to cause any harm they just won't play right so avoid saving those at movies self-contained so more information both on the CD the DVD and on the net at connected apple.com you can download a bunch of documentation for this there's a what's new in in QuickTime 6.6 document in so it's a big 60 megabyte Tiger documentation I'll download it here while Apple's think the bandwidth also download the disk image for this session you can go to connect oracle com login and click download software and see what's new and it'll be there under the developers conference stuff the api reference won't be updated until they Tigers final and one more thing it's another dull one more thing just around the back here in the hands-on labs the QuickTime for graphics media include time lab we have a special extended hands-on lab time so that you can talk to me and other folks about ipb and about visual context and that's starting more like right after this session and that will go on till six-thirty and they'll be tearing down everything else but they're going to let us stay in the room so that we can help you so come along also seeding you've got the pilot feed it's possible that will do for the cook time seeds if you would like to be involved in such seeds please send an email with your name company product technology interest to quicktime seeded apple com I have some reminder cards I have to reminder cards that i can give you if you come and see me after this or you can contact your friendly evangelist
