WWDC2003 Session 702

Transcript

Kind: captions Language: en well hello welcome thanks for coming this recession 70 to mpeg4 demystified as part of the quicktime track that we're very happy to have this year at WWDC it I'm Amy Nugent I work on the QuickTime product marketing team and I'll be your host for this session you probably just came from the QuickTime state of the union presentation and there you've learned where standards really play a very important part in the strategy for Apple as well as quicktime and it's my honor to introduce to you rob conan who briefly talked in the state of the union president the mpeg-4 industry forum as well as many other jobs and he will go through the mpeg-4 specification it's a very vast and deep specification that is capable of many things and I will leave you in the very capable hands of Rob and okay have a good session Roderick you very much area so I think we've got to do one of two things first either all of you flip up or your your imax or we put on some light in the room because it's really very dark I can't see anyone so here I books there's not enough i boosted i can we have some like in the room is that possible yes i think you can still see the screen right and the screen is much more important than I am so allow me to take off this good morning everybody it is my pleasure and honor to be able to to talk to you today and to explain the high-level concepts of event paid for I think I will be able to demystify some of it maybe not all of it notably one thing I have to say it from this about the licensing some people of you have heard about licensing you should come back on Thursday morning and there's someone here to explain just that I will say a few words though and if during this talk you have any questions I don't mind being interrupted just raise your hand it would be good if you use a mic for a question because this is being translated simultaneously into Japanese and the translators can only hear you if you use the mic if that's difficult just shout out your question I'll repeat it for the translators just need little defensive I've been told not to press this button because then everything will go wrong when you give you need little devices with buttons that you're not supposed to press thank you for coming this is what I would like to address today what is mpeg-4 how does it work what are the reasons and interesting developments why should you use them big for this is just a bunch of business talk so we'll go over that quickly because you're all developers I'll tell a little bit about the deployments of epic for and then just a few words about em for F or the impact for industry forum which is an advocacy group for for mpeg-4 so let's start with the basics what is mpeg-4 I am today not going to give you what's happening back here I got the message what is that before so I'm not going to give you the gory details of the video codec or the audio codec or any of the system codecs with a high level functional overview of Olympic for is and what it does first it's what we like to think of as d media standard we call that way because it's a standard it works across all devices all networks all carriers of everything basically and that's why we also call it the info operable cross-platform ecosystem while it is operable it's also competitive when I will tell you why it's competitive because just that you have a standard doesn't take away all the competition there is on the contrary actually that creates a lot of opportunities for competition most people no impact for is a video codec and epaulettes to talk about mpeg-4 video and AAC AAC is actually also mpeg-4 or mpeg-2 advanced audio coding and it's also a systems layer a part of which is the file format which was based on quicktime as as Frank casanova explained this morning it goes way beyond audio and video the audio and video are the first element that will see deployment and it supports stuff that's way beyond that and I will tell you a bit about that and it's designed for all multimedia platforms digital ones so where does it come from most of you will know so who knows MPEG 3 here anyone know MPEG three some people still do epic free doesn't exist because what you know as MPEG 3 or mp3 is actually mpeg-1 audio but mpeg-1 at three layers of audio and layer what was really simple layer 2 is what's used in most of the digital audio but the digital broadcasting systems today in Europe at least in America it's Toby and layer 3 is the most complicated of take one audio layers that was way beyond what could be implemented when it was designed but it's not just the norm and then you may know Olympic Village is standard for digital television and for DVD video and audio also in Europe again in America it's more dolby audio then there are epic 7 and epic 21 which are not successors to mp4 or MPEG two epic seven is sort of a metadata standard that allows you to describe content and epic 21 is a is a very fussy phrase is the framework for interrupt will use an exchange of digital media it does all sorts of stuff that has to do with whether you find content what's in a unique identification for content what do you find the rights to content and it attempts to standardize elements of digital rights management which is more of a challenge i can tell you then standardizing just a video called equities which is hard enough as it is so let's talk a little bit about the mpeg-4 vision and this is a vision that's been with us for a long time actually since the middle of the 1990s and it's coming on clothes and I've been working on mpeg-4 for 10 years almost that will be next year the vision and advocate if you remember back in the early 90s or the half-way the 90s there was all this talk about convergence and everything was going to be the same and we're all going to have glass of fiber into our living rooms and the big discussion was okay are we going to consumer content on the PC or are we going to consume the content on the television and back then we said that's that's all a lot of there's not going to be I'm not sure that can be translated by the way it's that wasn't quite right there was rather than convergence we saw proliferation of multimedia rather than less networks we got more we got all these sorts of difficult in mobile networks we got the law television network we got the digital temp sorry telephone networks work as a digital telephone network ISDN which has had quite a bit of adoption in Japan and Europe not as much here we had a dsl we got cable we got stuff coming through to us through satellites and it's going to be a chaos and rather than just this convergent terminal it's going to be either the PC or the television another load of bull we're going to have a lot of different terminals handheld devices phones more pcs different pcs set up boxes and the role going to one to do digital multimedia and back then the way to do standardization was new network new standard new stack communication protocols codecs everything a new especially in the communications world and we said that doesn't make any sense we have to have one layer of content representation that works across all of these different applications is agnostic to the network to the terminal and supports all these different types of what you could call paradigms for using content broadcast communication retrieval and retrieve them to be online or in package media like DVDs so basically a single technology for all these devices and of course that doesn't mean that on your high-definition television you're using the same dick rate but you're using the same you very values the same systems layer by the way you can use the same mp4 files and they get point to different content media files with different encode bigwigs and like Frank show this morning it could easily take one file and transport skoda to another that does work on I'm not on a on a mobile device enabling what but I like to think of as this right ones play everywhere paradigm where you can use your content across your devices on your PC's on your CED vices and on your phones even you can even take it with you and at the other hand you can shoot your films while you're the road and just upload them to your procedure they just play in quicklime so that's where we see the applications of mpeg-4 today and I already talked a little bit about in the state of the union we see the mobile devices we see it in broadcast not as much yet this will explode excuse me if we get the new advanced video codec about which I will say a few words a bit further down in this talk we do see streaming services interestingly we are getting a little bit of interactivity in these epic for allows for great interactivity which I will definitely talk about and the BBC is doing a trial now with that sort of stuff and for package media which is also waiting for the new choric at this moment so let's now go to the heart of this presentation which is how does it work mpeg-4 is is an object-based multimedia content representations tangles and some people that know a little bit about image coding or have heard about it before it will know that mpeg-4 supports arbitrary shape objects and video that you can do segmentation and stuff that's so true but you don't need to do it an object might just as well be a rectangular frame of video and the audio and that may be the text that scrolls across the video is another object which is already huge difference with mpeg-2 wait everything is just pixels it's got a revolutionary systems layer it's got state-of-the-art codex which are responsibly upgraded which means no new codec every half here is something that you can perfectly well do in the internet world but the CEO world doesn't work that way people don't want to buy a new DVD player every every half here and it's got something called profiles and levels to restrict complexity and guarantee interoperability those rd what we call interoperability points profiles and levels i will MPEG he finds a whole bunch of those and you could actually say there's too many of them it doesn't really matter because industry consortium search as the internet streaming media alliance which is a construction that Apple founding Jay pick their profiles and levels a soccer game gives this profile for video this profile for audio going to use this file formats will be there's only one of them so that's simple and that's how we are going to do improbable streaming media all of us Philips son Cisco percent of Apple you name so let's take a look at this picture in a jump of speech for a second and you can't mean to text and that's okay because you don't need to but what you see here is all the different content types didn't pay for supports and what you see there's audio there's videos as graphics there's even 3d graphics there's textures animation that's something we call Biff's or hotel about what mrs. and then there is basically this represents a multiplex which you see is an mp4 file which is the basic container that can carry everything then you can just redistribute that stuff the these containers and these streams using whatever you would like to use because it's basically mp4 is agnostic to all these things so there's broadcast is broadband delivery satellite as wireless there's phone lines whatever and then you can put it on the number of different devices and what I talked about the devices before so let's skip that bit but what's now interesting is let's look at what impact others are actually going to go back to the screen in mpeg2 you would do authoring and you would take all of these objects you do your authoring and Apple has a couple of great products doing authoring but then you're going to do encoding or what you do with encoding is you you basically say okay now I'm going to convert all these things into pixels one plane of pixels everything is collected to is collapsed into a single plane that frame rectangular frame of pixels gets encoded I explain everything using video concepts actually an audio I could do something similar but it's just a little bit easier for me doing it in visual concepts so you take all the object you collapse them into a single plane of pixels you encode this using mpeg-2 and then you just display it here there's nothing you can do anymore now with that big for you can if you wish you don't have to that you can keep all of these objects separately you can have multiple video just you could have one you get have a graphic that's encoded encoded independent league and have your streaming text you could have your voice and your music and code it separately you can keep it separate what's called elementary streams you could send these to the decoder and then you do the composition here so instead of doing composition before end clothing here we are now doing composition after decoding of the objects which is here that's the major actually what if there is one major paradigm shift in mpeg-4 that's it now in order to be able to do this you need some sort of a language that tells you okay this is where the object go on the screen this is when they appear that's what we call the bits the binary format for scenes and so it's an efficient binary language that allows you to describe where the objects are what they where they go when they appear now if you have this best language you cannot not just described the scene statically you can also start describing the scene dynamically you can attach behavior to the objects it say okay this logo is spinning it's changing its color it's moving from the top left of the screen to the bottom right of the screen now if I were to do this in mpeg2 or any traditional codex I would have to encode all these pixels and again and again and again and again again until the logos here which is quite a bit of waste of bits while I meant before I'll just give one command saying okay move the logo from there to there and take a second to do it and that's it which is a very small binary comments into the decoder decode that takes care of everything now this applies this visual objects it applies to all the objects as well you can describe 3d audio scenes in this and have sources move around and as seen if you wish that's quite a bit more advanced but that's the basic concept of mpeg-4 so let's look at this in a typical impact forcing that is fully free of any copyright so I won't get in any trouble which means it's a bit dull I made it myself it's an aquarium with some seaweed there is an arbitrary shape video object and I've been using this for a while she's for now this was when she was one day old there's some bubbles from fish and there's another type of fish which is a special sort of fish which I'll explain a little bit and all these are different objects so this is an arbitrary shade video object or natural video object these are graphic things the fish and then there's the bubbles there's the background it has music this may be a voiceover oh and then there's this this looks like a wireframe and actually is a wire frame with a picture projected onto it and the neat thing about this is if you if you move the vertices in the Royals wireframe you can make the fish swim and actually in real in real life you wouldn't see all these wires these would be hidden but that's just to show you how it works these are a couple of the objects to them before supports now this is what the scene tree looks like all these objects are represented by branches in this tree and they have sub-objects at what's some of these trying to go back do you really want to do yeah that works so all of these objects can have audio and video associated with them some of them are static graphics some of them are streams some of those audio someone's video and this is actually literally what's represented in the decoder and now you can go in with your best language and just do stuff with the branches you can take a branch out that an object disappears you could change the place of the whole branch you can change the color of an object just by issuing these little bits commands so we kept in we have another visual scene with objects it could be a very complicated scene could be a very simple scene with one audio object in one video objects and it just provides interoperable streaming which is it's quite a feat in itself these objects can be of different nature they can be natural which is they are recorded with the camera or microphone they can be synthetic which is there generated with a computer program and there is a compositor which is this new element and puts the objects in the scene and then there is an efficient real-time binary scene description language which is called this and this say a couple words more about this it inherits a lot of verbal the virtual reality modeling language but as you as you may know that one was neither real-time or binary and therefore not very efficient for stuff like streaming over the internet or to mobile phones it was perfectly okay for doing computer stuff and the coding scheme of all these different types of objects is optimal for the object type so you don't try to encode speech with a music encoder which is not really optimal you don't have to encode a graphic with a video encoder which is optimized for moving video rather than just still graphics you can use the optimized coding scheme for each of these objects and this is completely independent of bitrate and I still say this because most people now understand that big 4 is it about low bit rates just about low bit rates it's also about low bit rates way back when 1993 mpeg-4 started as a low bitrate project but that got changed like really quickly in 1994 but some people still think it's about low bid for it so mpeg-4 there's a studio profile that I he goes up to over a gigabit per second and video coding so let's look at the different objects that are supported in mpeg-4 the ones you know our video and audio and these are the most widely deployed video coding and advanced audio coding mpeg-4 advanced audio coding in addition to the video coding on the visual side we have animated faces and bodies and there's there's some companies that that have products are there for animated faces and I think the BBC has been looking at doing this because they have a legal requirement to do talking heads for people that can that can hear that people and they they're supposed to be able to to read lips and you could do this with animated faces there are two-dimensional three-dimensional animated meshes it does a little wire frames then you can project either still or even moving video into these wireframes and then you can deform the wire friends you get really intricate effects and there's text streaming text and still text and graphics and jpg is also support it as a part of the mpeg-4 framework to just use the graphics and then there in the audio site we have generic audio from mono to 5.1 channels and by combining different audio objects going to actually go up in almost indefinitely you don't need to stop at 5.1 there's specialized speech speech codex synthetic sounds this is very advanced structured audio is it's basically a language to program a synthesizer and then to first to describe instruments and second to to play the instruments so there's a score score language there's text to speech which is merely an interface which you can mark up text button can be regenerated as speech and then there's something called environmental specialization which is making stuff sound like it's in a specific place you can describes it place so let's look at that the the parts how this will fit together first there's the visual coding and then there's the audio coding and this is just decoding I'll say a few words about this a bit further down in my talk but it's important mpeg-4 only standardizes decoding it doesn't standardized encoding and that's why there's so much competition between providers the same with mpeg-2 and as you will see a bit further down in my talk this provides for a lot of improvement in quality of these codecs and this is also why you have to be very cautious with statements from proprietary vendors about the quality d quality of em before it doesn't exist basically but you can get the best quality with mpeg-4 and there are fair comparisons to be made but i'll say a few words more about that a little later then there's a systems layer and mpeg-4 which basic which does stuff before decoding in terms of demultiplexing and buffering and after to decoding in terms of presentation which is this composition of the objects and the systems part used to contain the file format which is the mp4 file format which is extremely close to the 3gp 3g 3gp file format which you saw Frank talk about in this talk this morning the only difference is basically that there is a toddler flatten that says this is the 3gp file which means ok i now have a mr voice coding support which is not something that is natively non to mpeg-4 but for the rest is just the same stuff and and then there's something called EMF which isn't always used you don't have to use it but which would provide you with an abstract interface to the transport and if you use DMS which has a little bit of a grandiose name delivery multimedia integration framework it stands for it's actually a quite compact part of the standard if you use dimas you can write your replication to a transport layer then you only need to write separate interfaces to a disk or to a network or to a broadcast even and your application is to be further fully unaware of what it's talking to and then there's the transport layer which in principle is not in the standard and this is how content flows through it comes to a transport goes through dimas if its present systems takes care of the multiplexing of all the different objects it's decoded and then the decoded objects are composited onto the screen or into the sound space and composition of audio could very well be okay I turn up the volume of the background a little bit and I turn down the volume of the foreground speaker a little bit or I choose the Japanese speaker rather than the English speaker these are all possibilities by by by using epic for composition and there's two sort of orthogonal parts conformance which contains a lot of bit streams if you have a decoder you can use the conformance part and see if you decode risk informant give you some level of indication of interoperability in the mpeg-4 industry for we do much more interoperability work with exchange of bit strings and then there's reference software which is actually free of copyright if you use it for building a compliant implementation there is something even though in principle this is not in a state of something called mpeg-4 own ID which is a specification on basically how to use IETF protocols and how to do the mappings and more recently what's called advanced video coding was added to the mpeg-4 standard and I'll say a bit more about that in a bit as well and you will see that the numbers don't quite add up there's more stuff that I don't think is important to talk about right now so let's take a look at the sum of recent developments hey this slide was supposed to have been hidden I want to first say a little bit more about the objects I'll keep this a little bit brief so we have video which basically goes from my thinking of a second to over a gigabit per second so if you take one set of zeros out it's megabit and if you take another set of zeros out here it's gigabit per second and Sony actually has cameras that support this stuff Studio profiler squirrel called multiple rectangular or arbitrary shape objects in the scene scalability supported include including fine-grained scalability which has some support but not a lot yet but it means if I have my full bitstream i can drop layers of the full bit stream and you can still decode sensibly the picture of the audio in this case the video sprites you can use price for backgrounds we could send them once and then you can warp the the background with to make the scene change but you don't need to send keep sending them as moving moving video and then we have some types of computer-generated visual information synchronized reflux and animated text place embodiment animation talk about this and the meshes with the moving texture still or moving texture now for audio and there's a lot of stuff here and i should say some of this will be used in some of this will likely not be used and that's quite ok because we have these profiles and people will pick what they what they need again with audio we could have a number of objects in the scene that you can make your audio composition i think the most important codec in effect for his impact for advanced audio coding which is very much like epic to advanced audio coding has a couple of new things there's another audio codec for really low bit rates with AAC is getting really low as well these days and then there's one for it extremely low bit rates it's called aih iln and then there's a voice codec actually two of them one again for extremely low bit rates and one of them for normal bit rates in it 24 kilobits per second you have just basically transparent voice quality you can't distinguish from real voice and an audio you have again scalability so that you can have actually it's interesting you can you can build an AAC layer on the cal player if you wish even so you use the Cal player sort of what what's called the prediction and then you can build X n you can now put an AC layer on that if you for instance do radio the basic quality goes in kelp because it's much mostly speech and if you want to have a really good quality you do it in mp4 AAC and something like that is actually done in digital radio mondiale or drm which is a digital broadcasting standard and the conditions were such that you have to be able to receive it in very poor reception conditions and then you have to get a good really good quality signal if you receive a good just a good signal and that uses this type of scalability synthetic all the objects also I talk a little bit about this before we have this Orchestra language whoops orchestral language and score language so with this language to describe the orchestra and with this you describe the the music itself and this is really a get one or two kilobits per second you can do really great music and there's a company that's been working on this for a long time and they were going to build a quicktime plugin and i hope they'll come out with it soon as it was promised for this for this summer media supported and a couple of types of synthesis and then there's this text to speech interface which you can use together with face and body animation those are with some of the more esoteric object types support today an industry is for AAC and for just normal rectangular video coding and there's some companies that are trying to do more interactive stuff with mpeg-4 but they start with the systems layer they have graphics they have arbitrary shape stuff there semi-transparent graphics and they use notably the binary scene description as I explained its inherited from vermel but it's much more efficient and it's added real-time it's basically married the mpeg-2 and back to get from the broadcast world right so people know about synchronizing audience video about synchronizing different objects and about buffer models and stuff and the scene description marries these concepts from verbal and from mpeg-2 great broadcast great synchronization that's what allows the interaction it works in n 2d and 3d dimensional and there's a couple of three-dimensional players out there already and it allows you to do dynamic scene updates you can add objects to the scene on the fly can delete them and you cannot you can change them on all on the fly by using this scene description language and to provide an interface with a smile world and to make it better author able and pike later added what's called the extensible impact for textual format or xmt what's you basically a textual format for bits and there's actually two versions of them one of them is very close to the dips and one of them is more generic that I'll spare those details but the important part is that there is a smile harmonization to the extent possible because there is a lot of smile content out there what's very important in mpeg-4 systems is that you do get predictable behavior of audio and video which is which hasn't always been the case with all the web the internet technologies and that you get predictable buffer management so as you know if I send content it will play on the player but because the player knows what to expect it won't get trouble with buffer overflows it's so predictable and it's all standardized there's some more stuff with a smile integration here in the timing which you can basically do a more loose timing of your objects and what's important here is that while mpeg-4 doesn't standardized digital rights management it has interfaces to proprietary systems digital rights management is it's not going to go away I think it actually could provide some useful features for for end-users even though it's been for traders and something that is hostile to end-users that it's that's wrong but in order to for a ecosystem and ecosystem to support serious content being deployed in the ecosystem something needs to be done about this rights management and I think Apple take a grill it took a great approach with what's being done right now in itunes it's very user-friendly and basically the aram needs to be you don't see it if you don't make normal use of your content and that's what we're trying what we're getting to see these days as a standard interface in the in mpeg4 and there's epic 21 which will bring more interoperability in the arab which means it's no longer as it is today the the monopoly of one big company basically and the file format i already said it a couple of times is based on quickly i met before just like the top three TP file format which is very close to mp4 quickly wrapping this up there's a big gay or Java which you can use for a really complicated content render and for having programmed content basically but also a standard api's to find out what you're talking to what are the terminal resources and stuff and there's some advanced audio rendering where you can basically make create the sound without changing the source of the sound you can describe the environment in which it should be should be played basically could say okay this is in a closet or this is Anna giant or this is a football field I can describe this with the audio rendering stuff and I realized I'm giving you a lot of a lot of details going to build we're going to try and talk about application soon but I have to make the case for the profiles first if you have this huge tool box of all this stuff then in theory you would have interoperability but in practice there's not going to be a multiple of interoperability because everybody's going to the implementing different things right which is why MPEG defines profiles and these profiles are the conformance points as we call them which is ok this is where you can test the interoperability a profile basically turbines a tool set I use these tools to encode my video I use these tools to encode my audio and then the level within the tool set limits the complexity and stuff like okay bits per second for for video the screen size for video or the the sample rate for audio these things are in the levels and if you take a look at the I estimating internet streaming media alliance they have said ok we're going to use what's called advanced simple profile for for the video we're going to use low complexity AAC mpeg-4 AAC for the audio and use the mp4 file format and that's what we're all going to do and now i have a within is ma I have an interoperable stack and they add some transport to that which is something epic for doesn't define and then you have the interoperability and it's interesting to see that while mpeg-4 has many profiles and i would say too many and I'm partly responsible for them as chair of the impact requirements group it's very good to see that industry is converging on just a few just a few which means there is this interoperability and and the ones they are choosing are hierarchical so there's the simple profile and advanced simple profile which are mainly used in video simple is what is now in quicktime advanced simple is what it what is in some of the more advanced and pick for players and end coders and decoders but they're they're compatible in the sense that if you have simple content simple profile content it will play in an advanced temple player so that's good and that's why you can't see that people are exchanging content and divx for instance is an implementation of advanced simple we have a couple of profile dimensions I think it's gets too technical to go into real detail but all the the elements in mpeg-4 have been profiled that's basically the point of this message and actually I don't think there are handouts right at this conference are there so we'll make this available on the end for a website can we do this I think so yeah so we will make this presentation available in the infrared website and you can download it for if you later what do we do it let's look at recent developments this is a very interesting development epic for as we know it today was standardized 1998-1999 some stuff was added at four years ago very recently a new code equals edit Olympic for it's called advanced video coding and while mpeg-4 as it was until this was added was very attractive to mobile and Internet where there was no impact tool yet it wasn't attractive enough yet to the broadcast because the in order to replace or the impact you win infrastructure or to add something to the mpeg-2 infrastructure you really need good advances in coding efficiency and will impact for advanced visual profile provides this it wasn't enough for these major investments in the broadcast industry it was enough for for the end for the internet and for the mo and stuff but this new codec which is called advanced video coding which originally comes from the the itu world and they've been working on this for a long time also maybe first I knew of the project was 10 years ago basically and it's the same coding standardized in itu and in I so as I actually basically basically there's two groups in the world that work on video coding standardization there's the video coding extra growth in the ITU the International Telecommunication Union and their lens is so MPEG came together formed the joint video team and the JV t codec and standardized this new product which beats everything out there so forget about what you hear from Microsoft this is better and this has been confirmed by by independent parties like lsi logic who might have great respect for him they guided and interestingly and I was again I will say more about that improvements will continue because of the fierce competition in this market there's a really fierce competition and we've only standardized the decoder so people will come up with amazing encoders basically and this will about this will give you about broadcast quality impacts video with about seven hundred kilobytes to one megabit per second now that's significant because that starts to get in the range where you can do streaming over a broadband network or very good a DSL connection or a good cable modem it starts to get there it's also good enough for people to think about ok I'm investing in a new generation set the boxes now maybe I should take a look at this new product it's also good enough for this to be implemented in mobile devices at some point in time they already did supported the the basic epic floors I could call it now and they also start supporting this stuff and it's amazing that neither what Apple is doing with the with the conferencing stuff using mpeg-4 for conferencing there's a lot of people lining up to do this to use advanced video coding or h.264 for conferencing there's a lot of industries waiting to start using this codec and I'm sure even though Apple never discloses its product plans not even to me I'm sure that they're working on this codec and they'll have it soon ready pretty soon then there's advanced audio coding and high-efficiency advanced audio coding now with high efficiency it's a neat little trick well you can we split the spectrum in half and then you predict the upper half of the spectrum for you this is the upper half of the spectrum from the lower half of the spectrum and if you do CD quality or near CD quality or really good audio just basically internet quality this gives you a lot really a lot of bit bent with savings like CD quality or about CD quality 48 kilobits per second and high quality at just general internet calling at 32 kilobytes per second the trick doesn't work for transparent quality so if you if you want to have really transparent quality which is something that iTunes is trying to achieve then you would still use normal AAC then you don't get anything from this prediction trick it is really neat and it's being used in the XM radio and digital radium on the yellow drm for their broadcasts because it works so well and now AAC and including a high-efficiency AC have been tested as the best crowded by the European broadcasting Union overall the proprietary codecs and that's what's being shown here and this should actually say AAC event audio coding this is original and then you see high efficiency AAC which was tested in one specific implementation called aac+ whoops and then you see a mp3 pro which actually is mp3 with the same trick of light to the same prediction trick and you see a Windows Media here and real here and Windows Media 9 I have been explained by audio coding expert isn't really differ a lot from Windows Media eight so this was a test at 48 kilobits per second and whoops why does it do this if I don't want it well 48 kilobits per second done by the ebu which is really independent and this was a really professional test double-blind which means people don't know what they're listening to and basically because audio testing is a lot like voodoo and if you if the experimental nose was being tested he can make you believe anything and if you know what you're listening to you can also make be made to believe everything or anything but if it's double blind and neither the experimenter nor the listener knows what's going on then you get really valid results that's what happened in this test a couple of other developments some we're going on on truly 3d video coding at very advanced some work going on and truly lossless audio coding that's also for the high end and there's some we're going on on a very interesting animation framework which actually takes a step back and says okay let's do this right this animation let's create it excuse me an integrated framework for animation of all sorts of graphics context it's not for video content or for just natural audio but for for computer generated content and we'll see where that goes so why should you use them pick for apart from the technical details and this is going this is going in the business stuff I i I'll quickly go over this I think if you're a developer it's may interest you just a little bit less but let's just take a look at standards and why they make sense the fuel a lot of innovation and actually this lighter I have to acknowledge Tim shaft originally make this light for giving me this light standard fuel innovation gsm is a great example the european or actually wooden our world quite standard for mobile telephony na do to the 11 also none under different names right at apple but you can connect here to your wireless network it's just works great they have a really long life standards to look at the TV standard like bell in europe and ntsc in the US or mp3 which is actually over 10 years old now but it's still a premium feature if you buy a car stereo mp3 comes at at at a price it's being built into car stereos and stuff and digital devices today and it will not go away no matter how great the successors are this will after this will kept being supported that means that as a consumer you don't have to throw away formats every other year or every year and just keep your stuff when it keeps working DHS has had a long life the CD has been with us for over 20 years standards creates huge markets the CD the DVD and mpeg-2 which is a which are really really multi tens hundreds of billions of dollars markets and they provide an interoperable ecosystem of tools and come to where you can just use stuff from different providers and plug them together any worse and these different providers can work independent of one master so to speak dependent on anyone you know if you don't have give not locked into any single vendor and the vendor may be competing with you by the way in if it moves into different spaces and there are the pricing is controlled by the market and not by a single vendor again and if you don't like your equipment from one vendor you can go to the other one so if you use them big what impact for a couple benefits for you you can offer your concert once and then use it on a couple of different many platforms and players and code once you may have to link all the different bit rates but like was shown this morning this can be made really easy your users can pick their favorite stuff they don't have to stick to one player content providers on the other hand can pick their favorite stuff everybody can just provide tools in their own niche there's a lot of different niches and it isn't like one-size-fits-all and competition drives the quality up if you look at them back for is both a revolution and an evolution it's a revolution in what I explained about the design how it works and how it can expand to synthetic content it's an evolution in the in the sense that it doesn't define you transport protocols and stuff you could just use it on whatever is there already in place and specifically in impact for as it is today and before with a PC advanced video coding as it's coming now it can all use in an impact to environment which is very big plus for broadcasters again they don't have to replace all their impact use all their impact your broadcasting stuff they just need to plug into new codec which is difficult enough but if the games are good enough then the economics are sound so it saves you money and it makes you money i believe by making more efficient use of bandwidth because it's efficient by being able to repurpose existing content now making interactive or deploying it on a mobile network no need to duplicate work if you go to different networks you can integrate it into existing in peculiar environments and you can use it or not be networks just as easily and it makes you money because you can use your content in your networks in new ways and can add new dimensions to content and there's little risk because it's a standard that's widely supported proprietary technology on the other hand does lock you into third party business and pricing models and make a dependent on their road maps and their plans and the way they choose to evolve their business and it can get you into channel conflicts so this is just one of the forecasts this is a Kipps standalone mp4 tips and course embedded in processors they think it will explode and it's already happening I tend to agree and there are many similar forecasts and one interesting trend is okay for the coming few years competition with Windows Media after that standard will win because the benefits are just so obvious that the market will choose for the standards and there's such a lot of people already making mpeg-4 stuff it's amazing this is an important point and I want to dwell on this for a while because mpeg-4 only standardizes the decoder there's a lot of room for innovation and if you see comparisons and I've seen at seeing very bad comparisons of notably by Microsoft that put QuickTime here and then their latest michaelson late latest codec on the other hand and then they compare the quality without saying that they're only using quick a simple profile for mpeg-4 and that they did be encoding themselves I mean there's such a lot of tricks you can pull if you do a quality comparisons but if i look at mpeg-2 and this is really the proof of the funding mpeg-2 bit rates have reduced by over fifty percent over the lifetime of the spirit and this is an underestimation and we'll show you the graphs and this was after the standard was frozen and without needing to replace the decoders a great new anchor comes out it just gets plugged into the broadcasting system you don't need to replace the set of boxes just works people come up with great new tools for encoding DVD DVDs DVD players don't need to be replaced just decoder is the same the encode it gets better and that's what's happening with mpeg-4 in the market today and that's what will happen really happened with advanced video coding what's already happening today and a vc advanced video coding will beat all the proprietary codecs was already up there including Windows Media nine and if I look at that's interesting you should disregard the numbers here because they are wrong what's actually right I thought we hit this stuff someone's phone is ringing this is sick there should be six megabits per second when it started in 1994 1995 today you can deliver the same quality in 2 megabits per second so it's not like suggested one megabit it's 2 megabits per second but still 26 megabits per second to make a bit per sec without changing the decoders that's quite that's quite impressive and that's from harmonic and if I will take a look at what den berg says tandberg Peavy is a competitor of harmonic they basically tell you the same story but then the graph should start at eight megabits per second and now whoops this should have read eight and here again it should have read 2 megabits per second something went wrong in the conversion from PowerPoint 022 keynote but the picture is clear from six megabit or from eight megabits per second to do megabits per second today is huge improvements because there is competition so this open standards its interoperability but there's a lot room for competition so briefly let's look at the deployments of mpeg-4 we see a PC media player support and a recent survey turned up on the MPEG 4 and 4s text notes mailing list turned up like some 20 different players and some of them are for facial animation and for 3d content most of them do basic streaming click x 6 is there of course real has a standard plug in which means if you hit mp4 content with a real player it goes back to the real server download the plug-in if it's not already there and because the decode the content that's done by in video there are several plug-ins for Windows Media there's divx which is an mpeg-4 compliant implementation which has millions of downloads weekly and just like quick time and then mpeg-4 of course is widely supported in and thread generation in 2.5 g mobile phone networks like Roberto castaignos said this morning it really becomes the case that in spite spite of all these different mobile networks which are not really in trouble you can take content from one of them in Japan and move them to Europe and the content will play so you can take your phone to Japan but you can send your content using the phones and it will play epic for is used for video AAC is the optional sound codec in addition to the mandatory speech codec and the file format like we said three dubs 3gp is a very close to mp4 it's just this top-level a demand and the AMR codec texture that's used in quicktime 6.3 recently released of course support 3gpp then the internet swimmingly alliance set a couple of words about that already made a specification for interoperable mpeg-4 across the internet what's maybe more hidden in the background is that mpeg-4 is becoming the de facto standard for security and surveillance there's a lot of surveillance cameras with hard disk recorders and stuff that just use a big force almost silently because we don't and I don't get to hear a lot about them and interestingly you see a lot of whole media centers that do went back for it and people use diffic still rip the content and then they put it on a DVD and they put it in the DVD player the DVD understands that big 4 and these are just a couple of recent announcements and I was mentioned the mole but there's chips there's video cameras there is a solid-state video cameras these are cool is it just this size basically and you could we can record on the on an SD card or a memory stick or something you can record a half an hour of video and audio that you can that's watchable on a TV I won't say it's like DVD quality but it's perfectly watchable just another device this size there's there's portable stuff and it's coming more like this video jukebox is to use them back for and there's of course mobile phones they don't use decode but some of them also stream it so it's not just in or is it not just recording some of them can even play it out while it's being recorded it's pretty cool so lastly I want to say a couple of words about the impact for industry forum and in that context even though we're not responsible for it about licensing of mpeg-4 because some of you may have who's heard about licensing here by the way yes right that's right okay so come back here Monday Thursday morning I won't be here unfortunately because we have our annual f4f meeting but someone will be here to explain so let me say a couple of words or a three-year-old now by the hundred members we have worldwide in across industries very much according to impact force vision a nonprofit organization we have these and many other members Apple is of course there and you see is a lot of major companies but there are also smaller ones and they come from itd come from the consumer electronics industry to come from the mobile operators they come from all across the globe and all across the industry and they all believe in this single standard that works across everything our goal is to get mpeg4 adopted and we have done a couple of things that are important we've discussed licensing a lot again I will say without responsible for licensing I'll clarify that in a later slide we've done a lot of interoperability with a program with over 30 companies exchanging the extremes between their products we will have a logo program pretty soon and if you type in mpeg-4 in Google you come to the end for your website and you get a host of information and this is a membership it's if you're interested three thousand dollars for full membership and three hundred dollars for and not-for-profits that I want to well on that too much this is important though and this is the last thing I'm trying to tell you and then we'll have questions the licensing a lot has been said about licensing a lot has been true and a lot has been false by the way but the responsibilities are as follows MPEG standardizes so ampeg is the moving picture experts group makes the standards and by ISIL rules they can't really deal with licensing although but the new codec there's been a lot of effort to get a royalty-free baseline codec it is the simplest incarnation it's a profile and it's been a lot of effort to to keep to try to keep this royalty-free for licensing then there's the mpeg-4 industry forum which done a lot of work to you get licensing of the ground but does it see anything of the proceeds doesn't require anything of its members with respect to licensing it's just literally a catalyst if you know our catalyst works before and after the chemical reaction to get the list isn't changed well if the if the reaction goes really bad the catalyst may go away that may still happen but then there's the license source that the people that actually have patents to sit together in some room and decide and sell licenses and what I'm 4ef says the licenses need to be competitive it should be possible to build competitive product given the licensing so that's what we're working on right now still working on right now and actually working a really hard right now to get this right for ABC because we know some things need to be improved their Monday morning Larry horn I think of MPEG LA will be here to answer your questions about licensing is going to tell you what's called the truth about a big for licensing my personal opinion is that it's great for devices it's great for phones it doesn't work yet for content providers that's what we're working on right now and there's a lot riding on this I can tell you hey I only expect to this life Amy Rob pointed out there is the session Thursday morning with Larry party let's hang a light is probably a particular interest I'll just so sorry and there's a lot more it's good quick on he has to your own you guys is it them pick a Larry horn that's coming it's a high okay I wish I could be here he said we just have to send for EF annual meeting I need to elect a new board and all that sort of stuff that it's interesting to but I wish I could just be or not slavery some questions not only a great session questions you