WWDC2003 Session 200

Transcript

Kind: captions Language: en thank you very much it's my pleasure to welcome you to session 200 which is a graphics an imaging overview hopefully what we typically do at the sort of overview session for the graphics track at WWC is we followed a formula in the past and that formula is essentially tell you about the new announcements for the upcoming technology announcements for the upcoming version of Mac OS 10 and then point you to various sessions they're going to talk about those those particular technologies and we're actually going to change things up a little bit this year and what we're going to do is sort of recast the session and we're going to actually recast as a graphics and imaging direction session because it's a very important thing that's happening at Apple that our rate of innovation in terms of the graphics technologies that we build into Mac OS 10 is incredible we have fantastic technologies such as quartz extreme a compositing windowing system that gives us the ability to create dramatic visual effects such as expose and also the fast user switching that everyone seemed to like yesterday the cube that rotated around as we switch between user accounts and additionally we have you know incredible 2d performance is shown by the sort of PDF bake-off that happened at yesterday's keynote and we continue to really work with these technologies and continue to innovate and it's interesting and my role as an evangelist here at Apple I work with many of you to help you understand our graphics technology portfolio and leverage it in your products what's come sort of to our attention is the fact that we're really pushing the envelope in graphics and with your release cycles being anywhere from 12 to 24 months it's often difficult for you guys to see where the trend is to see which technologies you guys need to really invest in in terms of building your products on to make sure you make maximal use of the platform and create really compelling applications for our mutual users so what we're going to do at this session is we're gonna focus on two main themes that we have that run through our sort of you know technology innovation and graphics at Apple that's going to be PDF and also the ascendancy of the GPU as a way to do a lot of even 2d graphics work so what I'd like to do is invite Peter Griffin Eno the director of graphics and imaging software to the stage and he's going to take you through the session thank you thanks Travis great thank you very much Travis good to be here at WWDC good to see all you guys here I think we've got a great week for you today this week at WWDC and today in particular we're going to take you through the overview of the graphics and imaging sessions which are always fun sessions at WWDC the first thing I'm going to do is for those of you who are new or new to Mac OS 10 or new to WWDC is do a brief overview of the graphics architecture of Mac OS 10 then I'm going to go into as Travis said two kind of central themes underlying themes that that we see in the industry and that we're taking advantage of in Mac OS 10 but I think it's really important for you guys to understand the first one is about PDF which we call Mac OS tens digital paper the next is a GPU computing revolution which i think is a real significant development and finally we'll go and do the tour of technologies and show you what's new and Panther so first of all the block diagram review to all of you we've got our core OS Darwin at the bottom based on FreeBSD in mock graphics layers on top of that and then the frameworks cocoa carbon and on top of that the user interface aqua the graphics areas we've got all of our technologies quartz 2d for 2d graphics OpenGL for 3d graphics QuickTime for multimedia and video and our compositing windowing system called quartz compositor QuickTime you heard about if you were in the session before it was general overview of quicktime directions they've combined the QuickTime live conference with wabg see this year so there's a lot of great content on QuickTime but I'm not really going to talk about it right here quartz 2d I'll give you a brief overview of what's what's in quartz 2d basically it's our 2d imaging model based on the industry standard PostScript in PDF model this imaging models been around for a number of years almost 20 years now and it's really kind of a credit to the original designers that it's able to describe basically ever Paige that's ever been printed in fact Adobe celebrating the 20th anniversary of PostScript and they published a nice book that was kind of interesting read if you're into the history of this stuff but anyway our courts 2d imaging model is based on the industry standard PostScript and it's really just a lightweight C library that implements the PostScript imaging model primitives and so there's no display PDF as some people have have asked about or display PostScript or anything like that it's really just a lightweight C library we can read and write PDF with that library so you can think of PDF as kind of the meta file format for the graphics library in fact we knew we wanted PDF to be the meta file format so we kind of worked backward from that to figure out what the good API would be for that we've got really fast anti-aliasing as you see in the aqua user interface the quality of the presentation of the line art and graphics and icons is really important so we spent a lot of time making things look good and run real fast as destination alpha which allows us to record coverage information as we draw this allows us to do composited icons and sprites that can carry along their anti-aliasing information as they animate or move around one of the great things about PostScript it really revolutionized digital typography invented you know outline fonts device-independent fonts and over the years there's of course been type one the original Adobe Digital font format there's been TrueType that Apple invented and OpenType and others have come along and Apple has a great technology called Apple type services that quartz 2d leverages to do bringing all the all the typefaces and just handle them seamlessly for you including a full type one scalar color sync is also built in the quartz 2d it's our implementation of the ICC standard for colored managed workflow and allows us to manage color end-to-end in the quartz 2d library and obviously one of the big things about a 2d library is how it relates to printing basically the whole WYSIWYG notion of being able to have the same imaging model for screen in printing and so quartz and PDF played very much a central role in the printing our texture as well the spool file we create when you go to draw into your printing context is just a PDF file and then that PDF file is rasterized for inkjet printers using the same high-quality quartz rasterization or converted to postscript for PostScript printer and again we can use the end and color management we have in the system to manage color through the whole process and the infrastructure we build all the graphics conversion around that's part of printing and just important is sort of the spooling and sequencing architecture and the networking architecture around printing and for that we're using an open-source technology called cups which stands for common UNIX printing system which is an implementation of an open standard called IPP and it's kind of an upgrade if you're familiar with the LPR LPD suite that comes along with Unix this is kind of an upgrade to that there are commits still command-line utilities that do the same LP LP LPR functions but cup's provides a much more modern infrastructure for that so that's kind of quartz 2d to kind of give you a brief framework in which to think about our 2d graphics for 3d graphics it's OpenGL again industry standard technology been around for maybe 15 years as the original iris graphics library on the silicon graphics machines and apples implementation is really a state of the art implementation we took our job very seriously in terms of building an OS infrastructure around driving graphics cards and virtualizing resources like video memory usage and we have a lot of data flow optimation is to make sure things like multimedia and video can flow through the system very quickly the other advantage we have is we have a unified driver model so unlike on the PC where you may get a monolithic OpenGL stack from a vendor we have a lot of the stack under apple control so we can expose extensions like rectangle texture or vertex program in a standard way across all of the vendors hardware which makes a lot easier for you as a programmer we also focus a lot on stability and robustness because as you know we use a lot of OpenGL throughout the operating system so it's really not just for games although it's a great game technology and we still focus a lot on games we're really driving it throughout other places in the OS as well so some of the innovative uses of OpenGL in the operating system we have of course quartz extreme which we talked about last year which puts the entire compositing of the windowing system on top of OpenGL there's also a bunch of innovative uses that are starting to emerge in consumer applications for example keynote the slide slide shows you're seeing here are all in OpenGL and when keynote goes full screen it's using OpenGL to render all the all the slides and the transitions I chat the conferencing screen view and I chat is actually done with OpenGL so all the picture-in-picture you see the animation of the and blending of the of the two scenes of the you know near and far scene are all done with OpenGL so we're really kind of pushing it as as a technology to to use in innovative ways and consumer applications and hopefully you guys based on you know some of the stuff you'll learn about this week at WWC you will want to use GL and your apps and new creative ways as well so that's OpenGL last kind of review bullet here is about the quartz compositor the quartz compositor is our implementation of the windowing system basically composites all the layers together on the screen it's based again on some pretty tried and true techniques in the industry a seminal paper 1984 SIGGRAPH about the compositing algebra by Porter and Duff they basically introduced the notion of the alpha Channel RGB a compositing and a whole algebra for doing that and in those days they were doing things like they had one program that could render fractals in one program that could say render spheres and you didn't want to run them both every frame if one was static and so you needed a way to combine the results after the rendering with high quality anti-aliasing and that's where the compositing algebra came into play and it's been used you know pretty much ever since as a real fundamental primitive of graphics and what we're doing is just using that in real time on the display to composite the output of all the applications to make up the graphical user interface so here's the block diagram of quartz extreme which we announced last dear which is the implementation of the quartz compositor on top of OpenGL so the application can be drawing in whatever graphics library it wants whether it's quartz 2d OpenGL QuickTime and basically the compositor will blend those together using GL into the frame buffer now Panther is kind of our third generation of this sort of desktop compositing engine we originally in the first release of Mac OS 10 it was all software based solution with some hardware acceleration for like moving opaque windows and things like that and Jaguar we put everything on top of GL and Panther we've been refining it further to make it suitable for things like the expose feature that you saw and it's really the natural evolution of windowing systems I think really think people and users just expect Windows to composite and I think everyone's gonna be doing this eventually and Apple OpenGL since we've proven in Jaguar and been running the desktop compositor on top of GL ever since Jaguar I think it's obvious that Apple Apple GL is is robust enough for 24 by 7 operation and the other thing that's key to understand about the quartz extreme implementation is once it once it gets the textured polygons it needs to draw to make up the GUI there's really no special OpenGL calls it makes all of the acceleration in terms of minimal CPU copies and getting data to the frame buffer and the blending modes and all those things are accessible to OpenGL developers so we did a lot of tuning on the data paths to make sure we did minimal copies because when you're throwing around you know megabytes per window you really can't copy any data but all of those advantages advances are usable by you and in fact the keynote guys take advantage a lot of those because they have pretty big textures when they're compositing and doing transitions as well so the big thing that we did new with quartz extreme this year one of the main things was the expose I'm not going to give you a demo because you saw in the keynote but basically the idea of animating all your windows so you can see everything have a nice ability to pick up a icon off the desktop and without having to drop it release the windows and be able to drag it into something that's basically the corpse compositor so there you have all the graphics technologies Mac OS 10 you know it's sort of a brief a brief review let me go into now one of my first kind of extended themes that I'm going to talk about today and that's PDF what I call Mac OS 10 s digital paper and this is not really a general kind of PDF in the industry discussion but kind of how we view PDF as an OS implementer and provider and what we're using it for in terms of the OS and the kind of opportunities we're trying to make available for you guys as developers so first off this digital paper notion PDF the best way I can always think of to describe to someone what PDF is is just to say digital paper because it really is a digital representation of a printed page so in other words it's got pagination built in and it can basically represent the output from any application so it's application independent it can present output to any number of devices it can handle any color spaces or whatever so it's device independent and it's extremely high fidelity I mean basically the imaging model behind PDF is the same as the imaging mile behind PostScript and as I said before it's been able to describe basically every page ever printed in the last 20 years or so so that's pretty pretty high fidelity and there's 500 million viewers distributed so it's obviously universal you can send someone a PDF file and be pretty guaranteed that no matter what platform on they'll be able to look at it so the what the way I think about it is PDF is really a universal sort of view level abstraction in the MVC paradigm world and when I put these slides together I didn't know MVC was going to be such the theme at WWC this year but so I have a brief review of MVC from a graphics guys standpoint so MVC is a standard way object-oriented programmers have have thought about factoring code into a model of view and a controller and my example is basically the model is the application data structures like the variables and the algorithms and the data of behind the model of the program the view is a visual representation so it could be a pie chart or a bar graph whatever you whatever you want the idea is you could have multiple views of the same model data and the controller might be a user interface area where you let you know the user type into into a field to change the data to manipulate the model and so basically the value of MVC is the fact that you've factored your code so you have model that can have multiple views so if you have a pie chart or a bar graph you know if you change your implementation of how to recalculate the spreadsheet models could have multiple controllers so you could have a simple user interface an advanced user interface or even a scripting interface and have your model code be the same and it kind of from a graphic standpoint the interesting thing is that any model whether it's Reggie database audio file or whatever can project themselves into a 2d representation to show to the user otherwise the user wouldn't be able to run the app because they couldn't they couldn't you know see the data and so views of vastly different models can share a common visual language and this is how a graphical user interfaces work when you think about it I mean you basically have similar looking windows but presenting vastly different models of depending upon which applications being run and the same thing with documents compound documents may where you may have part of a document coming from a spreadsheet part of a document coming out of a database part of a document from a text flow and because you can project all those things down to a view level format which is say a PDF file you can have them all share the common visual language even though the actual data backing them is pretty pretty different so let's move on here I might have to use the old-fashioned way here okay so document based applications are really MVC in in their nature you can think of the model as kind of the application file the data structures in the application in this case excel the excel file is really you can think of as the model the view would be like the document window that comes up where you get the row and column representation of the model and the controller would be the user interface of the application all the menus and panels that you interact with I'll try this one more time there we go so the interesting thing to realize from a graphic standpoint is when you create a PDF from your spreadsheet it's no longer a spreadsheet you've projected that model into a view and you really just have a representation of a spreadsheet and I think it's really important to keep the difference between models and views and look at it this way and how PDF serves so a model versus view if you can imagine if I'm interacting with someone I have a choice of whether to send them the excel file or send them the PDF that represents the the data in the excel file and really both are valid choices on the left hand with the model I get a very high fidelity model representation I can exchange data and they can modify modify the model if they want or change the formulas it requires that they're going to have that application and also have any fonts or plugins that I may have used in creating that model on the other hand on the on the view exchange side if I send them a PDF I can guarantee that they're gonna be able to see it because there's 500 million viewers out there they're gonna see the right fonts they're gonna see everything they're not gonna be able to interact with the data or change the basic calculations that go on in fact in many cases you don't want that but they are going to get a high fidelity of view representation so one of the reasons why this is I think important to layout is that if you look at the typical File menu of a application sometimes we get asked why save as PDF isn't just in there and I think the reason has to do with this model versus view idea normally in the file menu when you're talking about open closed save or even export you're talking about export exporting or saving the actual model data the application which could be you know anything but when you talk about saving as PDF what you're really talking about is essentially going through the print process the mapping of the model data to imaginated representation but just not sending it to any device just keeping that as it as a digital file around and so saving the PDF I really conceptually think of as just printing the digital paper and that's really the the notion I think that that carries us the farthest in terms of how to think about this and so what really hasn't been said before but I think is a really important point is that Mac os10 is the first commercial OS to have a system-wide standard for digital paper that's Universal and PDF truly is universal in terms of being able to view it so PDF is not a great model level construct and you know there are some some cases where maybe it's possible to encode sort of flowable text in PDF and this can be interesting and sort of closed-loop situations but basically the thing to remember here is that your are starting to impose model level constructs on what it's essentially a view level idea and if you think about a text flow model actually they're quite complicated and a precise specification would include your styling model your obstacle avoidance model how your containers connect and flow a pagination model whether you're allowed with a window widows and orphans and those kind of things justification model and a hyphenation model which is then going to require a whole language dictionary so you know how to break up words so a complete specification of a of a actual text flow is it's fairly complicated and I think kind of beyond the scope of what PDF and it's certainly not a universally agreed to notion of how to how to flow text so and the main reason I think that's true is it kind of sacrifices the universe fala D I mean the amazing thing about PDF is that everyone agrees that it's a universally agreed to abstraction for marks on a page I don't think everyone necessarily agrees that there is one universal abstraction for global text documents or spreadsheets or even that there should be one so our basic idea is that models are precisely where applications innovate and differentiate and we want apps to innovate and differentiate but just always be able to print down to a PDF because then we can take that through the PDF workflow so again our strategy is basically to use PDF as digital paper use it as a final form format of paginate addd also vector artwork use other formats such as XML to encode model data that's kind of our advice I mean that's what we do with keynote and other applications and really build a rich framework for processing this digital paper in the operating system and encourage applications to differentiate themselves by developing innovative models so long live MVC separations so that's my little talk on MVC and let's see how that once we get all the applications projecting their PDF their models into PDF how that really benefits the graphic arts workflow so here I've drawn a sample graphic arts workflow you can think of it starting with application dependent files removing the application dependence to get to some kind of digital marks on a page representation and then finally going device dependent and either rasterizing to a set of tips for CMYK separations basically going from device independence to device dependence and application dependence to application independence so one of the traditional ways this is done and this is really kind of the key that made PDF possible is that all applications whether it's word illustrator quark whatever can make PostScript and what adobe realizes that if they can make postcodes we can sort of rebind the PostScript language to this format called PDF that's easier to read and have have a reader application for and so this is kind of a workflow that that was popular when PDF initially came out basically everything projected in the PostScript that need run a process a distiller process and create your PDF and you could maybe tune it for the web and downsample the image or between it for print optimization or pre-press now that's great but there's really a better way to do things I think I talked about creating you know a purpose PDF for the web or for print but isn't PDF itself device independent you know it's suppose I create my web optimized PDF and later I realize I'd rather have it print optimized well then I do I keep that PostScript file around so that I can go back to it and and and repurpose it where do I keep the application file around then I need the application so do we really want PDS to be sorry PS to be the the application independent digital master and I think it's pretty clear the answer is no if you think about taking this picture and just replacing the hub with PDF that's really the architecture that we've been going for in Mac OS 10 and the nice thing about putting PDF in the middle there is it's really a better suited kind of digital master format than PostScript again it's viewable on all these copies of Acrobat out there and it really can losslessly encode the application intent since it's exactly the same imaging model as PostScript that digital master PDF that sits in the middle doesn't have to pass judgment on the application what the applications trying to draw the applications trying to draw high definition images or high resolution images or weird color spaces or true type fonts or type two or OpenType fonts I don't really have to care because I would just want to record what the applications telling me to draw and then later on I can purpose it out if I need to and so the nice thing about adding PDF as that center of the hub of just recording what the application drew is not only can I now later you know make a late binding decision whether I want to go web or print but I can also develop all these device independent PDF processing tools instead of this opaque PostScript file in the middle that I need a language interpreter and you know I can't even tell you how many pages are in it till I execute the document I now have a much better device independent representation there and I can write little tools that do cover page or in position or whatever top of it so that's really why I think it's key to have PDF in the center of that and PDF processing on Mac os10 is a pretty popular thing there's over 50 little PDF applications some not so little on Mac os10 some from big names some from small names and Panther is gonna bring new opportunities and this is kind of one of the thrust with Panther four on the 2d side is really getting to leverage the possibilities with PDF so we add a bunch of new PDF workflow tools in Panther addressing PostScript file handling user level scripting we have something called quartz PDF filters printing to a PDF workflow and PDF introspection api's and let me go through each of those briefly both scripted PDF conversion basically allows us to deal with the PostScript legacy files not necessarily put PostScript in the middle but provide a graphically lossless transformation from PostScript to PDF it works with EPS files as well it's based on a real PostScript interpreter and it's not a replacement for distiller it doesn't do a lot of the finishing up options you haven't distiller but it basically allows us to graphically transform PostScript to PDF the other nice thing on the printing side is it allows us to accept PostScript jobs to any printer from say a Windows client or a Mac OS 9 client so again back to the picture now with PostScript can feed that hub so we can take PostScript files convert them to PDF and then have them in that hub and subject to all the transformations that all the great PDF tools you guys are going to write can can do another aspect of the PostScript legacy that was kind of lost along the way was the ability to have some level of user level programmability PostScript was actually a programming language and some people used it to dynamically generate graphics so the pro script program would calculate the picture rather than just being the output from a driver PDF can't really do this but the reverse polish notation of PostScript and its abilities as the language also kind of made this a bear and this this you know always turned out to be more difficult than you'd like it to be but nonetheless some people found this incredibly useful the ability to have simple scripting based bound with the 2d graphics and although I never used PostScript as a pickup line I found this on the web which was pretty amazing so what we've done actually is take the courts 2d API and use a real scripting language Python and to create Python bindings for the courts 2d API it's basically the C live C language API entry points that you can see in the courts header files but just bound into into a Python interpreter within with a module and this allows simple PDF processing from scripts we've added some convenience functions for dealing with QuickTime for getting images in and out for dealing with cocoa in terms of drawing HTML RTF and unicode and also for dealing with our postscript to PDF converters so you can read post Krypton and EPS files so basically I think this is going to allow a new generation of script writers to really write simple tools to be able to manipulate PDF and it's really handy for small one-off processing scripts and one of the reasons we did this ourselves as we had a bunch of places in the printing path where we just need to make a simple change to the document and rather than write code it's just easier to do it this way so you can imagine some examples like rasterizing an eps file to a bitmap and exporting it via quicktime doing some advanced emphasis in position algorithms for booklet printing or something like that just concatenate two PDF files together it's pretty few lines of code adding a cover page or watermarks to a PDF is pretty easy dynamically generating graphics on a server is something I think that's going to be very interesting in terms of being able to have all this power in a server based application unlike PostScript you can't really override the marking operators you can't read the find show and show page but for now we consider that to be a feature because it's a little bit unmaintainable so as an example I actually did this went off and looked at the old PostScript blue book I don't know if any of you guys remember that book but there's a simple example in there called wedge and it draws this little starburst here and that's the PostScript code on the right and the Python code on the Python codes a little bit longer but probably more readable certainly these days so the next thing I'm going to mention briefly is quartz PDF filters quartz PDF filters allow us to do some transformations on PDF mostly dealing with color space transformations and you create these recipes and the color sync utility that the color sync session they're gonna they're going to demo this and talk about it and it can be used for color conversion effects or even just we use it in the printing path when we're going to a black and white burner over a slow link we want to get rid of all the color and just bring it down to the black and white before we send it over the wire it does have some imagery sampling and compression options built-in as well print to PDF workflow this is something we announced back in one of the Jaguar updates it allows you to extend the Print panel again leveraging print as though as the kind of hub of where PDF workflow takes off and basically that save as PDF button in the print panel can grow to any number of options that that the user wants in terms of applications that can open and deal with PDF for example you could send the PDF off to illustrator and work on it or send it to mail to bring it up in a compose window or encrypted or whatever so this is a great place to hook in if you're writing up a little PDF processing tool and some of you guys have already done this a great way to leverage into the system and lastly our PDF introspection API is which our new API is in Panther which allow you to have complete access to the PDF document structure as kind of the tree of objects it basically models the dictionaries streams strings arrays that are in the PDF file itself it doesn't go and model the internal of the graphics streams but it but it is useful for extracting things like links annotations and metadata and stuff like that so enough talking for now let's bring Ralph up to the stage and give you a demo of some of the stuff we have as far as PDF processing in preview and Panther look around hello so what I'm going to show you first is preview in Panter which makes use of these nuclear PDF introspection ap is that Peter was mentioning so I'm opening a PDF document here it's one of my favorite documents and it has a table of content information embedded in the file and in Panter if that's the case then the drawer to the right pops open and shows you the table of contents you can navigate through chapters and see for example all the instructions that the velocity engine has and click on one of them well there's back floor but similarly we also added search for PDF so I can actually look for string not not equal and it shows me all the pages where that expression occurs again I can click on that page and jump to the corresponding page hey well one thing that Peter was mentioning is the PostScript to PDF conversion and what I'm going to do is essentially just double click a postscript document from finder and the Posca to PDF conversion kicks in and it will open in preview briefly yes yes so what you see here this is a paper I got from the internet it's impo script it just got converted Texas their line out is there all the mathematical formulas are there so pretty much what you would expect and because it's now PDF all those tools we have to work with PDFs and I work on the converted PostScript file as well so for example I can go and search for the word image let me zoom in here and it's actually highlighted for you and one of the coolest things I think is you can even go and copy/paste out of the document like I select paragraph here happy and I get a text representation of the part I chose copy Act okay next thing I would like to show you is course scripting what I have here is a Python script that well let's just go through it it opens a PDF file an existing PDF file and it creates a new PDF file and then it enumerates all the pages in the PDF file gets the size of the pay of the particular page creates a new page in the output file and just draws the content of the original page into the output file so so far we didn't really do anything exciting it's pretty much a copy operation and then we add some custom drawing in this case we add red text to the margin of the page once we're done with it we tell the system to open that file with the default previa PDF view of the system so what I'm going to do now is Wow let me tell me first if you put that script into the PDF services folder then it will appear in the print panel as Peter was showing so if I go to the print panel where is it here and it takes a second so so I'm gonna print this script and I'm gonna print it through the script so it's kind of a naturist thing to do so let's see and there it is so I have to print out a little script and it added that confidential mark on the side oops so just to make that point perfectly clear this works with everything so I can take my PostScript file that just had before print it through your script and get confidential mark on any page well as the paper was published is probably not confidential but okay that's it for the PDF demo great thanks Ralph so for all those of you who have your theses locked away and dot PS files that you can't read anymore you can search him that I know a few people like that okay so courts in PDF summary PDF really provides Mac os10 with the universal representation at digital paper and we plan to leverage that a lot more in the operating system and hopefully build lots of opportunities for you guys as well to process PDF and our strategy is really to continue to build on it has a view level of abstraction final form presentation metaphor for PDF and I hope we've convinced you that in Panther we're adding a lot of really significant tools to the PDF toolset and as my last comment I'd say if you're still you know thinking pics Maps and G worlds come join the courts to D party go to the sessions learn about it it's pretty fun stuff so that's it for the for the PDF and courts 2d session you know change gears a little bit now and talk about kind of another development in the industry and this one's you know much much bigger than Apple it's kind of going on right now and I'm going to call that the my slides work here the GPU computing revolution there's really something going on now in terms of the ability of graphics processors to to compute graphics at much higher rates then even CPUs can I might have to walk over here so let's first let's look at some numbers so when we did quartz extreme obviously we knew that GPS could composite faster than than CPUs and if you look at a typical high end GPU today you're getting eight results eight pixels per clock at about 400 500 megahertz and twenty and some of the new ones even 30 megabytes gigabytes per second of memory bandwidth even a really nice CPU like the new 970 s is one pixel per clock at two gigahertz with 6.4 gigabytes per second of memory bandwidth and so by utilizing the GPU not only is it faster but you can free up the CPU to do lots of other stuff in your applications as well and what I'm trying to do in this talk now is kind of convinced you that this this trend is sustainable and that you really should take a look at if you have calculations in the area of graphics that you can move to the GPU with some of the programmable options that you should you should be looking at that so here's a little historical data of GPUs versus CPUs in terms of performance the CPU number is 160 percent a year that's just the standard Moore's law doubling every 18 months on it on a yearly basis works out 260 percent that's the orange curve the GPU curve is what historical data for GPUs has been over the last few years about two hundred twenty percent per year in terms of computational bandwidth increase so as you can see if you know your if your software is rotting the the red curve and your competitors software is riding the orange curve you've got a pretty good ride ahead of you and in fact over time that gap is going to get bigger and bigger and the reason this is possible is the capability growth in just the silicon technology is even increasing at even higher rate if you look at how many switches you can throw per second on a chip some numbers indicate that that's even growing at 270 percent per year in terms of transistors per unit area and clock speeds there's a lot of headroom here but what a GPU designers know that CPU designers don't and the answer really is nothing there's a cpus a general-purpose computer and a GPU is really not a general-purpose processor the computation that goes on on a GPU is what some call embarrassingly parallel which i think is pretty funny term and really entertainment applications like games have provided a sustainable economic model where people could invest in built building this VLSI to do massively parallel computations for graphics and I think that's going to sustain for a while and one of the ways to look at this is what I call computational intensity which is how much math do you do per gate on your chip and one way to approach this is to look at GPU design versus CPU design if you go to a GPU designer and you say okay your next generation chip I'm going to give you double the number of transistors he's probably gonna come back with a design that goes twice as fast in fact if he doesn't he's probably not not not gonna be working for you for very long a CPU designer you give him double the gate budget I think it's pretty hard for him to come up with double the performance and usually it's much less and the reason for that is general purpose CPU since it's a decision-making engine has to provide a lot of non computational logic about instruction and data caches you know some chips are up to you know thirty to forty or fifty percent cache branch prediction logic very complicated instruction reordering data hazard avoidance in the load and store units a lot of logic has to go into keeping the math units fed and the difference is with with GPUs they operate on a different computing model that's called stream processing and that's where you have data records flowing through a small compute engine coming in getting operated on by a small kernel of code and being spit out as data records on to the next element in the chain so you can have a GPU imagine the GPU as as thousands of these little stream processors and each one gets its input from its the previous guy does its calculation and passes the input on to the next so you have very high-speed dedicated producer-consumer links with no data hazards or anything between these pieces of the chip you can get huge wide bandwidth and the stream processor stays busy all the time if his job is to add two pixels he can just sit there and do it clock after clock after clock and CPU guys only wish they had that luxury to know that there was always work to do a lot of CPU logic is just finding work to do most of the stream execution units within a GPU are hardwired to the particular graphics problem but there is some program ability in some key places that's emerging and that's what I think is really the important trend for you guys to understand let's look at the graphics pipeline a little bit the traditional graphics pipeline starts with primitive submission on the top where you basically convert primitives into a vertex stream vertex can be thought of as a bundle of position vector normal vector color vector that's actually up to 16 arbitrary 4 vectors that you associate with each vertex then the vertices the process so the spatial vectors are transformed by the appropriate matrix lighting might be calculated texture coordinates are transformed basically performing operations on each of these bundles of data with each vertex then you go through what's called triangle rasterization where the vertices are put into a triangle and basically interpolated through the interior of the triangle we take the attribute bundle at each vertex and calculate what's called a fragment which is basically those the intermediate values with sort of the weighted average of how far they are from each vertex then the fragments are processed so we apply texture mapping we calculate what the final color and Z for that particular pixel is going to be and then the last step is fragment rendering where we do the Z check against the frame buffer we do alpha blending compositing fog those kinds of things and so that's the whole graphics pipeline and this is sort of the way it's been been for a while and what's happening is the the GPU vendors are opening up these two parts of the graphics processor as being programmable so you can think of that little stream stream execution unit kind of getting plugged into the vertex processing in fragment processing areas so when you hear those those terms that's really what's going on you're having a little a small little dataflow engine that can do a small calculation per vertex or per fragment when I talk about fragments I also sometimes talk about pixels so I think sometimes it's easier for people who are new to 3d to think about pixels and that that's okay with me so I'm gonna talk about pixel pixel programming so pixel programming is used a data model similar to altivec the programmable units operate on 128-bit four vectors of floats obviously your texture lookups may come from 8-bit data but it's all expanded for you into the floating point and you basically get to write a small cap small program that's executed per pixel to calculate the result of the output pixel so you know okay control over the blending and the z-test and all that sort of stuff but you can get total control over the the source color of the pixel you get access to the iterated values so those little the vertex attributes at the the corners of the triangle you get access to what that value is for the particular fix that you're drawing you can look in some global data you can look at do memory reads off of textures you can't look at the destination pixel you can't know what you're gonna draw to you just leave your output in a special result register and there's a lot of powerful ALU instructions like power or reciprocal square root cross-product all these kind of things and a bunch of squizzle instructions for changing order on the on the data units and the vectors so this is pseudocode this is not in any particular language just to try to communicate the simplicity and the job of a pixel program or a fragment program basically they all have the same function signature calculate pixel returns a result gets the iterated values and it can do whatever computation it wants it has access to some global constants as read-only and a global set of textures that it can go look up values in and that's really all it does now I'm gonna explain to you why I think that that once you constrain the problem like that you can make it go real fast and why GPU designers kind of have this advantage with the parallelism so this isn't how any particular chip works but just how if you can train the calculation like that how you might be able to design hardware to make it go fast so first let's consider just a for instruction long fragment program and I have one vector unit right now so I'm going to just do sequential processing through it I'm gonna send my fragment values into the vector unit and then basically clock the instructions through one after the next and after four clocks have gone by I get my result because I have a for instruction long program so I'm basically doing one operation per clock I'm able to keep the processor busy every every cycle and I get one result every four clocks so no big news there that's just sequential processing now there's a lot of data level parallelism in this problem pixel calculations because of the way of fragment programs are constructed are independent so the result pixel at a particular location can't depend on the result of pixel at another location the pixel calculations can basically execute in parallel so call this parallelism in space because it's kind of in the in the plane of the triangle what I can do here is just replicate the fragment execution unit in width and just do the same instruction on multiple multiple vectors that at one time so here's how I might have might have that laid out on now I have eight vector units I can feed in simultaneously eight fragment values into the pipeline and then basically sequence through my instructions one at a time and at the end get my eight results out so that gives us eight vector operations per clock were able to keep all eight of those busy every clock we get eight results every four clocks because it's four instructions long so on average throughput of two two pixels per clock but wait there's more instruction level parallelism is also true because it again we're constrained we can't really write to any memory the Machine states only going to differ between one instructions and the next by the register that was written to in the previous instruction so you can imagine compilation techniques or even maybe some hardware register renaming techniques coming into place to just build a little pipeline out of the out of M Bleen out of a sequence of instructions so let's look how that might work now we're going to line up the vector units in time and basically I'm going to teach each vector unit about an instruction so vector unit one gets instruction one vector unit two gets instruction two and so on and now I'm going to feed my fragment value in the top operate on it with instruction one and the first vector unit move it on the instruction unit two meanwhile I feed the next fragment value into vector unit one and kind of keep the pipeline full and once the pipeline is full I'm really just processing again keeping all the units busy at the same time and so now I'm fully utilizing all my four units for operations per clock I'm getting one result every clock once the operation is full so obviously the next thing no surprising and put both of these things together and basically exploit the time and the space dimension at the same time so I just you know get a bigger chip you know drag out more vector units and make a sort of a 3232 element fabric where I've got them four by eight just for illustration purposes and now basically I can feed you know teach each row about a single instruction feed the fragments in fill up the pipeline and then basically get eight results per pip per clock with doing 32 operations per clock so eight results per clock so I think you can see that as time goes on and maybe today is number 32 tomorrow those numbers 64 I mean you're not gonna run out of gas in the in the time dimension until you you know the average size of a triangle you begin to approach which for image image processing things where you're rendering big triangles there's lots of parallel fragments and in space you know this is only a four instruction program you can obviously imagine much longer instruction so there's a lot of headroom just in terms of parallelism so basically my argument is that will sustain and I think the chip capabilities certainly hasn't peaked I think the parallel computing possibilities haven't even come close to peaking yet we've got sort of eight by four as I drew here and I think there's still lots of headroom for that and I think the entertainment industry's going strong and it's not is not peaked and as operating systems and applications get into the game I think we're just gonna add fuel to this fire and really have an interesting world where people are doing massively parallel computations on the GPU to free up their CPU to do other stuff as well so you might ask yourself that sounds great how do I do it and the answer is you use OpenGL in Panther we have standard cross vendor programming languages at both the vertex level and the fragment level those begin with the ARB prefix which is the architecture review board for OpenGL so you don't have to learn vendor specific extensions there are a higher-level language being worked on and those will come out as well but assembly level language will will always be available and I think these architectures are so new you know we found like you get rid of one temporary and the thing runs twice as fast so people are probably gonna be tweaking assembly on things for a while programs are relatively small so maybe that's not a big deal but the higher-level languages are coming as well so without further ado let's bring Ralph back up and show you some fragment programs we've got a Radeon 9700 plugged into this machine over here which is running the ARP fragment program go ahead roll ok so the first thing I'm going to show you is the OpenGL shader builder application that is in Panther and well you see it here on the left side you have your little fragment program and on the right side they have a very complex OpenGL scene which consists of a single rectangle so that rectangle has a texture on it and there is actually a fragment program running right now that does that texturing so if you look on the left side what this fragment program does it goes to the textures in texture 0 uses the current texture coordinate to looked at look up the color at that point and then copies that color to the results so you get a texture quote well this isn't terribly exciting because non-programmable Hardware does exactly that for you so there's no point to actually write this program but we can go and modify it a bit so for example instead of copying the color back to the destination pixel unmodified I can say well only copy the green and the Alpha Channel and well the red and blue channels are lost so you see the result here it is this rather unhealthy looking cat okay so let's add an instruction in the middle here after we did the texture lookup we take the red component of the color and square it and then copy that the green and blue to produce some kind of a sepia-tone effect but it doesn't look right so let's tweak the exponent a bit let's say we take that one to the fifth yeah

  • don't feel like this until you have to

look you're going for now what you noticed is whenever I type in the background the program is compiled and run and shown to you right away so you get immediate feedback what your program does which is very nice to experiment and to get into things okay that's it for shader builder now that I've shown you what you can do with three instructions let me show you what you can do if you put a bit more effort into things okay so by the way this picture has been called demo monkey and so have I so the first thing I'm going to show you is a motion blur effect implemented as a fragment program the interface is essentially I click somewhere and drag the mouse in some direction and I get a motion blur in that direction so the first thing is notice it's pretty smooth frame rate and really the GL operations that that GL commands are going on here is there are four vertices live in four corners of that big rectangle and then it says draw so how you run your fragment program is you don't rectangle essentially and then for every pixel the fragment program gets executed to do that effect so another effect we tried is an axial blur like this you can set the focus point to wherever you want it to be like this so because what the CPU does here is really negligible I mean it just you know here are four vertices go and from then on it just waits until the result is done so this is pretty much this is a very very nice effect that is fairly expensive to compute but it has pretty much a zero CPU cost let me show you a different one a glass distortion effect so you can make the bombs bigger and smaller and actually move the glass around like this okay oops the last effect I would like to show is like an emboss effect so what this does is it takes the picture we had before and interprets the brightness values as hills and valleys in you know like some kind of really F thing and then put a spotlight on it so you can actually go and drag the spotlight around make the beam biter narrower things like that and again you have to you have to do that by a while to realize this you doesn't use any CPU at all it's really just you know the only thing that CPU does series update the sliders okay I think that's it for the fragment program thanks Ralph so in summary for this section of the the GPU talk Irene I hope I've convinced you that GP is for certain classes of data parallel workloads and algorithms really have an advantage over CPUs and I think that this advantage is going to be sustainable for at least the foreseeable future and you can use this to access to this power via OpenGL and my advice to you is learn how to do this stuff before your competitors do because there's some pretty cool things are going to be happening so the last section I'm going to talk about today is another effort we've been working on which is quartz 2d on OpenGL quartz 2d on OpenGL basically accelerates quartz 2d by turning it into GL calls and it's really the logical next step after quartz extreme it ties together the two kind of key throws we talked about today of programming the graphics processor and PDF 2d in plantasia and in Panther we're going to have an initial implementation of this that really focuses mostly for GL developers who want to get high-quality text and line art into their applications which has always been a really difficult thing to do with GL so the way that you do this is you take your GL context and you pass it to a function called CG GL context create you give the size of the CG context you want in the color space you want to render into and you just make CG calls on that CG context graph it's high-quality 2d rendering it's virtually identical to software quality we do use the alpha blending in the hardware so it's not pixel you know the values aren't could be exactly the same but it basically looks indistinguishable from the software and it's anywhere from three to ten times faster than then course 2d software rendering the the way it gets to be on the order of ten times faster is when we can actually catch things in the in the graphics unit so we will cache text fonts glyphs as textures and if you hold onto your cg image refs or CG pattern refs and reuse them the implementation will catch those in video memory as well so you can draw very quickly with that the well the reason we're kind of calling it an initial implementation right now is because there are certain courts 2d operations that are not yet supported by this context for one thing we can't do the high quality LCD text that we have in the system without relying on fragment programming and since fragment program is kind of at the high end right now and it's coming down through the through the system we can't really kind of turn on quartz 2d acceleration everywhere right now the other thing is the PDF 1.4 blend modes which are also going to require fragment programming we also you have to be using the core graphics API only or the courts 2d API only you can't turn around and draw some quick draw with thing you can't ask for the locking deport bits and you can't use high-level frameworks on this context so you pretty much have to be going right at it with the courts to dapi we do require coarse extreme capable Hardware so we have the non power of two texturing which is important for drawing images and things like that and generally as as as this path becomes available I think applications are going to have to revisit some of their assumptions about the cheapness of accessing the drawing buffer the window buffer so it's another thing to be aware of in your usage model if you want to use this kind of acceleration so the basic way I tell this story is that having a wide pipe is really great but it also increases the cost of turn of reading reading back pixels and turning everything around so if you imagine you're on a little stream towards the frame buffer and you drop a few pixels in and you realize you want them back yeah maybe you can just reach down the stream and grab them but if you drop them over the top of Niagara Falls yeah we got to stop the Falls let all the water fall down climb down there and get the pixels bring them back up you're not gonna be running any faster than software in fact in some cases slower so to use courts 2d on OpenGL you have to you know buy into the whole asynchronous and the pixels you know they'll show up when they show up I never want to look back at pixels as I've drawn except very rarely otherwise it's just not going to be any faster for you so I'll invite Ralph up one last time for a demo of courts 2d on OpenGL okay first application that goes I'm going to show you is a chat as we've seen in the keynote I chopped has now this video conferencing feature the way that videoconferencing view is implemented is actually fairly interesting it is an Open GL scene so there is a video is put on a texture and this is said displayed oops I think I'll be your camera my camera man that's a good point okay so video is put on a texture and then GL composites that texture and the reason why that is done is in if you have a two-way conference going you have this little picture in picture which has a drop shadow and sometimes there is content being shown as translucent alert messages lying on top of life video and stuff so GL is really good at these kind of things and it also does the conversion from a video value V to RGB so that takes all that load off the CPU so the CPU is free to do the actual video encoding and the networking stuff that is necessary so this is a slightly modified version of I chat to make this demo what I'm doing here is I'm drawing text on top of it and this is text drawn into an OpenGL scene and it's not text on a texture it is essentially CG show text at point and then point it at that texture so you just draw into it and you get you know all the font management the kerning and all these kind of things that are usually very hard to get in OpenGL so a second example I have here I have a PDF document this one here has a little frame and a bit of line art in the corner by the way the way I made this PDF document I wrote a little quartz scripting script the Python script that takes the places to line out and draws Noble so I'm going to take that and drag it into my view so we put a little oscillator on it to change size but the point here is again this is not drawn into a texture every single frame we reinterpret that PDF and draw it on top so the scaling you get here is not scaling of a bitmap its vector or scaling okay so let me show you something about the performance of these things what I'm running here is a PDF document it's the app gate reference manual which is a thousand three hundred pages long and I'm just trying to flip through the pages as fast as I can and this is the standard quartz software renderer that you get in Panter and you see that little frame rate meter in the top right corner yeah we're getting you know it's kind of low fifty to sixty frames per second so that's actually when you think about it pretty impressive it's you know sixty pages per second that definitely beats the laser writer okay but now I will replace the content view here with an OpenGL view and then point the same PDF file at the Open GL quartz OpenGL renderer and it gets a bit better so we are around 160 to 180 pages per second now and literally the piece of code that needed to be done there is something like three lines of setup so the rest that the actual rendering code is exactly the same so why is it fast well it's what Peter said about that asynchronous model so but PDF parsing involves well you have to departing and you have to do decompression of the stream and then you draw so what happens in this case once you did the decompression of a graphics primitive you just submit it to the GPU and while the GPU is doing the graphics you'll be you're ready to decompress the next part and do the additional parsing so you have now the CPU and the GPU in parallel very nicely so that's by the way that the little CPU meter you see there this is a two CPU machine so it runs at 50 percent so me one CPU is fully busy well you might wonder now well if I take all that parsing and decompression stuff out of the loop because when I my application draws it doesn't do that it just calls the CG API so and then I will say well I'm glad you asked because that's what we did it we essentially took out all the graphics primitives off that PDF file and wrote a little big list and then just call the CG API is to do the same drawing so it's technically no longer a PDF the drawing of a PDF it's drawing of the PDF content and when I do that things start to look like this see that the CPU is still busy we're actually still the CPU is still either completely saturated with sending these commands over the GPU so I would consider that an unique opportunity but we'll see ok that's it for the yeah that's pretty amazing I think wealth was telling the story about when we were bringing this demo together and they can't to go back what twice to make the frame counter go higher yeah 100 that should be enough no 200 that should be enough 400 okay so anyway so the recap of kind of the talk today is basically that there's really a new era of innovation in platform graphics that's happening in the industry and I think Mac os10 is kind of leading the charge here and it's really kind of a state of the art visual computing platform for all your applications so in your apps please leverage all the great infrastructure we were building in of course 2d and OpenGL and I think unlike sort of in the old days there was a period where people had to work around operating system infrastructure to try to do what they wanted to because of limitations in quick draw or GDI or whatever platform graphics was there well I think we're kind of coming upon a new era where the platform graphics is getting good enough that you build on top of it and really can go places you don't have to go back and reinvent all of your blit loops again let us do that work and you guys add great value on top of that and with all these new things going on in the industry particularly with OpenGL and fragment program it's really fun to go just learn a few new tricks and and crack open the GL book or something like that and and teach yourself some new techniques because there's a lot of new stuff happening so for the last few minutes at the talk what I'm going to do is briefly whiz you through some of the pointers to other sessions that you want to check out if hopefully we've piqued your interest during the talk today I'll start with quartz 2d some of the new things in Panther for quartz today's PDF 1.4 support the PDF introspection API we talked about the quartz scripting with the Python bindings CMYK rendering context so now we can drive raster printers in their native color space numerous performance optimizations and of course as you saw quartz 2d on OpenGL there's a course 2d in depth session on Thursday and then an intro to court services which talks about some of the display management infrastructure on Friday for Panther obviously were shipping 1.0 of our excel an implementation as we talked about yesterday this will be emerged with a 4.3 X 486 it's on your on your seed we have double clickable x11 applications now in full-screen mode operation and a lot of bug fixes so check this out in the sheet if you're an X Developer it's it's all in the disk you have also new for Panther and printing we've got PostScript support we're really excited about that the ability to basically have any Mac os10 attached printer be seen on the network as a postscript printer user interface improvements we've got an improved version of the old desktop printers which is coming out in Panther we have job submission API is if you know how to calculate your own PDF file or your own PostScript file you can just hand that to the school system directly and not have to watch that page counting dialogue go by we're merging with cups one point one point one nine the latest version of cups and we're also going to be shipping the GIMP drivers this time around the GIMP print drivers which are great for legacy printers it supports a whole bunch of printers and the drivers are really first-class citizens with their user interface and integration so the printing session is on Thursday so you want to check that out for color sync some of the new things if you went to the QuickTime Talk you saw Tim talk a little bit about color sync and its relation to QuickTime graphics importer who apply profiles by default cameras are gonna embed profiles by default with taking the with a standard standard profile cups and GIMP print drivers can be integrated with color sinks and vent profiles and have matching occur we've got the cup court's PDF filters that I talked about you'll want to go see those and sips which is a command line image processing tool which is meant to interface with Apple script so apple scripters can do basic image operations like crop and rotate without having to launch preview or whatever new api is for abstract profile generation color sync has the facility to do actual manipulations in abstract color spaces like the sepia tone in la b and there's going to be new api's to how to express those kinds of color transformations in in Panther and also a new display calibrator that's tuned for LCDs so go hear about that on Wednesday for image capture a bunch of new stuff there were new ways to integrate your applications with image capture there's automatic task image capture services which is a cocoa and carbon service menu is now you can integrate that with image capture there's also a common UI layer for if you want to do scanning from within an application you can call on that there's also a network support for sharing and monitoring image capture devices over the web so that's that's kind of cool as well on Wednesday session 204 is image capture and OpenGL excuse me need my water OpenGL new for panther there's fragment program pixel buffer support lots of optimizations about copy text image and copy text sub image we have recoverable GPU support from drivers that support it the ability to reset the GPU on the fly without bringing the Machine down which is pretty handy it's about security bugging drivers OpenGL shader builder and profiler these are two really great tools we have for OpenGL developers and I think you want to go to the sessions and see those because those are really making great strides in Panther again the ability to use quartz 2d in an OpenGL context and just lots of bug fixes and optimization techniques you can hear about so this is a bunch of sessions that we have for OpenGL I'll point out in particular the last one session 212 on Friday we're actually going to have some of the demo engineers from ATI come and show you how they built their demo engine and run through a couple of techniques they used in their demos last year we had Nvidia do this it was a great session and it's going to be great again this year and another special session later on today we thought it would be really fun to have the keynote engineers come and talk about keynote as an application and how they really leverage the platform services in terms of quartz 2d OpenGL QuickTime cocoa to build what I think is really a great application that really kind of swims downstream with all the technology and builds on top of what we're doing in the operating system so that'll be a really interesting session to attend as well so and last but not least our feedback forum which is at five o'clock the last thing in the conference in the North Beach conference room so come to there tell us what we're doing right what we're doing wrong what you'd like to see us do in the future and will we always love enjoy talking you guys are getting your feedback you