WWDC2004 Session 437

Transcript

Kind: captions Language: en and please welcome frameworks engineer Douglas Davidson [Applause] good afternoon everyone thank you all for coming my name is Douglas Davidson I'm here to talk to you about the cocoa text system now the text system in some sense is at the heart of cocoa is almost everything in the app kit makes use of text in one way or another and generally uses the text system to do so the cocoa text system is very powerful and very flexible as lots of features and there are quite a few new features for tiger so what I'd like to do today is first of all gods over briefly what we have this new for tiger and then I'm going to dive into some of the inner workings of the text system so that by the end of the talk I can really give you a detailed description of some of the most significant new features how they work so what do we have that's new for Tiger those of you who are here last year may remember I talked about among other things our support for Word documents and that was pretty well received but you know some people would ask me why we didn't really handle tables and word documents and I had to tell them the cocoa teksystems doesn't support tables so when I got my task list for tighter you can guess what the first thing on it was support tables in the cocoa tech system so i'm glad to say that for tiger the cocoa tech system will support table now we thought about this a bit though and we realized that while tables are important for RTS and word documents and so forth but they're really critical for is HTML so for tides always also put a lot of effort into our HTML support using a table support and in addition entirely new for Tiger the cocoa tech system will now explore HTML so let me talk about this now I've been speaking this this as text table support for short because everyone understands that but what we really have here is something that's a lot more general the White House is a flexible extensible mechanism for sizing and positioning text blocks of all sorts the first application of which is to the representation of tables from RTF or word documents or HTML or HTML and CSS now when I get to the end of the talks I'm going to be describing how this works in great detail I just want to say that is a very general mechanism text blocks and our intent is to use it for all sorts of complex layout we might encounter for HTML import you may recall that int answer we have two different kinds of HTML import there is an older style that has some support for sophisticated layout but that is rather out of date limited in the kinds of HTML vatican part and then there's a second kind new for Panther that use web kit for parsing until could handle arbitrary HTML was rather limited in its way out support so for tiger all that is lon we have a completely new HTML import mechanism uses webkit for car sing so can handle any HTML you like we defined and it uses the new text block text table mechanism for complex layout support to our HTML export whenever you talk about generating HTML the question always comes up well what kind of HTML is that you're going to generate so when we were planning this feature we took low informal service on the group's at Apple we thought might want to use it and the results were unanimous every group wanted a different kind of HTML so we decided we just have to satisfy all and what we have is a mechanism that gives you control over this allows you to specify exactly which HTML tags should be generated this gives you a lot of control you can generate HTML you can generate XHTML strict or transitional of course it's automatically going to be valid and well-formed on you can use the SS in a variety of ways or you can use know CSS you can generate a complete HTML document or just a fragment and all sorts of other options take a look the release notes I think you're going to like it now we're only about halfway through the tiger development cycle so these things are not finished by any means but if we'll go over to the demo machine I'll give you a brief look at some of the things that are working so I have a little demo application here so let me just type into it [Applause] maybe make this fold and let me go and save it I have several have added several options here for saving this I could save it as rich text or HTML work fine web we've also added support for saving as words xml format word ml when I save it as HTML let me choose to save it as HTML 4.01 transitional for CSS I'm going to choose to use CSS in line so the encoding well let me leave it is utf-8 because you know that's the right thing to do and let me give it a name and save it on and let's see what kind of thing we generated open it to show the source and we have an HTML document okay try something else let me bring up I have a little Excel spreadsheet here select it copy paste it in here I get a real table but just select that and pop up the size little if you can see it better you make it bold okay well that's working pretty well so far let me try something a little harder so in Safari I brought up a possibly recognizable website let's just take a look at the source for that this thing usually has about 50 tables nested four or five deep you know so let me take this and save it I'll save it using web gets new web archive format which of course can L support and then let me go back and see if I can open that up and if everything works properly we will get lashed HUD just select it all maybe make it bold and then save it back out yeah don't provide it and well here it is let me see if I can open it back up it's Safari and there it is I can see a few glitches that we'll have to work on but recognizably the same website only all bulb but let me emphasize that this is an import and export mechanism the HTML that we're producing is rather different from the HTML that went in you'll notice for one thing it's solid well formed a nice with format which the original [Applause] okay so while I'm here there's one more thing I wanted to show you let me just close this out open up this encyclopedia article here maybe I want to find all of the tigers in it so I could just find the Tigers but now we have the possibility to select all of them as a multiple discontinuous selection may be making a little sugar make it bold italic underline so forth okay let's go back to the slide and I could talk a little about that multiple selections how does this work well there are a variety of methods and then it's text view that act upon or take a select to the selected range so now for tiger all those will have a counterpart that works on selected ranges which is going to be an array of range values the precise detail different for the least Oh sister exactly what they take what about compatibility well any methods that isn't multiple selection savvy will still continue to work just fine on the selected range which will just be the first selection sub range of them if there is a multiple selection and this turns out to be a pretty good default was just fine in most cases in fact there are still some methods in the see that you have there's still some methods in the app yet that haven't been updated men continue to work just fine operating on the first selected subarrays so what else is new for tiger well now that we're reading and writing so many different formats for import and export we've stopped adding a new method for each format instead we provided a set of generic methods that take the document type as a parameter an option or attribute and we're before we had just constant strings for the option and attribute keys now we actually have real names attributes for them we have some new document attributes for metadata particularly for use with spotlight and some new document attributes specifically for HTML export including those that provides a control that I mentioned over the kind of HTML is being generated the tags and the encoding and finally we have a new document option for HTML import the text size multiplier this is the correspond to the little buttons and Safari that they make the text size larger or smaller what else do we have text lists I'm going to be talking more about text lists later on let me say that our intent is that that text this feature should be used to enable and a textview to automatically generate the markers for text lists that is the numbers of the bullets that is not implement in the see that you have what we have currently is that we are using this for designating list in HTML import and HTML export we have some additions to NS paragraph style first of all this is where the text block tables and text lists or attached to the text as i'll be talking about later on we also have the hyphenation factor previously this was settable on the globally for layout manager now we actually allow you to set it individually on an individual paragraph we have another factor now the tightening factor for truncation if you request truncation with ellipses we will first try to tighten up the text before truncating it with ellipses and this controls how much we will try to do that and also on just for support of HTML we have a header level on paragraphs it allows us to distinguish the text that is an HTML header from ordinary paragraph text string drawing we have some new methods for string drawing there are much like the old ones but they have additional options on one common request is to be able to do spring drawing with the origin at the baseline and that is now the default with these new methods they also have the option of measuring using either the type of graphic violence as usual or the actual image bounds of the glyph previously the primary method for specifying a font has been my name by the PostScript name for tiger we are promoting in a spot descriptor as the standard means for specifying font and it has a lot more capabilities you can specify by name I size by trade by the stylistic type of the font it can even group together multiple fonts serve as a cascade if you want to the font talked earlier this week there was a suitable discussion of and a spawn descriptor there been some changes to ennis typesetter we're deprecating now and it's simple horizontal types that rule would still be there for backward compatibility we have taken a lot of the functionality from NS 80s types that are the wasn't really a TF specific and moved it up into and a typesetter this makes it easy to create a whole new custom type setting engine and hook it up into the text system and in addition we've added one new method and here for the typesetter when it wishes to obtain the rectangle or a line I won't read out the method name we're not trying here to compete for the longest method name i think i think that madman's rep has that all sewn up but actually all these parameters are really necessary useful as we will see when they come to the table layout and finally some more miscellaneous additions we have a couple of new responder methods that are implemented in ennis textview one to insert a line break that is a line break as opposed to a paragraph break and one to insert a container break usually a page break we have a new delegate method NS text view allows the delegate to control any changes to the typing attributes this is in addition to our previous notification and as layout manager now house getter and setter for its quick generator there are a number of new convenience methods for I'm manipulating the base writing direction for a right to left left right text on the tribute strains and text view and finally we have a number of new panels that are not yet implemented in UC but a planned for inspecting and modifying links with some tables and tests and they have methods for bringing them forward so we've seen what's new now I want to move into the part of the talk my aim is my nanny's talk I can discuss in detail exactly how the new block and table mechanism works but in order to get there first i'm going to talk undergo to view a bit about how the layout process works in the text system and i'm going to discuss some of the existing classes that are most relevant and analogous to the new mechanism fortunately these are classes that i haven't really discussed at detail in previous years in this text container and as text attachment and i'll review also that a new and its text list class and then i'll get to the text box and table so to review them let's talk about the major players in the text system at the model level main classes and it's text storage which stores the text of the document direct subclass and then it's mutable attributed string so it holds the characters and their attributes then we have the classes that typically serve as values for attributes and a sponsor fonts and it's paragraph style for paragraph level attributes as text attachment for attached files and there's an asst text container which models the geometry of region within which text is to be laid out typically a page as a controller level the central class of the text system and ass layout manager that manages the whole process and a call go on a couple of other classes and a slip generator and as types that are to do some work for it and at the view level there's a visible face of the text system in a text view now the job of the text system really is to go from the characters and their attributes in the text storage to the bits that you see on the screen and to do this make this happen there are four basic processes that occur first of all attributes fixing this is where the tech storage does this and make sure that the attributes were consistent that is that the spots can actually represent the characters to which they are attached and the Paraguay paragraph level attributes actually apply to whole paragraphs then comes clip generation this is controlled by the layout manager layout manager calls on the glyphs generator to do this this is where the characters and their fonts are converted into a sequence of glyphs then comes way out again controlled by the lab manager and calls upon the types that are to do it a typesetter takes these glyphs and assemble them into line position on the page and then comes display against them done by the layout manager it takes these glyphs and their positions and it sends them on down to ports to be rendered on a screen we have a little diagram how this occurs on the one side we the glyphs that are generated by the coolest generator and as they are assembled into lines by the typesetter within the geometry that's determined by the text container there then displayed in the text view which sits in the window ok I want to focus particularly on layout because that's the most important process in the new text block table mechanism and the way that this occurs in detail is that the layout manager decides that a particular range of list needs to be laid out when it calls on the types that are an asset to lay them out the types that are then contacts the text container remember the text container models the geometry of the region within which the text is being laid out so the types that are calls on the text container to ask it about that geometry to determine the rectangles for the lines of text on the page then the types that are takes its glyph and it fills up these lines with the glyph it figures out exactly where each glyph should go in each line and then the types that are called back to the layout manager to tell what the layout manager what it is done where each line fits on the page which glyphs go in it and exactly where they go on each line and the lab manager stores this information because it will need it later on for display interaction and all other purposes so let me talk a little bit more about Anna's text container I say it models the geometry of the region within which texts to be laid out typically a page or though could be other things on the stock and a text container object that you will get from the apt yet represents just as a bull rectangle but you can create a custom and a sex container subclass that can represent essentially an arbitrary region it will still have a rectangle that is bounding rectangle but within that it gets to control where the text will be laid out and the way that works is that as I said the typesetter when applying outcalls on the lid on it on the text container and it asked the text container it proposes a rectangle for a line and the text container gets to modify that in return whatever rectangle for the line at life here's a little diagram of that showing a possible custom text container with some of the rectangles to the line that it might have returned to the typesetter now that's a conceptual overview let's go back over to the demo machine and see I can show it to you in action oh let me bring up this and start it and run it make this a little bigger now this is just from text but I have set a custom text container here on this text view and this text container has a parameter and as I modify the parameter their region within which the text it laid out will change so it go one way or the other let me just select it also you can see what the actual line rest look like so I modify that let's take a look at the code so now there's really only one main method that has to be implemented here this class has the parameters but they just control what happens in this method we're going to the typesetters when you propose a wide fragment rectangle and then we are going to get to modify it and in this case what we do is we trim it based on this sinusoidal curve that is the shape of this text container so we figure out where we are on our sine wave and in one case we're going to trim it on the left side and the other case we just trim it on the right side and then we just return our modified rectangle and that's all there is to it there's one more little detail we do have to implement this method to notify the layout manager that we are not just a simple rectangular text container so that it can turn off certain optimizations not as all that it takes really to implement a custom text container let's go back to the slide and let me talk about another class for text attachments in TextEdit you may have dragged in an image or some other file into the text and it gets attached to the text as a specific location and it's represented by an image or some icon that's drawn in line in the text and the class the way that this appears in the text storage is as a special character the attachment character with a special attribute the attachment attribute and the value of the attribute is an instance of class and as text attachment as text attachment goes two things first of all it represents the contents of the attached file as an asst file wrapper and second it has to provide some sort of visual representation to be drawn in line in the text and to do that drawing it uses a cell and then this text attachment cells these specific now by default if you're dragging something into text edit the text attachment will automatically generate an appropriate cell and image or other representation depending on the contents of the file but as developers you don't have to let it do that you can assign your text attachment a custom text attachment cell either one of the standard class with perhaps a custom image attached to it if you were at the tips and tricks talked the other day I give an example of that or you can give it I instance of a custom subclass of an attachment cell with a custom subclasses honest text attachment so you can do essentially arbitrary drawing in line in the text how does this work well during glyph generation the glyph generator will notice it'll come across the attachment and not try to generate a glyph because there's no list represent it it just puts in an old left this place holder for it and go stream then at layout time the typesetter will come across it notice that there is a an attachment there and what it needs to know about it is what space that attachment will take up in the text so that it can position it in the line so it actually calls to the text attachment cell passing it as arguments the position and text and line fragment of so forth where it is being laid out and the text attachment cell get to decide how big it should be there and return that to the typesetter and the types that are then reserves that space for it during layout then when it comes along to display time the layout manager that says the layout manager handles the display the lamp manager notices that it's not an ordinary clef it's a text attachment and it calls on the text attachment cell to do the drawings in the space that was reserved for it so that at custom text assessment cell can do well it's arbitrary drawing in an arbitrary space in line and the text but there's also one more thing none of the detects view will notice where there is a text attachment and it will give the attachments all the opportunity to handle mouse clicks in that region so you can even do some interaction with a custom text to happen so as we go back to the demo machine I want to give the timing little demo so this is the same application but if I click on this box I will put in some custom text attachments and these are just here this example it's just a horizontal line I put a few of them in here they're really very simple the only thing they do is that they change their size a little depending on the size of the line with it in which they're being laid out so let me take a look at code for that that's pretty short as i said the methods that you have to implement our first of all when the types that are asked a text attachment cell how big it is here we're returning a rectangle that whose size is just determined by a size and position is just determined by the size of line fragments and which we're being laid out just a third of it and then when it comes time to draw we get a chance to do our own drawing called upon by the layout manager and here I'm just doing something very simple settings black and sewing rest will be specified before and that's all it takes to have a custom text attachment style to do your own drawing so let's go back to the slide and now I want to talk about a new class NS text list the basic principle of our text list support is that all of the texts of the list including the marker the bullet or numbering will appear in the text as usual the andaz text list object itself will appear as an attribute on the text and it will do primarily two things first of all it will specify which portions of the text live within which part of the list of which lists and it will also determine what the formatting of the markers will be and as I said our intent is that this should be used by Anna textview for automatic generation of markers be the bullets or numbers of what have you that is not yet implement in your seat currently we use this for specifying list for HTML import and HTML export now how does this actually appear in the text the text list the in a 6 list object is that it would be a paragraph level attribute so it appears as part of an asst paragraph style but this can be nested so given region of text may not just be in one list it may be in multiple lists so Anna's paragraph style doesn't have this one and a text list it has an array of NS text lists listed in order from outermost to the innermost that's how we specify which portion of the text lies in which lists and we have a couple of convenience methods now in an attribute string for determining the entire range of a particular list or the location in a list of particular item and the text list and a six list object itself has some methods for specifying the format of its markers I'm not going to describe that now you can look at the release notes it's fairly phantom we support a number of basic list formatting options with possibly more to come in the future now here i have a little diagram show how this works here we have two nested lists an outer list in an interest some of the text is only in the outer list in that case the text list array just has the outer list in it and some of the text is in both lists the outer and the inner and for those the text lists array would have the outer list first than the inner list all right so now we have all the ingredients to understand how our text block and text table support works I've said that anise text container represents the geometry within which the Texas laid out and it's quite general flexible of doing that but it's essentially static it doesn't depend on the text that's being laid out in it the point of the new mechanism in the new class and this text block is to allow the text attributes on the text to affect how it is laid out so in this text block is going to be attached to the text much like in this text list is as an attribute and it's going to participate in the layup process somewhat as an a text attachment does the content the text in the text block will all appear in the text as normal it will all be laid out perfectly normally and laid out displayed with one little change and that is that the text block is allowed to affect the line rectangles within which the text is being laid out so the ways that the text block appear in the text is very similar to that which test lists appear cuz blocks again are going to be paragraph level attributes so they'll appear on the parent style and they too can be nested so a given section of text from me in more than one block and so in a paragraph style in addition to the text list array it also has an array a text block and in order from outermost to innermost and again we have a couple of convenience methods on its attributed string for determining the entire range of a particular block or a table I'll get two tables in a minute so how does this house is treated when it comes to lay out during the layup process we are going to define two rectangles for a given text block the first one is the layout rectangle that is the rectangle with in which the text of the block is actually going to be laid out and then there is a bounce rectangle which is a rectangle around the text of the block that also includes space for margin borders padding any additional region around the text that is not available for the layout of other text the layout wrecked for a block has to be determined just as layout for the block is starting immediately before the block is laid out and the way this happens this is the typesetter notice is it starting to lay out a block of text and it calls on then as text block and asks it what its layout right should be and passes in the location at which is starting to lay out text and a containing rectangle which is where the outermost block would be a bounding rector the text container but for inner blocks would be the rest of the immediately enclosing law then the bounds brett has to be determined at the end of layout for the block immediately as soon as the they block his finish being laid out because the bounce wrecked generally is affected by the length of the text in the block and again the types that are notices that it's finished laying out of law that calls on the block asks it for its bound threat and that and as text flock gets to determine that wrecked and pass it back to the typesetter the types that are then as usual called out the layout temperature and tells the layout manager what it is found out stores the these two recs for the Block in the layout measure and lamp manager keeps that information and maintains it here's a little diagram so when this block is being laid out its laid out in this layout rectangle which is typically a long rectangle and when we're allowing the block to be an arbitrary height then after the block has been laid out then the bounds rectangles are wrapped around the text with any extra space needed for as a margins border pattern whatever the block wants to put there now then when it comes time for display again the layout manager manages the display the light measure actually has two methods for display one to draw a background and one to draw the glyph so when it's method is called for drawing background there's one addition that is that it notices that there are text block to be drawn and I calls upon those text blocks in order from outermost to inter most and asks them to do any drawing they need to do and in that case they can draw up backgrounds they can draw borders any sort of declaration that they need and then the usual glyph background is drawn on top of that and then when the layout managers method for drawing list is called the glyphs are simply drawn on top of all this background and show up in the right places now all that is a general text block mechanism there is also a specific version for text tables and we have a subclass of NS text block code and its text table block and a text table block represents a block that appear as a single cell within a table the distinguishing feature the table out is that the different cells in the table have to coordinate their layout with each other so there is another object and as text table that represents the table of the whole and each table block each cell has a reference to the table as a whole and when it comes time for layout or for display as i said the layout manager calls upon the block in this case the next table block and its text table block at the subclass does something special it passes those requests on to its and as text table which coordinates them with all the other cells of the table and returns the appropriate results or does the appropriate display now all this is the general mechanism the standard text block and text tables and text book table block glasses in the kit have a number of parameters that determine how they actually do size and position themselves and these are things like margin border and padding widths on each side content width and so forth they can be specified either as an absolute point value or as a percentage of the enclosing block and text blocks will and draw a background color if specified they can also specify border colors if you like separate one for each side you can look at the details and they're all in the release notes it should be not terribly surprised but what I want to emphasize here is that these do not limit you the mechanism is very general if you have a custom subclass of in a text block you can do any sort of sizing and positioning and decoration that you like so let's go back to the dermal machine I'll give a little demonstration of this now I said that the custom that the standard text block class in the kit will do a background color it doesn't at least not yet support for example background images but there's nothing to prevent you from writing your own text block class that will do that and I've written a tiny little sample here that does a background image and in addition it does some custom sizing it sizes the block to the size of the image let me turn that on and notice that here is the text up here in the block and I can select it make it bigger make it bigger if we like and so forth and let me take a look at the code for that so the message that we need to implement for this our first of all when we start the layout of the block we need to get the layout wrecked and the types that will call upon this custom sub class and ask it for the way erect and hear all that i'm doing is determining the layout rekt based upon determining is with based upon the size of the image and to be nice i've made it centered for it Ottilie and the containing rack so a little bit of calculation here so it's pretty straightforward then when we called upon to generate the bounds rekt what I'm although i'm doing here is trying to make the bounce rekt equal to the size / the size of the image of course it might be that our container is not by fig enough for the image or might be it is bigger and we need to Center it so there's a little bit of calculation here to determine the balance wrecked in those cases and we return that and then when it comes time to draw all that we're going to do is just draw this background image but again the rect might be smaller than we actually wanted or it might be bigger than we actually wanted so we're going to adjust for that trim or Center one way or the other and then so there's a little more calculation here but again quite straightforward and then we just draw the image in the rack and that is what is necessary should make a custom subclass of on and a text block now let's go back to the slides and finish up that's what I have to present to you today if you want to see more the first place to go is the release notes which has detailed descriptions of all this floor the pre Tiger features there is extensive documentation on there actually some of the nice diagrams I showed you here were taken directly from that documentation and the samples that I showed you here should be available for download on the ADC website they discriminant act person Matthew Formica and fork is phonica at apple com