WWDC2004 Session 437

Transcript

Kind: captions
Language: en
and please welcome
frameworks engineer Douglas Davidson
[Applause]
good afternoon everyone thank you all
for coming my name is Douglas Davidson
I'm here to talk to you about the cocoa
text system now the text system in some
sense is at the heart of cocoa is almost
everything in the app kit makes use of
text in one way or another and generally
uses the text system to do so the cocoa
text system is very powerful and very
flexible as lots of features and there
are quite a few new features for tiger
so what I'd like to do today is first of
all gods over briefly what we have this
new for tiger and then I'm going to dive
into some of the inner workings of the
text system so that by the end of the
talk I can really give you a detailed
description of some of the most
significant new features how they work
so what do we have that's new for Tiger
those of you who are here last year may
remember I talked about among other
things our support for Word documents
and that was pretty well received but
you know some people would ask me why we
didn't really handle tables and word
documents and I had to tell them the
cocoa teksystems doesn't support tables
so when I got my task list for tighter
you can guess what the first thing on it
was support tables in the cocoa tech
system so i'm glad to say that for tiger
the cocoa tech system will support table
now we thought about this a bit though
and we realized that while tables are
important for RTS and word documents and
so forth but they're really critical for
is HTML so for tides always also put a
lot of effort into our HTML support
using a table support and in addition
entirely new for Tiger the cocoa tech
system will now explore HTML so let me
talk about this now I've been speaking
this this as text table support for
short because everyone understands that
but what we really have here is
something that's a lot more general the
White House is a flexible extensible
mechanism for sizing and positioning
text blocks of all sorts the first
application of which is to the
representation of tables from RTF or
word documents or HTML or HTML and CSS
now when I get to the end of the talks
I'm going to be describing how this
works in great detail I just want to say
that is a very general mechanism text
blocks and our intent is to use it for
all sorts of complex layout we might
encounter for HTML import you may recall
that int answer we have two different
kinds of HTML import there is an older
style that has some support for
sophisticated layout but that is rather
out of date limited in the kinds of HTML
vatican part and then there's a second
kind new for Panther that use web kit
for parsing until could handle arbitrary
HTML was rather limited in its way out
support so for tiger all that is lon we
have a completely new HTML import
mechanism uses webkit for car
sing so can handle any HTML you like we
defined and it uses the new text block
text table mechanism for complex layout
support to our HTML export whenever you
talk about generating HTML the question
always comes up well what kind of HTML
is that you're going to generate so when
we were planning this feature we took
low informal service on the group's at
Apple we thought might want to use it
and the results were unanimous every
group wanted a different kind of HTML so
we decided we just have to satisfy all
and what we have is a mechanism that
gives you control over this allows you
to specify exactly which HTML tags
should be generated this gives you a lot
of control you can generate HTML you can
generate XHTML strict or transitional of
course it's automatically going to be
valid and well-formed on you can use the
SS in a variety of ways or you can use
know CSS you can generate a complete
HTML document or just a fragment and all
sorts of other options take a look the
release notes I think you're going to
like it now we're only about halfway
through the tiger development cycle so
these things are not finished by any
means but if we'll go over to the demo
machine I'll give you a brief look at
some of the things that are working so I
have a little demo application here so
let me just type into it
[Applause]
maybe make this fold and let me go and
save it I have several have added
several options here for saving this I
could save it as rich text or HTML work
fine web we've also added support for
saving as words xml format word ml when
I save it as HTML let me choose to save
it as HTML 4.01 transitional for CSS I'm
going to choose to use CSS in line so
the encoding well let me leave it is
utf-8 because you know that's the right
thing to do and let me give it a name
and save it on and let's see what kind
of thing we generated open it to show
the source and we have an HTML document
okay try something else let me bring up
I have a little Excel spreadsheet here
select it copy paste it in here I get a
real table but just select that and pop
up the size little if you can see it
better you make it bold
okay well that's working pretty well so
far let me try something a little harder
so in Safari I brought up a possibly
recognizable website let's just take a
look at the source for that this thing
usually has about 50 tables nested four
or five deep you know so let me take
this and save it I'll save it using web
gets new web archive format which of
course can L support and then let me go
back and see if I can open that up and
if everything works properly we will get
lashed HUD just select it all maybe make
it bold and then save it back out yeah
don't provide it and well here it is let
me see if I can open it back up it's
Safari and there it is I can see a few
glitches that we'll have to work on but
recognizably the same website only all
bulb but let me emphasize that this is
an import and export mechanism the HTML
that we're producing is rather different
from the HTML that went in you'll notice
for one thing it's solid well formed a
nice with format which the original
[Applause]
okay so while I'm here there's one more
thing I wanted to show you let me just
close this out open up this encyclopedia
article here maybe I want to find all of
the tigers in it so I could just find
the Tigers but now we have the
possibility to select all of them as a
multiple discontinuous selection may be
making a little sugar make it bold
italic underline so forth okay let's go
back to the slide and I could talk a
little about that multiple selections
how does this work well there are a
variety of methods and then it's text
view that act upon or take a select to
the selected range so now for tiger all
those will have a counterpart that works
on selected ranges which is going to be
an array of range values the precise
detail different for the least Oh sister
exactly what they take what about
compatibility well any methods that
isn't multiple selection savvy will
still continue to work just fine on the
selected range which will just be the
first selection sub range of them if
there is a multiple selection and this
turns out to be a pretty good default
was just fine in most cases in fact
there are still some methods in the see
that you have there's still some methods
in the app yet that haven't been updated
men continue to work just fine operating
on the first selected subarrays so what
else is new for tiger well
now that we're reading and writing so
many different formats for import and
export we've stopped adding a new method
for each format instead we provided a
set of generic methods that take the
document type as a parameter an option
or attribute and we're before we had
just constant strings for the option and
attribute keys now we actually have real
names attributes for them we have some
new document attributes for metadata
particularly for use with spotlight and
some new document attributes
specifically for HTML export including
those that provides a control that I
mentioned over the kind of HTML is being
generated the tags and the encoding and
finally we have a new document option
for HTML import the text size multiplier
this is the correspond to the little
buttons and Safari that they make the
text size larger or smaller what else do
we have text lists I'm going to be
talking more about text lists later on
let me say that our intent is that that
text this feature should be used to
enable and a textview to automatically
generate the markers for text lists that
is the numbers of the bullets that is
not implement in the see that you have
what we have currently is that we are
using this for designating list in HTML
import and HTML export we have some
additions to NS paragraph style first of
all this is where the text block tables
and text lists or attached to the text
as i'll be talking about later on we
also have the hyphenation factor
previously this was settable on the
globally for layout manager now we
actually allow you to set it
individually on an individual paragraph
we have another factor now the
tightening factor for truncation if you
request truncation
with ellipses we will first try to
tighten up the text before truncating it
with ellipses and this controls how much
we will try to do that and also on just
for support of HTML we have a header
level on paragraphs it allows us to
distinguish the text that is an HTML
header from ordinary paragraph text
string drawing we have some new methods
for string drawing there are much like
the old ones but they have additional
options on one common request is to be
able to do spring drawing with the
origin at the baseline and that is now
the default with these new methods they
also have the option of measuring using
either the type of graphic violence as
usual or the actual image bounds of the
glyph previously the primary method for
specifying a font has been my name by
the PostScript name for tiger we are
promoting in a spot descriptor as the
standard means for specifying font and
it has a lot more capabilities you can
specify by name I size by trade by the
stylistic type of the font it can even
group together multiple fonts serve as a
cascade if you want to the font talked
earlier this week there was a suitable
discussion of and a spawn descriptor
there been some changes to ennis
typesetter we're deprecating now and
it's simple horizontal types that rule
would still be there for backward
compatibility we have taken a lot of the
functionality from NS 80s types that are
the wasn't really a TF specific and
moved it up into and a typesetter this
makes it easy to create a whole new
custom type setting engine and hook it
up into the text system and in addition
we've added one new method and here for
the typesetter when it wishes to obtain
the rectangle or a line
I won't read out the method name we're
not trying here to compete for the
longest method name i think i think that
madman's rep has that all sewn up but
actually all these parameters are really
necessary useful as we will see when
they come to the table layout and
finally some more miscellaneous
additions we have a couple of new
responder methods that are implemented
in ennis textview one to insert a line
break that is a line break as opposed to
a paragraph break and one to insert a
container break usually a page break we
have a new delegate method NS text view
allows the delegate to control any
changes to the typing attributes this is
in addition to our previous notification
and as layout manager now house getter
and setter for its quick generator there
are a number of new convenience methods
for I'm manipulating the base writing
direction for a right to left left right
text on the tribute strains and text
view and finally we have a number of new
panels that are not yet implemented in
UC but a planned for inspecting and
modifying links with some tables and
tests and they have methods for bringing
them forward so we've seen what's new
now I want to move into the part of the
talk my aim is my nanny's talk I can
discuss in detail exactly how the new
block and table mechanism works but in
order to get there first i'm going to
talk undergo to view a bit about how the
layout process works in the text system
and i'm going to discuss some of the
existing classes that are most relevant
and analogous to the new mechanism
fortunately these are classes that i
haven't really discussed at detail in
previous years in this text container
and as text attachment and i'll review
also that a new and its text list class
and then i'll get to the text box and
table so to review them let's talk about
the major players in the text system
at the model level main classes and it's
text storage which stores the text of
the document direct subclass and then
it's mutable attributed string so it
holds the characters and their
attributes then we have the classes that
typically serve as values for attributes
and a sponsor fonts and it's paragraph
style for paragraph level attributes as
text attachment for attached files and
there's an asst text container which
models the geometry of region within
which text is to be laid out typically a
page as a controller level the central
class of the text system and ass layout
manager that manages the whole process
and a call go on a couple of other
classes and a slip generator and as
types that are to do some work for it
and at the view level there's a visible
face of the text system in a text view
now the job of the text system really is
to go from the characters and their
attributes in the text storage to the
bits that you see on the screen and to
do this make this happen there are four
basic processes that occur first of all
attributes fixing this is where the tech
storage does this and make sure that the
attributes were consistent that is that
the spots can actually represent the
characters to which they are attached
and the Paraguay paragraph level
attributes actually apply to whole
paragraphs then comes clip generation
this is controlled by the layout manager
layout manager calls on the glyphs
generator to do this this is where the
characters and their fonts are converted
into a sequence of glyphs then comes way
out again controlled by the lab manager
and calls upon the types that are to do
it a typesetter takes these glyphs and
assemble them into line position on the
page and then comes display against them
done by the layout manager it takes
these glyphs and their positions and it
sends them on down to ports to be
rendered on a screen we have a little
diagram how this occurs on the one side
we
the glyphs that are generated by the
coolest generator and as they are
assembled into lines by the typesetter
within the geometry that's determined by
the text container there then displayed
in the text view which sits in the
window ok I want to focus particularly
on layout because that's the most
important process in the new text block
table mechanism and the way that this
occurs in detail is that the layout
manager decides that a particular range
of list needs to be laid out when it
calls on the types that are an asset to
lay them out the types that are then
contacts the text container remember the
text container models the geometry of
the region within which the text is
being laid out so the types that are
calls on the text container to ask it
about that geometry to determine the
rectangles for the lines of text on the
page then the types that are takes its
glyph and it fills up these lines with
the glyph it figures out exactly where
each glyph should go in each line and
then the types that are called back to
the layout manager to tell what the
layout manager what it is done where
each line fits on the page which glyphs
go in it and exactly where they go on
each line and the lab manager stores
this information because it will need it
later on for display interaction and all
other purposes so let me talk a little
bit more about Anna's text container I
say it models the geometry of the region
within which texts to be laid out
typically a page or though could be
other things on the stock and a text
container object that you will get from
the apt yet represents just as a bull
rectangle but you can create a custom
and a sex container subclass that can
represent essentially an arbitrary
region it will still have a rectangle
that is bounding rectangle but within
that it gets to control where the text
will be laid out and the way that works
is that
as I said the typesetter when applying
outcalls on the lid on it on the text
container and it asked the text
container it proposes a rectangle for a
line and the text container gets to
modify that in return whatever rectangle
for the line at life here's a little
diagram of that showing a possible
custom text container with some of the
rectangles to the line that it might
have returned to the typesetter
now that's a conceptual overview let's
go back over to the demo machine and see
I can show it to you in action oh let me
bring up this and start it and run it
make this a little bigger now this is
just from text but I have set a custom
text container here on this text view
and this text container has a parameter
and as I modify the parameter their
region within which the text it laid out
will change so it go one way or the
other let me just select it also you can
see what the actual line rest look like
so I modify that let's take a look at
the code so
now there's really only one main method
that has to be implemented here this
class has the parameters but they just
control what happens in this method
we're going to the typesetters when you
propose a wide fragment rectangle and
then we are going to get to modify it
and in this case what we do is we trim
it based on this sinusoidal curve that
is the shape of this text container so
we figure out where we are on our sine
wave and in one case we're going to trim
it on the left side and the other case
we just trim it on the right side and
then we just return our modified
rectangle and that's all there is to it
there's one more little detail we do
have to implement this method to notify
the layout manager that we are not just
a simple rectangular text container so
that it can turn off certain
optimizations not as all that it takes
really to implement a custom text
container let's go back to the slide
and let me talk about another class for
text attachments in TextEdit you may
have dragged in an image or some other
file into the text and it gets attached
to the text as a specific location and
it's represented by an image or some
icon that's drawn in line in the text
and the class the way that this appears
in the text storage is as a special
character the attachment character with
a special attribute the attachment
attribute and the value of the attribute
is an instance of class and as text
attachment as text attachment goes two
things first of all it represents the
contents of the attached file as an asst
file wrapper and second it has to
provide some sort of visual
representation to be drawn in line in
the text and to do that drawing it uses
a cell and then this text attachment
cells these specific now by default if
you're dragging something into text edit
the text attachment will automatically
generate an appropriate cell and image
or other representation depending on the
contents of the file but as developers
you don't have to let it do that you can
assign your text attachment a custom
text attachment cell either one of the
standard class with perhaps a custom
image attached to it if you were at the
tips and tricks talked the other day I
give an example of that or you can give
it I instance of a custom subclass of an
attachment cell with a custom subclasses
honest text attachment so you can do
essentially arbitrary drawing in line in
the text how does this work well during
glyph generation the glyph generator
will notice it'll come across the
attachment and not try to generate a
glyph because there's no list represent
it it just puts in an old left this
place holder for it and go stream then
at layout time the typesetter will come
across it notice that there is a
an attachment there and what it needs to
know about it is what space that
attachment will take up in the text so
that it can position it in the line so
it actually calls to the text attachment
cell passing it as arguments the
position and text and line fragment of
so forth where it is being laid out and
the text attachment cell get to decide
how big it should be there and return
that to the typesetter and the types
that are then reserves that space for it
during layout then when it comes along
to display time the layout manager that
says the layout manager handles the
display the lamp manager notices that
it's not an ordinary clef it's a text
attachment and it calls on the text
attachment cell to do the drawings in
the space that was reserved for it so
that at custom text assessment cell can
do well it's arbitrary drawing in an
arbitrary space in line and the text but
there's also one more thing none of the
detects view will notice where there is
a text attachment and it will give the
attachments all the opportunity to
handle mouse clicks in that region so
you can even do some interaction with a
custom text to happen so as we go back
to the demo machine I want to give the
timing little demo so this is the same
application but if I click on this box I
will put in some custom text attachments
and these are just here this example
it's just a horizontal line I put a few
of them in here they're really very
simple the only thing they do is that
they change their size a little
depending on the size of the line with
it in which they're being laid out so
let me take a look at code for that
that's pretty short as i said the
methods that you have to implement our
first of all when the types that are
asked a text attachment cell how big it
is here we're returning a rectangle that
whose size is just determined by a size
and position is just determined by the
size of line fragments and which we're
being laid out just a third of it and
then when it comes time to draw we get a
chance to do our own drawing called upon
by the layout manager and here I'm just
doing something very simple settings
black and sewing rest will be specified
before and that's all it takes to have a
custom text attachment style to do your
own drawing so let's go back to the
slide
and now I want to talk about a new class
NS text list the basic principle of our
text list support is that all of the
texts of the list including the marker
the bullet or numbering will appear in
the text as usual the andaz text list
object itself will appear as an
attribute on the text and it will do
primarily two things first of all it
will specify which portions of the text
live within which part of the list of
which lists and it will also determine
what the formatting of the markers will
be and as I said our intent is that this
should be used by Anna textview for
automatic generation of markers be the
bullets or numbers of what have you that
is not yet implement in your seat
currently we use this for specifying
list for HTML import and HTML export now
how does this actually appear in the
text the text list the in a 6 list
object is that it would be a paragraph
level attribute so it appears as part of
an asst paragraph style but this can be
nested so given region of text may not
just be in one list it may be in
multiple lists so Anna's paragraph style
doesn't have this one and a text list it
has an array of NS text lists listed in
order from outermost to the innermost
that's how we specify which portion of
the text lies in which lists and we have
a couple of convenience methods now in
an attribute string for determining the
entire range of a particular list or the
location in a list of particular item
and the text list and a six list object
itself has some methods for specifying
the format of its markers I'm not going
to describe that now you can look at the
release notes it's
fairly phantom we support a number of
basic list formatting options with
possibly more to come in the future now
here i have a little diagram show how
this works here we have two nested lists
an outer list in an interest some of the
text is only in the outer list in that
case the text list array just has the
outer list in it and some of the text is
in both lists the outer and the inner
and for those the text lists array would
have the outer list first than the inner
list all right so now we have all the
ingredients to understand how our text
block and text table support works I've
said that anise text container
represents the geometry within which the
Texas laid out and it's quite general
flexible of doing that but it's
essentially static it doesn't depend on
the text that's being laid out in it the
point of the new mechanism in the new
class and this text block is to allow
the text attributes on the text to
affect how it is laid out so in this
text block is going to be attached to
the text much like in this text list is
as an attribute and it's going to
participate in the layup process
somewhat as an a text attachment does
the content the text in the text block
will all appear in the text as normal it
will all be laid out perfectly normally
and laid out displayed with one little
change and that is that the text block
is allowed to affect the line rectangles
within which the text is being laid out
so the ways that the text block appear
in the text is very similar to that
which test lists appear cuz blocks again
are going to be paragraph level
attributes so they'll appear on the
parent
style and they too can be nested so a
given section of text from me in more
than one block and so in a paragraph
style in addition to the text list array
it also has an array a text block and in
order from outermost to innermost and
again we have a couple of convenience
methods on its attributed string for
determining the entire range of a
particular block or a table I'll get two
tables in a minute so how does this
house is treated when it comes to lay
out during the layup process we are
going to define two rectangles for a
given text block the first one is the
layout rectangle that is the rectangle
with in which the text of the block is
actually going to be laid out and then
there is a bounce rectangle which is a
rectangle around the text of the block
that also includes space for margin
borders padding any additional region
around the text that is not available
for the layout of other text the layout
wrecked for a block has to be determined
just as layout for the block is starting
immediately before the block is laid out
and the way this happens this is the
typesetter notice is it starting to lay
out a block of text and it calls on then
as text block and asks it what its
layout right should be and passes in the
location at which is starting to lay out
text and a containing rectangle which is
where the outermost block would be a
bounding rector the text container but
for inner blocks would be the rest of
the immediately enclosing law then the
bounds brett has to be determined at the
end of layout for the block immediately
as soon as the they block his finish
being laid out because the bounce
wrecked generally is affected by the
length of the text in the block and
again the types that are notices that
it's finished laying out of law that
calls on the block
asks it for its bound threat and that
and as text flock gets to determine that
wrecked and pass it back to the
typesetter the types that are then as
usual called out the layout temperature
and tells the layout manager what it is
found out stores the these two recs for
the Block in the layout measure and lamp
manager keeps that information and
maintains it here's a little diagram so
when this block is being laid out its
laid out in this layout rectangle which
is typically a long rectangle and when
we're allowing the block to be an
arbitrary height then after the block
has been laid out then the bounds
rectangles are wrapped around the text
with any extra space needed for as a
margins border pattern whatever the
block wants to put there now then when
it comes time for display again the
layout manager manages the display the
light measure actually has two methods
for display one to draw a background and
one to draw the glyph so when it's
method is called for drawing background
there's one addition that is that it
notices that there are text block to be
drawn and I calls upon those text blocks
in order from outermost to inter most
and asks them to do any drawing they
need to do and in that case they can
draw up backgrounds they can draw
borders any sort of declaration that
they need and then the usual glyph
background is drawn on top of that and
then when the layout managers method for
drawing list is called the glyphs are
simply drawn on top of all this
background and show up in the right
places now all that is a general text
block mechanism there is also a specific
version for text tables and we have a
subclass of NS text block code and its
text table block and a text table block
represents a block that appear
as a single cell within a table the
distinguishing feature the table out is
that the different cells in the table
have to coordinate their layout with
each other so there is another object
and as text table that represents the
table of the whole and each table block
each cell has a reference to the table
as a whole and when it comes time for
layout or for display as i said the
layout manager calls upon the block in
this case the next table block and its
text table block at the subclass does
something special it passes those
requests on to its and as text table
which coordinates them with all the
other cells of the table and returns the
appropriate results or does the
appropriate display now all this is the
general mechanism the standard text
block and text tables and text book
table block glasses in the kit have a
number of parameters that determine how
they actually do size and position
themselves and these are things like
margin border and padding widths on each
side content width and so forth they can
be specified either as an absolute point
value or as a percentage of the
enclosing block and text blocks will and
draw a background color if specified
they can also specify border colors if
you like separate one for each side you
can look at the details and they're all
in the release notes it should be not
terribly surprised but what I want to
emphasize here is that these do not
limit you the mechanism is very general
if you have a custom subclass of in a
text block you can do any sort of sizing
and positioning and decoration that you
like
so let's go back to the dermal machine
I'll give a little demonstration of this
now I said that the custom that the
standard text block class in the kit
will do a background color it doesn't at
least not yet support for example
background images but there's nothing to
prevent you from writing your own text
block class that will do that and I've
written a tiny little sample here that
does a background image and in addition
it does some custom sizing it sizes the
block to the size of the image let me
turn that on and notice that here is the
text up here in the block and I can
select it make it bigger make it bigger
if we like and so forth and
let me take a look at the code for that
so the message that we need to implement
for this our first of all when we start
the layout of the block we need to get
the layout wrecked and the types that
will call upon this custom sub class and
ask it for the way erect and hear all
that i'm doing is determining the layout
rekt based upon determining is with
based upon the size of the image and to
be nice i've made it centered for it
Ottilie and the containing rack so a
little bit of calculation here so it's
pretty straightforward then when we
called upon to generate the bounds rekt
what I'm although i'm doing here is
trying to make the bounce rekt equal to
the size / the size of the image of
course it might be that our container is
not by fig enough for the image or might
be it is bigger and we need to Center it
so there's a little bit of calculation
here to determine the balance wrecked in
those cases and we return that and then
when it comes time to draw all that
we're going to do is just draw this
background image but again the rect
might be smaller than we actually wanted
or it might be bigger than we actually
wanted so we're going to adjust for that
trim or Center one way or the other and
then so there's a little more
calculation here but again quite
straightforward and then we just draw
the image in the rack and that is what
is necessary should make a custom
subclass of on and a text block now
let's go back to the slides
and finish up that's what I have to
present to you today if you want to see
more the first place to go is the
release notes which has detailed
descriptions of all this floor the pre
Tiger features there is extensive
documentation on there actually some of
the nice diagrams I showed you here were
taken directly from that documentation
and the samples that I showed you here
should be available for download on the
ADC website they discriminant act person
Matthew Formica and fork is phonica at
apple com