WWDC2004 Session 314
Transcript
Kind: captions
Language: en
good afternoon and welcome to the five
o'clock session here in fright Friday
afternoon last session of WWDC my name
is Matthew Formica I am as many of you
I'm sure already know and the cocoa and
development tools evangelist here at
Apple computer and this is the one
session of the week where I actually get
to present some technical content as
opposed to just hosting QA so with that
said let's get started so one of the
things in develop relations that we've
noticed as as we work with all of you on
various applications is that the low
levels of the operating system have a
big impact on application performance in
fact a lot of times the launch time of
an application or how it performs has a
lot to do with the dynamic linker dy LD
and how the application is linked
together statically as well and these
pieces of the operating system are often
less well understood than some of the
other API sets in the o out on top of
this Apple was making significant
technology advances in these areas for
tiger and so we thought this session
would be a good opportunity almost as a
trail guide for all of you to help you
understand some of the things that we're
doing in this space so what we're going
to talk about today is pre binding which
up until recently has has been the way
to get applications to launch quickly on
Mac OS 10 some of the changes we made in
10 3 4 and higher what we're planning to
do for tiger for dy LD how to use dead
goat stripping and then we'll dive down
into wrt for the compiler so a little
bit of a recipe cookbook type approach
here today we're going to cover a lot of
ground so we're going to talk first
about how pre binding works and this is
to show you where we've come from it
also will hopefully give you a little
insight into the workings of how an
application launches and it still has
some applicability today
as well especially if you are still
trying to maintain compatibility for
older OS versions what is pre binding
pre binding makes Mauch o applications
launch faster macco is the native binary
format for Mac OS 10 what pre binding
does is it rebase azure a binary to a
virtual memory address ahead of launch
time so your application needs to be
somewhere in the 4 gigabyte address
space and pre binding says where that
goes pre binding can take a lot of work
to get right we start with the Mokka
binary format in addition to all of the
application data and other symbolic
information that goes in there the maca
binary can contain links off to other
libraries and frameworks that it links
against as well as symbolic information
concerning addresses of where the
symbols that needs should be located so
that when an application is loaded into
a fresh address space d ylb checks the
binary to see where it should get loaded
and by default applications get loaded
to address 0 Wendy while d loads a
library or other framework that the
application is linked against if you
don't pre bind the library will also
have the address of 0 and thus the
library and the application will collide
vol d then has to go through the work of
sliding the library to an available slot
this takes time this is this can has the
potential to be slow especially if you
have a lot of libraries or frameworks
finally the application is ready to
actually start running with your code
so to set up free binding there's a
couple things that we have traditionally
done Xcode by default passes the dash
pre bind flag down to the linker which
turns on pre binding when you build this
is actually why when you by default are
linking are building an application I'm
linking it against a framework that
you've built if you don't set things up
properly the pre binding will be there
and you'll get conflicting addresses and
you'll see messages to say the framework
or library overlaps the text sec section
of the application that's because you're
trying to leave them both at address 0
to remove that conflict you'll want you
would want to set the seg one address
flag for the library or framework to
actually specify where in the 4 gigabyte
address space the library or framework
should be loaded to your binding it to
that address so that at launch time
that's where I'll get loaded there's a
couple things I commonly use when I'm
helping developers debug pre binding vm
map is a command-line tool that will
basically give you a complete dump of
the address space you can see what
system libraries and frameworks as well
as your own application and libraries
and frameworks are loaded I barrett's
out recipes addresses in the address
space if you're launching from the
command line you can use dld pre bind
debug and environment variable that will
then have the system spit out some
information for you to indicate whether
what you're launching is prebound this
sounds complicated it is it takes a lot
of work of fiddling to get right the
benefits are that your application does
launch a lot faster for a pre bound
binary the application is once again
loaded to address zero but when the
library is loaded in it's already been
set up by you to have a different
address that doesn't collide so dy LD
can just load it up without having to
slide anything
all the addresses are already correct so
the application just runs and this can
be a lot faster the advantages of pre
binding or that it works on 10 dot 21 up
so if you need your application launch
quickly on 10 dot to you want to still
pay attention of pre binding when it
works it works well it does give you a
large win in application launch time the
problem is as I've discovered helping
many of you is that it's hard to get
right it's very easy for you to pick an
address that you think works for your
frameworks or libraries that don't
actually work there's still some overlap
or perhaps you pick an address that
works today within your application
through adding code grows in size and
then you bump into an overlap down the
road it's hard to keep everything in
sync manually and it's also it's very
fragile as well as we change things in
the system perhaps we end up taking up
an address space that you were using
before and then your application would
not be able to be prebound so we fought
hot about this and we came up with a new
approach we've got a better approach now
and to talk about what we're doing now I
like to invite up Jeff klassen thanks
Matt thanks Matt one of the things that
I wanted to add before before I get into
what we did in 10 3 4 is that Matt
mentioned the fragility of pre binding
and one of the things is that makes it
fragile is Matt talked about things
growing and shrinking and your base
addresses and your preferred address is
getting out of sync that's not even all
in your control for example Apple could
ship a software update that grows one of
the system frameworks that would collide
with yours and then you would have to
generate new preferred addresses and so
that actually is a source of a lot of
the fragility and a lot of the pain of
very complex third-party applications
that provide a lot of frameworks that
they get linked in so let's go and do a
little bit more history so we have this
problem where
launch flow we came up with a solution
it was pre binding very early on when we
decided that pre binding was something
that we needed to do we did realize that
the time that Matt described of sliding
and relocating and dynamically loading
the library is in fact a bottleneck of
app launch and excuse me and pre binding
was what we did as a shortcut to try and
precompute some of that work that the
dynamic linker does when you load things
into memory as Matt showed in the
animation so here's some numbers
actually just to illustrate the fact of
how what pre binding actually does for
you on 10 dot 2 and 10 dot 323 and
earlier this is a very large C++
application with a large number of
private frameworks and if you don't pre
bind that app it actually takes 48
seconds to launch that's a long time
this happens to be a cross-platform
application and on Linux the same
application launches in about 15 seconds
and these are comparable hardware ones
until obviously the others a g4 but if
you take that 48 seconds and you
actually pre buying the application that
drops down to 12 seconds and that's
actually an acceptable launch time for
an application as complex as the one
here that's actually the time to splash
so there's a lot of work going on so
this may be time that we're spent doing
work in the dynamic loader or time that
the application is initializing itself
but to go from 48 seconds to 12 seconds
actually is something that affects
end-users and that is something that a
developer would be encouraged to do work
so what did we do Matt describe some of
these already pre binding if you don't
pre bind on older met versions of Mac OS
you have slow launch times most likely
and the more complicated your
application is with the more dependent
framework the slower that launch time
get and that also is coincidentally
makes it harder for you to keep all
those addresses straight and not
have overlaps and have the pre binding
be determined to be invalid and have to
be reproduce me reap rebound dynamically
at launch time there's a couple other
things about pre binding that a few of
you may care about because we
dynamically modify the application
executable at the system you may have
seen this daemon called six pre binding
that gets started up now and then if the
dynamic linker detects that your pre
binding is out of date it will throw off
this demon to try and actually fix up
the pre binding so the next time you
launch that application it will launch
faster but what that means is you can't
sign your code because the file is
actually being modified by the system
without your control and a lot of you I
think care about that the other thing
that's really interesting about this is
I'm sure you guys have seen during
installation at the end of installation
you get this barber pole bar with a
optimizing system it takes a long time
how many you guys have gotten bit by
minutes of that okay so what what is
going on there is that is the Installer
throws off a process that fixes up the
system pre binding if you install a new
framework it's got to go to all the
dependent dependent executables of that
framework and reap rebind that with the
new addresses that are in the new
framework you installed so here's an
example of actually these are kind of
extreme examples keynote is a fairly
complex application with a lot of
dependent frameworks that installs lots
of files it actually takes 50 seconds to
install on this machine and after that
it takes 90 seconds to reap rebind
everything that's not too bad but
something like a security update which
contains maybe just the system framework
takes five seconds for the Installer to
actually write the files onto the disk
and then you're sitting for more than
five minutes while the system fixes up
everything because everything is
dependent on that system framework so
we'd actually like to try and reduce
that
yes so we basically just you know solve
launch time problems by creating install
time problems for the user and that you
know may not be that bad because you
know you install one but you know it's
still a painful experience if you're on
an older machine like an iMac or a
powerbook with a slow disk drive and
again it's complex so what are we going
to do or what have we done so what I'm
going to do is I'm going to talk about
some of the improvements we made into 10
3 4 and then later we'll talk about even
more improvements that we're making in
the tiger so 4 10 3 4 what we did is we
actually wanted to take a step back and
actually try and understand the real
problem and not the symptoms of the
problem and the symptoms being slow
launch time and the real problem being
let's find the hot spots in our dynamic
linker and optimize them an interesting
point is we're not the only operating
system that does pre binding goes by
various names but linux has a pre
binding like saying windows solaris they
all have something that's very similar
to pre binding and what's interesting
about the other operating systems is
they don't see the orders of magnitude
improvement in launch times like I
showed for that large application
earlier they only get about ten to
twenty percent better improvement when
you pre bind and it's no easier to pre
bind on those other operating systems
but because it only buys ten to twenty
percent performance improvement most
developers choose not to go that extra
they put that extra amount of work into
the system so again so what we did is dy
LD was originally designed before large
C++ applications were the norm and C++
introduces a number of complexities into
the object file there's a greater number
of symbols that are exported templates
have interesting properties that the
linker needs to fix up at runtime and so
what has happened over the years since
we since we designed ey LD and then did
pre binding is that the number of
rules that have to be fixed up at launch
time has grown significantly so it was
time to tune our dynamic linker for
large C++ programs so we took advantage
of some tools we use shark it's a great
tool I don't know how many of you saw
this talk right before this at
three-thirty and talk on shark but it's
a great tool I encourage you to use it
as much as you can to find your hot
spots and we actually use shark and
found the hot spots and optimized and
here's some results this is the same
application I believe it's different
hardware that's why the numbers are
different and what what this chart shows
is actually split out the time that
we're spending in the system in d yld
versus the time the application itself
is doing work so back on 10 33 was an
unfree bound application the launch time
was 80 seconds and of that 82nd 72 of
those seconds were d y LD fixing up the
relocations basically reap rebinding and
sliding if you pre bound that
application that dropped to 18 seconds
and only 10 of those seconds was d y LD
actually loading all the libraries so on
10 3 4 with the new dy LD that we
shipped we only spend two seconds for a
non prebound application now if you look
at the size of that green bar versus the
total time now that's where shark comes
in and now since we've sped up our time
so much in the launch of your
applications you guys really need to dig
down and try and improve the other now
eighty percent in this example we went
from taking ninety percent to not a very
large percentage so times it time to dig
out shark so basically the take home
message here is if you only care about
deploying your application on 10 3 4 and
later you don't need to pre bind at all
because you get very little benefit and
the real important thing
is now we have fully dynamic launches
are faster than they used to be prebound
so even if your non prebound case is
going to be ten or twenty percent slower
which it's not the same anymore now you
still will get some performance benefits
of pre binding but we are faster than we
used to be with a fully preload
application so with that I'd actually
like to bring up robert nielsen to talk
about the dy LD that's in tiger thank
you Jeff so you're going to give me a
water here in anticipation in
anticipation of dry mouth I'll get this
started here so I don't know how many of
you have had a chance to install tiger
and all power books around so see a lot
of people installing things we have a
brand new dy LD in tiger now this is a
kind of a big undertaking in that
probably next to the colonel dy Audi is
one of the most fundamental parts of the
system so as you would expect we were
very cautious with this but given the
history of everything that we've been
through and what you've seen we felt it
was a really important step to take
there we go so what do you get with a
new tiger d y LD well so it's a new
implementation I don't think I need to
tell anybody out there in the audience
that when you have solved the problem
and it's taken you 15 years to get to
where you are being able to re-implement
that from scratch you always get it
better the second time faster we were
actually able to address many
algorithmic issues and also assumptions
that were made 15 years ago when the old
d y LD was started that are no longer
true now more standards compliant and
we'll drill down on this a little bit
later instrumented I want this to be a
message that you take home if you miss
everything else in this talk I want you
to take this home we instrumented the
new dy LD
many axis so that you can actually
measure how long d y LD is taking what
it's doing how much of how much of its
time is rebasing how much is rebinding
how much is initialization we've
instrumented so you can actually see all
the various libraries that are loaded
what causes them to be loaded you can
now actually look at what part of the
launch process is actually the wild ease
fault and what part is actually things
that you can actually go address this is
really important the oldie while he
never had a chance never had a mechanism
excuse me to actually point these things
out so as a result whenever things were
slow it was always d while default and
we are consistent between prebound and
non prebound the old d Wildey pre
binding again came later in the process
and so there were times when
applications behave differently slightly
different code paths when things were
pre round or not prebound that can be
problematic as you can imagine okay new
implementation better basis for
innovation what we did was take the old
d while d its api's and all of the
semantics that are were expected out of
it we re implemented that we were
influenced that we have a much smaller
application we have a much faster
application but we have a compatible
environment and what's really key about
that is it's now a new code base smaller
cleaner and we will now be able to
actually begin innovating one of the
things we had to do and as Jeff
mentioned in his talk there's been a lot
of language progress since D yld has
started 15 years ago and many of those
changes have to do with C++ again Jeff
mentioned templates you have you know
template coalescing etc static
initializers this is an important one
will drill down and this little later
exceptions and exception coalescing etc
so we have been where we are improved in
many regards but we really just are now
the basis for where we will begin to
innovate so the code paths for dye lib
and bundles aka plugins are unified and
they are in fact
the same path in almost all cases yeah
and we've always you know we we actually
struggled with this because people would
sometimes make things bundles or die
libs based on what their performance
requirements were and now you can
actually choose the right tool for the
job so if you need and the only
difference by the way for those of you
who haven't gone down this path the only
difference between daya lives and
bundles is that bundles do not have
external symbols to which you can link
so and the idea there is that if you
don't have a link to dependency on
something you can then unload a bundle
currently we don't have support for
unloading die lives but now you can
choose the right tool for the job more
compliant I tried to stay away from
referring to standards but we end up
using the word here anyways because
standards there are industry standards
there are de facto standards in there
and c c++ a sec standards so the former
d yld did what's called lazy
initialization in pursuit of speed again
this was along the lines of what we were
trying to do to make lunch time faster
so I'm going to back up a little bit and
just talk a little bit about what's
called lazy binding we have a concept in
the wild II called lazy pointers lazy
pointers are pointers that you actually
have to call through there actually you
know function calls etc they're not
accesses to data they're not the address
of a function these are called lazy
pointers and what we do is we resolve
those lazily so if you have you know
10,000 references to external symbols in
your application we don't resolve all
those at launch time we actually wait
until you call through the first time
you actually pay one time to have that
resolved and then every time any place
in your code call to for that same
symbol that gets resolved the idea was
hey this is a good idea lazy binding
saves launch time so why don't we take
that idea and apply it to apply it to
initialization to lazy initialization
the theory was that what we will do is
we will detect when you're calling a
function in a module or in a library and
we'll say oh hey we need to actually go
now run the initializer before we
actually resolve your pointer that was
the theory what we discovered was that
wasn't standards-compliant and their two
standards were talking about here there
is the de facto standard in see how all
the other environments do it how GCC
doesn't on other platforms how Microsoft
does it how codewarrior does it there is
an industry standard on how that was
done and that was all initializers get
run all the time and we didn't do that
we were trying to be clever and we were
trying not to run Mitchell I zehrs when
in fact we should have and it turns out
it wasn't enough of a speed win it turns
out that people have over the years
learned that they don't do really heavy
weight things and their initialization
routines and in fact trying to add some
of the intelligence of dy lb to deal
with the fact that you are trying to
decide when you can and cannot do things
turns out that that logic alone costs
you almost as much as the speed wind was
so the new dy Audi is fully compliant
with both the industry ie de facto
standards and the c++ standards if you
take a look at the kind of history of
the language specification and CC has a
standard that says very little about
linking and nothing about linking of
shared libraries and dynamic libraries
C++ didn't follow that rain and in fact
got quite involved in talking about how
symbols should be linked in things and
made reference to static libraries as
well as main reference to dynamic and
shared libraries and what the standard
says is you will call all initializers
and you will call all finalized errs now
it doesn't say when so you could
theoretically just before you exit call
all of your initializers and then call
all your finalizes and your standards
compliant so you figure if you have to
do that you might as well do the
initializers up front and finalizes at
the end and be standards compliant so
whatever I guess that's what we should
do so the gnudi well dias is compliant
in bolsa sort of de facto standards that
really isn't sea world and in the c++
standard this will improve performance
in some places
one of the big wins is it won't require
you to do the bind at launch mechanism
which I'm afraid many of you had to do
to get all C++ initializers to run in
the old dy audi you'd set the bind at
launch flag and the poor thing would
have to go resolve all symbols so that
it would it would actually resolve to
the initializers and that behavior is no
longer required that's a huge win for
people at launch time so there's also
some speed improvements when linking
with single module so you know the old d
Wildey days there was a difference
between multi-module and single module
and there's some history that i won't go
into but one of the benefits was that
you could have initializers on different
modules and so it would only initialize
just a module which is one part of a
library and try to do that lazily but
since now we actually call all
initializers / the standards the notion
of having multiple modules is really not
an added advantage that you need anymore
and in fact if you go with single module
in many cases you'll actually see an
improvement in performance because the
dynamic linker doesn't have to deal with
as much so so single module is faster
than multi-module and if you're using
Xcode it does the right thing here we go
this is this is one of the big ones
instrumented dy LD is often the least
understood thing it tool in our tool
chain and I've been thinking about this
and it really feels like to the
developer it's the absolute end of the
chain and to the user it's the beginning
of the chain it's actually kind of
neither fish nor fowl and it's that
boundary condition in the middle and
developers don't really kind of in large
part realize that this is in fact part
of their environment it's just doing the
very last bit of work so you know so so
as it's sitting on that boundary it
doesn't get a lot of attention and quite
frankly the former deal d while d didn't
have the instrumentation and didn't give
you any feedback as to what it was doing
so to address this we've instrumented dy
Audi and we're just getting started so
what I really want you to take away is I
want you to actually get back to us and
say hey this is great but wouldn't it be
better if you did the sand so we're just
getting started
here but here are some environment
variables that we have for instance d
while the ignore pre binding allows you
to take a nap which is fully prebound
and actually run it unprovoked and pre
bound next to each other side by side
you set the environment variable it's
right hand side of this environment
variable is either all it's all app or
split I forget I forget the exact syntax
please look at the man page on this but
you so in other words you can actually
see if your app is not pretty bummed but
everything else is or if nothing's
prebound actually you can do
side-by-side comparisons to see if pre
bindings actually worth it I bet you'll
find out it's not do uld print API is
actually print every time you call
through a published d while dapi it
actually echoes out the API and the
arguments that you've passed in do all
the print initializers this is a big one
you actually want to call print
initializer so you actually want to see
this so you can actually make sure not
only your initializers were being called
or being called only once that they're
being called in the right order
initialization order could be a talk in
and of itself but print initializers
will help you sort out some of these
hairy problems initializes are called
pre main okay pre main means many people
when they get inside gdb they say break
at main what do you how do you break
anything before mean and when these
things who initialize just happened
pre-made it's often hard to debug these
things d while d print libraries shows
you as things are coming in what things
are being loaded what libraries cause
what libraries to be loaded in so if
you're trying to figure out under what
circumstances the library gets loader or
if you're trying to determine if for
instance you're actually loading the
debug or profile version of a library
this is a way to do it dld print library
do all the prints statistics this is one
of our favorite and probably the first
one we added this actually allows us to
we actually do time measurements of how
much time you spent rebasing rebinding
how much time you spend the
initialization there's a sum total that
says how much time was spent in d yld
and what we have found on applications
big applications you know big
applications that might take four
seconds to launch actually 400
milliseconds is actually spent in d yld
so the problem is no longer do you
wildin
please Mandy well d to get this
information and please give us feedback
on this one of the one of the goals on
the instrumentation we actually want to
unify the output grammar to be very
consistent so those of you who are
script writers and I'm one of them you
can write a pretty simple script to
parse the output of these things and and
do what you will with them so let's talk
a little bit about initialization
finalization you're going to you're
going to hear us pop back and forth
between initialization finalization
construction destruction they have
different meanings but in this context
they have very similar meeting so i'll
see if i can iron that out for you so
the history of this is approaches on see
initially and many of you who've done
any unix programming at all have ever
had to deal with this in the early days
there was a magic function called
scorned it or another one scored score
fi and I you know using them clever
mechanisms to see programmer tab of
trying to keep everything to the lowest
possible number of characters so that is
actually an it for initialization and fi
and I for finalization so this is
initialization finalization you were
allowed to have one of these basically
one of these per executable you can play
some tricks with the with the linker and
have one per library but you you know
quite frankly was very difficult it had
to be an external symbol so you were
able to do this but you know if you
didn't do it exactly right you sometimes
found that you turned off initialization
for lied see when that isn't good you
really need libis e to get initialized
then in a huge leap of technological
advancement somebody said hey let's put
it on the command line and called ash in
it then you could pass in the name of
the function woohoo big big improvement
there and by the way you actually always
want to do this in code you don't ever
want to do this on the command line so
so Apple along the way added their own
extension a pragma which this may be the
first time you see this also please
forget this right after you walk out of
the room this is an extension that we
added better again to add these things
to code this is an extension that we
added and we we are um well it's all
deprecated
it's all deprecated there is a mechanism
that GCC has and we'll go into that
don't use these anymore these are old
mechanisms there's lots of problems with
them don't do this if you have this in
your code if you have things if they
have these using your binaries we are
binary compatible your codes will
continue to work if you have these but
as you move forward want you to go ahead
and view some of these recommended
practices so in GCC there is an
extension has been around for a long
time it's a member i mentioned
initialization versus finalization
construction versus destruction these
are actually called attribute
constructor and attribute destructor
probably would have made more sense that
they had said attribute attribute an
initializer and attributes finalizar but
quite frankly what's going on here is
that these actually sit on top of the
primitives that the C++ compiler uses to
do construction and destruction so since
it's ultimately sitting on top of the
same primitives the GCC folks when they
added this attribute added this
extension to the language they said
let's call it constructor and destructor
so a lot of advantages to this couple
things to note I would recommend that
you make these static functions there's
no requirement as in the previous
mechanisms for making them extern so do
make them static they must be void void
meaning they must return nothing and
they must take nothing you are allowed
to have multiple constructors and
multiple destructors per file that's
also called the translation unit for
those of you who read specs and also
cross modules and cross libraries so you
can have many of these it is or it is a
recommendation that you actually have
one of these per translation unit and
that you actually have other functions
that you call from here the standard
does say you can have multiple DS and
the standard does say that they will be
called in file order and in the case of
see that's a pretty straightforward
thing however when you get to C++ and
you start looking at what it means in
C++ to do these kinds of things things
can get very confusing very quickly so
in C++ there was no need to add a
language extension because C++ already
has mechanism for static initialization
if at file scope you say
int you know equals foo open paren close
friend that semicolon that is in fact
static initialization that will happen
before that will happen at that point in
the file as there's translation unit is
being handed off that will happen early
also you can create objects create them
at file scope or again namespace scope
which is a modified file scope and these
your constructors and destructors will
in fact be called so they can be tricky
and getting all that stuff right with
C++ and trying to depend on the order
within a file beyond the talk beyond the
scope of this talk so please pick up
scott Meyers book effective C++ item 47
the fact that it has its own item number
here should tell you something so in
summary what are the take-home point
tiger duality is faster it's so fast
that pre binding is not required and
it's only going to get better conforming
C++ static initialization we've had a
lot this was one of the driving forces
behind us going and revisiting byob and
actually doing a new one from scratch
and it's instrumented to aid the
developers understanding of BYOD and
help you figure out what it's doing and
what you can do to help make your launch
times faster thank you very much chef
glasses oh gee thanks Robert so I think
I wanted to shift gears a little bit and
I think what I want to talk about now is
on Monday Ted Goldstein and his keynote
introduced the feature that we now are
providing dead coach stripping support
to our tool chain and that's both in
Xcode 1.5 and in Xcode two point oh we
didn't explain much about what it is and
how to use it and that's what I'm here
for so what is dead code stripping so
dead cut stripping is really something
that the static linker does it does an
analysis of your entire program and
determine symbols and code
segments and data items that are not
referenced at all and it just removes
them remove them from the final linked
image and that safe space certain
classes of applications have more dead
code than other Macintosh platform the
Macintosh tools historically from other
vendors like Metro works have always had
did coach stripping and the style of
programming has been let's put you know
hundred utility functions in one C file
and if I don't use them they'll get
thrown away the tools that we've been
shipping up to now have come from a UNIX
background where the eunuch style of
programming is let's just put everything
in lots of dotto files and then you've
got static archives and then and have
libraries have appeared so the dead code
stripping hasn't been as important in
the past for unix type applications
other things that are interesting when
it comes to dead code stripping our c++
template instantiations where the
compiler can actually sometimes generate
code that is never referenced that will
reference things that are not defined in
your in your program and that actually
causes a link error if you don't strip
that code out so we do that now so how
do you do it so there are two
command-line options for the static link
or now dash dead strip which enables the
feature overall and the other one that's
actually kind of important is don't
strip your initializers and finalizes
what's interesting is initializes and
finalize errs are almost never
statically referenced in your
application and if you don't pass this
the linker will say hey they're not
referenced off throw them away so please
if you have initializers that are static
or that are not static that are exported
and you do use this flag or they will be
thrown away if you turn undead code
stripping you don't actually need to
remember those I'll go into how to do
this in Xcode in a second but dead code
stripping doesn't come for free one
limitation with what we have today is
you actually need to rebuild your
program and use dash G fall if you want
debug information that she used is an
optimization that the compiler and X
it uses by default to try and minimize
the amount of debug output however if
you don't have full debug symbols for
the static linker to deal with when it
tries to strip you may end up with
symbols debug symbols in your final
image that have no code backing it or
vice versa you can actually end up with
code that doesn't have debug symbols so
dash G full is important dash G's not
good enough because dash G defaults did
you used and again I'll show you how to
do this in Xcode so you don't have to
type stuff on the command line or type
custom options it's important to
remember that any symbol that's exported
from a dynamic library is considered
used by default this is a good thing if
you have a global symbol and it's in the
library you expect someone to be able to
link to it therefore it is considered
used unused symbols that aren't
reference statically you can actually
use another another GCC attribute called
attribute used that tells the compiler
this is used please don't throw it away
so for example if you're trying to use
either the DL compat a pil sim or dy LD
api's to reference that symbol only
dynamically you need to flag your source
code to tell the compiler yes I want the
symbol in my final image one other thing
you could do is there's a couple linker
flags that let you deal with symbols in
a bulk manner there's a feature and I
think the Microsoft compiler called
decals back how many of you guys are
familiar with that of specifying whether
or not to import or whether or not to
export a symbol we don't quite have that
yet but you can actually specify a file
that you can either say here's my
library here's the list of the only
symbols that I want exported and then
you pass that with the dash exported
symbols list and then anything else not
in that file will not be exported as a
public symbol or you can go the other
way and say here export everything
except these things because they're
really secret and private so you have a
choice of how to control that you do
need to use the new compiler there's new
version of GCC 33 that comes in Xcode 15
that's added
toward for tagging these are tagging
these object files so they can be
stripped by the linker if you don't
recompile your project with the new
compiler and the linker gets some old
object files the linker needs to be
conservative and treat those old object
files as a single block basically the
old unix semantics so if you have a
hundred 100 functions in this old object
file and it's not recompiled and you use
one of them all hundred of those will
end up in your final image if you do
recompile the other ones that aren't
used will be thrown away so it's really
really easy to do this in Xcode you may
have seen this already in the preview if
you've installed xcode 15 or tiger
there's a couple check boxes in the
target inspector and they correspond
directly to the linker options there's
help underneath that you can look to
make sure you're doing the right thing
but check one or both of them depending
on what you want and in the debug also
this is in the project inspector you
need to make sure you set G full and
that's a level of debug symbol you can
actually do search in the bottom 4G full
and it will do the right thing because
we search the help text also so that's
all I have to say about that for now and
now I wanted to bring up Jeff Keating to
talk about w charkie which kind of is a
little different topic
okay okay so moving away from linkers
and loaders and such in the Panther
timeframe we introduced a wide character
support to GCC and to all of the system
libraries so white characters are there
a standard part of iso see and i see
pastas they were introduced last century
and we and intent a timeframe we finally
got around to implementing them
completely in both the compiler and all
the weights with the system libraries so
those of you who are familiar with it
will probably know all this already but
you basically you instead of using the
character type char you use w CH RT and
instead of strings and character
constants you just place an L in front
of them to indicate that you want a wide
string or a wide character constant
unlike regular unlike regular strings a
regular character constants you're no
longer restricted to just the standard
ISO CD character set so basically the
low half of ascii so you can actually
use characters from other languages
Japanese all the accented characters
from all the European languages Chinese
they all work in strengths you can use
all of the standard C and C++ library
functionality you can print them out
using printf so here's an example of a
lot of a single-wide character that
we're printing out using printf there's
an additional kind of stream called a
wide stream in both sea and in c++ that
now works on white characters just as
regular streams work on regular
characters so you can use all of the C++
functionality to print out wide strings
to print out wide characters and so on
the key feature of white characters that
makes them different from what you could
kind of do the for which is just put
utf-8 into a regular strength is that in
a wide string each character is just one
unit so here for example the last bullet
item is we have a string that contains
exactly two characters you can index
that string and the first item in the
string is a complete character by itself
it's not the first bite of some longest
sequence so I should feel obliged to
point out we have the standard C and C++
functionality this doesn't include
anything that for instance draws 20
lines of wide characters to a screen
formats that nicely puts in line breaks
justifies it on both sides remembers
which direction to write it in we don't
have that for that you want to use the
carbon and cocoa functionalities for
doing this you know you want to put it
on and in particular you should consider
using CS strings CF strings don't work
like this they have an encoding that's
specialized for each individual function
for each individual language and so on
as a result of that why'd the seer
strings will often be more efficient
than using wide strings so long as you
don't need to do heavy duty text
processing or any kind of language based
processing of those strengths so like
most unix-like systems on Darwin we
choose to make way characters for bites
they contain a ucs for code point from
unicode this lets darwin support all of
the characters in unicode while still
maintaining that property that every
character is one thing in the strength
some other operating systems decided to
use two bites the wide characters
because we'd never have more than 60,000
characters in the world and it turned
out that wasn't such a great choice with
the new extended cjk characters we
actually need more than 65,000
characters so we need the full four
bytes
the so that was white characters so I
mentioned that you could kind of do this
with regular strings with strings made
out of char the key thing to understand
is that the interpretation of char
varies at runtime depending on what look
how you use there's the whole
functionality involving LC chart set
involving setlocale described in the in
the c and c++ standards as a result of
this if you try to use anything outside
that basic ascii character said you know
in a char string it may work you need to
be very careful about testing it may
work on your system but on someone from
a different way someone from a different
country tries to run your software they
may discover that what you thought were
perfectly fine japanese characters turns
into strange accented characters that
make no sense the GCC 33 compiler as
shipped in Panther kind of expected
input to be in utf-8 so you should just
go into Xcode and check the appropriate
to take the appropriate to drop down
menu item that says my source files are
in utf-8 because that's really what it's
expecting and that's what it'll try to
convert to on the output we hope to
improve on this in GCC through five but
it's not quite there yet if you have
been using systems with a 2 x w chart t
and you want to end you want to come to
to GCC and xcode and use that you might
consider using unity are as a substitute
for the actual type w chart e to read
these from and to disk you might want to
use the icons library if you say man
three icons which contains
functionalities for reading the UCS to
form the w that a two by WRAT really
means and and it'll it'll convert and
it'll load do the full decoding of that
form it'll also it also knows how to
decode virtually everything I very other
character said that you might ever want
so you should look at that or if you've
decided to use CS strings instead the
right place for that is to use CS string
creep from internal external timing
representation
which has basically all the same
functionality it lets you basically say
i have this sequence of bytes and it's a
new cs2 please turn it into a seer
strength and it'll just do it so now
availability I said we got it into
Pantha and it is the library support is
available in Panther and later that
means it's not available in Jaguar or
earlier so a consequence of this is that
first of all if you want to use it in
your programs you probably really want
to be just targeting panther or later
and even if you don't care about white
characters or other languages you should
still know that the c++ standard library
requires this support so if you try to
take an applet of c++ program you build
it on Panther and it uses the c++
standard library and you then go try to
run it on jaguar if you've managed to
invoke some of the parts of the C++
libraries that expect the support it
won't work so what you should do is if
you wish to build an application for
Jaguar or earlier and you wish to do it
on a current operating system panther or
later you should use the SDK
functionality this only applies to c++
and objective c plus it doesn't apply to
see okay so i should now head over to
Matthew formica okay jack so what you've
seen today is a bunch of different
things that we've been working on on the
low levels of the system dynamic and
static linkers and in the compiler
launch time improvements are here today
and they're better than pre binding and
they're getting even better in Tiger d y
LD is all new and tiger and adds a whole
new level of standards conformance that
I'm sure will help many of you with your
applications on Mac OS 10 you want to
make sure you know your tool set if you
have bumped into the concept of dead
code stripping up till now we now have
it in xcode you'll get different mileage
on different applications depending on
whether your app actually uses or
requires that code strip
and then if you are moving an
application to Mac OS 10 and it's been
relying on w chart T on another platform
or another compiler you will want to
consider alternatives on Mac OS 10
including units are and CF string
there's a variety of documentation we
have available our tools documentation
includes information on dead code
stripping and the support included in
that I am the the tools contact and
developer relations you can feel free to
drop me an email on tools issues that
you have or you can send feedback to
Xcode ash feedback at group apple com