WWDC2004 Session 107
Transcript
Kind: captions
Language: en
thank you very much and now if you would
please welcome to the stage Dave's our
jet ski good afternoon today I'm going
to talk to you about porting UNIX
applications for mac OS 10 one might say
this is about porting learning about the
unique differences our platform provides
if you're coming from maybe of linux or
solaris box or your favorite unix
platform of the day now i assume you
actually since you're coming from
another platform have experience with
unix so when i use certain terminology
it won't be too foreign to you so what
are we going to cover we're going to
cover the history of UNIX a little bit
just to show where Apple comes from
we're going to talk about bundles a
unique Apple invention for packaging
stuff and packaging itself for software
distribution we're also going to talk
about the interesting damon's on a par
platform that you may notice when doing
a PS or wonder why yeah just when you're
poking around you we're going to talk
about standards conformance since that's
a very important thing to so many of you
since you are using standards api's most
of the time we're going to be talking
about the linker the runtime and other
what we call frameworks it's the whole
stack of things you linked against file
system portability our file systems that
we have on our platform very wide and
they support certain options and others
don't support certain options we're also
going to talk about authorization and
authentication obviously Anna multi-user
environment you need to be able to allow
certain users to do certain things and
not allow certain users to do certain
other things finally there's the
development environment we provide Xcode
and lastly we're going to talk about
Mach topics we're not going to cover
we're not going to talk about drivers
are writing or i/o kid we're not going
to talk about the GUI in any way nor
multimedia printing font and handling
basically anything you can see we're not
going to talk about it it's just about
the
bottom layers of the system now apple's
mac OS 10 we've got a long legacy of
different os's in the platform we have
some of our roots from bsd the berkeley
software distribution of unix from way
back in the day we have mock out of
Carnegie Mellon which was the
microkernel work and took a large
advances into virtual memory and inter
process communication finally next
brought these two together and tried to
make a product out of it called next
step and they succeeded fairly well in
accomplishing the goals they set out to
do so much so that Apple bought it and
produced mac OS 10 what you're using
today and Mac OS 10 brings in its own
unique flavor of things to add to the
mix with our mac OS 9 legacy with our
mac apps and our legendary ease of use
now let's start getting into some of the
details that make us unique first and
foremost we have a notion called bundles
you're going to see them everywhere on
the system a bundle is just a user
visible atomic blob when you see an
application you'll click you drag it you
can't but behind the scenes it's
actually a folder the folder contains or
directory whatever you want to call it
it contains executables resources
pictures sounds you name it anything
needed to support the application in
that bundle that one directory now in
the UNIX world you might stuff some of
these things in users share you might
stick some of the libraries and use your
lib they're all scattered throughout the
system we like bundles because it brings
them all together in one directory the
users can move around and add to their
system and delete at their discretion
finally we have two variants of the
bundles given our history we have the
traditional bundle which looks like
let's say an application bundle it be
the name of your application foo dot
slash foo that's how we'd identified as
a bundle now in the Mac was ten time
frame we invented the notion of a CF
bundle which it's the name of your
bundle /
contents or / and then resources and
other blogs in there now again I'd
wanted to reel down that we're talking
about the basic system part of the
system we're talking about that layer
right in there that you see around
libsystem and the top part of the
colonel meaning system calls things like
glitzy and some of the other various
libraries like lib em and whatnot you're
already familiar with now the things
that make us unique on our platform is
not only do we have libraries we have
these notions of frameworks frameworks
are really just libraries but they live
in a bundle and they live in a different
location than user live they live we'll
talk about more later where they live on
the system now if you want to get third
party software there is various options
available to you we have the think
project I don't know if you've heard of
it but if you're porting let's say an
existing open source project let's say
you grab screen and you're trying to run
it on 10 well someone may have already
done that work for you if you check with
think or Darwin ports you'll probably
find that the project you're trying to
pour it is already ported to the
platform saving you lots of work also as
far as packaging is concerned our own
product we have mac OS 10 our client
distribution and we have mac OS 10
server for our server customers now the
server product is mostly value-add based
around our file server technology and
additional components to make the server
experience more pleasurable be not only
based on our GUI to control applications
but things like additions to the web
server environment like blogging
software or whatnot finally we also have
Darwin which is the apples proof that we
are committed to open source by
releasing the foundation of the
operating system and in fully bootable
fashion that you can run on your 10
bucks your Apple walk should you desire
now you're coming from a different UNIX
you do PS and you see some interesting
damage floating around I want to talk
about at least a few of them so you can
understand some of them that we have on
our platform and why we have them and
they're important let's start with
notify d notify D is a way that Damen's
can send messages to each other and not
necessarily agree on the technology used
to send the message for example the
system are networking team and has put
together a daemon that actually controls
how interface has come up or manage set
up well some damon's want to find out
when the networking changes but they
really don't need to know the finer
details of how or why so what happens is
we send a mock message over to notify be
saying the networks change I notify that
these n tends let's say a signal over 21
Damon or it might send a message over a
file descriptor to another a lot of our
damage already on the system actually
use the signal method to get restarted
when Network change events occur they
request that a hop signal gets sent to
them and tada anytime the network
changes they get hooked now that network
change that I was talking about the guy
who sends a notification that's config d
config d is responsible for maintaining
a lot of state on the system for
configuring and controlling various
pieces of hardware but mostly centered
around networking it's where our dhcp
client lives it's where some of our
other networking policy decision-making
happens it's very critical to the system
and we I would recommend that if you
want to learn more about it you can look
at our system configuration framework
where which are the external API is for
talking to config d also if you look you
might see something called the mdns
responder it's also known as rendezvous
this is what implements it some things
talk to it and you know either browse
the network or to send out
advertisements and that's its sole
function in life finally look up d look
up d is equivalent to nscd if you're
coming from a linux or solaris box it's
our name service cashing Damon it can
cash as dns lookups the cash is a name
and name lookups group look up any good
thing you can pretty much think of its
being looked up it caches them it also
implements get at our info
example and does the advanced queries
for that now let's talk about standards
standards our policy you know is that we
like standards we want to comply to them
but we're not making a serious effort to
go out and test all the standards but
but but but if you find a bug that is
where we do not conform to a standard
please please let us know we will try
our best to fix it because we do believe
that standards are best for all involved
us and you as a developer now for
example we do know about some bugs we
haven't gotten around to fixing and one
that might affect you when you're
porting application is that we don't set
errno in math lib it's unfortunate but
true and it's something you'll have to
work around now let's talk about some
other interesting details we have we
have a different linker than you might
find on a solaris or linux box we have
majko as our file format and d while d
is our dynamic linker a linux box might
use elf is a file format and LD fo as
its dynamic linker what does this mean
well some behavioral differences for
example versioning on our platform the
what might be considered the major
number in the linux solaris world is the
path on our system if you do an O'Toole
dash elegance and executable you'll see
the full path to a library on the system
if you want to do a binary incompatible
change change the name this is a big
difference because that means that if
you redirect your assembling to get a
performance hit unlike the Linux world
which has an LD cash which caches
assembling translations also we have
bumbles versus libraries in the linux
and solaris world there is no difference
between a plug-in and a library it's
just whether you link to it at link time
or you load it at runtime it can be the
same file on our platform that's not
true you have to specifically compile a
plug-in as a plug-in
to load it in that manner now what does
that mean you need to use the dash
bumble flag also if you want to make
your life easier as a developer of a
plug-in or a bundle depending on how you
want to call it you can use the dash
bundle loader flag what that does is
specify the actual executable which is
meant to load your plug-in so that way
you can resolve symbols and make sure
that your plug-in is fully resolved at
link time rather than just you letting
it slide some symbols are undefined and
we hope that it works out at runtime
also a unique difference we have is the
two-level namespace we invented this
technology to help us deal with binary
compatibility going forward what it
means is that a library let's call it
library foo might use malloc from our
libsystem now let's say you include a
Mallick in your product in your
application well only your code is
actually going to end up using it
because that there's a direct reference
from that library to the Mallik
symbolism it isn't a global namespace
you can if you need the semantics that
if you define a Mallick it will be
overridden everywhere in your address
space you can say dash force underscore
flat underscore namespace and this will
tell the dynamic linker to collapse
things down and make sure that symbols
are the same everywhere also a very
interesting thing about our platform is
that we're far more dynamic when it
comes to dynamic symbol resolution at
runtime what this means let's describe a
classic bug is that Thun mail for
example had a signal handler the signal
handler wasn't resolved fully in the
sense that functions it called hadn't
been resolved yet now send mails running
along calls a function which triggers
the dynamic linker to start resolving it
and then the signal fires well we're
already in the dynamic linker trying to
resolve a signal and what happens is now
the signal handler is running and it's
caught it reaches a symbol that needs to
be resolved and now we've got two
instances where we're trying to be in
the dynamic
at the same time on the same thread and
you deadlock this is a neat difference
it's were far more dynamic but it can be
problems with your porting code what I
recommend is you just do dash bind
underscore app underscore load and
that'll tell the dynamic linker to make
sure that all your symbols are resolved
before you hit main now when it comes to
forward and backwards compatibility on
our platform we have something that will
make your life easier we have something
called Mac OS 9 week symbols what it
means is that you can say if symbol call
symbol so there might be a function you
want to use but it's not available on a
previous release that's okay just test
for it and if it's there you can use it
and if it's not you can it's a very
powerful tool for dealing with forwards
and backwards compatibility finally on
our platform static linking of standard
libraries is impossible we don't ship
them sorry it's just something that we
find is best for our needs is supporting
you and yeah it's best for all involved
we believe now not done talking about
the linker yet there's some other
interesting behaviors that are worth
talking about symbols in an object file
must be overridden together in the elf
linux solaris world you can for example
override malik and not override free now
that's okay if you never called free but
if you actually do call free you're
going to get a different implementation
they're backing stores might not even be
the same one can imagine that horrible
problems will start to ensue when you do
that well on our platform the way we
solve that is we put malloc and free and
realloc and the other associated
functions all in the same dog see file
which get compiled down to the same data
file obviously and the rule that our
dynamic linker enforces is that if you
override one symbol in a dot o file you
can't use any of the others there you
have to override them too now that might
not affect you you might not be
overriding symbol of all that much but
it can potentially affect you if you put
unrelated pieces of code in
name 0 file it's something to watch out
for also on our platform common symbols
are problematic I just recommend that
you do dash F no common try and make
sure that you're just not using common
symbols they're just a bad idea in
general anyway a couple more things to
talk about bundle unloading is currently
unemployment 'add sorry you know you
can't unload a plug-in basically it'll
still reside in your address space
another interesting issue to be aware of
is that if you're using C++ and a bundle
our static initializers are not being
called at the moment again it's an issue
to worry about when porting code now if
we can move outside of an actual address
space and just talk about general
runtime issues we store configuration
and resource files not necessarily in
the same place on Mac Mac os10 we store
them in some different locations we have
open directory which is our great
multiplexing demultiplexing engine for
getting preferences from different
places and we'd recommend that you use
those ap is if you want to be a good
citizen on our platform to find out
about your various configuration
parameters or general system
configuration parameters also instead of
having a lot of dot files in your home
directory using different parsers and
just different altogether code we have
code for you that can make your life
easier it's called the CF or NS
preferences api's and they put them in
tilde / libraries / preferences and
it'll save you a lot of work of dealing
with how to save preferences and restore
them on our platform now when it comes
to system startup we have some
differences compared to a linux or
freebsd or solaris system on those kinds
of platforms everything essentially
calls out from etsy RC and you'll just
have shell scripts calling shell scripts
calling shell scripts and what will
happen is that when it comes to what
order to run things in the UNIX world is
pretty static still all they do really
to accomplish proper initialization is
just make sure the shell startup scripts
are named
in serial order so you might have one
food to bar three baths yeah that's you
know pretty not very dynamic and we've
done some things to improve that we
created startup items again their
bundles we've talked about these before
there's stores in system library startup
items each one of those bundles contains
a little blob of data specifying
dependencies this means we can
dynamically start up the system and boot
things when they're ready in fact we
even boot some startup items up in
parallel since they don't have
dependencies on each other now the boot
up is changing unfortunately the session
that talked about that was yesterday
those session 106 it's being replaced by
something called launch d1 can think of
it is I net D on steroids it's going to
be able to support any Damon on the
system and make sure that it gets
launched on demand when any
configuration detail of it changes some
more run time issues in fact I just
talked about pre binding with one of the
Xcode people right before this session
so a lot of this slide doesn't really
matter anymore pre binding but for the
purpose of this session i'll still talk
about pre binding it was a method we
have to increase performance of runtime
of your application what we did is we
since we know of all the libraries on
the system we can pre calculate where
they will load in your address space
once we do that we know where all the
symbols are and we can then record in
your executable the linking information
let's say again use malloc well we know
that's going to be you know nine
kabillion something and we can record
that in your executable that way when we
load we check that the state of the
world hasn't changed and if it hasn't we
just let your application go because
we've already pre linked you as thing
unfortunately what this means is we've
modified executables and libraries all
the time whenever library changes we go
modify everything and re pre bind
everything that's a problem for some
people it means that backup
security tools get false positive for
changes things like tripwire to do
intrusion detection don't work because
again they're false positives for
changes how's this all changing well
we've written a much much much faster
dynamic linker in tiger so much so that
we believe that you as developers no
longer need to worry about pre binding
I'm hoping that will make a lot of you
happy now this is more of an issue if
you're thinking of a 1000 or 10 one box
but our bin sh changed it used to be z
SH V she was almost positives from
client but not quite we switched to bash
we think bash at the time happened to be
faster to which was a nice perk but
that's not true anymore but bash is
POSIX compliant and we like that but if
you were using some Z isms you might
need to accommodate the change if you're
making obviously writing a portable
shell script hopefully none of this
should matter to you now frameworks
frameworks is something I touched on
earlier it's a bundle based alternative
to the UNIX hierarchical libraries and
resources again it just contains the
library and headers but it could other
contain other shared resources to let's
say your library is a gooey library and
it has pictures and sound it can be in
your framework now when it comes to
actual compilation though there is a
difference you don't say dash l food to
link against the framework you say dash
framework foo that tells the compiler to
look in a different location for it on
the topic of file system portability
POSIX file systems are typically case
sensitive and they support sparse files
although that's not necessarily
universal our native default application
that we have a default file system for
Mac OS 10 is hfs+ its case insensitive
case preserving that means that you can
have big foo or little food but you
can't have both at the same time we also
support resource Forks and the API for
changing that used to be
out of the way and UNIX applications
wouldn't know about it which created
issues for backup tools or anybody
copying files using unix tools this has
changed in the tiger frame time frame
because resources are now going to
exposes extended attributes and any unix
tool that is aware of extended
attributes can deal with resource Forks
correctly but it is something to be
aware of considering that extended
attributes are now coming into vogue in
the entire industry finally our links
are not supported by all file systems
it's something that your application
might need or used but you need to be
aware that keep while supported by H of
S Plus some of the other file systems we
have on the system don't necessarily
support it like the web dev filesystem
now something that is again unique to
our platform when it comes to the
filesystem hierarchy is the way we do
scoping we have four primary scopes on
the system we have the system scope
essentially software and shipped by us
that you shouldn't ever need to touch or
manage we have the network scope it's
where a network administrator might put
anything they want to show up on all
machines we have the local scope
anything on the local machine and
finally the user scope something in the
user's home directory how does this
affect anything well remember how we
talked about frameworks they can live in
all four of these scopes you have system
library frameworks we have library
frameworks which is the local case we
have network library frameworks for the
network case and you can as far as I
know put frameworks even in the user's
home directory should you desire again
and Tilda / library / frameworks when
you're developing let's say a framework
or really any application that wants to
put files on our file system we would
recommend that you try and conform to
our file system hierarchy for where to
place files on the topic of
authentication and authorization we have
the security framework it's our
preferred API for doing things it has a
lot of support for advanced technologies
like smart cards
and other interesting authentication
mechanisms like fingerprint readers you
name it they're trying to support a lot
of the advanced things that many
organizations want it's also a
capability and rights based system I
won't go into the details of that but it
is important to keep in mind if you need
to do authentication and authorization
and finally while we do have the file
security framework we do have Pam
available if you are needed
compatibility solution either you have a
Pam plug-in or using a Pam api's we make
sure that the authentication
authorization can route correctly if you
need any kind of Pam compatibility our
development what environment is a lot of
familiar things and some different stuff
at the same time it's mostly a good new
tool chain it's GCC and gdb and make you
know the things you're familiar with but
we do have some differences that are to
be good to be aware of for example we
have the c preprocessor we've modified
it to support precompiled headers you
might see these in user include when you
may be say an LS start p you'll see some
precompiled headers there it helps us
when compiling large applications and
including lots and lots of headers so we
don't have to regenerate all the c
preprocessor passed unfortunately the c
preprocessor doesn't support all of the
new extensions to the air c preprocessor
which some of you might have consciously
or unsub consciously started using how
do you work around this well it's dash
no dash c PP dash pre comp that'll get
you your old canoe linker bag now Xcode
obviously there's been many many
sessions about it this week we highly
recommend you use it in the case of
debugging it has a very powerful visual
debugger zoo we would highly recommend
any use and you don't necessarily need
to go to a lot of effort to port your
applications build system into Xcode
Xcode supports legacy targets you're
going to say look there's a may fall
over there just go build against it and
you can keep your legacy build system
which is probably desirable to you if
don't want to maintain portability but
you can take full advantage of
everything that Xcode has to offer with
code searching and debugging and lots of
yummy things like that finally some of
you have some very very very big apps
out there so much so that you run into a
nice interesting architectural anomaly
on our platform it's not it just
requires some different code generation
if you need more than 16 megabytes of
text you need to use dash mne a long
call on the compilation line to make
sure that your code gets generated
correctly otherwise it will fail to link
again it's just something to be aware of
if you have a large executable if we
talk about api's now we again we're not
completely like other platforms we have
some things you need to be aware of if
you're using pole it happens to be
emulated in our platform and we highly
recommend you use KQ instead if you
can't though at least it's still there
but and well at the moment it simulated
via select so if you're familiar with
those two api's you might be able to
imagine the scaling problems we have
with that current emulation technique
but if you don't need a large number of
file descriptors it's okay to use our
poll emulation at the moment also dlopen
for loading plugins again it's emulated
how does this affect you well again as
we talked about unloading plugins is not
supported and if you're taking advantage
of the RTL the next functionality of
dlopen we don't support that at the
moment that's something to be aware of
other issues we have that you should be
aware of is that pthreads is partially
implemented it gets better and better
with each release but it you need to be
careful when you look for functionality
on our platform the biggest thing to be
aware with pthreads is we don't
necessarily support cross inter-process
sharing of locks for example something
to be aware of also POSIX message queues
are missing if you really really need
that kind of functionality our current
story is that you we recommend the use
maybe mock ports but
we hope that you can use find a
different API also when it comes to
system 5 shared memory we support it but
it's very weak at the moment for if you
look in our boot up in flash Etsy or see
you'll see that we statically set the
variables for that and once those shared
met system 5 shared memory variables are
said they can't change the life of the
boot so if you need those change you
need to be aware of that finally well
not finally uh well we do have openssl
on the platform we would recommend the
youth cdsa openssl has some
architectural limitations when it deals
with for example smart cards openssl
will hand the key around directly
whereas cdsa supports having the key
being representatives handle which could
be connected up to a smart card
somewhere else where you don't actually
even have physical access to the key a
unique ish language on our platform a
language opportunity is objective-c some
of our frameworks are implemented in it
and you need might need to use it I
actually recommend it it's kind of fun
to use it looks like small talk and
it'll probably only take you less than a
day to learn so I want to be scared of
it and you might enjoy it finally a lid
tool versus ganool abbr tool oh it's
unfortunate but these things happen we
have a name conflict if you run libtool
on our platform you'll get a tool that
we wrote many many many years ago next
called libtool for dealing with library
generation has nothing to do with the
canoe libtool they're not even really
functionally overlapping but it's just a
conflict of names and that means on our
platform you need to call if you need to
use canoe libtool you call it G libtool
now the last thing we're going to talk
about today is mock our story is if you
need to we don't want you using mock if
you don't need to mock has AP is for
example for allocating memory in your
address space vm allocate vmd allocate
we'd rather you not have you use those
api's we have perfectly fine POSIX api's
for doing that it's M map and
for example but sometimes you do need to
use it if you need advance control your
process priority that's the currently
best mechanism for doing that also if
your process ends up using libraries
that use mock you need to be aware of
the bootstrap namespaces it's a mock ism
I do really don't want to go into it at
the moment what it means is that
different processes processes in your
login session have a different context
mock wise then let's say a system Damon
so if you need to talk to another damn
and this might cause problems for you
finally when it comes to traditional
UNIX AP knives that are non-standardized
ptrace is only partially implemented our
platform we only implement it enough to
do an attached after that gdb on our
platform for example uses the mock api's
for getting and setting and controlling
the process that's introspecting I doubt
very many of you are using ptrace so
that probably isn't something you need
to worry about now who do you want to
talk to you after the sessions over if
you need in the law and after you leave
WC well you want to contact Jason yow
he's our track manager and he hopefully
should be able to direct your inquiries
to the right people around Apple if you
have any questions also we do have some
documentation offline if you need it
there's our porting unix / linux
applications to mac OS 10 there's the
this tells you where it is we also have
the porting drivers to Mac OS 10 if
you're coming from a different platform
and again this is the locations for
those pieces of data