WWDC2004 Session 107

Transcript

Kind: captions Language: en thank you very much and now if you would please welcome to the stage Dave's our jet ski good afternoon today I'm going to talk to you about porting UNIX applications for mac OS 10 one might say this is about porting learning about the unique differences our platform provides if you're coming from maybe of linux or solaris box or your favorite unix platform of the day now i assume you actually since you're coming from another platform have experience with unix so when i use certain terminology it won't be too foreign to you so what are we going to cover we're going to cover the history of UNIX a little bit just to show where Apple comes from we're going to talk about bundles a unique Apple invention for packaging stuff and packaging itself for software distribution we're also going to talk about the interesting damon's on a par platform that you may notice when doing a PS or wonder why yeah just when you're poking around you we're going to talk about standards conformance since that's a very important thing to so many of you since you are using standards api's most of the time we're going to be talking about the linker the runtime and other what we call frameworks it's the whole stack of things you linked against file system portability our file systems that we have on our platform very wide and they support certain options and others don't support certain options we're also going to talk about authorization and authentication obviously Anna multi-user environment you need to be able to allow certain users to do certain things and not allow certain users to do certain other things finally there's the development environment we provide Xcode and lastly we're going to talk about Mach topics we're not going to cover we're not going to talk about drivers are writing or i/o kid we're not going to talk about the GUI in any way nor multimedia printing font and handling basically anything you can see we're not going to talk about it it's just about the bottom layers of the system now apple's mac OS 10 we've got a long legacy of different os's in the platform we have some of our roots from bsd the berkeley software distribution of unix from way back in the day we have mock out of Carnegie Mellon which was the microkernel work and took a large advances into virtual memory and inter process communication finally next brought these two together and tried to make a product out of it called next step and they succeeded fairly well in accomplishing the goals they set out to do so much so that Apple bought it and produced mac OS 10 what you're using today and Mac OS 10 brings in its own unique flavor of things to add to the mix with our mac OS 9 legacy with our mac apps and our legendary ease of use now let's start getting into some of the details that make us unique first and foremost we have a notion called bundles you're going to see them everywhere on the system a bundle is just a user visible atomic blob when you see an application you'll click you drag it you can't but behind the scenes it's actually a folder the folder contains or directory whatever you want to call it it contains executables resources pictures sounds you name it anything needed to support the application in that bundle that one directory now in the UNIX world you might stuff some of these things in users share you might stick some of the libraries and use your lib they're all scattered throughout the system we like bundles because it brings them all together in one directory the users can move around and add to their system and delete at their discretion finally we have two variants of the bundles given our history we have the traditional bundle which looks like let's say an application bundle it be the name of your application foo dot slash foo that's how we'd identified as a bundle now in the Mac was ten time frame we invented the notion of a CF bundle which it's the name of your bundle / contents or / and then resources and other blogs in there now again I'd wanted to reel down that we're talking about the basic system part of the system we're talking about that layer right in there that you see around libsystem and the top part of the colonel meaning system calls things like glitzy and some of the other various libraries like lib em and whatnot you're already familiar with now the things that make us unique on our platform is not only do we have libraries we have these notions of frameworks frameworks are really just libraries but they live in a bundle and they live in a different location than user live they live we'll talk about more later where they live on the system now if you want to get third party software there is various options available to you we have the think project I don't know if you've heard of it but if you're porting let's say an existing open source project let's say you grab screen and you're trying to run it on 10 well someone may have already done that work for you if you check with think or Darwin ports you'll probably find that the project you're trying to pour it is already ported to the platform saving you lots of work also as far as packaging is concerned our own product we have mac OS 10 our client distribution and we have mac OS 10 server for our server customers now the server product is mostly value-add based around our file server technology and additional components to make the server experience more pleasurable be not only based on our GUI to control applications but things like additions to the web server environment like blogging software or whatnot finally we also have Darwin which is the apples proof that we are committed to open source by releasing the foundation of the operating system and in fully bootable fashion that you can run on your 10 bucks your Apple walk should you desire now you're coming from a different UNIX you do PS and you see some interesting damage floating around I want to talk about at least a few of them so you can understand some of them that we have on our platform and why we have them and they're important let's start with notify d notify D is a way that Damen's can send messages to each other and not necessarily agree on the technology used to send the message for example the system are networking team and has put together a daemon that actually controls how interface has come up or manage set up well some damon's want to find out when the networking changes but they really don't need to know the finer details of how or why so what happens is we send a mock message over to notify be saying the networks change I notify that these n tends let's say a signal over 21 Damon or it might send a message over a file descriptor to another a lot of our damage already on the system actually use the signal method to get restarted when Network change events occur they request that a hop signal gets sent to them and tada anytime the network changes they get hooked now that network change that I was talking about the guy who sends a notification that's config d config d is responsible for maintaining a lot of state on the system for configuring and controlling various pieces of hardware but mostly centered around networking it's where our dhcp client lives it's where some of our other networking policy decision-making happens it's very critical to the system and we I would recommend that if you want to learn more about it you can look at our system configuration framework where which are the external API is for talking to config d also if you look you might see something called the mdns responder it's also known as rendezvous this is what implements it some things talk to it and you know either browse the network or to send out advertisements and that's its sole function in life finally look up d look up d is equivalent to nscd if you're coming from a linux or solaris box it's our name service cashing Damon it can cash as dns lookups the cash is a name and name lookups group look up any good thing you can pretty much think of its being looked up it caches them it also implements get at our info example and does the advanced queries for that now let's talk about standards standards our policy you know is that we like standards we want to comply to them but we're not making a serious effort to go out and test all the standards but but but but if you find a bug that is where we do not conform to a standard please please let us know we will try our best to fix it because we do believe that standards are best for all involved us and you as a developer now for example we do know about some bugs we haven't gotten around to fixing and one that might affect you when you're porting application is that we don't set errno in math lib it's unfortunate but true and it's something you'll have to work around now let's talk about some other interesting details we have we have a different linker than you might find on a solaris or linux box we have majko as our file format and d while d is our dynamic linker a linux box might use elf is a file format and LD fo as its dynamic linker what does this mean well some behavioral differences for example versioning on our platform the what might be considered the major number in the linux solaris world is the path on our system if you do an O'Toole dash elegance and executable you'll see the full path to a library on the system if you want to do a binary incompatible change change the name this is a big difference because that means that if you redirect your assembling to get a performance hit unlike the Linux world which has an LD cash which caches assembling translations also we have bumbles versus libraries in the linux and solaris world there is no difference between a plug-in and a library it's just whether you link to it at link time or you load it at runtime it can be the same file on our platform that's not true you have to specifically compile a plug-in as a plug-in to load it in that manner now what does that mean you need to use the dash bumble flag also if you want to make your life easier as a developer of a plug-in or a bundle depending on how you want to call it you can use the dash bundle loader flag what that does is specify the actual executable which is meant to load your plug-in so that way you can resolve symbols and make sure that your plug-in is fully resolved at link time rather than just you letting it slide some symbols are undefined and we hope that it works out at runtime also a unique difference we have is the two-level namespace we invented this technology to help us deal with binary compatibility going forward what it means is that a library let's call it library foo might use malloc from our libsystem now let's say you include a Mallick in your product in your application well only your code is actually going to end up using it because that there's a direct reference from that library to the Mallik symbolism it isn't a global namespace you can if you need the semantics that if you define a Mallick it will be overridden everywhere in your address space you can say dash force underscore flat underscore namespace and this will tell the dynamic linker to collapse things down and make sure that symbols are the same everywhere also a very interesting thing about our platform is that we're far more dynamic when it comes to dynamic symbol resolution at runtime what this means let's describe a classic bug is that Thun mail for example had a signal handler the signal handler wasn't resolved fully in the sense that functions it called hadn't been resolved yet now send mails running along calls a function which triggers the dynamic linker to start resolving it and then the signal fires well we're already in the dynamic linker trying to resolve a signal and what happens is now the signal handler is running and it's caught it reaches a symbol that needs to be resolved and now we've got two instances where we're trying to be in the dynamic at the same time on the same thread and you deadlock this is a neat difference it's were far more dynamic but it can be problems with your porting code what I recommend is you just do dash bind underscore app underscore load and that'll tell the dynamic linker to make sure that all your symbols are resolved before you hit main now when it comes to forward and backwards compatibility on our platform we have something that will make your life easier we have something called Mac OS 9 week symbols what it means is that you can say if symbol call symbol so there might be a function you want to use but it's not available on a previous release that's okay just test for it and if it's there you can use it and if it's not you can it's a very powerful tool for dealing with forwards and backwards compatibility finally on our platform static linking of standard libraries is impossible we don't ship them sorry it's just something that we find is best for our needs is supporting you and yeah it's best for all involved we believe now not done talking about the linker yet there's some other interesting behaviors that are worth talking about symbols in an object file must be overridden together in the elf linux solaris world you can for example override malik and not override free now that's okay if you never called free but if you actually do call free you're going to get a different implementation they're backing stores might not even be the same one can imagine that horrible problems will start to ensue when you do that well on our platform the way we solve that is we put malloc and free and realloc and the other associated functions all in the same dog see file which get compiled down to the same data file obviously and the rule that our dynamic linker enforces is that if you override one symbol in a dot o file you can't use any of the others there you have to override them too now that might not affect you you might not be overriding symbol of all that much but it can potentially affect you if you put unrelated pieces of code in name 0 file it's something to watch out for also on our platform common symbols are problematic I just recommend that you do dash F no common try and make sure that you're just not using common symbols they're just a bad idea in general anyway a couple more things to talk about bundle unloading is currently unemployment 'add sorry you know you can't unload a plug-in basically it'll still reside in your address space another interesting issue to be aware of is that if you're using C++ and a bundle our static initializers are not being called at the moment again it's an issue to worry about when porting code now if we can move outside of an actual address space and just talk about general runtime issues we store configuration and resource files not necessarily in the same place on Mac Mac os10 we store them in some different locations we have open directory which is our great multiplexing demultiplexing engine for getting preferences from different places and we'd recommend that you use those ap is if you want to be a good citizen on our platform to find out about your various configuration parameters or general system configuration parameters also instead of having a lot of dot files in your home directory using different parsers and just different altogether code we have code for you that can make your life easier it's called the CF or NS preferences api's and they put them in tilde / libraries / preferences and it'll save you a lot of work of dealing with how to save preferences and restore them on our platform now when it comes to system startup we have some differences compared to a linux or freebsd or solaris system on those kinds of platforms everything essentially calls out from etsy RC and you'll just have shell scripts calling shell scripts calling shell scripts and what will happen is that when it comes to what order to run things in the UNIX world is pretty static still all they do really to accomplish proper initialization is just make sure the shell startup scripts are named in serial order so you might have one food to bar three baths yeah that's you know pretty not very dynamic and we've done some things to improve that we created startup items again their bundles we've talked about these before there's stores in system library startup items each one of those bundles contains a little blob of data specifying dependencies this means we can dynamically start up the system and boot things when they're ready in fact we even boot some startup items up in parallel since they don't have dependencies on each other now the boot up is changing unfortunately the session that talked about that was yesterday those session 106 it's being replaced by something called launch d1 can think of it is I net D on steroids it's going to be able to support any Damon on the system and make sure that it gets launched on demand when any configuration detail of it changes some more run time issues in fact I just talked about pre binding with one of the Xcode people right before this session so a lot of this slide doesn't really matter anymore pre binding but for the purpose of this session i'll still talk about pre binding it was a method we have to increase performance of runtime of your application what we did is we since we know of all the libraries on the system we can pre calculate where they will load in your address space once we do that we know where all the symbols are and we can then record in your executable the linking information let's say again use malloc well we know that's going to be you know nine kabillion something and we can record that in your executable that way when we load we check that the state of the world hasn't changed and if it hasn't we just let your application go because we've already pre linked you as thing unfortunately what this means is we've modified executables and libraries all the time whenever library changes we go modify everything and re pre bind everything that's a problem for some people it means that backup security tools get false positive for changes things like tripwire to do intrusion detection don't work because again they're false positives for changes how's this all changing well we've written a much much much faster dynamic linker in tiger so much so that we believe that you as developers no longer need to worry about pre binding I'm hoping that will make a lot of you happy now this is more of an issue if you're thinking of a 1000 or 10 one box but our bin sh changed it used to be z SH V she was almost positives from client but not quite we switched to bash we think bash at the time happened to be faster to which was a nice perk but that's not true anymore but bash is POSIX compliant and we like that but if you were using some Z isms you might need to accommodate the change if you're making obviously writing a portable shell script hopefully none of this should matter to you now frameworks frameworks is something I touched on earlier it's a bundle based alternative to the UNIX hierarchical libraries and resources again it just contains the library and headers but it could other contain other shared resources to let's say your library is a gooey library and it has pictures and sound it can be in your framework now when it comes to actual compilation though there is a difference you don't say dash l food to link against the framework you say dash framework foo that tells the compiler to look in a different location for it on the topic of file system portability POSIX file systems are typically case sensitive and they support sparse files although that's not necessarily universal our native default application that we have a default file system for Mac OS 10 is hfs+ its case insensitive case preserving that means that you can have big foo or little food but you can't have both at the same time we also support resource Forks and the API for changing that used to be out of the way and UNIX applications wouldn't know about it which created issues for backup tools or anybody copying files using unix tools this has changed in the tiger frame time frame because resources are now going to exposes extended attributes and any unix tool that is aware of extended attributes can deal with resource Forks correctly but it is something to be aware of considering that extended attributes are now coming into vogue in the entire industry finally our links are not supported by all file systems it's something that your application might need or used but you need to be aware that keep while supported by H of S Plus some of the other file systems we have on the system don't necessarily support it like the web dev filesystem now something that is again unique to our platform when it comes to the filesystem hierarchy is the way we do scoping we have four primary scopes on the system we have the system scope essentially software and shipped by us that you shouldn't ever need to touch or manage we have the network scope it's where a network administrator might put anything they want to show up on all machines we have the local scope anything on the local machine and finally the user scope something in the user's home directory how does this affect anything well remember how we talked about frameworks they can live in all four of these scopes you have system library frameworks we have library frameworks which is the local case we have network library frameworks for the network case and you can as far as I know put frameworks even in the user's home directory should you desire again and Tilda / library / frameworks when you're developing let's say a framework or really any application that wants to put files on our file system we would recommend that you try and conform to our file system hierarchy for where to place files on the topic of authentication and authorization we have the security framework it's our preferred API for doing things it has a lot of support for advanced technologies like smart cards and other interesting authentication mechanisms like fingerprint readers you name it they're trying to support a lot of the advanced things that many organizations want it's also a capability and rights based system I won't go into the details of that but it is important to keep in mind if you need to do authentication and authorization and finally while we do have the file security framework we do have Pam available if you are needed compatibility solution either you have a Pam plug-in or using a Pam api's we make sure that the authentication authorization can route correctly if you need any kind of Pam compatibility our development what environment is a lot of familiar things and some different stuff at the same time it's mostly a good new tool chain it's GCC and gdb and make you know the things you're familiar with but we do have some differences that are to be good to be aware of for example we have the c preprocessor we've modified it to support precompiled headers you might see these in user include when you may be say an LS start p you'll see some precompiled headers there it helps us when compiling large applications and including lots and lots of headers so we don't have to regenerate all the c preprocessor passed unfortunately the c preprocessor doesn't support all of the new extensions to the air c preprocessor which some of you might have consciously or unsub consciously started using how do you work around this well it's dash no dash c PP dash pre comp that'll get you your old canoe linker bag now Xcode obviously there's been many many sessions about it this week we highly recommend you use it in the case of debugging it has a very powerful visual debugger zoo we would highly recommend any use and you don't necessarily need to go to a lot of effort to port your applications build system into Xcode Xcode supports legacy targets you're going to say look there's a may fall over there just go build against it and you can keep your legacy build system which is probably desirable to you if don't want to maintain portability but you can take full advantage of everything that Xcode has to offer with code searching and debugging and lots of yummy things like that finally some of you have some very very very big apps out there so much so that you run into a nice interesting architectural anomaly on our platform it's not it just requires some different code generation if you need more than 16 megabytes of text you need to use dash mne a long call on the compilation line to make sure that your code gets generated correctly otherwise it will fail to link again it's just something to be aware of if you have a large executable if we talk about api's now we again we're not completely like other platforms we have some things you need to be aware of if you're using pole it happens to be emulated in our platform and we highly recommend you use KQ instead if you can't though at least it's still there but and well at the moment it simulated via select so if you're familiar with those two api's you might be able to imagine the scaling problems we have with that current emulation technique but if you don't need a large number of file descriptors it's okay to use our poll emulation at the moment also dlopen for loading plugins again it's emulated how does this affect you well again as we talked about unloading plugins is not supported and if you're taking advantage of the RTL the next functionality of dlopen we don't support that at the moment that's something to be aware of other issues we have that you should be aware of is that pthreads is partially implemented it gets better and better with each release but it you need to be careful when you look for functionality on our platform the biggest thing to be aware with pthreads is we don't necessarily support cross inter-process sharing of locks for example something to be aware of also POSIX message queues are missing if you really really need that kind of functionality our current story is that you we recommend the use maybe mock ports but we hope that you can use find a different API also when it comes to system 5 shared memory we support it but it's very weak at the moment for if you look in our boot up in flash Etsy or see you'll see that we statically set the variables for that and once those shared met system 5 shared memory variables are said they can't change the life of the boot so if you need those change you need to be aware of that finally well not finally uh well we do have openssl on the platform we would recommend the youth cdsa openssl has some architectural limitations when it deals with for example smart cards openssl will hand the key around directly whereas cdsa supports having the key being representatives handle which could be connected up to a smart card somewhere else where you don't actually even have physical access to the key a unique ish language on our platform a language opportunity is objective-c some of our frameworks are implemented in it and you need might need to use it I actually recommend it it's kind of fun to use it looks like small talk and it'll probably only take you less than a day to learn so I want to be scared of it and you might enjoy it finally a lid tool versus ganool abbr tool oh it's unfortunate but these things happen we have a name conflict if you run libtool on our platform you'll get a tool that we wrote many many many years ago next called libtool for dealing with library generation has nothing to do with the canoe libtool they're not even really functionally overlapping but it's just a conflict of names and that means on our platform you need to call if you need to use canoe libtool you call it G libtool now the last thing we're going to talk about today is mock our story is if you need to we don't want you using mock if you don't need to mock has AP is for example for allocating memory in your address space vm allocate vmd allocate we'd rather you not have you use those api's we have perfectly fine POSIX api's for doing that it's M map and for example but sometimes you do need to use it if you need advance control your process priority that's the currently best mechanism for doing that also if your process ends up using libraries that use mock you need to be aware of the bootstrap namespaces it's a mock ism I do really don't want to go into it at the moment what it means is that different processes processes in your login session have a different context mock wise then let's say a system Damon so if you need to talk to another damn and this might cause problems for you finally when it comes to traditional UNIX AP knives that are non-standardized ptrace is only partially implemented our platform we only implement it enough to do an attached after that gdb on our platform for example uses the mock api's for getting and setting and controlling the process that's introspecting I doubt very many of you are using ptrace so that probably isn't something you need to worry about now who do you want to talk to you after the sessions over if you need in the law and after you leave WC well you want to contact Jason yow he's our track manager and he hopefully should be able to direct your inquiries to the right people around Apple if you have any questions also we do have some documentation offline if you need it there's our porting unix / linux applications to mac OS 10 there's the this tells you where it is we also have the porting drivers to Mac OS 10 if you're coming from a different platform and again this is the locations for those pieces of data