---
title: WWDC2004 Session 100
framework: wwdc
role: article
path: wwdc/wwdc2004-100
---

# WWDC2004 Session 100

## Transcript

Kind: captions Language: en good morning so pleasure to start the OS foundation track this year with a talk entitled beyoncé's beyond syslog the noose is the new log subsystem we have a couple of interesting things planned for logging although they they may look rather small there's things that we'd like to be able to leverage in the future unfortunately I have a confession to make if you wanted to go ahead and use the technology i'm presenting today to make a new dashboard widget to win the competition the api's that we had to show you today aren't actually on the CD but will be coming sometime before Tiger actually ships so my name is Nikolai cracovia and I was it been introduced a BFD technology group and first I'd like to start with a background on on syslog what it is what it does what it does well and what it does not so well and then also do a little bit on the motivations on why we're actually planning to do it a new logging infrastructure again from some of those logs sort of deficiencies there's a little bit on system administration beyond just logging plans that Apple has for the future for different things that you know unix administrators have been asking for and by the end of the talk you'll be able to have a little familiarity with the the API that we're planning how to write log messages had actually searched through the log messages and best practices for when and what to log so what is this log well in a world without syslog you just open up a file and start you know printing records to it but would the syslog provides you is a system based facility for being able to do all of your logging through a central point this allows a lot of configurability and allows for messages not to be scattered all over around your systems you don't have each application writing individual log files to various separate directories so by having the centralized facility we can then gain other functionality in addition to printf things like being able to tag priority messages so a message like you know my printer is on fire is very different from you know i just opened up a tcp connection and to be able to differentiate between this is very useful you can also do simple log filtering you know based on these priorities you can specify separate log files that get the messages and things like system dot log and VAR log gets the majority of the data but you can also do things like authorization information with a slightly more secure file permissions or things like log lugging mail messages or or FTP connections also go to their own separate files for in extreme cases you can also write things directly to dev console again like my printer is on fire that might be a very important message so administrator can configure it so it goes directly to a console as opposed to just a file so users are more likely to see it immediately this log also provides a basic capability for remote logging as unfortunately was designed back in the heyday before denial of service attacks and our fancy encryption so it's not necessarily the most secure so it's not enabled by default but we're looking into possibly 69 so there are a number of things that this log got right centralized logging facility to be able to have a centralized mechanism for dealing with log messages being able to specify log log levels for messages everything from from debug through you know emergency it provides a lightweight API so it's a very low barrier for entry for adoption and you don't want mugging to be necessarily a difficult thing for your users to to have to do it's also very configurable on being able to filter on on its priority and level and also the ability to log to the triple normal machines but there are a few things that we'd like to address the syslog it has no internationalization support you can only log messages with a subset of ascii this isn't very good for the number of languages are very difficult to represent and just a ski the messages are also a little too simple by only providing you know the time stamp and priority and then a message there are a lot more things about a log messages you can you can insert that might be useful for person who wants to look at those log messages while you can't add them just for the loc method string itself then it exhibits another problem with this log it's an unstructured format so it's difficult to parse and it's also difficult to search you just have to do things like using a grep and fancy sort of expressions to be able to get what you want so our goals include adding a structured log format to make searching much faster and easier and in particular for a programmatic perspective to be able to have rich data to be able to attach to these log messages as a separate field so you can log things that won't necessarily just appear through the log interface I'll get a lightweight AP I don't want to make it too complicated for you to actually log your your messages and of course backwards compatibility if your program is written today to use this log it should continue to work in into the future another thing integrated log file maintenance currently log files are rotated by some external application usually done through electron like mechanism but chronically doesn't know a whole lot about log messages so it'd be really nice to be able to move some of that functionality in word and then extensible input and output facilities be able to stay on control whether or not you get blog messages from from different available resources or or being able to specify the outputs and I'll give some examples of that later for log file maintenance I think it would be really nice to be able to have configurable log file expiration so a system administrator can say you know keep all messages higher than warning those are going to be messages that I want to archive you know well into the future but I might only want to keep like the last 200 info messages so and those can be just as a rotating rotating sort of buffer you might want to say don't keep debug messages from an app that continually you know spews out into the syslog because they're just cluttering up your logs and you don't find that information valuable we're also thinking things like you know don't spin up the disk so you can save battery life that's opposed to having to write these log file messages and then also we'd like to add like filtering on compound expressions so all of you sort of rules can be have you know complicated things like you know you know John that has a luggable greater than notice you might want to do something special with or all of my log levels yesterday that we're of a warning level back to that input-output thing I mentioned syslog for the next generation syslog ASL will have a syslog d module so it can handle input from all of the standard interfaces that syslog expects them from like dev vlog and such also from UDP if you so wish to enable back so you're dependent on that functionality and then for writing it will have full syslog vlog comp support so you can have messages still continue to go to their old locations although it's not the preferred interface for for some of the things like searching the local store as you heard yesterday via a sequel bite in Tiger and it'll give you the facility to be able to to do a little bit richer searching on here on your blog messages and also go through a programmatic interface so your individual app can display you know its short history of your possible debug output without having to send the user to a to a separate location to find that data we have to have a couple of ideas for possible future extensions the convention the remote logging is not secure so maybe come up with with a way of being able to remedy that sort of problem being able to do blogging from SNMP information which we will integrate that into the logging infrastructure maybe you want to advertise some of your syslog information on your blog you can do things like live RSS once the information is in a separate modular sort of interface you can do all sorts of things maybe message in the bedroom can message the machine in the den my disk is full I need help the individual log records themselves have a couple of components that are automatically filled in for you we're going to be trying to capture the user information both as the username string and as a UID if you separate record interfaces because those might not necessarily match all the time similarly group information you might want to know our query like a good cross a group of which applications are doing logging timestamp for what course the log messages received and machine name of mostly for now for future extensibility for being able to support remote logging and then the application name itself which is derived simply from RGB 0 so if you wish to overload that you can you can change it and of course things that you have to specifically call from the API the log level itself determines a little bit how it gets filtered the message itself and then we're also providing an ancillary data field so you can go ahead and attach a string that would not show up necessarily in the central blog message so it's searchable as a separate field when is it a good idea to log information well really all the time anytime you do anything sort of useful like you know system service like if you start listening on a socket or start you know advertising a service if you disable the service parsing configuration files creating new files and some of the reasons this is useful is if things start to break down then you can provide the the error messages saying you know this file was you know corrupted and I can't deal with it some people you know had applications just go ahead and do a couple bounces in the doc and not do anything and that's really not a very good interface and to be able to provide some sort of mechanism for the user to be able to figure out what exactly went wrong and maybe possibly correct it would be very useful so awfully is it for diagnostic information for various things not just from application like hard application errors but things that where the system is in a consistent state where application can deal with a particular problem but it may not be ideal and the system can you can provide log messages tables that give information for the user to be able to tune their system again providing a simple audit trail to be able to figure out what the application is actually doing one of the next things about Mac os10 is you get to hide a lot of that complexity and a lower from the lower layers up through a nice user interface but it's also nice to be able to keep track of that information so if a person doesn't need to use it it is available for ASL we're planning on a log level policy very similar to what this log currently use uses its extensible and the matter for these are are gonna be arbitrary strings that you can use to to set these log messages but for right now we want to be able to you know as we migrated from syslog be able to provide you know directs this log compatibility at the top level you know it is an emergency log level typically not for most application developers it's more for things like the colonel to say you know this system is actually you know it's completely stopped I can't do anything more it's a wonder this lug message you know it actually get written at all it's to the point where you know you're archiving the the state of the world for future generations to come back and and determine what will happen alerts different things that the user can handle and should immediately on things like disk full or maybe my network disconnected and I'm absolutely dependent on that network and maybe even but not quite there not quite the same as say like a hard device error like I had a problem reading from this hard drive again like these are some of the kernel level facilities and things like that are higher level than some applications applications thing start right around error where it's errors you know a hard fatal error where you know you were expecting something and that's not true anymore and you can't do anything but but exit this will provide at least a mechanism for for you to provide the user with information of why you have to quit warnings again non fatal errors things that the system or that your application may not necessarily you know have a problem dealing with but that shouldn't be that way and this love the users like like configuration file is consistently miss thing has to be regenerated or something notice is therefore things but that aren't in their aunt errors but you may still want to inform the user like I'm starting a particular service tered SSH on allowing remote printing info allows you to log informational messages such as you know this user logged in you know from a remote connection or there's mail waiting for you or any number of different things that that could be useful for the user to know about and then debug provides again a lower level so you can do things like lightweight racing you know syslog isn't a profiling utility so it shouldn't be used as such but to be able to provide the user like a set of clear things of how it got to where it did and then as you as application developer issue you can do like things like log analysis to figure out what necessarily went wrong there's a command-line interface currently it's the same as what we currently have fits this lock compatibility the command called blogger so logger by itself with a with a string will will log just that string at some default level priority you can also do things slightly more sophisticated like specify you know the actual app name that gets logged and then the priority from the command line so this would be just starting back up at log level notice with the application name back up sh we're also planning a query interface in the command line it'll start obsolete using you know prep and that'll provide a little bit more functionality currently we haven't decided if this command is going to live within lager or as a separate utility but in the interface will be something like you could specify log your desk you for a query and then you can do globs on the particular thing so by default it would glob be the message itself so this would get both starting back up and started testing or you could say after things like you know give me all on lug messages from back up sh with a priority level of notice of the API again much like syslog is is rather straightforward all the headers that you need is just going to be in contained within slh and there's very few basic data structures that you need one of them takes truck ASL breath it's returned by my ASL open if it's not only have an error and you really can't blog the fact that you have an error but ASL log then calls with the ref with a tag that specifies the the actual priority and then takes a printf styles are like string that specifies like the log message itself and then it's always nice to be able to close your references if you don't need them anymore but I also mentioned that you can specify ancillary data so using the ASL set command and you can do things like specifying the key data you can add say and arbitrary additional arbitrary string likes a hostname to the log message and then as you call as you call ASL log then that extra information is going to be logged with the next log message and then that field is cleared so that the if I called a slog ASL log again you would not have the hostname be set how do we search in addition to the ASL ref we also need another opaque struct called ASL response again opening the log reference is the same and then you call ASL set and then a particular key that you wish to search for and then what that key should match as so it's a level debug for the ASL key level will return all messages that match the level debug throughout your log archive then you call ASL search on that reference and that returns in ASL response now with an ASL response you can iterate over calling ASL response next and then to get the individual fields from the log message you can say ASL response get on that response so in this case you'll be retrieving the sender the pit and the message and then printing off and then a format you know similar to an old-style syslog message and then the ASL response free clean up references a long way I so close so in summary it have a consolidated programming interface for accessing log information something that really quite hasn't been done before unified log information store again being able to query the single sequel light databases as the back end but also being able to specify other locations that the messages could possibly go the modular interface for receiving and storing messages speed will be the API isn't plan for tiger but we're looking at things in the future for opening up things so you can write your own unless modules for being able to deal with these messages both input and output and it's just in general a good debugging tool for users until you'll find out what went wrong with their system and if I could do anything about it I guess this time I like to bring up Jordan Hubbard for beyond syslog things that in system administration that go beyond the scope of just looking at logs Thanks my honor ok so my portion of the talking might be titled beyond beyond syslog because we realized in this session that we're going to have a lot of system administrators in the audience and we figured this would be a good opportunity to talk about some stuff that we may not actually be able to have in Tiger but are all works in progress we're certainly going to try and get somewhere all of them done for Tiger but no promises but this is sort of the direct direction we're going in and the types of problems we're trying to solve so one thing we've noticed it's interesting about Apple's evolving user base I can put it that way is we're getting a lot more UNIX people and they tend to be system administrators they tend to diplay in IT environments they're deploying Xers which is obviously a new product for Apple and somewhat out of the boundaries of the traditional desktop market so they're bringing to us a lot of interesting new problems shall I say that I'll cover here one thing actually I'll say the number one thing they asked for is better package management yeah so unfortunately package management is actually hard it's one of those really it's not very deep but it's a very wide problem space and dealing with dependency tracking dealing with undo dealing with arbitrary scripts that might runs or post installation scripts where you have some really strange piece of software that you know has to go off and petrova system maybe add users perhaps go configure something in the i/o registry you ever really know what an arbitrary package is going to try to do so to get it to do that in a way which is flexible and gives you some prayer of security is not an easy problem to solve so as I said one of the well these are some of the issues that we have with today's package management and one is which I by which I mean the PKG MP kg files that the installer eats and number one gripe i can install what i've installed or I can't back out so if I've installed a software update which for some unforeseen reason doesn't believe my system in the best condition afterwards I'd like to be able to go back in time and there's no dependency tracking across packages so if you have several packages sitting in a directory together and one depends on the other there's there's really no notion for that at all the only thing that apple provides today something called the the n package which is basically just an agglomeration of a package in all of its dependencies and if you have 2n packages which contain the same sub components they'll both stupidly install the same sub components over and over again plus of course you have the overhead of downloading the same bits several times we have library receipts and the the bomb or build of materials files is pretty much the only installed software database that you get and obviously this is insufficient it doesn't let you do queries doesn't let you at a glance find out what's installed on your system when it was installed and other useful things there's no file or package conflict checking so if two packages claim the same file that's just fine which of course it's not and upgrade support is really a pretty weak with the current system it basically just checks that you're you're I think you're monotonically increasing in version numbers and then splat the new files into place there is no command line interface that I really know of for installing packages so the future is essentially to look at all of these problems which we obviously have been doing and come up with something that doesn't become the spiritual embodiment a second system syndrome which is basically the problem that most people have when they go out to write a next generation package system they they want to solve all the problems that they've seen today and they end up noodling around for about three or four years and end up producing either something that's too arcane to ever use or it never sees the light of day which is more often the case so we have a couple of efforts that we've we have sort of started and then stopped again because we saw the the danger signs of that occurring and put the brakes on and what we're basically committed to doing at this point is doing it in the open and developing a new package management system with the BSD copyright so it's usable by by any of the open source operating systems out there and doing it as a cooperative effort with open source developers because we realized if we try and do this internally it will turn into into a horror we will stick prologue in there we will use XML and all kinds of evil interested ways it will do dynamic xslt transforms on the packing list and will we have lots of technology and so our tennessee's to want to use all of it and that's not necessarily good thing in this space so we're going to do this with with outside developers as well so we can maybe keep this a little bit same so some other issues there's a lot of third-party software out there if you look at for example the freebies deportes collection I believe they're up to 11,000 ports that's a lot of software and Mac users are really no different in having the same desires they would like to be able to install arbitrary bits of third-party software on their system hopefully in a reasonably safe way so that it doesn't go and clobber system components or attempt to do so so that's the probably the second biggest request we get is when are you going to come up with something like the port's collection so as I said there are lots and lots of these things and basically right now what's going on is people are just sort of tossing build recipes back and forth furtively in the dark and saying this is how i ported this particular thing or they create blogs and they and they just trade information back and forth but supporting these things yourself it's not not really a great thing to do so obviously as I said you know with previous reports that are existing attempts to solve this problem there's net bsd well and the current offerings on Mac OS 10 is a nepalese package source which is derived from the previous deportes collection it actually does work on Mac OS 10 today I can't say that were used it so I don't know how well it works but if they do claim Mac os10 is one of their supported platforms there's something called Darwin ports which I and a couple other people started about two years ago and this was our attempt to solve this problem using TCL and what we thought was hopefully a higher level framework than what freebsd ports provides which is basically make files but yeah it's not ideal we learned a lot of really useful lessons with our imports but we see that we still have a way to go with it and of course there's the ubiquitous fink which i'm sure a number of you either use or have used and they're certainly they have the leadership role in terms of providing the widest collection of pre ported software they're very responsive in updating their ports and keeping them actually working and that that's an important point it doesn't matter how elegant your you're wrapping system is for for automating the building of software if you don't have a fairly vibrant community of people who are actively maintaining it and this is one area where you look at for example the Debian folks and they have they have infrastructure coming out various horrify it's really impressive because they've really solved a lot of a lot of the the fringe aspects of the problem space and by a fringe I don't mean in terms of importance just in terms of what most programmers consider interesting and that is web pages which document what software is out there what version is currently at who maintains it the list of bugs filed against it which architectures the packages are available for so there really a good model of a vibrant organization around porting software so for the future we we're not sure what the future holds here we're continue to back our imports just as a pretty good solution but we're actually looking at cooperating with some of some of the Linux distributions which have also started to run into the same problem some of whom were saying s rpms are not really the right way to build software it was a good solution for its time but we want something similar to the FreeBSD ports collection in the sense that we want a massive tree of building recipes and it's important to just to note and users may consume packages but those packages have to come from somewhere and so creating the the infrastructure and environment for building these packages and maintaining that is at least as important as coming up with a good package management system and so we're looking at some interesting ways of doing maybe an end-to-end solution where you start with an XML recipe which describes how to build it and then as it goes along the food chain and gets built into some sort of destination route and then it's finally packaged up that same XML file simply has more properties added to it which describe the package metadata and then that's bundled in as part of the package itself so you never actually lose the original build recipe so some other issues system configuration is still difficult it's something that involves lots and lots of different configuration files scattered around the system you're never even sure at any given time what demons are configured or or will launch in response to some stimulus and then of course once you know which demons are out there you have to learn their configuration formats and it seems like every single demon in the system has invented its own Apache Scott httpd.conf Samba has smb.conf you name a demon I'll name you a unique configuration format which applies to that demon and that demon only different services also have different semantics so in some cases you can edit the configuration file and then you need to go tell the demon that its configuration file change by sending in a special signal in other cases there's a management command like Apache CTL which you need to run to restart it in some sort of civilized fashion in most cases there's nothing you you go and you find the the demon question you shoot it down and you restart it and you hope it restarts and didn't leave state files lying around which annoy it the second time it comes up basically a reboot is you know the only assured method for for restarting it properly which is by the way one of the reasons why some of the software updates reboot you it isn't just because the colonel has changed it's because it's altered some service and the only way to get the new service running is to reboot your system that blows ok so there's also no notion of a configuration parameter space that they can share or rendezvous in so in some cases a number of demons would like to share a common knob it's an object control some global resource and they should all respect it or a global knob that says no demons should run at all right now please I'm doing I'm doing a back up again so there's there's no notion of that the the closest thing we have to a shared configuration space is this total table in the kernel and obviously that's that's pretty low level and only really allows you to configure Colonel knobs it doesn't allow you to configure things up in user space nor should it and the whole remote management issue is is basically nonexistent other than SS a Qing in and doing it has a command-line interface doing it that way so we've been looking at this this problem very hard and one motion that we came up with which were we've been prototyping and playing with is the notion of a configuration API and that's good for a lot of different reasons and the chief reason I i really like it is it finally gives you a path to go down if you want to create a new service and say I don't want to invent my own configuration file format there are quite enough thank you so but I don't have any there's nothing in lib see or any of the standard UNIX api's that allows you to do that and ok if you use corefoundation you can use CF preferences which is which is a good start for this it definitely eliminated a lot of doc files in your home directory but if you're not using CF if you're sort of a UNIX demon and that doesn't really do you any good unless you want to drag all of CFM with your set of libraries so we're looking at a fairly low level serve sorry for the low little service which lets you set and get properties query walk the tree something similar to an SNMP nib but but a little bit with more structured data and so you have a configuration a space you have api's for getting and setting the stuff and you have a persistent data store probably an SQLite database is our first choice right now so you have cables and rose and you can also bring up the SQLite to also browse it if you want to do it that way but we're not married to that right now it's just one potential data store and the most important aspect of this is its application agnostic now you can write a cocoa browser which lets you browse these things and tweak stuff you can have services just programmatically going and setting and yet values you'll have a command-line tool which for any demon which conforms to this this new API will have its knobs registered with so you don't have to think of which specific command goes with a specific demon and obviously if you also want to have a web front end to it for remote management or whatever you could do that or you could write a proxy demon which authenticates to a remote client may be a cocoa application again or whatever and lets you do this remotely so some of the issues that are going to make this a difficult problem is we have a number of existing configuration data stores there's the the SC dynamic store stuff that config to users there's surprisingly net info which still store certain system parameters so we need to merge those and you need backwards compatibility one of the the cardinal sins in the UNIX market is to come up there with a bold new mechanism that completely changes all the rules and say but it's really better you know you really want it because it's better and they say you know I don't care I i can write a patchy configuration files in my sleep and i want to keep doing that thank you very much or I are just a set of common configuration files across my Lewis machines my go as ten boxes my freebsd boxes and i want those to continue I want that scheme to continue to work so we are looking at the notion of configuration file parsing plugins that map from one space to the other which is not easy to do because there's an imperfect match for a lot of these configuration files and you of course have the issue of what happens when you update one or not the other do you try and back propagate the change if you make it to this the centralized data store but don't make it to the flat ASCII file that it came from or do you just do it one way you essentially do it as a migration tool so there's a lot of interesting trade-off there and we're looking at that very closely so and you need to call with a fairly high level API for this so that you can do something similar to pee lists where you have dictionaries and arrays and sort of arbitrarily complex data types so that people don't just use it as a binary blob data store and still have to pick the data park themselves but obviously you don't know the full range of configuration data that people are going to want to store so you need to pick a very flexible format the pilots format actually isn't too bad one because it is it is quite flexible that way and of course you need things like transaction and roll back if I make a whole bunch of configuration changes and then it blew the system up it would be or somebody did at three o'clock in the morning because you have five or six administrators all sharing the same duties and I'd like to be able to come in and say you know at 6am and say Burt messed up the system you probably wouldn't use that word but anyway and and you want to be able to roll it back so that's and it's important or even see the transaction history to see who it was I'm just fingerprints are all over this mess so surprisingly enough we've been trying to kill that info for several years and it looks like Ned info might actually be a good solution for this so maybe rather than killing it utterly we simply reinvent it we do do a kind of a lateral transition now let me let me be really clear about this we've been trying to kill that info because it does have some fragility issues when you use it as your central user group hosts the data store and more importantly it doesn't interoperate with others it's it's not ldap it's not active directory so the whole notion of open directory is to take those those those problems away from it info and do them in a more layered and abstract fashion and it does that quite well but if you took that info and said well this is this is just a configuration data store now and we no longer you know it's no longer part of the critical path we're absolutely every every login or user ID look up it's actually easy to make it robust for that particular application it also has the notion of a distributed database so you can have medieval parent domains and parents for those so there's a hierarchical network namespace and that turns out to be really useful for storing configuration data because you end up having system administration domains the art department the sales department the engineering department and they may have common knobs that our department wide and then you have knobs that are specific to the machine so we're looking at potentially bending that info to our will and giving it a sexy new french name and turning it into something else so some other design goals and this doesn't just apply to the problem space I I mentioned but but overall anything doable of you via the UI should be doable on the command line that's one of the number one bricks to the head we've taken over the last year or so now don't clap yet cuz I didn't promise this I said this is a zangle and it's something we are we are getting this point we we every time I go out to Hollywood or somebody who uses a lot of machines this is like one of the number one things that hit me with so we do understand this is a really important goal some of the stuff that Apple has done is very you I centric like you know a software update and it took a little while to to make it dual mode so in some cases you have to go back and sort of do the retroactive design but going forward we real it were really clear on the fact that you need to design that that just feel adds angle from the beginning another complaint we get is you I did something and I have no idea of what you know I clicked on something and some file was modified or some data store had an they made to it can you tell me what that was I might also want to be able to reproduce that action on the command line elsewhere where I don't have that you I so the UI can be a great way of showing you how to do something if it will show you how I did it so that's another important design goal for everything that we're doing in the future documentation for for command line tools turns out to be important so so we're actually now doing this as part of our builds we actually have a verification phase which goes and says hey mister maintainer your command line tool you have you installed something in user bin and it does not have a man page and it slaps you upside the head repeatedly until you do something about it so so something that we're also trying to be really cognizant of is that the the road is littered with the decaying skeletons of those UNIX folks who have gone before us and tried to make this work it is a it is a hard problem space to find the right balance for if you get to hog wild in in front ending everything the admin's screen that you're dumbing things down and that you didn't get or or that you didn't give them a button for exactly what they wanted to do how dare you etc and if you make it too simplistic they won't use it say what's the point I can do this on the command line my fingers are hard-wired for that command what good is your UI tool and if you look at a lot of things that people have done in the industry you'll see things on both sides of the number line so this is something that it's one of the reasons I wanted to cover these points here is these are all works in progress these are all things that either in under investigation or are being prototyped and this is an area where administrative feedback people who actually at the rock face doing this stuff day in and day out can really really help us with because again we're engineers so we'll go out and build an elephant we're a mouse is called for or vice versa if you just leave us alone with with the specs and no adult supervision but if we have real-world scenarios and and definite strong input must do this it must do that it must not do this other thing that is incredibly important to us so i encourage you to send me email contact us through channels file enhancement reports you don't just have to file bug reports when you go to bug reporter you can also say I'd like this enhanced I'd like a new feature place so I encourage you to do that
