WWDC2004 Session 438

Transcript

Kind: captions Language: en I'm Dominic giampaolo a member of the spotlight team and this talk is working with spotlight first I'd like to go over what we're going to cover today so the agenda that we have is why spotlight how it works what is it that we were trying to solve what did what did we try and accomplish then the next piece integrating your app with spotlight what does that mean what are the different parts of that searching with spotlight and of course working with metadata the main focus of the talk is really going to be about what it means to integrate your app with spotlight first off what's the problem any engineering task before you start it you of course want to define what is put in the Box define it what is it that we're trying to solve here it's hard to find things on your computer I think we've all run into this so why is it hard there's too many files if you're anything like me you you know accumulated a few files over the years couple tens of thousands and then well there's digital cameras oh that's another couple thousand ten thousand files and oh yeah all those mp3 files that's another bunch of files any movies that you've created or downloaded it starts to accumulate if you bothered to put things into some kind of organization that's really nice you have like I'm a little bit a bit of a librarian inocencia a nice broad hierarchy but everything's fixed into a single location and that's not always what you want you may have files that fit into multiple categories I have just got back from a trip and I took lots of pictures of flowers and other pictures of mountains but in different countries if i want to say show me all the flower pictures we took in france well that's one thing but if i want to say just show me all the flower pictures from the trip including France and Italy and wherever else can't do it it's not an easy way to organize things next there's also a lot of rich information about files and we're just not using it so even though mp3 files are tagged with a lot of rich metadata email has quite a bit of metadata JPEG images of course the EXIF information from the camera is quite a bit you may want to say show me everything with a less than stop at three point 0 / thing each shot as clothes off or long distance but there's no way to easily search for that so there's just no easy way to find this information what's the spotlight solution obviously we want to make it fast and easy to find files as opposed of course to making it confusing and difficult which I suppose if that was the goal were already done so we wanted how do we want to do this by using metadata to enable richer searches metadata is information about the data or the contents of a file we want to use that to enable things in able users to search in a more natural way this allows you to organize files in multiple ways so that you can sit like I was giving an example before mountains in france or french pictures there are pictures i took in france these are different you know axes that you can flip around we'd also like to allow for additional metadata things that maybe weren't originally envisioned as being associated with the file but that you need to such as workflow state and the last point is a very key one we don't want to require apps to change it would be great if we could you know wave our hands tada new world everyone's rewritten all their apps great and we all work with metadata and we're all very happy but the world doesn't work that way you have lots of code it's difficult to change so the minimal amount that we can ask you to do the better now I'm going to go through a quick demo of spotlight and to cover a couple of things that that I'd like to highlight for you first of course we'll start with the finder so some of this we saw in the keynote but there's a couple of subtleties that I wanted to point out so first query type of course HTML we find 779 items out of whatever on this disk and that's pretty fast if i was to type something like jpeg jpg like this we find eleven hundred and eighty three which is a little bit bigger a few people missed this in the keynote or at least i heard but there are smart folders so if i type jpeg down here now i have a Smart Folder of JPEG images I have another one of HTML as well so you can save your searches and come back to them and of course they re executed at that time another thing if I was to type something like Frederick okay nothing matches so I come over here just position these windows appropriately and I go and create myself a new folder but i'm not going to type Frederick normally I'm going to do it the French way because there's so many wonderful French people at Apple so we type it with a few extra accent there and when I hit return notice that it showed up automatically in this query over here so even though I type the ease without the accent doing case and diacritic and sensitivity on the matches and you saw that it was live as well it showed up when I created it and if i was to drag that to the trash it disappears from the query next I'm going to show you a little application will not a little but a very nice application that we wrote internally called bull search to demonstrate another feature that we have called grouping so if I was to search for jpg again we find a bunch of items here and I'm going to organize that there's an option for grouping and so if i choose to organize these by title some of these things were actually QuickTime movies that were compressed with jpeg completo jpeg compression so now you see there's kind of these set of virtual folders that were automatically created based on the title so there were five different versions of the dungeons and dragons trailer and they showed that they get grouped together because they all have the same title so it's sort of synthetic virtual folders that get created on the fly based on the set of attributes that you're grouping by and this is a very powerful way to kind of build virtual hierarchies on the fly so finding nemo again there's only three versions of that film these are different resolutions or bit rates for for the web or so on which is this is kind of a nice way to organize things now the key part of this list emotionally just click that is that I'm going to run Microsoft Word fire that up here for a second and I'm going to bring up a another Finder window i'm going to type the word outrageous nothing matches the bill word outrageous now in word and regis document baby so i've just created this and if i save it and i'm going to call it whatever because if it had the word outrageous and the title that would be too easy and so of course it automatically matched in the content and showed up now he keeping to observe we didn't change work right we don't have access this word so we didn't actually do anything they're fine by content that's pretty straightforward another thing though that it's a little bit more subtle if you pull up the property sheet and this was alluded to invert Ron's demo you can see but there's some there's some metadata here that was automatically filled in for me both in the title and the author field so just click OK there if I was to come back over here and I'm just going to type something else that matches nothing so you can see there's nothing in the query so I set my last name it matches the document because we extracted that metadata and again this is the role of importers which is key thing that we're going to talk about here in a second and how they this is how you get your app integrated into spotlight so without any changes to to Microsoft Word whatsoever the simple addition of this importer what we've managed to get it integrated very seamlessly so you can search for things by their author so on and so forth ok so let's quit out of here clean this up I don't bother that and going back to the slides weekend for a second why do you care spotlight and riches the user experience plain and simple it makes your documents easier to find when you're integrated properly by the presence of an importer your documents users can find them based on things that they remember about them not just the title so that can take that can take the many different forms some of which we're going to go over it doesn't require any code changes in your app there are things that you can do to take additional advantage of spotlight if you want but without doing anything at all for example like we did with Microsoft Word you can take advantage of you can get integrated into the spotlight system users can find their documents more easily kind of gives it it's like an additional feature for almost no work whatsoever and it's another way to share data with applications so that applications don't have to necessarily know everybody else's file format for the information the metadata that's important which can be published by the importer then it's more easily accessible to other applications they don't have to go through and parse your file format there's a uniform way to access it now we're going to talk a little bit about the spotlight architecture so you can understand kind of how its put together and where you fit into the into the equation spotlight is a system for storing and retrieving and clearing and the tree getting out information about files it's composed of a server which runs in the background Damon's that help the server and of course importers and I should not forget to mention the client API which is part of course services the importers are the sort of connection from the rest of the world to the system that stores it kind of what does it look like let's get this tour so over here on the left side you have an application which goes and writes a file when that file is written the system notice is this and an importer is run to extract metadata from that file which is then connected up to the spotlight server which stores it into the system store on the right hand side of the picture you have what we find your icon which you know could be any application we issues queries and receives results and can display those not a lot of apps have a need for that but for those that do that's the sort of final piece of it there's three main concepts in the spotlight system of course you have importers which as mentioned here using the actual code terminology md importer which is how you extract metadata from the file publish it to the system so it's pretty straightforward and we're going to write one later on in the in the talk in a minute here you have an MD query which is a way to express a bit writing expression about the attributes that you want to find the files that you want to find and retrieve them and then the leaf items rmd items which represent files and items are made up of attributes and I use the word metadata and actually sort of interchangeably attributes are named types and a value and represents some information about the file ways to integrate with with the spotlight you can write an importer if you have a custom file format so if you work with standard file formats such as JPEG or aiff or mp3 don't have to do anything we've already we're going to cover the basic data types that Apple supports natively so there's no work to be done if you work with standard file format with some caveats in the sense that you want to put metadata in there if you can but the first thing you can do if you have a custom file format is to write an importer this is a this is what enables sophisticated searches for your documents you want to put useful metadata in your documents so that's sort of what I was saying is that if you can for example the exif data that comes in a camera preserve it make sure it stays in there or put additional information in there that we can extract because a lot of file formats already have support for a variety of metadata and then if you need you can the final level of integration is to actually use spotlight queries for tracking documents or displaying results now we're going to switch to talking about importers and in this section of the talk we will actually go through and create one and write it install it and show you how it works what are the rules of the game importers need to publish metadata that helps users search kind of harping on this it's what you want to allow for Richard previews which is in the sense that some attributes are difficult to compute so the length of a song you have a variable bitrate file computing how long it takes you know what's the duration is difficult so you would want to compute that once and store it as an attribute so that we can say Oh find many songs that are longer than three minutes that's useful you want to avoid putting things into into publishing metadata that private data binary data icon previews this is not what what spotlight is about spotlight isn't a sort of fast and efficient way to search for user-oriented data things that users remember the labels of layers in a Photoshop document the names of tracks in a multitrack audio editor or Movie Editor these are things that people would remember that they would want to search for a you know chunk of some data structure that's internal to your apps binary that the user has no connection to no that's not that's not something that they want to search for and at the other end of the game a spectrum too much noise too much too many attributes can confuse the user if you have 500 attributes that's probably not the right approach so what are attributes examples of good attributes copyright title author dimensions there's a special attribute called KMD item text content which we use to represent the text content of a document and this is how we do the full text searches so that can take a couple of different forms and I'll cover that in a couple of minutes some bad attributes would be you know a specific implementation details or binary data that the user can't easily search on we've predefined a whole bunch of attributes so you can see the list here KMD item title authors keywords projects and so on there's quite a quite a different quite extensive list hasn't covered everything of course but fairly broad set of things and if you look in the include file metadata md items hoc different you will see the full list of these of these attributes so writing an importer this is where we're going to actually step through the process of writing one what do you have to do do it in Xcode we've got a metadata importer template there's one function to implement so it's not that difficult you can use your existing document reading code with the caveat that you don't want to have something some piece of code that goes into inflates the entire data structure you have some multi megabyte data image or whatever and it gets pulled into memory and exploded and decompressed that's not what you would want to do you would want to sort of scrape the file get the interesting bits as metadata and then publish that so a lightweight version of your document reading code he does if you have us custom file format you know how to read and write it so you probably have code that you can use and you return a CS dictionary of the attributes that you would like to publish for that for that document those are sort of at a high level what it takes to write an importer there's three steps using the MV importer template you have to create and define a good edit the info.plist and then implement the code one two three defining the GUI you there's a command line tool called uuid Jenny type that in in the terminal run it you get a string put that into the code edit the info.plist to associate that good with the with the code and then identify the UTI types that your plugin handled so if you have a custom file format with a custom file type you would put the file type for that document into the lsi item content types key and the info.plist and that's how your that's how the system knows to associate your importer with that data type and we're going to go through a code and then of course you have to implement the code there's a function get metadata for file and that's that so let's write an importer and you'll see this doesn't actually take to too much Oh run Xcode oh can we switch over to the code machine great okay we'll create a new project and this is a Apple standard plug in we have metadata importer predefined and we'll call this source importer although that's just because that's what I've been typing it's not actually a source importer well it's like this let's say so if we pull up main doc say you can take a quick look here we have a template and there is the three steps that I talked about first thing it says is create a unique uuid for your importer so fire up terminal I type uuid jen and i get this very beautiful 128 bit ID and I will push that down there and I paste that in there okay and following the instructions go to step 2 edit the info.plist alright i can do that and i come back over here and there is the metadata importer plugin ID and i will paste in there and once again this other part down here and then the last thing that it said to do was to change the UTI text so in this case I'm going to say is that we edit public dash C or we support public dash B header files save that and now the third step is to write the code so of course I'm going to sit down and write a big chunk of code right now right it parses see header files and no we're just going to cut and paste a little bit there's a couple of header files that I need to throw in here I'll put those up here at the top and I will put in a i will put in a prototype for my function which I wrote the head of time and I'm just going to cut and paste a nice big chunk of code down here below that is very rudimentary but there's enough work to to make this demo work to partially header file now where the last thing said implement get metadata for file so here we have that piece of code that's empty at the moment your past in a couple of different arguments and the main one of course is the attributes dictionary ref that you get this is what we're going to fill out with the information that we would like to publish for this file we're also told the content types UTI so if you have an importer two handles multiple types you'll know what we think the type of the file is and the last thing most important is a pat of reference to the file the path to the file that we would like you to parse now I've already gone and filled in the bit of code that does all this I'm just going to cut and paste this into here replacing this empty bit and then we'll go through it really briefly and so this we get the full path and then we have this function get typedef names which given a path returns to us the number of type deaths in a.c array and then does the the magic to convert that into a CF array which we then add as a dictionary value with a particular this is the attribute name Tom Apple source typedef there's one thing I have to do here because they have a custom attribute name and we passed in the CF type depth which is a CF array so I can save this now the last thing that I have to do because we're we have a custom a custom he ever got a custom attribute name we have to define it and so we do one other thing I want to so I need to call this calm copy this paste that in there and then this is where we actually really say what it is and I'll talk about this again in a second so I'm just kind of going to gloss over this for the moment that's all taken care of save that we've saved this and we're going to build it and if I didn't screw anything up okay good job it's built now what do we do with it so we have go into the source and Porter directory it's going to build the rectory Thursday source importer md importer by copied echar source and Porter are filled source important mb importer in two tildes flash library mb importers and i'm just putting it my home directory here i'll talk about where else you can install it it's putting it here for the second and we drop it in there basically that's all it took to install it now we can run developer tools there's a program called MV import we do dash out everything we did we did everything properly it should show up in the left which it did not ok so i need to md check schema know check fema and that is on schema.org smell successfully parsed oh that's right thank you yeah so clearly i've used unix for only about six months and of course being up on stage helps a lot whoever said that I really appreciate it I would have done another 10 minutes realizing that okay so now we will successfully install it properly and if we run developer tools mb import ah there we go beautiful yeah okay now on the desktop i had a place i had a sample header file in this test directory if i run it developer tools and the import again and with the dash d3 option so it will print out loads of information i have this file my header dot h now first off if i type but let me just go ahead and run it what we can see happens here is it says importing data from file and it tells me exactly what file what type it thinks it is publix be header which is useful to see that it matches what we defined ourselves as and then we can see that hey calm apple source type stuff that's the name that we define for our attributes and there's three typedef my integer my big into here and crew struct and if we were to look at my header dot h you can see that there are three types deaths in here that were extracted properly and published as header file so this is a way that you know we've just defined a new importer and installed it in the system and successfully had it published metadata which if we'd like to go into finder and if i say what file defines my big integer we see that my header dot h shows up so fully plugged into the chip put out of extra code to the moment and kindle that okay we can go back to the slide so we wrote an importer there's a couple of things we still need to talk about though well md importers run in several different context so in the case that i showed their we're an MD MD import by hand on a single file it ran extracted the metadata that's all very nice well and good however md import can run in a couple of different scenarios for example if someone takes and plugs in a firewire hard drive with a hundred thousand files on it we've got a lot of work to do and so it can be part of a slightly longer running process this of course has implications but when you run it once and it works you're all happy that's great and you don't notice anything necessarily because even if it goes and allocates a lot of memory and you may not feel the impact however when it's running excuse me as part of a longer load process if it has leaks or trashes memory you're going to start to notice these things so you need to be a good citizen you need to pay attention to this we're also taking defensive measures as well so if you're not a good citizen will make you be a good however you want to avoid using a lot of memory as you can you know you have to pay attention to things like leaks and we have a lot of tools to do this and you want to use some caution when reading large files so in some cases like I said when you plug in a drive with a whole bunch of files on it you don't want to necessarily just read the file like you would normally because you can pollute the flout the buffer cache of the computer which can cause a lot of unnecessary paging activity because in that scenario data is not likely to be used again so if you're running with standard POSIX file descriptors you can call SF control for s no cash and if the data is in the cache because it was a recently saved document you'll get it from there if it's not in the cache you won't waste time polluting the cash with data that you're never going to read again so that's always a win for using carbon you can use the no cash map if you're using cocoa you can get at the raw file descriptor and call the f control again tips for importers you want to use standard after Bruton aims avoid inventing new ones if you can we have a lot of common one so for you to say well my document file format has an author field so I'm going to call it calm my company at authors is not the right thing use KMD item authors take a look at the list that I mentioned earlier see what's there and make use of them when you can don't forget text content when it's applicable this can take a lot of different forms for example if you had a keynote presentation importer well what does text content mean for a presentation well there's all the bullet item fault for all example all of this text right here is something that could be published as part of the KND item text content you want to avoid letting that get too big there's not much point in storing more than about 100k of texts since that's what Google does and seems to work for them so you know you can accumulate text from various pieces of your document it may not just be straight big chunks of text that may be strings that come from a variety of places and publish it that way you will don't want to publish too much it doesn't it's not helpful to publish 500 attributes as I mentioned earlier publish things that the user thinks about interacts with and that will make it easier for them to refine things if you need to remove an attribute for example you decided that other attributes who no longer applies to this document add it to the dictionary and that put a CF null for the value and that will cause the system to delete it as you saw i installed it into till the library md importers which works pretty well for you know initial testing and debugging but most likely you would want to install your employer into flash library in md importers to debugging for debugging like i used the md import shell which is the list of what importers are installed that's the quick test to see that it got there which where I had that heart attack earlier and then when you're testing it to see what's happening you can use MV import with the dash D option to get different levels of debugging v4 is probably way too much you can give it a path to a hierarchy of files or you can give it a specific file as well and there's some it's in developer tools and it's a you know the way you would test things out if you define if you need to define new attributes and this is what I kind of glossed over in the in the Code walkthrough there's a schema dot XML file that's part of the project and you can define new attributes in a couple of different ways so depending on what your needs are the first one I can't find serio we have a string attribute which we define as types of CF string and we give it a name you have number GF number and the last one is kind of an interesting one and this is what I used in the source importer the multivalued string what this means is you can think of the document as are the attribute as an array of individual values so who bar blah those are all separate entities that are in an array for that attribute then you would localize or you can provide localization for your attribute with the schema dot strings file just again a standard convention at the utf-16 file and you map what you wanted to call or what they name you gave it in the file which is not something you would display to the user and then in my favorite language the only other language I know Italian what you would want to display it as and you can check this with md check schema you notice that we're using a kind of funky naming convention here where the reverse dns style naming but we have under bars instead of periods that's because we wanted to keep these attribute names compatible with the coco key value coding scheme which doesn't allow for periods in the name so that's why we did it that way Apple has written a whole bunch of a whole bunch a couple of importers for the standard file formats that we support natively and you can expect us to continue to do that so things like JPEG PNG tips so on we have that covered quicktime of course you would expect that PDF and then things that the application kit can open for text documents which includes RTF and RTF d and word documents and we support that as well so you don't have to do those so in summary importers are pretty simple to write there's not a lot to it you know if there's a bit of glue code that you have to get together we provide that in the template two CS plug-in so there's you know it's not any great magic it makes your documents easier to find this is the connection this whole system the whole spotlight system lives and dies by the quality of the metadata that's there and how easy it is for users to search for things so it makes your documents easier to find it's in everybody's best interest to do it it handles full-text indexing with the KMD item text content attributes and it's the sort of thing you could go home and write one tonight so with that let's talk about queries and searching who needs queries well not a lot of people actually there's not that many finder applications that need to be written but apps to have a custom UI where the focus is working with groups of files so you know take some of these things and do some of this stuff over here and that's the main focus here at you I where you're not going through a traditional open phase panel those are the kinds of things that would would benefit from working with queries through asset management workflow or you know filetype management applications even something like soundtrack which doesn't you may not think of is working with files but in fact when you I don't know if you're familiar with the soundtrack application but it lets you select different sets of instruments issues queries to do this this is something actually that could take advantage of the spotlight system to do queries on the attributes about the instruments that it's searching for the queries find items based on their attributes attributes that you can search on our the metadata that's published by the importers of course file system attributes that's what we've always been able to search on the file size last modification time all those boring things that you know you don't really always think about but are useful sometimes and of course full text content what does the query language look like it's a simple see like expression with standard operators like equals not equals greater than what you would expect you know parentheses for grouping so what does it look like in an actual expression they have two of them there KMD item keywords equal star to star and that's how you would do substring match and in the bottom example is slightly more complex when I did the example searching for frederique in the finder before you notice that it matched because even though the accented characters accented e and I hadn't typed in accented e still matched and what you see at the end here is what I went too far okay that little CD at the end stands for case insensitive and diacritic insensitive and because we have the asterisks around both ends of it it's a case and I critic insensitive substring match now how do you write a query there's three parts to it really you first create an MD query ref then you have the standard CFO allocator default and the string you pass in is the expression that we had just on the on the previous screen in this case we're saying KMD item title equals star Tigerstar and we have the CD and then we have some additional options for grouping which is what I'm sorting which I showed in both search but we're not going to cover here today we'll test an old for those let me start the query running with mb query execute and in this case we've specified that we want to have want updates which is a live query if you just want to issue a one-shot query past zero i believe for that argument and then you don't get any updates just that's the result in the story then you read the results when you get notifications that there's results available you get the result at query index I and there you have it queries are designed to work with CF run loops so there's three phases really there's progress okay you're getting results things are coming in from the initial set we're going through then you get a finish notification that says okay that's the initial set if you selected for live queries then you'll start to get updates as things come and go from the query set know when you saw like the liveness things i did a clearing nothing match then something popped in that's an update notification coming in saying hey there's a new result you have like i mentioned one shot or live queries and the sorting and grouping features which again unfortunately we're not going to cover today because we will not have time so I can come back over here we're going to go through a little sample program that we have that does queries I'm not going to write the code but we will go through it briefly to kind of see what it looks like i have a nickel i should just pull it up okay i have a little application that looks like this guy right here and there's a search field which is hooked up to the code to a search now so when i type in a string that gets stuff plugged into i didn't actually change changes plugged into this function here search now in the code go down a little bit first thing we do is set the title of the window not very interesting we created NS string and this is a coke application and we do a star equals and then we put the string that they type that's the % @ and it was typed into the search field and we put it in quotes and put it as a substring match with the stars on either end and we say cd4 case and diacritic insensitive then we pass that on to start query which is another method down below and that's right here and here we take then we add notifications if we have the very first query that we've run we add some notification observers for progress finish and update and then we call md query execute just like i mentioned on the slides earlier now when when we run this program go ahead and build it and run it here's what it looks like and if i type HTML get the same results we'd get in any of the other applications we have 779 results alright pretty straightforward where did that all come from well when we got updates we asked the tableview to reload its data and i'm going to talk about that how we actually display the data later on in the second blast it of this this talk and when we get the done notifications we don't have to we just note that it's done and that's basically all there is to it to issuing a query that you when you get updates you tell yourself to process them and in this case like I said we just ask the tableview to reload the data which is where we actually go and display it and that's that's that okay so so actually the Holies I don't think that back cuz i'm going to you next if we can go back to the slides actually so really few apps need to perform queries it's you know if you need to do it it's not that hard but it's not the sort of thing that you have to think what do i have to do to adopt spotlight i need to i need to do this not not everybody needs to it's great if you do it's not very hard but it's if it's appropriate for your application queries are see like expressions about the attributes that you want to search so you saw we had some very simple expressions with standard equals you can build much more complex ones with parentheses for grouping so you can do ores and and and so on to build you know date is modified between the state and it's less than this other day or it's in this other date range when you can build some fairly sophisticated things if you'd like it's well integrated with CF run loops it would be kind of a pain if this was bolted on the side you had to jump through hoops and do contortions to make it work with your application but you know we've worked very closely with the finder team to meet their needs and other applications like the bowl search demo or that ask Mac demo so we kind of understand how it should be integrated properly there's options for doing live queries so if you want to continue to receive updates and notifications on the fly we support that and as I mentioned or alluded to and sort of demoed with full search they're sorting and grouping features which can provide you with some pretty advanced functionality if you require it now displaying metadata as you saw when I ran a smack it displayed some information about the files and that it was displaying the name but it got back through the spotlight system there's pretty it's pretty straightforward to display metadata you have to have an item reference you can get an item reference in one of two ways you can either first get it as a result from a query or you can create it for explicit pass so if you know the pass through some other mechanism it was something returned to you via a file open space panel or what have you you can just explicitly create the item once you have the item reference then you can get a list of attribute names about the item that the dictionary and array I'm sorry of the names that of attributes that exist for that item so if you don't know anything about it and you want to display arbitrarily what's there you can get that list and then go through and get the actual values or if you know exactly what you want you can use the md item coffee attribute family of calls and I say as a family because there are different variants depending on what you want to get one a few or all the attributes for an item and then you can use that information and we all work with standard CF types so if you have a multivalued string you'll get back a CF array with all of the values for that attribute name so in the case of going back to the source importer if I asked for k if i call md item coffee attribute for this specific attribute com apple source typedef i would get back a CF array that contains the names or the values for that for that attribute name of course if the attribute doesn't exist you'll get a null so you need to be be aware of that you want to use the the one that's most appropriate of course bulkier calls are better in the sense that if you're going to get five attributes and you know you're always going to get five attributes pat build the CF array with those five attribute names and get all five of them at once sort of standard best practices that way you avoid round trips back and forth to the server so let's show you how we display metadata back in the ask Mac application so going back down to that mysterious function that i allude to about reloading the data so here we have the table view object for table column and in this case you see we come in and we take the identity identity field of the of the column that was passed in and we look at what it is so in this case I'm just calling KMD item copy attribute as you can all see that for the KMD item pass for that file and that's the that's the information that I will return that should be displayed for that column i also have if there's a display name we'll take that and i also have a second column KMD item title now if i go ahead and run this again if I searched for jpg you're going to see that some of these things that match our as I mentioned before movies that are compressed with JPEG compression photo jpeg compression so they actually have titles we're as a normal jpg file wouldn't necessarily have a title so in the cases where there is no title we don't display anything it's just empty in the case where is the title we've successfully retrieved it with KMD item copy attribute watches this line right here to get the title for the object so when it exists we get it when it doesn't exist we don't display anything and that's all there is to it really for displaying data now you of course can do more sophisticated things caching data if you need to and so on or as I alluded to calling a more sophisticated function to get a set of attributes all at once okay if we can go back to the slide can we go back to sign Celtics okay so in summary items consisted consists of a list of attributes items are the representation of a file in the spotlight system and it's a list of attributes which are attributes are named types and value you can get a list of all the attributes for an item so if you know nothing at all about it you want to find out all everything that's there you can get the full list of names and then you can go through and retrieve the actual values for each one you can call there's calls in the MD item copy attributes family for one some or all of the attributes that are associated with a file and you want to use the bulk calls when possible one item that I should or one attribute that I should mention is it a special as KMD item text content you can't retrieve that in the sense of give me the text content for this document that it doesn't work that way just so that you're aware of it no now we've talked a bit about the high low Earth of CD API and there's also a cocoa API that I'd like to mention although we're not going to go into any code samples for it as you might expect the lower level core services API is very straightforward and procedural there's the cocoa API NS which is based on NS metadata query and as expected is higher level object oriented manages queries and results use the NS predicate class which should have in blue but anyway to populate or to initialize an NS metadata query that's metadata and this predicate is an expression about the attributes that you want to find and that's how you would build the expression as opposed to just using a straightforward string and as metadata query also offers a group of grouping feature and its key value coding and observing compatible so you can hook things up to NS array and NS tree controllers sort of for automatic kind of connections between queries their results and their display another factor that I'd like to talk about a full-text indexing have mentioned it a few times throughout the presentation that spotlight uses full-text indexing the search kid has undergone some dramatic improvements for for Tiger in content indexing is considerably faster incremental search which is something that wasn't really doable before is up to 20 x faster so you don't have to wait for all the results to be relevant strength before you get results we can start getting results on the fly which kind of gives you that find as you type functionality and when you do want relevance ranking they've improved the relevance ranking quite a bit now why am i mentioning this in some cases it's appropriate to use a search kit directly for your for your own private index such as the help content or the Xcode documentation these are things that are sort of more appropriate to private or through a specific index and the search kid api's which were made public in Panther and have been enhanced in Tiger are there for you to use and fully documented what's the current state of things so obviously this is not a final release so we're not done yet there's going to be issues and things that you'll run into there's some limits on attributes size that we sort of kind of self-imposed we're sort of proceeding very cautiously with this whole project because this is sort of new territory in a lot of ways I mean I know some some of these things have existed before but we don't want to put ourselves into a situation where we we wind up with something that's not sustainable in the future so we're kind of defining a fairly tight envelope and then where we bumped into it we look at it well why did we bump into that that limit there is this the right place to expand things to push the boundary and when appropriate yeah we'll push it so like I said there are some limits on the attribute size and number of attributes when you run into these talk to us let us know what it is that you're trying to do why is it not working sometimes it's the right thing to increase the limits and sometimes it's like no maybe that is an indication that things are being should be done in a different way when your feedback like I said this is kind of new territory for it to be in such a broadly available general purpose operating system you know these kind of metadata functionality and so on so we want to hear what people are looking for what they need what they're missing what doesn't work for them with what we have today so that we can build the system better expand the system to meet those needs so summarizing what we've talked about today importers of the main connection from your application file format to the spotlight system so importers publish metadata from files spotlight makes it takes that metadata makes you the documents easier to find and displays them more and allow them to be displayed more richly spotlight also allows applications to interact in more sophisticated ways so as I mentioned before you don't have any patients that have to know about em other applications and all their file formats they can just sort of asked for the attributes about the file they don't have to bother going to parse it because the data has been published what do you need to do what is the end result of this if you have a custom file format right and employer that's the biggest thing that's the biggest favor you can do for your users for us and for yourself put useful metadata in your document so a lot of file formats already have support for various types of metadata like I showed with word there's that property sheet make sure to populate that where you can with things that are interesting things that would help the user find that document later on and when you're doing things if you're doing special things manipulating a document and copying it or doing the save as preserve the metadata where possible or when appropriate so if there's a next step chunk in a JPEG file and it makes sense and you haven't completely modified the documents or no longer makes sense preserve that coffee it as part of the file format no so now where you can find out more about this because I've gone through this pretty quickly and you know it's not like you're going to Sara Lee have everything in your head right now there's a whole bunch of example code and documentation online as well as some updates to what's online and that disk image is connectable calm ok so there's an additional disc image of documentation on connectable calm so here we have the different spotlight the spotlight importers where you would find that documentation the MD imported reference and you know the template that's they're pretty much describes it all so there's not too too much that you have to worry about that the right place md item to find out you know how you would manipulate that what the functions of that class are not class with that what the family of calls are how you would make use of them query reference schemas and so on and adding search to your application so there's quite a bit of documentation already available and out there