WWDC2004 Session 221
Transcript
Kind: captions
Language: en
hello everyone and while
so this year's image capture show or
actually by just looking at this slide I
said well we guess I guess we have a
problem in showing real pretty cool demo
later on because this side says we are
referring we have to refrain from taking
pictures and I guess we haven't set up
everything here but I guess we just file
it through so let's talk about image
capture we will start with a short
overview what image captured us and the
easiest way truly to describe image
capture is really if the framework that
allows you to work with still image
capturing devices the camera or scanner
and the nice thing really is we allow a
vise range of devices and we support
multiple communication protocols it's
funny like couple of years ago when we
did the first image captured session we
were like forcing you're at least asking
you as device manufacturer well wouldn't
it be great if we had a few maybe just a
handful of protocols that you can use
and the really by just using a handful
of protocols this is in a handful of
driver or device modules and be able to
talk to as many devices as possible the
industry trend has really shown that
most of the new devices are really mass
storage or PTP picture transfer
protocols based which is really really
great and so later on in the session we
will show you a couple of new AP is that
we are going to add four Tigers to make
the usage of new devices new device
classes even easier let's start with a
short overview of the image capture
framework and it's pieces at the bottom
level we have the hardware the device
that you connect and this has actually a
device module that is responsible for
talking during the communication to that
device
this device module is a user client
background application for you there's
no need to add any kernel driver salsa
and it talks to the image capture
extension image capture back on that the
image capture extension can handle
multiple devices at the same time and
can actually handle multiple clients at
the same time so in theory you could use
iPhoto you could use the image capture
application file maker or whatever your
application for example application and
then at the same time access the single
device this is bit different from if
you're familiar with the train model
we're all we have a layering stack
that's very simple or very similar to
what we have but they all run in the
same process we have different processes
and really the image capture background
application as a bottleneck will hear
your life the device requests and really
make it easy and straightforward for
client application to talk to like one
or multiple devices at the same time now
if we look at this structure here
actually everything inside image capture
is made up of very small building blocks
first building block is the notion of an
ica object 98 object just has a type and
subtype and may contain a reference to
another icy object and it may contain
properties or actually a collection of
properties metadata in the dictionary
which leads us to the second bullying
clock that's the properties as a
properties here again we have a type and
subtype and these properties contain
actually the real data with real data we
mean the kind of image data and
available thumbnails its image data or
if the device support
like mp3 files or the movies then of
course we have the music or the movie
data and the last thing we added for I
guess Jaguar was the addition of
dictionaries when we came up with the
initial design of image capture we were
thinking well we could put all the
metadata all the properties in two
separate objects I say properties it
turned out it's kind of cumbersome to
get all the metadata collected so it was
much easier and you will see that in a
little bit much easier to use CF
dictionary refs or or NS dictionaries
that you can pass round and they will
contain all the metadata for example for
a given image so all the excess data
will be in one of these dictionaries if
you look at these three building blocks
they form kind of a unit and so once
again the object has basically type and
subtype the properties has the entire
image and Sam Dale and the dictionary
contains the this case access metadata
that allows you to work with images in a
very convenient way out of these you can
build up your tree and the top-level
object that you see here the device list
is actually always available even if you
do not have any device connected what
was new with Panther was the addition of
an dictionary to that device list object
that dictionary of course does not
contain any metadata because there's no
image for the device list object but
what it contains is a list a reference
to all devices that are connected so to
cause gives me the device list and then
give me the dictionary for that device
list if you do these two calls
you will have information of what kind
of devices are connected and again by
just using these building blocks that's
what we do inside the image capture
framework so whenever for example I
photo wants to talk to a device first
thing as all other clients have to do
first thing is really gives me the
device list and then query the device
list are the object for example gets the
dictionary get out of that dictionary
the number of devices for each device
you can get the dictionary and that
dictionary actually will contain either
flattened out or as a tree both variants
are in there they're all the images all
the files that are on that device now
you can use that on a single machine but
what if you have to machine well of
course you can have the same thing
running on two machine but with Tiger we
were introducing file or a device
sharing for image capture devices it
could by just setting up the devices
share on one side and listen on the
other side you can take the standard
camera or scanner and there's hook it up
to one device but use it on the second
device the detection is fully automatic
using round able and you just pick and
choose but need more cooler thing on
Panther was the image capture web server
I guess this was a like best kept secret
for for Panther because a lot of people
were later on and review ously we're
seeing well I didn't know about that
feature it's really cool feature the
image capture web server is the Faceless
background application a regular client
for the image capture fright of the
inscription framework and what it does
is it takes the attached camera and
publish that over the next
so what you could do you can connect at
home you can connect the camera and from
work you could actually look at the
images on that camera or if you have a
camera that supports PTP and allows you
to take pictures you can switch to a
mode where it actually allows you to
take pictures over the internet and it
has a monitoring mode so that every so
in so many seconds it will automatically
take a picture and display it and that's
a really convenient way to get to your
data now that was really the kind of
quick waters image capture and put up
the image capture component one thing
that we are really handling and the
image capture framework is devices that
are connected via USB and firewire we
handled it pretty well because like out
of the box just fresh installs megawatt
spin you connect the camera and whoops
the camera shows up the iPhoto launches
or image capture application launches
and allows you to download the images so
this hard plugging is done through an
icy notification framework completely
invisible to the users but what is
happening really in the device gets
connected at i/o hit level we learn oh
there's a new device we do a matching
based on information that we have inside
the device modules and we then launch
actually the appropriate device module
well that's all create if you have USB
and firewire devices but what about
devices that are none hot pluggable well
up to now that was the problem and for
Tiger we want to have a solution for
that so we really see that there are
lots of new devices for example cell
phones coming out that have Bluetooth
and the buildin camera
what about these devices or PDAs some of
the PDAs has a built-in camera others
have just attachment so you can plug in
the camera and in fairly decent
resolution camera nowadays and one thing
that has probably a very bright future
is the the area of wireless cameras so
by end of the year or early next year
you will see a lot of new devices
showing up that have wireless support
directly built-in so these devices well
they may use rendezvous or may use like
worst-case just met it feels very type
in IP address but there will be away and
there needs to be a way to use these
devices as well and of course the same
thing for Scotty based devices good old
skazhi non hot pluggable I mean you have
to really make the connection upfront
before you turn on the machine and those
devices it would be nice if we had a
simple way for the user to use those as
well so we in order to support these non
hot pluggable devices we have to
introduce new API and it's enough to
just introduce like an icy a load device
module and unload device module for the
load and unload you can imagine then we
probably don't want everyone to write
code to do that so the idea is to add a
device browser to the image capture
application so you would go to the
device menu image capture and then you
could bring up a prouder to select the
non hot Blackwell device that you want
to use and ideally another three
dependent on request so if you would
want to have something like a common
panel common UI for that we were
thinking about implementing that as well
that would just allow you again to do
the device browsing and in handle the
connection so if we look at the ica load
device module you see we have same
structure of course it's just parameter
plug and an optional callback proc if
that callback progress mill is the
synchronous call otherwise asynchronous
call and then the ICA loaded device
modules the parameter clock has decide
the header that's in all parameter
blocks we have it has fields that allows
you to specify a device module via URL
and we have no s type that specifies the
transport type so that could be
Bluetooth it could be USB firewire
whatever and then a set of parameters
that are specific to the transport type
ideally you wouldn't have to worry about
the device module URL so you could just
pass in now you specify the transport
type and then you specify a Bluetooth
address that you get wire any iokit call
the unload device module is a lot
simpler because all you what you have to
do is really just pass in the ica object
that you want to unload so let me show
you very quickly how we do that in image
caption we please switch to demo 5
so I will launch again a smaller helper
app that shows you really what's going
on on the image capture side it has
entries for all image capture components
and we'll put out a check mark next to
the name whenever one of these
components is running so let me launch
image capture you see image capture with
was launched get the check box but also
the very first entry the image capture
extension image capture extension is
running now but we do not have any
device module well there's no device
connected so let me go to the device
menu and we see we have a connect item
in here and what will happen and this is
really not the final you I but that's
just a temporary you I to show you what
direction we are going is it will list
in this case the Bluetooth devices that
I have right next to this machine and
that's a PDA and the phone down here it
shows you similar information that you
will get from the regular Bluetooth
control panel so let me select the phone
and let's do a connect see what's
happening the phone works recognized and
yeah we do see the items on the phone
and we can download those and by the way
nice is saying we added 4014 tiger
whenever you want to specify a download
folder you can actually drag it up here
and that should be short cut to the
Proverbs or very easy to get to the
download folder and here we could
specify preview to open it and then just
try to download and you see here is the
image that we have from the phone
okay so that was the like manual way of
loading the device module let me quit
image capture and shoot you another
small application that will actually
hopefully make it to the SDK that's an
application that allows you to just test
all the image capture api's and you see
on the left you see all the image
capture aprs we have and you see down
here the new one the IC unload device
module and appear you see what it takes
a parameter there's only one input
parameter that's the icy object and if
we open the tree so where do we see on
the left side here at the right side
here we see all the objects that we have
on the device we have up front the
device list we have the device itself
and then we have three images we could
now select the device object copy that
and just paste that in here and then
execute the command either synchronous
or asynchronous way and then you see the
device is gone and the device module as
well image capture extension will quit
in just a bit can we switch back to the
slide please
so that was just a quick demo loading
unloading device modules and really just
a very first preliminary UI for doing
that so if we look at this cell phone if
you want to really use it as a digital
wallet it's really nice way to say hey
that's my cat that's my kid that's my
dog and the cell phone I said we'll
probably have a way to take pictures or
download pictures are there something
missing so what about uploading well
again wouldn't it be nice if he could
upload to a device and if you look at
the API said that we have for image
capture used to have is set property
data or the set property data what you
would think that this is the right way
to do the upload unfortunately that's
not just enough because we just don't
want to set the actual image data we
probably need a little bit more
information and so we will have a new
API to handle the uploading device
uploading and it's I say upload file and
it's really just more than just the file
copy well it's a file copy call for non
image files but for image files it also
allows you to scale the image
and the parameter block for this looks
like that we specify a parent object and
the file of a stress and a couple of
flags there's one thing about the parent
object that's just the kind of suggested
parrot object so really depending on the
device you may not be able to really
upload to the specified folder so it's
always good I mean you could try it it's
always good to just upload to the device
itself and then normally the device
takes care of the positioning is within
the device file structure so let me just
give you a quick demo of the uploading
switch back to the same applies
so
we will connect to the device
and now you have two ways of doing the
the upload one way is to drag it to the
icon or whenever you are in the download
mode just drag it in here and then you
get a sheet popping up that allows you
to specify the option you want to scale
it yes or no yeah in this case we want
to scale it and see actually it suggests
the size it knows that this device has
the screen size 120 x 160 but actually
the device also supports this as native
size also to 88 53 52 so it could choose
either one let's stick with the first
one and do the upload now really
depending on the speed of the device you
see we got new entry in here and it was
scaled to 107 by 160 the image the
original image was larger what about
this and you see it was for 29 x 640
and again show that using the RT api
test the upload file or we just specify
again a parent object and in this case
or we want to do is even upload the text
file and if that works then we should
see it takes one the theory we should
should see it on the right side it
should pop up when waiting for that but
far too long okay can we switch back to
the slide please
another area that was introduced new
feature was introduced with panther with
the fast user switching for image
capture there were really no problem if
you had a scanner or a mass storage
device connected message device was very
simple like all your firewire hard
drives you would just expect that it's
good work if your user a and switch user
D and then both users should have access
to the device for a scanner it was also
not a problem because scanner we from
the beginning had an icee a scanner open
session in closed session and there was
only one client allowed for a session so
that simply means if you are user a you
are working with the scanner you have a
session open you switch to user B and
there was no way that user B could get
the scanner if you are user a and you
are done with your session then user B
will have em easy access to the device
four cameras that are non mass storage
for PDP camera we do not have the
concept of a session so what's happening
is really user a uses the camera this
which to use a be now the question is
what should happen should user A or B
half half the camera well instead of
just like unplugging the device and
reconnecting the device which actually
would give user bxs we want to have a
better solution and what we are going to
do is introduce a session handling for
cameras so we won't have an open and
closed session and these calls are
really similar to the ica the scanner
open and closed session but they are
different because they are optional well
we do not want to break accessing apps
so the current iphoto should just work
the current image capture app for my car
and your f it should just work and the
other thing which much different is
really the number of clients well for
cameras there's no no need to limit that
down to a single client so multiple
clients can still use the device
so these optional calls are just to make
fatties of switching a lot easier what's
a cup just then for example whenever the
future iphoto is done with importing or
you're not at the import tab then there
is no need to plug the device from a
second user flow IPA open session just
take the device object and returns a
session ID I see a closed session just
takes the session ID that you got on the
open call these are the main changes we
have an API was for for tiger let's
quickly look at this kind of support
scan aside before we go to some real
camera related demo so on the scanner
support since the Jaguar we were
supporting native scanner modules and
also clean data sources for tiger sorry
for Panther we added support for
transparency units and also add it very
simple to use UI that allows you to do
some image enhancement is automatically
or with a manual correction and for
tiger we are going to support something
that was also requested by a lot of
users the support of a document feeder
and let me just show you that on demo 6
so this is the standard image capture
you I if you have a scanner connected
you see in this case we are using the
train bridge which means we are using
Twain a train data source using in this
case the absinthe in a data source that
is just rerouted through train bridge to
image capture and since this device here
supports as a Twain capability a
document feeder we will modify the UI
and add this extra checkbox here so you
can select the document feeder then the
main view will change because when you
have are in the document feeder mode you
are probably not in this workflow where
do it priests can first then select your
scan rectangle and then do the final
scan what you want to do in the document
feeder mode is really scan the entire
page and do it quickly and do it often
so again this is probably not the final
you lie because the one thing that you
could not select right now is let's say
do you have an 84 or us letter us letter
or you is legal so we need some way to
allow the user to choose between
different paper format right now in this
demo we will just scan the entire
document no put the scanners working
what was probably in the sleep hopefully
for exact
light is flashing and yeah it is
scanning the first page so this is not a
professional 24 pages or 60 pages per
minute scanner this is just a low-cost
easy to use kind of home type scanner
that you would use in order to 24 for
scanning and multiple document see
what's happening the first page of scans
in the background preview is launched
and as soon as a page comes in now it
will be added to preview and yeah each
page will be displayed actually want one
nice thing you could do in preview since
these pages come in one at a time what
you can do in preview you could set the
preferences for images to say well
create only one you don't want to create
the particle windows disk only create a
single window and then all the pages
that come in will be added to the same
window then for example in preview could
do a print and print to PDF for the a
PDF document of whatever you have okay
so i guess we're stand with default
pages yeah also new just one minor
remark we in the past we were just
generating like a critical name based on
scan and then date and time in this case
you can really specify a prefix and we
will just add a page number to it well i
think it's a little bit user friendlier
than before okay back to the slides
please
so looking at the device developers what
do we have new for them or first of all
something handling we were introducing
an API I see a copy objects and nail in
Panther and the initial support was for
I see a sample format with tiger we are
going to extend that and really allow
you to download the family it's directly
in jpeg or tiff format and for the
developer of a device module this means
well not that much of a change because
you can directly give us the native
thumbnail that you have or crawl back to
the old way just returned and I see it's
anthea so if you are not going to change
that's fine because we will do the
conversion between I see a thumbnail to
jpg or tif so the user will get whatever
some news format you see 11 invitation
we had in the past was with isolated Sam
Hales was really the 128 x 128 pixel
limitation and we want to loosen that
little bit and allow really as described
in this female 15 minutes a body allow
larger thumbnails and up to a maximum of
160 I one funny
one thing that we got a lot of questions
about because I get the documentation
was lacking a little bit on that whereas
what's up was the info.plist device
info.plist well for a device module that
we have on image capture we need since
this is a regular bundle we need then
folded penis that has all the things
that are required for info topi lists of
bundle executable I can name and these
things but we needed in the past also
information that would allow us to do
the matching of the hot black event with
this device module and this information
this matching information was kept in
the info depillis we in addition needed
the device input of Peters which was in
the resource path of the device module
and that divides info the plist was
containing information about the current
profile or device icon for example for
Tiger we are going to change that we
will have all that information just
combined in a single p list that's the
device in philippi list you still need
the info to penis but no no image
capture related entry in there and it
will be the same location as the current
divided for the plist and really combine
into the entries of of two plus some new
entrance we will for example specify in
that info the penis we will specify the
physical transport or media information
for example just this device support
jpeg tiff movie avi whatever format and
it sizes and you already saw why we are
going to do that you saw on the upload
we have this proverb that was giving us
science information
so image capture uses really this for
the upload panel and allows the user to
choose the kind of native slice it has
information with the screen size of
devices or sizes that that this device
produce and since we are also explaining
that to add the device capabilities we
will have the information for example is
this device able to handle uploads of
JPEGs is this device able to handle an
erase or take picture let me just show
you the current format on on number 5 so
if we dig into the device modules so
they are in system library image capture
devices and here we have a whole set of
of devices so if we show the package
content go into the resources here we
have the device input appeal it and now
this has some USB related information
and some device related information and
now you can go in and look at all these
entries or instead of using the property
list editor we can also use a small tool
that we are also going to add to the SDK
the device in for the plist viewer and
this allows you to dislike double click
on for example the PTP module and what
is happening is just collecting all that
information actually it looks up the
icon and displays that as well and you
can pick whatever device you have
and go to the device and you see okay
this one supports jpg and these
resolutions this one supports download
and erase it does not support capture
for example and we see the protocol
level it's there all PDP cameras these
are all the defenders so see all main
vendors from Canon Kodak nikon have and
Sony they have PDP cameras that we do
support and this is just a convenient
tool to get to that information and as I
said we are using these sizes heavily in
the future okay back to the slides
please
and finally I come to the main demo part
of this session and the idea was to just
give you kind of ideas of new
applications that it could do with edge
capture first thing the high dynamic
range images for those of you who
attended session 207 on Wednesday they
learned everything about how dynamic
images you probably will ever need to
know Luke and Gabriel are doing it as a
great job and introducing that and to
just keep it very short one of the
problems you will have and get rid of
showing that if you look at the apples
garage and take a picture of that scene
you see with the short exposure you get
pretty much good detail in the back
everything in the front you have no idea
what's that if you have a too long
exposure you see all the cars these are
not being w snow okay and using the
secret sauce of tone compression you can
take multiple exposures and do it on
compression and you really get a result
that's like that so what is the best of
all worlds up to this point it was kind
of complicated to take these high
dynamic range images because you had to
go out put your camera on a tripod and
then take an image change the exposure
time take an image change exposure time
it is that a couple of times and then
run that through an algorithm well if
you have a camera and that allows you to
do that automatically you could just use
image capture calls to really do one
quick high dynamic image rather than
Grange image capture I really wish I
could say that he could go out and take
any camera and you could do it but now
we have one
and really hope that next year more
vendors will open the protocol 3 support
more of these kind of things so the
Canon d70 that's the one we are going to
use now and oh sorry nikon are correct
okay what i want to do is create a scene
here for a camera that's kind of
difficult to capture and we have a
bright light the box there's something
in the box and the camera are just
pointing to that so let me turn the
camera on and you see these 70 is
recognized and what we're going to do
now is we take the first image this is
automatic exposure and we see underneath
the image it's the exposure time so what
we do is we take out of the metadata we
display the exposure time so basically
verify that whatever we set the exposure
time to before we do the take picture
these times are really in the image we
have to still have it here and now we
take a couple of exposures and then take
theta Gabriel's secret sauce and combine
that and it's already done because the
progress bar is it stopped moving and we
will be able to open this that's the
current
time
and we capture the scene now if you look
at this it's actually pretty amazing
because what we will see up here we can
actually see the inside of this land
desk lamp but we can also see inside the
box it can actually make it a little bit
brighter and yeah we can really see
what's inside the box and also see
what's inside the land nice thing these
are really floating point images so the
information is really not frozen to this
or whatever you see now what we suit
will always have the slider to go either
into the dark or in the lighter side
so just give you an idea how these
images look without this processing well
that's what you normally see as soon as
you have the Tigers visible if you
really try guys coming heavily and yeah
you can see the Tigers but you have no
details in the in the desk lamp or if
you have the detail on the desk lamp the
tiger is done
so this is a pretty nice way to capture
high dynamic range images and really
it's it's simple to do the only thing
you have to do is click one button and
it happens automatically okay so the the
next demo is timeless and here i guess i
have to thank all of you because you
were all really contributing to that
thimble without knowing though so let me
just show you what did it yesterday
so we took of time that off from
yesterday afternoon till yesterday late
night but the campus event see this
group down there they are quite busy
the clouds come in getting darker
I Fallon show begins even the moon come
back
in the back you see scent of the airport
or eppinger coming in
ok I guess but now leave the last one
scar all right
who's right
okay so ever so how did we do that well
the small application sitting on top of
image capture that just has two sliders
very easy to use you can specify the
time if you want to capture this case
whatever 12 hours and how far do you
want to decompress it so if you want to
play 12 hours within 60 seconds you take
this setting otherwise yeah you can
change it like say an old a 24 hours and
want to play it back in like two minutes
and then just hit start and it sort of
an he takes the images creates the movie
and two hours later you're that also is
obviously that you're done okay if you
have a little bit flat time so last
thing I want to show you is just for fun
a small application that actually allows
you to take 3d images and the idea is
very simple if you have a camera that
flips and so basically the distance of
the eye and we flip the camera and just
doing that we could would mount it on a
tripod I just don't want to do that
because II already have those images
captured connected device and then
bring up you I that we'll just look at
the images on the device and the nice
thing since there's a sensor in the
device what really showing whether you
wrote rotated this way or that way we
can please slot the images and actually
pre slot them in a way that whenever I
change one image here you see that the
one on the right side changes as well
because they are kind of grouped by date
so we know it's the left and right side
and then we can very easily crew bit and
yeah just select whatever you want and
then you could say process and then what
should happen is wait two large images
and what should happen is it will
display like that and you can do some
fine adjustment and using one of the Red
Queen classes you can look at it and you
have your 3d image also very simple to
do and without any any extra effect okay
go back to the slides
we see we did the demo and we are now up
to QA
you