WWDC2001 Session 201
Transcript
Kind: captions
Language: en
good morning welcome to session 201 how
threading can benefit your application
on Mac OS 10 my name is Mark Tozer
belches I'm the desktop technology
manager in developer relations so today
what I'd like to do is start the
conversation actually first of all I've
gotten kind of feedback from the fact
that our session title is the longest
one at the set at the conference this
year so I've been asked to take some
input at the end of the session for
title suggestion so make sure you give
me some input I need to shorten that
here is what I'm being told what I'd
like to do is kind of go over some of
the ways that Apple computer has been
working to optimize system performance
and some of the ways today we'll talk
about how you can optimize system
performance through hardware and
software so what I'd like to do is kind
of take a look at how you know we
introduced systems at these keynote at
shows like Mac world and so forth and
you know aside from the features of new
functionality design performance is one
of the key customer messages that we
deliver to our customers customers wants
to be able to say that they're gonna buy
this new piece of hardware their
applications are going to be able to run
faster right so performance is a key
message here how does Apple to address
performance aside from increase in the
megahertz well they're do different ways
we can do that we can use the processor
technology the g4 that we've migrated
throughout our product line so the most
recent addition has been the titanium
PowerBook g4 other ways we've done that
is last year at Macworld New York we
introduced dual processor systems so the
ability to now have to g4 processors
with to g4 velocity engine available to
you as well other ways is the unified
chip architecture for i/o on the bus has
also been improved so information can be
passed along the bus much more
efficiently so what are the ways that
you as a developer can take advantage of
these technologies well there are api's
that we put out there
so the NPAPI has been out for at least
three or four years velocity engine API
has also been out there so those are
ways that you can optimize your code to
take advantage of some of these hardware
technologies multitask any multi thread
multi-threaded are also ways that you
can increase performance just through
the application and not have to worry
about the processor being faster at the
root of all of this is obviously OS 10
and this is what you're gonna learn
throughout the whole week of how OS 10
provides all the functionality and
abilities for you to optimize your code
without having to worry about when is
Apple going to deliver a gigahertz
processor well today what you should be
really asking yourself is when are you
going to optimize for the api's that we
have today and increase performance on
today's processors one of the area's
also that I want to kind of clarify or
some of the definitions that we use in
this type of presentation and
conversation about optimizing for
hardware multitasking is the ability to
handle several different tasks at the
same time multi processing is the
ability of the OS to actually utilize
more than one processor at the same time
SMP or symmetrical multi processing is
the ability of the OS actually be a
little bit intelligent of how to
schedule the actual functions and that's
the execution of code across multiple
processors to actually balance the load
so to speak can we go to demo three
machine to give you an example of what
that looks like
here's QuickTime running three different
movies if you can notice the system is
actually utilizing both processors and
they're pretty well balanced QuickTime
itself is is actually threaded so
because of SMP the OS basically knows
how to hand off to each processor the
task of pushing pixels to the screen so
if we had if QuickTime itself was not
threaded if it was a single threaded
application or technology you would see
one processor being overloaded in the
second processor kind of getting some
time when the operating system actually
divvied up some of the tasks
can we go to slides please so what I
like to do is introduce Matt Watson
Coriolis engineer who will actually
deliver today's presentation on why you
should be threading your application for
Mac OS 10 thank you thanks mark so I
work in the core OS group at Apple and
we're responsible for most of the lower
level darwin source code and i
specifically work on the user level
implementation of darwin so why do you
want to use threads in your application
what Mark alluded to earlier was the
user perception of the marketing message
we've been sending out for Mac OS 10 is
it's fully preemptive support symmetric
multiprocessing so customers are going
to expect that their applications are
going to run more efficiently and scale
better on multiprocessor machines so to
take advantage of that if your
application can perform tasks
simultaneously and you're running on a
multiprocessor the customers are going
to expect that added benefit of
performance since Mac OS 10 is fully
pre-emptive you'll notice that all the
applications are getting time sliced
simultaneously and if your application
has multiple threads it's going to get
more than another application share of
that time so you're actually going to
see the benefit directly in your
application in some applications you
might notice that synchronous requests
would block the UI so we want to avoid
that we want to make sure that if you're
coding and you're using an API that
could block for an indeterminate amount
of time a an extra thread in your
application could help prevent the UI
from blocking and let that other
requests complete also polling is
considered bad on Mac OS 10 we don't
want applications to constantly check
for state and one way you can do that is
through some of the synchronization
mechanisms that I'll discuss in a little
bit where you can have a thread that's
waiting for an event to occur in the
background
and it will only continue after that
event has actually fired
so most cases your application may not
need to be multi-threaded in a lot of
instances there are some times when the
added complexity of a thread may change
the logic of your application if it was
written originally to be single threaded
and you have lots of global data your
locking mechanism may not be as robust
as you have originally designed there's
a little bit of added overhead for an
extra thread to come into your
application there's some kernel
resources that are associated with that
and you basically need to decide when
you're designing your app which portions
of your application make more sense to
be multi-threaded there may also be
other options that make sense in a GUI
application you always have a run loop
going that's polling for events and if
you can use a timer for a short-lived
task that might be a better option in
certain cases some of the overhead I was
talking about in a pre-emptive
multitasking operating system a context
switch is what occurs when the kernel
decides to go between different threads
that are running on the system and
there's data associated with every
thread basically the register state that
each thread takes up needs to be saved
and restored between threads there may
also be extra register state depending
on if that thread has used
floating-point or even the velocity
engine and that amount of data is
obviously a little bit more expensive
when you're switching extra threads
throughout your application the thread
memory footprint is also something that
might be significant in Mac OS 10 every
thread gets a virtual half a megabyte of
stack space for instance which is
controllable when you create the thread
but you have to be aware of that so if
you're creating a thread that only calls
one function you may want to reduce the
stack space and I'll discuss API is to
do that or a little bit later thread
creation in its in and of itself is a
little bit of overhead that you need to
worry about because the act of creating
a thread introduces those kernel
resources there
a set of mock api's that is used under
the covers to create threads and the
calls that you use to create the stack
and the actual thread itself take up a
little bit of overhead throughout the AP
is that Mac OS 10 provides for
multi-threading there are some common
concepts all these api's let you create
a thread and let the thread exit itself
there are synchronization primitives
that let you coordinate the events that
are occurring between multiple threads
and every API will have a set of threads
safe services they the documentation
describes things that you can and can't
do for multiple threads we're working
toward making it a lot easier so that
you don't have to worry about which AP
is our thread safe by making sure that
everything that you call will be thread
safe but that's not quite there yet in
Mac OS 10 threads are the scheduling
primitive it's the unit that the kernel
uses when giving up the work that needs
to be done at every time slice our
threads are fully pre-emptive so the
kernel will interrupt a thread that's
running to start the next thread there
are some exceptions to that where we
have some scheduling models where a real
time like API can be used to specify
somewhat of a deadline schedule where
you can say I want my thread to run this
long but in most cases the default
threads that get created are normally
preempted we use a priority based
scheduling model so by default all
threads get the same priority when you
create threads or after the thread has
been created you can change or modify
that priority so if you have a thread
that needs to be more important than
other threads either in your application
or system-wide you can change that you
can also depress the priority of a
thread if you want that thread to be in
the background more if it's just doing
some low-level work that doesn't need to
be in the user's face we use a one to
one threading model which means that as
the api's I described are calling into
the low-level kernel threads we use a
single kernel thread per high-level
thread concept so as I discuss these
api's keep that in mind
there are other implementations out
there where you have multiplexing of
threads in userspace per kernel thread
but the added complexity of that and
scalability on empty machines makes it a
little bit more difficult to justify on
OS 10 mock threads as I said are the the
basis of our threading model in general
you can inspect the mock thread by using
the mock API but you really want to use
the API that's appropriate for your
application if you're writing a
high-level carbon or cocoa application
you stay in those api's if you want to
inspect the thread state if you're
writing a performance tool or something
along those lines you can actually use
the the mock API directly but we
discourage it because if for instance
you change a threads priority through
the mock API and not through the API
you're using that API may not notice it
so later on your thread scheduling might
behave unexpectedly so this is a slide
we've shown before and what it's trying
to show is the an example of what
happens if you take an application and
run it on a multiprocessor system with
more than one thread and the numbers
along the edge of the bars there are the
multiplier factors for the improvement
in performance so you can see the first
third and fourth numbers are all under
two but that second number is above two
which is kind of interesting and it's an
exceptional case where you might wonder
how I could get more than 2x performance
improvement on a multi-threaded
application well if you have two
processors in the system you also happen
to have two caches so if your data set
fit in the primary cache you would
wouldn't see more than 2x performance
improvement but what happened in this
case was the data set was able to be
split up across two caches and you
actually got a little bit better than a
2x performance improvement so you may be
surprised in your application depending
on the size of the data that you're
manipulating how about how good your
performance can actually be in the mock
implementation it was designed from the
ground up to be full symmetric
multiprocessing so we don't have any
special case code
for a single processor dual processor
everything on the system is designed to
be run on a multiprocessor machine the
velocity context I mentioned earlier is
something that has to be saved and
restored across processors and that
context that gets saved is a little bit
expensive that's what I was mentioning
about the overhead you might need to
worry about on OS 10 we use the same
kernel binary for either a single
processor or a multiprocessor system we
don't have a special install for a u P
or an MP system this basically makes our
own development a little bit easier we
don't have to QA two different kernels
on the system and it's a little bit
easier for us to do the install as far
as making sure that customers all have
the same bits on their machines the mock
scheduler was inherited from the OS f
the open software foundation we use a
scheduling framework that they designed
but we've modified it heavily for our
own use it has a global run queue
meaning that all the tasks on the system
get switched through on every context
switch the tests that are runnable I
should say the system notices that if
you're on a multiprocessor system you
have an idle processor so it will go and
schedule a thread to be on the most idle
processor at the time so we don't really
have a notion of thread affinity it
might be a term you've heard of before
where you can have a thread running on a
single processor for a long time the
kernel will basically balance the
resources of the system as best it can
and as I've said before preemption is
key here every thread gets preempted as
it goes through the run queue the user
frameworks that you'll see when writing
a multi-threaded application are
probably just the same frameworks you're
using for your application there's
nothing really special most of the
frameworks that we have have an API for
threading in Carbon it's the MP or MP
task API that has been out for a few
years in cocoa there's n s threads which
are used both in the non GUI and GUI
frameworks and in Java we have Java
thread
which is just a special implementation
that is using the the underlying
primitives in Darwin all of those three
api is depend on the Darwin P threads or
POSIX threads api's and since I work in
the core s group that's actually the API
set that I'm most familiar with and I'll
talk about that in a second P threads as
I said is the basis for all threading
models every API that I described goes
through the P threads layer so when we
make changes in enhancements to the P
threads layer all of those API sets take
advantage of that I put in quotes there
it's a light implementation the the
history behind that is when we were
trying to decide on an API that we could
use to implement those higher-level
threading models we chose P threads and
the implementation decisions were driven
by those higher-level api's so if you
look in the header file and our
documentation you might see that there
are some API calls missing and in
general we're working toward flushing
out that API but the design goal was to
help ensure that the higher-level carbon
cocoa and Java API is could do their job
we use a one-to-one mock to P thread
implementation as I said that reduces
the complexity of our user level code
and helps the scaling on an MP system
better in P threads there's some common
API uses and misuses if you haven't done
threading before P threads is a fairly
well-defined standard you can go to your
local book store and get a book on it
I'll talk about some of the ways that
you'll need to be careful when using P
threads on our implementation for
instance one case we don't have any
system-wide type so in P threads there's
a specification that provides global
shared memory based mutexes and
condition variables for signaling those
can be implemented in your own
application but we don't provide those
api's right now one thing that you might
need to remember is synchronization is
not cheap in P threads the model that P
threads uses has a mutex lock associated
with a condition variable and for the
to be properly used in anatomic fashion
we have to use some kernel resources to
do the signaling when you're
synchronizing between threads the
default behavior for a pea thread as
specified by POSIX is that threads are
what's called joinable which means that
when you create a thread the operating
system will hang around and wait for
that thread to finish if you want your
thread to go away and do its job and you
don't want to worry about it anymore
there's an API to detach that thread
which just means let it do its job I
don't want to hear about it
let it finish on its own as I mentioned
earlier the stack space for a thread is
by default half a megabyte now it sounds
like a lot but it's all virtual memory
that is used on demand so if you look at
process listing you might see your
application using a lot of virtual
memory but unless it's actually been
touched the system hasn't allocated that
for your application yet so it won't
cost you as much as you might think but
even given that if you have a lot of
threads running in your application you
may want to create them with an
attribute saying I want my thread size
to thread stack size to be smaller and
if you do that then you can limit the
visible virtual memory used by your
application in the POSIX specification
for conditioned variables and signalling
there's the notion of a predicate which
is when you want to signal that
something has happened there's a mutex
and a condition variable associated with
that but there's also some external
condition that needs to be checked for
so you can have a global volatile
variable that says this thing happened
well if you just use the condition and
mutexes without having a global variable
the API won't work properly and this is
discussed in a lot of POSIX thread
specification texts you can just look
this up the the most common error that
people use is they try and do signalling
without a predicate our implementation
of P thread cancel which when you start
using threads or you if you've used
threads in other systems you might want
to try to cancel or kill a thread that's
currently running and this is dangerous
for a few
reasons one is if you're using
asynchronous cancellation where you're
basically telling the system I don't
care what the threads doing I just want
to go away that's dangerous in the sense
that there could be kernel resources
associated with that thread that may not
be properly cleaned up there may be some
other data associated with that thread
in some other task that is not getting
cleaned up like a file descriptor that's
open or a file that's taking up space on
the disk so we recommend using the P
thread canceled in its deferred or
synchronous model where you basically
make a request to the system that I'd
like this thread to quit and in our API
we have a P thread test cancel call
where if you're in a long-running
compute bound loop you can check to see
if a thread has a request to cancel and
if so that thread will exit at the P
thread test cancel point the POSIX
specification also provides for system
defined cancellation points like most
system calls and if you look in the
Darwin store space you'll see that we're
busily working on implementing those
because it's actually a better way of
doing the cancellation if I'm in an open
or read system call and I tell that
thread to cancel it should just break
out of that open or read system cause if
I had interrupted that system call the P
threads documentation right now is a
little sparse on our system I usually
point people to the open group site
because they happen to have a pretty
extensive documentation on both the UNIX
98 standard and P threads specifically
in general we use that as our model for
doing our implementation we will be
providing more pthreads documentation on
the system at some point in the future
so in the Carbon api's as I said the MP
task spec is what you'd probably look at
for your multi-threading needs MP tasks
just a quick overview in Mac OS 10
there's the notion of tasks and MP tasks
so tasks have classically been the
process notion where you have an
application it
address space all of its threads in MP
tasks those are threads within a carbon
application and as you know in Mac OS 10
all applications are in a separate
address space so some of the api's that
you may have used in classic Mac OS 9
are not going to work the same way you
can't do signalling between applications
right now using the MP tasks API the API
here is pretty rich there are a lot of
mechanisms to do synchronization between
MP tasks there's a semaphore model there
are message queues and event groups all
three of these can be used in different
ways depending on your application if
you have some client-server model if you
have a worker thread model you can
decide which one works best for you
the other API that's a little bit unique
here is the critical region API this
basically lets you do mutual exclusion
and it also allows for a recursive entry
to those regions there are atomic
operations that are present in this API
that are kind of handy for atomic
increment and decrement and test inset
type instructions these are all done
very efficiently and they're probably
very close to the same implementation as
on Mac OS 9 some of the api's that exist
on Mac OS 10 that use MP tasks under the
covers are the synchronous File Manager
api's and the open transport api's if
you'd like to see an example of some of
these api's and the tech note that we
usually refer to is 1104 the URLs kind
of hard to read there but in general
that will give you a background on what
we're doing with the MP task api's and
some examples of the thread-safe
services that you can use in carbene the
documentation specifically for the multi
processing services is on developer
apple comtex pub tech pubs MP services
and these are all these documents are
being
volved to reflect what the current
carbon api is the second framework I'll
talk about is cocoa it's the high level
GUI framework that Apple has presented
as an object oriented environment for
doing application development and as
threads API is very simple to use there
aren't very many entry points to it
basically you can create a thread you
can get the threads state at any time
the preemption pre-emptive nature of the
threat isn't unique all the threads in
the thread malls I've described or
pre-emptive there's an exit notification
for NS threads where even though the
thread is detached meaning that it can
go away and you can forget about it you
could also register for a notification
saying I'd like to know when this thread
goes away to clean up resources common
to most thread api's is the notion of a
per thread data and when you extend that
into an object-oriented environment the
per thread data becomes a per thread NS
dictionary so you can have keys and
values that are associated with your
thread using the nice high-level cocoa
api's in NS thread there's an app kit
extension so even though NS thread is
defined in the foundation classes if you
look in the app kit our cocoa framework
you'll notice that there's a method in
there that says detach
drawing thread and what that lets you do
is give a hint to the system that I'm
going to be creating a thread here that
might be doing interaction with the
window manager and the app kit will set
up special state to make sure that all
that's interactions with the window
manager are thread safe now if you're
writing a quartz pure quartz application
that interaction is already thread safe
in fact the way that you can write a
quartz application without using any of
the cocoa frameworks is to use one
connection to quartz per thread and that
provides all the the synchronization you
need you still have to protect your
global data if you have multiple threads
that are running that are all trying to
communicate with the same connection
you need to use the synchronization
mechanism that I've described to make
that work but in general the applicant
extension lets you do all you need to do
for our KOCO thread to let it know that
you're going to be doing drawing and
this threads are self aware meaning
there's no global notion of all the NS
threads on the system or you're not
really going to be handing NS threads
off to other objects in the system all
these api's I've described since they're
layered on top of POSIX threads can ask
for their POSIX thread and then once
they've done that they can perform any
of the POSIX thread api's like changing
their priority or finding out what the
stack size is for the thread but in
general NS threads kind of stay in their
own realm they have a separate run loop
so usually when you're creating an NS
thread since you're already in a GUI
application if you're using cocoa
there's a main run loop that's
associated with the first thread that's
been created on behalf of the system but
new threads that get created are going
to have their own run loop because those
those threads will probably have a
different signaling mechanism and
different events that are occurring that
need to signal back and forth between
other threads and for instance the other
cocoa api's that have to do with run
loops or timers and notifications and
you can send those between threads with
a little bit of synchronization there's
a concept of an auto release pool that
if you haven't done any cocoa
programming yet this is a wrapper for
objects which gets created and destroyed
kind of on the fly so you you call a
method it returns you an object and one
thing notion that cocoa tries to help
out with is the memory management so if
you don't actually explicitly create an
object it just comes back from a call it
gets what's called auto released and the
way the auto release pull works is on a
per thread basis so the the main run
loop has an auto release pool every time
through the event loop it releases all
these objects that may have been created
on behalf of another method call and in
a separate thread you need to make sure
that you're maintaining the out a
release pool as well because all the
messages that get
on that separate thread we'll need to be
Auto released if they're returning
objects eventually the documentation for
NS thread is also on developer.apple.com
in a very long URL actually the cocoa
documentation is very well done and
there's I believe in O'Reilly book out
that you can go get now for future
developments basically if you follow
Darwin since I'm involved in the crow us
group we basically get we have the
ability to put all of our work that
we're working on daily in the public CVS
repository so if you want to just go to
the Darwin web pages that will let you
know how to go ahead and check out any
of the CVS repositories for all the the
work that we're doing specifically the
Lib C project is where the pthreads code
lives and the xnu project is where the
mach kernel is and where the mock
threading model lives some of the things
we're working on and we're also getting
help actually the Darwin community is
very interested in our threading
implementation priority inheritance is
an issue where if you have multiple
threads and you're finding your lock
contention becomes a problem we're a
higher priority thread is blocked
waiting for a lower priority thread to
run there are some solutions that you
can use in in your application to work
around this problem but in general the
system should help out by temporarily
raising the priority of the thread that
needs to run for the higher priority
thread to continue so that concept
hasn't been implemented yet but we're
working on that the general API
expansion issues of the P thread spec
that we've provided we've gotten a lot
of requests for specific functionality
that's missing we know about that and
and we're working on it
the other thing that we've been focusing
on is performance like I said in the
first couple slides when you're deciding
whether you should use multiple threads
in your application or how you should
use them we don't want to be an
impediment
there we don't want the choice to use
multiple more than one thread in your
application to be have to be made
because there's a performance issue in
general you should use multiple threads
if your application has a data model
that works well with that if you have
lots of data that can be parallelized in
its functionality where you're splitting
up chunks of data in a graphical
application that's doing tiling or the
the best example we've used is photoshop
a lot for some of its filters those
things lend themselves very well to
multi-threading and we don't want you as
a developer to have to worry about
whether you're even on a multiprocessor
system in fact if your application is
written properly you should just
automatically take advantage of that
second processor and your customers will
be happy because they paid the money for
the multiprocessor box and they're
actually getting the performance that
that they would expect we're working on
providing more thread-safe services as
i've mentioned in the carbon api's and
even in the lower level darwin api's
we've been working pretty hard on making
sure that you can call the api's that
you want without having to worry about
creating your own locking mechanisms
around the parts of the api that are not
thread safe so right now we have a
couple demos first I'd like to bring up
Robert boated who's going to show an
example of some of the developer tools
that were working on thanks Matt
so what Matt has done is explained to us
what the api's are for using threads and
a little about the reasons about why we
might use them what I'd like to do is
give us some case studies and explain
how some of Apple's own tools use
threading now the interesting problem so
let's start this up is that this is
going to be a somewhat unauthorized talk
I've talked with the groups a bit but in
general what we're going to do is we're
going to reverse engineer on the fly
these apps and try to understand what's
going on the way I'm going to do it is
with a handy dandy little performance
tool this is a pet project it's not yet
on the developer CD that tries to
visualize how threading goes on now the
first step I'm going to look at is the
finder okay let's get some action here
and what the time what we're seeing here
is a timeline view in thread viewer so
each of the little blocks there
represents about a 50 millisecond
interval and each of the bars represents
each thread and so the idea here is that
you can see that there's three threads
currently in the finder click around
and the colors change according to
what's going on so for example the green
represents that there's currently
execution on that thread the yellow
represents that execution occurred but
is not currently occurring so there was
some execution during the last sample
the green as in here represents that the
program was waiting in the run loop the
red represents that the thread was
waiting on the lock now the first thing
I'll show you is that you can see that
there's basically one thread that does
most of the action that is the main
thread and that's how most applications
run so most of the drawing most the UI
logic is going on there the second
thread from the bottom the one that's
locked represents what the finder people
call a sync thread so the idea is that
every time that they cache a lot of the
information about what's going on in
each folder however when you enter a
folder that you've already seen you need
something that will go and make sure
that nothing has changed in that folder
and that's what the sync thread does so
the idea is it quickly goes and it looks
in that directory in a separate thread
notice that it's always existing it's
always locked so it can be started up
quickly and by having on a second
separate thread you not only have the
ability to basically have scalability
and do things in multiprocessing way so
that you get quick response but also
since you're accessing the disk you know
that if that blocks because let's say
it's an eye disk you're going to be able
to go and access that and wait for the
response without actually stopping the
UI thread and so that improves the user
experience is now is telling us let's
start finder up again
the other thing that you'll find as we
click around is that each time that we
enter a new folder we can see that new
threads get started so there was the
remnants of one here and then we have a
thread here and that thread only stays
around for a small instance and then
goes away again and what's happening
there is that once again the finder guys
don't want the UI to block the the
finder would look miserable if every
time that you went into a new folder and
it went out to touch the disk if the
entire finder locked up
it's a horrible user experience and so
to avoid that as many of you probably
will want to do in your own apps what
they do is they have the disk access is
going on in a separate thread when they
enter the new folder they go
the catalog it they do that on a
separate thread and that way they know
that the finder is not going to block
while they collect data a third example
is if we do a copy so let's go to the
home directory and let's duplicate a
folder or duplicate an application and
what we find is again up here we've
created a new thread the idea is that
the copy is something that's going to be
a long-running app or a long-running
thread it's something that's going to be
touching the disk and so it's going to
be blocking so you don't want it on the
normal thread and it's something that
should be running in the background
because it's probably something that the
user doesn't have a huge amount or
doesn't really care that it completes
immediately and wants to be able to do
other activities and so the idea of
being able to split off this background
task or background process is an
important idea so what you've seen here
is that the finder is using threading to
avoid blocking on IO in case they're
accessing disks that are remote for
example you've seen trying to improve
the user experience by making sure that
blocking stuff happens on separate
threads and you've seen the idea of
using the threads for background actions
ok the second example I'd like to show
is iTunes so let's go back to thread
viewer
and we'll see there's a few more threads
in this so let's start out by getting
some music playing okay so we know it's
actually playing you know that this
isn't canned and what you see here is
the activity going on in the threads the
thread at the bottom again is the main
thread so it's sitting there and it's
doing all the UI work and if we actually
had the visualization part of iTunes
going you'd see that thread basically
pegging out constantly running the next
thread up represents basically data Cola
data decompression which is actually
running as a deferred task and as a
carbon thread what you also will see is
occasionally on this third thread
there'll be some green blocks that go by
and what's happening there is that
there's an actually a separate thread
that's being used for accessing the disk
and so the idea is that this thread goes
off and everything now and then when it
needs data it grabs as much data as it
can stores it in a buffer then there's a
separate thread that actually does the
decompression and then the thread at the
very top where you see all the activity
is actually the thread that sends data
off to the audio device on the disk we
want to do large accesses and we want to
grab a huge amount of data and then do
the decompression with the audio device
we want to minimize latency and so the
idea is we want to throw a few little
bytes at a time out to that audio device
and so by having that on a separate
thread that means that we can easily
send the data that's needed and do it in
as timely a manner as possible so we
have no dropouts in actuality the three
threads that you see there the
decompression the disk reads and the
audio playback are actually all
controlled by the sound manager so
they're actually not threads that itunes
actually went to any trouble to actually
create but once again you're seeing
cases of iTunes using threading to avoid
disk block or a blocking on disk i/o you
see it trying to take the tasks that are
time critical and put them on separate
threads and you're seeing the idea of
trying to make sure that you can
parallel eyes the code by breaking the
major parts of sort of this pipe and
filter model into separate threads okay
these were just two examples about how
Apple is actually using threading
hopefully the people who are available
for Q&A can give you some more ideas
about when to use threading and when not
to you thank you thanks Robert
so the second demo we have is Yvonne
posta who works on the job of virtual
machine and as I said earlier there are
definitely cases where your application
is more inclined to be multi-threaded
especially if you have lots of data that
can be operated on in parallel so he's
going to give us an example of an
application where this is the case and
helps out a lot okay good morning
what I have is basically a digital
elevation model from Switzerland and I
wrote a swing app that basically render
scenes within Switzerland so I have a UI
element down here swing it says one so
I'll spawn this rendering in one thread
so it goes off it tiles the the image
into small pieces and does that on
thread you see in the task manager one
CPU is usually used and the other CPU is
Java UI making use of the spare
processing power to display so we saw
that it took fifteen point seven seconds
to display this image if I go to two or
even three threads on this machine when
I restart this image we see that the UI
updates much quicker
it's pegging both the CPUs calculating
so what else do I have I can do the same
thing with the satellite image just
taking the taking the same data model
and while it's calculating I can scroll
in the indie UI here and still use use
the UI why it's calculating the stuff so
Java itself has built-in support for
threading in language it's rather easy
to use threading rather easy
for us to update the UI at the same time
as you do calculation that's that's
about it
so how much what was the performance
benefit you saw with your multiplier in
example four for this calculation it was
about one point eight scale a scale
factor so that's pretty good if you
consider that part of the update is
happening while you're calculating it's
it's pretty using the processors that
machspeed so it's it's also a feature of
the swing implementation that the
developer doesn't have to worry about
the locking for drawing to the UI that
just happens for them that happened that
happens behind the scenes for the user
you just say repaint this area here's
your new image that's it great thanks
Yvonne so I'll bring up mark to finish
off the information about this session
and other sessions so what I wanted to
do is bring up a related sessions that
you might be interested in furthering
your knowledge about tuning and specific
areas of how to get Hardware
advancements here so session 121 with
carbon performance tuning which Robert
will be at advanced cocoa topics or
session 123 and the Darwin kernel and
session 140 I also want to add that
there will be a birds of the feathers a
session at the end of the evening today
at 6 p.m. here in Hall C I believe it is
with velocity engine so we'll we will
actually discuss with the updates on
velocity engine we wanted to kind of
separate both of these topics and
allocate as much time as we could to
threading you know one of the things
that last year I was up here on stage
promoting to you developers and at that
time we didn't have dual processor
systems I was promoting the fact that
you should thread your application kind
of not being able to tell you that hey
we're coming out with multiprocessor
systems but you know if you thread today
as OS 10 matures
and was actually going to be released
your application will run faster simply
because the threading model within OS 10
is a lot more efficient than what is in
oh s nine so today with multiprocessor
systems you can see now why there's even
more of a reason why you want to
optimize your applications so taking a
single threaded application bringing it
over to Mac OS 10 you the customer
experience is going to be well there may
be some performance enhancements but
really it's more perceptual because OS
10 is actually doing all the work for
you if you multi thread your application
you get additional benefits because I
was 10 now says okay great I can you
know move things around a lot quicker
and use the thread model within the mach
kernel now the next step is really
optimizing your application for MP
because not just providing threads is
not going to give you the the efficiency
of using that second processor that's
where SMP comes into play so actually
paralyzing your code and seeing where it
makes sense to actually have your your
thread go off to the second processor is
what factoring or paralyzing your code
involves and that's what we term as
optimization for MP hardware and that's
what you kind of saw in the example with
QuickTime itself actually balancing the
pegging of each processor so that SP
itself can take care of the housekeeping
there
you