WWDC2010 Session 309

Transcript

>> Lynne: Good morning, everyone.
My name is Lynne Salameh, and I'd like to welcome you
to Advanced Performance Analysis with Instruments.
Before we start, let's take a quick look at today's agenda.
As you can see, we've got a lot of ground to cover, because
in today's session we're going to be showing you how
to use efficient and advanced profiling
tools such as Time Profiling and Data Mining,
in order to find out where your
code is spending most of its time.
We're also going to be introducing new Instrumentation
that will allow you to understand the power characteristics
of your iPhone applications and understand how your
applications interact with the system as a whole.
So let's start off, let's talk about Time Profiler.
In Time Profiling is a statistical sampling
mechanism that was first pioneered by Shark,
and is today in Instruments the most efficient
sampling mechanism for use on the iPhone
and for your Mac applications in order to find out
where your applications are spending most of their time.
How does Time Profiler work?
Time Profiler looks at the running threads in your code,
and it basically takes samples at regular time intervals.
And shown here are those backtraces that
are collected when those samples occur.
So these are example backtraces and we'll be
taking a look at them in more detail later.
But one thing to note here is Time Profiler also
collects backtraces from your kernel and drivers in order
to show you the time spent in that code as well.
So Instruments takes these backtraces and constructs what is
called the call tree in order to simplify and make it easier
to understand what's going on in your code and make it
easier to find where you're spending most of your time.
So I'm going to show you how Instruments
construct an inverted call tree,
which means the deepest stacked frame is going
to be at the root of the tree and we're going
to work backwards and view our call history in reverse.
So to start off, let's assume that we're spending
1 millisecond in each of these backtraces.
So we see that we're spending 2 milliseconds in the
frame interpolate, and that's our deepest frame.
And then we can work backwards, acquire
image we're spending 4 milliseconds there.
But 2 of those milliseconds go ahead and call
interpolate, so we have 2 different nodes on our tree,
and so forth with cg drawImage and then copyImage,
which is actually a frame occurring
in our codes, the mipmaps application.
And then finally, main, which is
our entry sequence interrupt code.
So this is how Instruments combines
backtraces to construct an inverted call tree.
But the way that you view these in Instruments,
or how Instruments actually displays
these, is using an out find view.
So if we take the center call tree over here, it's
actually shown, slide let's move back 1 second.
Yeah, there we go.
It's actually shown in an outline format
shown to the right in Instruments.
All right.
So really you should use Time Profiler for
your sampling needs because of 3 reasons.
First of all, the profiling is
performed from within the kernel.
Which means, Time Profiler has very low impact
on the applications you're trying to look at.
And it really collects stricter, more precise timing
information about the applications you're trying to target.
Secondly, you can use Time Profiler to look
at not only the threads that are running
in your applications, but also to target all thread states.
And what this means is that you can also
find where your threads are waiting in your,
in the code that you've written in
addition to when they're running.
And finally, you can use Time Profiler to
target all the processes on your system
and also to collect kernel fact traces
when you're looking at Mac applications.
So we talked about Time Profiler in
relation to overhead when recording.
And let's actually talk about overhead when we're
dealing with the recording styles that Instruments uses.
So the traditional recording mode of
Instruments is called the immediate modes.
And this is very useful because it allows
you to correlate events that you see,
that you perform, sorry, in your applications.
Like for example, clicking a button
with events that you see on a track.
Like a spike on the track a change
in the tables while you're recording.
Unfortunately, because Instruments is actually using the
same CPU that your app is actually, you know, executing on,
and it's using the core, it's really consuming some
of the time that could be used by your application.
So this is what we call the observer effect.
Where the timing that we collect is not as
precise as it could be, the timing information.
So new in Instruments as part of the iPhone 4
SDK, we've introduced the deferred recording mode.
In the deferred mode, Instruments disables your UI, disables
the track and the detail views while you're recording,
and it defers all of the data processing towards
the end of, until you hit basically stop recording.
And what this does is it minimizes the
observer effect, which is what we want.
So how do you access the deferred mode?
You access it from the file menu in Instruments, and
you go to Record Options and select Deferred Display.
After you do that, if you hit Start Recording,
Instruments will immediately disable the UI and gray it
out like you see here, and you
start recording in deferred mode.
And the UI will come up once you hit Stop Recording.
All right.
So also new in Instruments as part of the iPhone 4
SDK, we've introduced, we've allowed Time Profiler
to collect kernel backtraces for
applications running on the Mac.
So this allows you to peek under the covers, so
to speak, of what your applications are doing.
For example, if you see that your code is making
a lot of system calls and you're spending a lot
of time making system calls, only when you uncover the
kernel backtraces can you see why these system calls are
actually costly and what they're doing.
And this is very useful for optimization of
drivers and kernel extensions that you might write.
So, for example, let's take a look at this
stack trace we have here on the right.
We can see that, you know, the lowest
frame we have is for performing a mem copy.
And only when we uncover the kernels
stack traces can we see that some
of these mem copies do incur virtual memory faults,
where the memory manager pages in a
page in order for mem copy to use.
We've also unified the backtraces in the
call tree view, so that the tops version,
or the tops frames in this stack trace or kernel
frames, and the littler ones are user frames.
So to make it easier for you to
view your code in a unified fashion.
Additional backtrace improvements, or backtrace viewing
improvements, is the introduction of stack compression.
And as you can see here, I can just move the slider
to actually simply the stack traces that I see
in the external extended detail view, in order to
collapse them along framework boundaries and collapse them
down to just code executing in My Applications.
So we talked about improvements to stack viewing.
We've also improved the way you view your
source code by introducing the disassembly view.
The disassembly view shows you the instructions that
your code is using for the ARM and Intel architectures.
And in this case, using the disassembly view can
be useful, for example, when you have the bunch
of nested function calls, and you see
that your code is spending a lot of time
on that line with a nested function call.
So only when you navigate to the
disassembly view you can figure out precisely
which function is actually consuming all the time.
The disassembly view can also show you the effectiveness
of your templates expansions and your inline code.
And finally, when every instruction counts,
and you're trying to optimize tight loops,
the disassembly view lets you do optimization
on the smallest level of granularity.
You can access the disassembly view via 2 methods.
So this is the source view that
we know and love in Instruments.
And you can see that it's annotated with some
profiling information from Time Profiler.
If you actually select the Information button over here,
the Instruments will pull up, the instructions corresponding
to the code that you see in the source view.
You can also access the disassembly view by
selecting the top right button in the corner.
And it will toggle between the
source views and disassembly views.
And as you can see, my disassembly
actually has my source annotated as well
with the profiling information from Time Profiler.
Now new in Instruments as part of Xcode 4 is
the availability of the documentation for the
in phone ARM disassembler instruction sets,
which you can access by selecting this gear
and selecting show ISA
Reference Guide, which will then take you
to the documentation for these instructions.
All right, so we've talked about Time Profiler.
Let's show you Time Profiler in action.
I'm going to switch over to the demo machine.
And here I have Instruments.
And the first thing you see is a template chooser over here.
So if I zoom in I can see that over here on
the left-hand side I have several categories.
And I've selected the Mac categories.
And I'm going to select Time Profiler Template to add
the Time Profiler Instruments to my trace documents.
I'm also going to target an application that you might have
seen during the State of the Union demos called Mipmap.
And I'm going to hit record to bring up that application.
So Mipmap is basically takes a bunch of images.
It renders thumbnails to those images onto the screen.
And then it continues to create scaled down
versions of those images, which are called mipmaps.
And what I'm going to do is I'm going to start recording.
And you can see Instruments actually
collecting information in the background.
And I'm going to try to scroll while
I'm trying to render these thumbnails.
And as you can see my application is
actually hung, it starts spinning.
The animations are choppy.
It's not really doing what I want it to do.
And I really want to figure out
what's going on in this case.
So it's finished processing the images, and I can
just switch back to Instruments and hit Stop Recording
and start looking at the data I've collected.
So the first thing I'm going to draw attention
to is the configuration view over here.
This basically controls how the settings
for your backtraces and your call trees.
And in this case I've selected Invert Call Tree,
so that I can see my call tree in inverted fashion.
I'm also going to select Separate by Threads.
And over here to the right you see that I'm spending a
lot of time, about 16 seconds in this frame right here,
Resample Byte H for CPP vector,
which was happening in Core Graphics.
This is not really my code.
And neither is this where I'm spending
about 8 seconds, decode MCU optimize.
So let's take a look at, let's investigate this further.
I'm going to zoom out and bring in the extended
detail view from the top right corner over here.
It pulls out this drawer.
And the first thing I see when I look at the stack
trace, if I zoom in again, is this code the resample
by 84CPP Vector is actually
getting called from my code in ScaleImage.
I'm kind of, I'm kind of expecting that I'm spending
a lot of time scaling my images, in this case,
on the separate threads that I'm using to do this process.
The next frame down, what I see here is
that a while bunch of code is going on.
Let me zoom in again.
Sorry. There we go.
There's a whole bunch of code that's
doing image decompression.
And this is happening in frameworks that are not my code.
And if we look at this part right here it's
trying to acquire image to draw that image.
If we scroll back up we're seeing that what we're
doing here is attempting to do a lot of work to create
and decompress images when we're
in the thumbnail view draw rec.
We're trying to render the thumbnails.
And if we actually scroll up all the way down and zoom
in on this backtrace, we can see that we're doing a lot
of this work, this decompression
of images on the main threads.
In fact, I'm going to go ahead and select Uninvert the
Call Tree to see my, the call stacks in a top down manner.
And you can see that in this case;
oh, let me zoom in again, I'm sorry.
I'm spending; you know, I have all these threads
that are supposed to be doing the image creation,
the thumbnail creation and initialization, but I'm
actually spending all of my time on the main thread.
And there is a disproportionate amount of time
spent there in comparison with the other threads.
And this thumbnail view draw rect is
appearing as the heaviest stack trace on my main thread.
So let me take a look at my code
and see what I'm doing there.
I'm going to bring up Xcode.
And this method right here, the;
let me actually scroll up to it.
The Start P Threads Methods, what
it's doing is it's doing all the work.
It's creating the thumbnails and going ahead and
creating the scale down version of the mipmaps.
And what I had assumed when I first wrote this code,
that since this code, this part of the code is executing
on a secondary thread, my call to NSImage alloc and
it was context of URL, to create the thumbnails,
would actually do all the work of
the initialization for the images.
But in fact, Time Profiler has told
me that all this work was getting done
on the main threads, so in fact, this call is actually lazy.
It doesn't do any work till it
actually needs to render the images.
So to optimize this, I'm actually going to replace this
call with code I know, takes the initialization stage
and actually does it on the thread
that I intended it do to it on.
So I'm going to delete these two lines of code.
And I'm going to bring in some new ones.
All right.
So instead of using the original
call to create the thumbnails,
I scale my image down using Scale
Image ImageClipFracForResizeMode.
And then create the thumbnails
over here with inits with cg image,
and then assign them to the mipmaps.thumbnail
on the main threads.
So hopefully this should do what I expect it to do.
I'm going to save, hit build, and I'm going to verify that
this change actually improved my performance by going back
to Time Profiler and running Time Profiler again.
So here's Time Profiler.
So since I've already selected mipmap
I'm just going to hit Record again.
And it'll bring back my mipmap application.
And now I'm going to hit Start.
I'm going to scroll.
It seems smooth.
It's actually running through without any
choppiness, and it's done in 5 seconds
as opposed to 11 when we first started off.
So to verify this actually did
change what we wanted it to change,
I'm going to go back to Instruments, hit Stop Recording.
And if we zoom in actually you can see that when
we were originally spending about 11 seconds
on the main thread trying to initialize these
images, we're now spending about 3 seconds.
And the time is now more evenly distributed
amongst our threads that are rendering the,
creating the thumb nails and creating the mipmaps.
All right, so this is how you use Time Profiler to
find where your code is spending most of its time,
and basically find, try to optimize this.
And now I'm going to switch back to
the slides and talk about Data Mining.
So one thing we saw during our demo was
that we were looking at a lot of frames
and stacks that were not coming from our codes.
There were, you know, operations that
were getting performed in Core Graphics,
all these image-rendering methods
that we have no control over.
So I'm going to tell you about a very useful tool, it allows
you to charge the time that you spent in those libraries
and in those frames to your code, because
they're doing code on your behalf.
And this is called Data Mining.
So Data Mining involves a bunch of operations.
And the first of these is charging a symbol to its caller.
For example in this case, we have
an example, sorry, call stack.
Sorry. We have an example call tree.
And we have some example times shown on that call tree.
If I charge acquire image to its
caller, I take the time spent there,
the 98 milliseconds, and attribute it to its callers.
And I can then combine the far end of my backtrace
together to see that I spent 1 of 5 milliseconds
in draw image in that part of the call tree.
In fact, you might find it useful to
charge an entire library to its callers.
And in this case what you can do is if
I charge Core Graphics to its callers,
I take all the time spent there and
attribute it to its calling frames.
And once again, I can combine the
furthest branches of the tree to see
that I'm spending 139 milliseconds in acquire image.
A lot of times you might want to take
away the time spent in a certain region
of your code from the rest of your call tree.
For example, if you're not interested in that
region, because you can't really optimize it further.
So this is pruning a symbol in its subtree.
And what you do, if for example, we prune away powerOf2Image
is we take the time spent in powerOf2Image in all the frames
that it called, and remove it from our tree.
Which means, that instead of 98 milliseconds spent in
interpolate, we're now spending 76 milliseconds there.
And finally, you might find some cases where
your code is making a lot of calls into frames
that are happening in a different library.
And there's a lot of these frames.
Then it calls back into your code.
And really you only care about
the boundaries, where you know,
the transition happens from your code,
into that library, back to your code.
In this case, you should flatten
the library to its boundaries.
So if we do that in this case with Core
Graphics, we just remove away the extra frames
and keep the ones that occur at the transition points.
So that concludes; or that basically
tells you how to use Data Mining.
And I recommend you use it to simplify
your backtraces and find out precisely
where in your code you're spending most of your time.
And with that I'd like to introduce my colleague Chad Woolf.
>> Chad: All right.
Good morning everyone.
My name is Chad Woolf as Lynne mentioned.
I work with Lynne on the Instruments team.
And this morning I want to talk to you guys
about the new iOS 4 and how it uses power.
Now it's hard to talk about mobile devices these
days and not talk about the topic of power.
And this is because battery life is such an
important part of the user experience these days.
And the way that we get the most battery life on
our devices is to use as little power as we can.
Now in order for you guys to help us do that,
we need to give you both a way to measure power,
and determine where that powers being spent in the device.
So I'm happy to announce in the iPhone SDK 4
we have a new suite of power analysis tools.
We have within Instruments an Energy Usage Instrument,
which actually measures energy usage directly
from the hardware in your mobile device.
This come right to, as close as we can get to the battery
and it measure the energy flow
from the battery into the device.
Now our next Instrument is a CPU Activity Instrument.
In which case we're measuring the
CPU activity in a given interval.
We have a new Sleep-Wake Instrument,
which records our sleep and wake events.
And we have a Display Brightness Instrument, which will
record when the display is come on and when it's gone off,
and we record the user setting for
the display brightness there as well.
We also have a few radio Instruments, which monitor
the states of the radios in the iPhone and the iPod.
And we've taken all those Instruments and we've
put them into a single energy diagnostics template,
which will create a single document for
you with all these Instruments populated.
Now one of the things you'll notice when you start
doing power measurement on a mobile device is
that mobile devices tend to function differently when
they're at home versus when they're in the field.
Now in this context what I mean by home
is really connected via the USB cable
to your host machine in your development environment.
Now when you connect that device up to your host, a
couple of things happen differently in the device.
First, the sleep and wake is disabled, so you're
not going to see sleep and wake events occurring.
Energy's actually flowing from the
host into the battery to charge it.
So any of the measurements that were taken at the battery
to determine the energy are going to
be fundamentally affected by that.
And finally there's a little bit of data that flows
between the device and the host when they're connected,
and that can generate some activity in
the device that you would normally not see
if you were disconnected and in the field.
Now when you do take your device into the field and you
test it in the domain that it's going to be tested in,
if it's a social networking application maybe a cafe, or
if it's a turn-by-turn direction system out, you know,
maybe in your car, you're going to notice that you're
entering into a more diverse radio environment.
And what I mean by that is that as you move with
your device from cell-to-cell on a cellular network,
you're going to get different levels
of power performance for your radios.
As you move within range of different Wi-Fi hotspots,
some of those will be able to make connections to.
Some of those you will not be able to make connections to.
And that will affect how your device is using power as well.
And then furthermore, if you are developing
something that uses Core Location,
the best way to test your Core Location app is to
actually take the device and move it from place to place,
so you can see those events coming in and
how they interact in your application.
So in the iPhone 4 SDK we have a new way
of recording while you're in the field,
and that's actually being able to
turn on power logging on the device.
So when you're, when you have your
device and you want to do the testing
of your application, you can go to the settings application.
We have a new developer settings subsection
there, and a switch for power logging.
What you'd do is you'd turn that logging switch
on and that enables the power logging service.
And the power logging service can
record data all day if you like.
It's a very efficient mechanism.
And then at the end of the day you can turn off
your power logging, go back to your host machine,
connect it up to Instruments, and pull of that data
down and analyze it in the Instruments application.
Now of course we always support the standard
way of doing recording in Instruments,
and that's to have the device connected to your host.
And push in the Record button in
Instruments and then you can see the data
as it comes in and as you interact with the device.
Now let's talk about that first Instrument.
The first Instrument's the Energy Usage Instrument.
And as I mentioned that is a true power measurement.
We're actually going right to the hardware as close
as we can to the battery to get these measurements.
Now the Instrument is taking the
measurements it's receiving,
and it's converting it to an energy factor between 1 and 20.
Low numbers means lower energy usage, which
means you're going to get longer battery life.
And higher numbers means that you're at a higher energy
usage and you're going to have shorter battery life.
So how short or how long?
Well if you're able to sustain an energy
level of 1, it would take over 20 hours
to fully discharge a fully charged battery.
And if you're able to sustain an energy usage level of 20,
then you'll be able to discharge a fully
charged battery in a little less than an hour.
If you're somewhere in the middle,
let's say an energy usage of about 10,
that's about 10 hours of projected battery life.
Now because this is being measured from the hardware,
the sensors are only currently in the iPhone 3GS,
the iPhone 4, iPod touch second and third generations.
And if you have a device that's not on this
list, the energy usage Instrument will show zero,
but the rest of the Instrumentation will
be working just the way it normally would.
So let's talk about the second Instrument in there.
That's the CPU activity Instrument.
Now the CPU activity Instrument is like the activity
monitor, but it's a much more compact, lightweight version.
CPU activity actually ends up being a significant portion of
your power profile and where you spend a lot of your energy.
And that's because power usage from the CPU subsystem
scales at how much work you're actually giving the CPU.
So the more work you give it, the more activity that
you show in the tools, the more power you're going
to be, the more power it takes to drive that.
So within the Instrument we measure the total activity,
and that's the amount of the CPU is
active in a given window of time.
But we also break it down into the
current foreground applications activity.
We record that.
And we also record the graphics and render
server, and the code that's executing in there.
And the audio/video server, otherwise
known as media server D as well,
and that's listed in the audio
processing column here in the Instrument.
Now, the reason that we record these and break
these down in this fashion is because the workload
of your application tends to move around.
If you're using Core Animation, for
example, your foreground activity,
your foreground application activity
may show a very small number, 1 or 2%.
But you may see the graphics column showing 20 or 30%.
And this occurs when Core Animation is sent a lot of the
compositing work over to the graphics and render server.
Now there's also another service
for playing movies and audio,
and that's access via the AV Foundation,
the media player framework, and OpenAL.
And if your applications are doing
work with that type of media,
you'll see the audio processing column
show up with some activity as well.
So in addition to the CPU activity, this
Instrument also tracks application state changes.
And it puts flags on the Instruments track view, so you
can see when an application has moved to the foreground.
When it's moved to the background, and when it's
in the background, if it's become suspended.
And also there's a flag for foreground obscured.
And that's what happens when your foreground application
is running and then the lock screen comes down
and prevents the user from further interacting with it.
Now moving on, we have 3 radio Instruments.
I have 2 on this slide, which are similar and
they're related, so I'll talk about those first.
The Bluetooth and Wi-Fi Instruments are actually
tracking the user preference for these radios.
And it's pretty important to understand
when you're recording a trace
and comparing two traces what the
current state of your radio's worth.
So if the user had turned on Bluetooth, your device would be
acting differently than it would be if that were disabled.
For example, with Bluetooth enabled, you
could be communicating with external devices.
With Wi-Fi enabled, a couple of interesting things happen.
Core Location is able to use the Wi-Fi hotspots
that are around you for a more efficient form
of location information, so it
doesn't have to turn the GPS on.
And if you're in a Wi-Fi hotspot that you're able to connect
to, you could be sending your data over that Wi-Fi hotspot
in a more efficient method than
using the cellular technology.
Now if it's disabled, what you're
doing is your forcing all of your data
from your application to travel over that cellular network.
So if you're optimizing applications for data transfers
over the cellular network, then you should turn off Wi-Fi
and you'll see that appear in the Instrument.
Now the third radio is the GPS.
I want to talk about this one a little
bit differently, because your application
and how you use Core Location can actually
drive whether the GPS is on or off.
Now when it's enabled, it is one of the higher power radios
on the iPhone, but it is needed for a high resolution
and high accuracy when you're using Core Location.
So when you're using Core Location and you have an instance
of your CLLocationManager, if you set the desiredAccuracy
to a lower value, so that your desire's not to see
a meter accuracy, but maybe a kilometer accuracy,
then Core Location can bypass turning on the GPS and
use potentially the cellular technology or the Wi-Fi.
The same goes for the distanceFilter.
The distanceFilter tells Core Location how often to
update your application as you move from place to place.
If you set that appropriately for your application
you might not need to turn on GPS as well.
And the last and the most important API that you
guys will be interested in on the Core Location
or the CLLocationManager object is
stopUpdatingLocation and startUpdatingLocation.
As your user moves from the screen that shows
the GPS information, the location information
that you're trying to look at, to a screen that doesn't.
Let's say a web page, or a text
view, or something like that.
You should stopUpdatingLocation.
At that point you will actually
disable the GPS if it's turned on.
And another Instrument here we're recording brightness.
Now we're recording the user-selected brightness level.
And when that Instrument shows
zero that means the display is off.
The backlight does consume quite a bit of energy.
So when you see that Instrument showing a
positive value you'll know that a certain portion
of your energy is going to be due to the brightness.
And as you go from trace to trace you want to make sure
that the users are using the same brightness, roughly,
so that you can have comparable power traces.
Now what's important about this Instrument
is that when that screen goes off,
you know that the users are no
longer interacting with your device.
And this can happen for two reasons.
One it hit the Sleep/Wake button on the top of the
phone to turn the display off and put in their pocket.
Or they're not physically interacting with the device
and it's gone to idle dim and then turned the screen off.
Now the last Instrument I want to talk
about here is the Sleep/Wake Instrument.
And this is one of our most important
Instruments, especially today in the world
of the multitasking API's and background applications.
In the Sleep and Wake Instrument it's going to
help you get the most out of your battery life.
And the reason is, because sleep is the most efficient
mode of the phone, and where it spends most of its time.
So in a healthy situation here the sleep should
occur about 15 seconds after the screen turns off.
The phone will finish all of its work
and then drop down into a sleep mode.
Now if you have a background application that continues
to run, that'll obviously push that sleep out.
Now with local notifications and push
notifications and location updates,
all of these technologies will cause the device to wake up.
So if you're using multitasking APIs and you're
using background application type notifications,
you're going to want to watch that Sleep/Wake Instrument
very carefully to see if those notifications are waking
up your machine or your device more than they need to.
All right, so I want to show you
all of this in a single demo here.
So what I want to show you is turning
on the logging on an actual device.
I want to show you how to import a trace into Instruments.
And then I want to show you, with a trace
I've already recorded, a couple of things.
I sent myself in this trace, I sent myself an
email, so that generated a push notification.
Then I watched a little bit of video on YouTube,
and then I used a few applications that use the GPS.
So I want to show you how to find all
of these events in an Instruments trace.
Let me start with a device here.
Now you'll notice here that the device is disconnected.
I don't have this connected.
This is how you're going to want
to turn this on in the field.
So let's turn on our phone.
Now here we have our settings application, that's
where you're going to find the developer settings.
And if you scroll down here towards the bottom,
you're going to see the developer setting.
Now if you get to the point where you don't see the
developer setting, there are two possibilities for that.
The first is that the device hasn't been set up for
development purposes, and that's done through Xcode.
You usually do that once.
But the other possibility is that you've rebooted the phone
and you haven't yet connected it to Xcode or Instrument.
And if that's the case then you won't see the
developer panel until you first connect it up again
from the reboot into Instruments or Xcode.
Now this is due to a failsafe that
we have in developer tools.
If anything were to go wrong while
you're in the development mode,
simply resetting your device will take all the
development tools off of your device, as well as the data.
So you have to be careful when you're recording information
here that you don't run the battery completely dry,
because when you reboot it unfortunately
all the data will be not there anymore.
So the normal mode here would be to go
to the developer panel, flip logging on.
So there you go, on.
And then we can turn it off.
Now when you turn it on and off, you're
going to be creating a new session.
And so if you've turned it, once you turn it on again,
you're going to end up removing that last session.
So be aware of that.
And be careful when you're flipping that switch
back and forth because you may lose your data.
So now what I'm going to do is instead of sitting here
for 20 minutes recording power while I talk to you guys,
I have another device, which I'm going to connect
up, which has the power trace already on it here.
So the visual demonstration of me connecting the device.
[ applause ]
>> Chad: All right.
So let's move over to our demo machine.
[ silence ]
>> Chad: Okay.
So I'm going to start a new Instruments document here.
In the iPhone section of the template chooser.
Try and zoom in on here for you.
If you scroll down we have a new,
here's the energy diagnostics template.
Start with a new document here.
We have all of our Instruments populated.
Now what you want to do is, is the way you choose
which device you want to import the document from,
you, is a standard for the targets chooser.
That's our iPhone.
Now to import the data under the file menu we
have an Import Energy Diagnostics From Device.
We'll do that and the data will come.
Now during this demonstration as I navigate the trace,
I'm going to be using this track scale control down here.
I don't know if you guys have had a chance to use that.
But this allows you to zoom out, so you
can see all of your data on the same track,
or on the same screen, and zoom in for more detail.
So I'll be playing with that, if you see
this scale changing that's what I'm doing.
So now I promised you that I'd show you a couple
of things, the push notification to start with.
So what does a push notification look like?
Well I can tell you the push notifications
actually right here.
I'm going to scroll that over, so you can see Instruments
with the actual Instrument name next to it here.
There we go.
Now this is actually the push notification.
The reason I know this is a push notification is
because I looked at our Sleep/Wake Instrument,
and this dark band means the device is asleep.
The light band means the device woke up and then
it went to sleep again some short time after that.
Now, the Display Brightness Instrument is telling me
that the display did not come on for that wake up.
So that's indicative of a push notification
or some other type of notification going on.
Now since I sent myself a mail, when you get a mail on
the phone you have that friendly mail tone that generates
that tells you that you have an email waiting.
Maybe it's not friendly for you to receive too much email.
But if we go to the Activity Instrument
we can kind of confirm that.
So if I move my inspection head to this block of activity
here that I see, the inspection head will line up.
Let's see if I can get it all on one
screen here, maybe not for you guys.
But you might notice that when I move the
inspection head it will move to the line
in the table view of where that activity is.
So if we look at that line in the activity, we will
see that we had 13% activity during that interval.
And sure enough the audio processing went to 4.3%.
So that is our push notification.
Now next thing I said is that I had watched a YouTube video.
Now how do I find that in all of this data?
Well, we have these flags on the top
here that record the interesting events.
And this one's coming from the Sleep/Wake Instrument
telling me that the device has gone through a running state.
And I can navigate the flags here with these arrows.
We'll see that more, that mail came to the
foreground with me checking that mail I sent.
Then it goes into the background.
And ah, YouTube is started.
So now I see that YouTube is the
foreground running application.
And I can move to the next one and see that where it goes
into the background and then becomes eventually suspended.
So I know in this window of activity is my YouTube activity.
Try to fit all that in here.
Scale out.
OK. So now let's take a look at some of that activity.
If we look at the Activity Instrument again here, we'll
notice what I was talking about the workload moving around.
And this is what I wanted to show you guys.
If you look here we have 30% activity
while we're playing the video.
The foreground app is actually doing .4%.
That's the YouTube application itself.
It's doing very little work.
Basically just maintaining the controls.
Now the bulk of the work is being done in the audio
processing daemon, and that's also the media daemon.
So as it's playing the video and running through the
codex that workload occurs in the audio processing.
And we're doing a little bit of work in compositing in
the graphic process; the graphics and render server.
So as I move out here, now another
interesting thing I wanted to show you
about this power trace is you might
be able to see from the monitor here
that the activity levels been fairly consistent.
But I'm going to move along the
energy trace up here at the top.
And if you guys want to take a look at that tool
tip that shows the energy level at 13 out of 20.
As I move long here at the beginning of this trace I
see it's 13, 14, 13, 14, and we continue to move on.
Now we move towards the end of
the trace it starts to go down.
We're actually using less power here.
Now we can speculate that that's due to the
way that we watch video on the Internet.
And the way that we do that is we start
watching the video while we're buffering.
So as we're buffering the end of
that movie, we're watching the video.
So you're going to have double activity during that period,
now that activity's mostly occurring over the radios.
And when I reach about halfway point in my movie,
the radios are going to stop becoming active
and you're going to see that energy come down.
Now the other thing I told you is that we're
going to have some GPS activity in this trace.
If I zoom out you'll see on this bottom band, which is the
GPS that we do have some sporadic activity happening here.
Now the question is why.
Why is that occurring?
Well, we'll the flags again to do some
navigation in here around where the GPS is.
I'll zoom in a bit.
And we'll look at these flags.
And the flag that's closest to the GPS
turning on is Maps coming into the foreground.
Now this makes a lot of sense, because the Maps application,
at least the way I was using it was
doing constant location updating.
So you can see for the entire duration that
Maps was in the foreground that GPS was active.
Now there's some GPS activity a little bit later,
so let's continue to look through our flags here
to find out what might have triggered that.
And we see that the next event that
happened closest to the GPS turning
on is the camera application came to the foreground.
And for the duration of the camera application as it was
recording a little bit of video, the GPS came on and off.
And this is due to the feature in the
camera application for geotagging.
It uses Core Location, which uses the GPS to find out
your information, so it can associate that with the video.
So this is a full power trace, it's about 18 minutes long.
You want to record traces that have similar length so
you can try to get the significant amount of data to see
if your, how well you're doing and how your
applications are interacting with the device.
So I'm going to go back to slides.
Here we go.
Ah, demo. OK, so now one of the things
I want to talk about before I move on is
that when you measure power the way we measure power,
which is very close to the battery,
it's both a blessing and a curse.
It's a blessing because we're getting a
measurement that shows where all the power is going.
There's nothing that's sneaking by the sensor.
And it's a curse in a way, because that's everything.
including a lot of things that you
don't have direct control over.
So when you want to see how your application is specifically
impacting the device, you may want to disable as many
of the features as you can on the device.
And a few recommendations here are mail fetching.
If you don't want the activity for mail fetching to occur.
Push notifications if you're application's not using those.
And the auto-dimming on the display.
The ambient light sensor inside the device
will vary the power being sent to the display,
which could widely affect your measurements.
Especially if you're in an office
that's changing lighting conditions.
Furthermore, the radios can be disabled.
Wi-Fi, as I mentioned can be disabled.
But also there are settings in the phone where
you can turn off the 3G and data connection.
If you want to test the affect that
the data's having on your power usage.
And also you can disable the Bluetooth.
Now a smaller point here, but one I felt was still worth
mentioning, is that temperature does affect energy usage.
As the device warms up your device is going to, the
resistance's internally electronics is going to increase.
It's going to require more power to drive the device.
Up to a certain point, of course, where the device becomes
too warm and it start throttling its power consumption back
to try to bring that temperature down into
a level that's comfortable for the user.
Now unless you're really driving the
device hard you're not going to see this.
But if you do see some difficulty; if you have some
difficulty explaining some of the measurements that you got,
you may want to consider the affect
that temperature has on the device.
So in summary, we have a new set of
power analysis tools in Instruments.
We have a new template, which takes all of those tools,
wraps them together in a nice single document for you.
Now the whole goal of this is to help you
guys extend the battery life for the end user.
And so after you've done optimization in the Time
Profiler and you have a nice crisp responsive display,
you might want to push even further and try to optimize
your application just a little bit further to try to reduce
that CPU activity that could be
contributing to a shorter battery life.
Now also what's new here is that we encourage
you guys to do the recording in the field.
This is sort of a new mode for Instruments.
In the past you had to have been connected to the device.
But in order to get the most accurate representation
how your device uses power, you need to take that device
into the field and do your recording there.
And this is new on iPhone SDK 4.
All right.
Thanks. I hope you guys have fun using that.
It's been a lot of fun to develop.
The next we want to talk about something
completely different, and that's re-symbolication.
Now if you've taken a trace sometimes you might end
up with a series of what we call missing symbols.
If you see inside the trace numeric addresses rather
than the symbol names you're familiar with like PrintF
and DrawRec, etc. You'll just see these addresses.
And that's basically Instruments telling you that it
doesn't have the information, the symbol information,
or rather the way to map addresses onto symbol names.
Now the way we used to fix this before the
iPhone SDK 4 is we would rebuild the application,
reinstall it on the device, and then rerun our test.
Now why did that work?
Well, the symbol information is actually
a byproduct of the build process,
and it stays on the host with your application binary.
And every time you rebuild, and now of
course what I mean stays on the host,
it actually does not get installed
on the device for testing.
So every time you get a new build, each
binary and symbol information has a UUID,
or Universal Unique Identifier
that shows how to match the two.
And when we get data from the device into Instruments,
the Instruments on the phone sends the UUID
of the app being profiled back to Instruments on the host.
And then Instruments on the host goes through the
process of locating the binary somewhere on your host
with your symbol information to try to find it.
Now if you've done a clean, then you've removed the binary
from the host and you've removed the symbol information
from the host, and you end up with
a lot of these missing symbols.
Additionally, if you try to go off and rebuild the binary
and you don't reload it or try to take another sample,
then Instruments are still not going to
be able to find that symbol information,
because every time you build you get
a new UUID for that particular binary.
Now for some of you out there this still
happens, even if you didn't do a clean.
And this could be due to the way that Instruments
looked for that symbol information based on the UUID.
It uses Spotlight to do that.
And Spotlight can be tricked in a few different ways.
First if you've moved all of your binaries and symbol
information into a compressed achieve, like a zip file.
So obviously you're not going to find it in there.
But to also Spotlight doesn't index
certain types of network shares.
So if you use a build server or a symbol
server your Instruments is not going to be able
to find your symbol information out there on the server.
Also, Spotlight can be intentionally disabled.
You can tell it not to look in certain places, and that
may have been where you decided to put your build folder.
And also if you put your build folder in
temp, temp is not indexed in Spotlight.
So all of those can contribute to
Instruments not finding that information,
and then you have a lot of missing symbols in your document.
But, in the iPhone SDK 4 we have
a new option under Instruments
to take a trace that you recorded and re-symbolicate it.
Symbolication is effectively where we take our
symbol, our addresses and map them back to symbols.
If you go into the file menu you see
a re-symbolicate document feature.
In there you'll see this panel that drops down.
It's actually one panel, but I'm
showing you two tabs on there.
It allows you to locate the dSYM file.
And that's a file with a .dSYM extension
on it that usually gets built next
to your binary, and associated back with the binary.
So if you save those aside for your different
builds that's how you can reassociate them.
Also you can add a dSYM search path.
So if you are using a symbol server or build server,
you can put that path in of the network share in here,
and Instruments will, again, find
and locate your symbol information.
So once you do that you will see the happy, healthy
symbolication and symbol information that you're used to.
And you can get back to work solving your problem.
All right, that's all I have to tell you guys today.
But Lean is going to come back up here
and talk to you about Xcode 4 preview
and some of the neat new features, including system trace.
[ applause ]
>> Lynne: Thank you Chad.
So earlier in this session we saw how to use Time Profiler
to see where your threads were spending most of their time.
But it's also useful for you to know how your
threads interact with other threads on the system.
For example, you might want to know whether your threads are
getting preempted by threads trying to service an interrupt.
Or whether your threads are blocking, because they're
exhibiting lock tension, and therefore serializing
when they're not, you don't want them to.
So, new in Instruments is part of Xcode 4 developer preview.
We've introduced system trace.
System trace records all the transitions
that your threads make into the kernel.
And they can do this because of three reasons.
First, you do system calls.
Second, virtual memory operations.
And third, thread context switches.
So before we go ahead and explain how Instruments
record this data and what it shows you,
let me go ahead and explain a couple of terms.
So if your thread is running in time, it is context switched
onto a CPU or a core when it's scheduled on that CPU.
And it gets called to switch out
when it's scheduled off of the CPU.
The time which it's on the CPU is called
a thread tenure, or a thread interval.
Now during time a thread is on a CPU it
can actually make a couple of system calls
or virtual memory operations as shown here.
Now some of these system calls might be blocking,
and cause the thread to get context switched out.
And other virtual memory operations a system calls can
actually execute within the duration of the thread tenure.
So, system trace is a template with three Instruments.
The first of these Instruments is the scheduling Instrument.
And it really records when your threads on CPU,
which records when the thread tenures are executing.
And it records why your threads get off
the CPU, for example, if they get blocked,
if they get preempted by another thread, or if they
explicitly yield their quanta to another thread.
And the scheduling Instrument records the thread priorities.
So really a good rule of thumb here is you want your threads
to run on their respective cores for as long as possible.
And to help you apply certain rules of thumb that you might
like to use when using system trace, new Instruments as part
of the Xcode 4 developer preview, we've
introduced the trace highlights view.
This is a top-level statistics view that lets you
understand what's going on with your threads or processes.
In this case, it's showing me the top five processes
that are showing the largest amount of context switches.
So it really helps you understand what's
going on and apply the rules of thumb
that we're going to be talking about here.
So if your application requires a lot of throughputs,
it makes sense to maximize the time your threads are
on the CPU, so maximize the thread tenures.
And if your app is looking for responsiveness, you want
to minimize the amount of time your thread is on CPUs,
so that the thread tenures are short, the thread
gets on the CPU, does the work it needs to do,
and then gets off to allow for
other threads to actually execute.
The second Instrument is the system calls Instruments.
And the System Calls Instrument records all
the requests that you make into the kernel
so that it does work on your behalf and accesses resources.
So because system Calls access resources, they can
actually incur a context switch they can block.
And this might cause unwanted serialization
of your otherwise parallel code.
So the rule of thumb is to minimize the
number of system calls that you make
and to avoid using slow blocking
system calls that you might know about.
And once again, you can use the trace highlights
view in order to find and see the top statistics
for your applications and threads
that are making system calls.
The third Instrument is a Virtual
Memory Operations Instrument.
And it records when your code accesses new pages of memory.
This can happen in three ways.
Your code can, or Instruments can record when
your code first accesses a new page of memory
when the memory manager allocates a
new page in your code's address phase.
Or if your application is using file maps
resident pages, Instruments is going to record
when the memory manager makes a copy of a
memory page into your processes' address space.
And Instruments will also record page-ins and page-outs
from disks, which are the most expensive memory operations.
So those results, you know, you want to try to minimize
the amount of paging that your application does.
And to do that generally minimizing the total memory
footprint of the applications actually reduces the number
of virtual memory faults that it uses due to paging.
You might also want to reuse your memory allocations.
And if your applications are time sensitive, it's
good to also preallocate memory and front load all
of the memory faults to the initialization
stage of your application,
as opposed to during the applications
execution, which might be costly.
Right, with that let's go to a quick demo system trace.
I'm going to bring up a demo machine over here.
And hit command New to bring up Instruments again.
I'm going to select the system trace template over here.
Double-click it.
And the first thing it does is it adds 3 Instruments onto
my trace: scheduling, system calls, and VM operations.
And by default its targeted to profile all processes.
So for this demo I'm going to be using mipmaps again.
And I'm going to be comparing 2 different
algorithms for creating the mipmap thumbnails.
And one of these is, takes, well actually
let me start recording while I'm talking
and explain this as I'm recording.
So what I'm going to do is I'm going to hit Record
and Instruments should pull up the deferred mode.
I'm going to start recording.
So the first and faster method of creating these mipmaps
is to have a single thread go through the largest
to smallest scales of my images, and
create the thumbnails and the mipmaps.
So I just hit Stop Recording.
The second method, which I'm going to perform now if I go to
Mipmaps Preferences and change this to smallest to largest,
is that I'm going to have the threads that I'm using go
through all the smallest mipmaps and create those first,
and then the next, the largest up, and so forth.
So back in Instruments, I'm going
to hit Start Recording again.
And Start over here.
So this second algorithm we notice is slightly slower.
It's going to take more than 5 seconds to complete.
So I'm going to use system trace
to find out why this is the case.
All right, so we're done, hit Stop Recording.
First thing we see here is the highlights view.
And the highlights view is some of the
information that, you know, is pretty useful.
It's showing me that on my system most of the context
switches are happening in mipmap, which I expect,
and as I scroll down my system call wait
times are attributed to a lot of file
and directory operations, like Open, Close, No Cancel.
And a lot of memory faults are
happening on behalf of mipmap.
And my process total system calls, a lot
of them are happening in mipmap as well.
When you zoom out and take a look
at the system call wait time.
I'm going to click on this and it'll
take me to my System Calls Instruments.
And if we zoom in we see in the second algorithm we're
spending around 230 seconds waiting in our threads,
as opposed to 181 milliseconds executing on the CPU.
So we go back to original run and
take a look at the first one.
We can see that the first algorithm is
actually waiting a less amount of time on,
at the threads, so the threads are blocking less.
It's 120 milliseconds waiting, and 150
milliseconds executing on the CPU time.
And we're actually making a lower
number of system calls in this case.
So this is an example of how to use system trace, understand
why your threads are behaving the way they are, how,
and compare algorithms as we did in this case.
So let's switch back to the slides.
So, in closing, what did you learn today?
We learned how to use Time Profiler
and Data Mining to find out precisely
where in your code you're spending
most of your time and resources.
We've also introduced some new Instrumentation that
allows you to understand the power characteristics
of your iPhone apps and to understand how your
application and interacts with the system as a whole.
And I would like to point out that if you would
need more information and have any questions,
please feel free to contact Michael Jurewitz our Developer
Tools Evangelist whose email is shown on this slide.
So I highly recommend attending our sister session,
which is in about an hour or so in Presidio.
It's called Advanced Memory Analysis with
Instruments, which we'll talk about memory managements
and how to use Instruments to find problems with that.
And also, you should definitely attend the
Performance Optimization on iPhone talks,
and there's two of those, today and tomorrow.
And thank you very much for attending our talk.