WWDC2010 Session 405

Transcript

>> Kevin Calhoun: Hello, and welcome to
Session 405, "Discovering AVFoundation."
Now and for the next sixty minutes you
will join me on a voyage of discovery.
My name is Kevin Calhoun of Apple's
Media Systems Engineering Group,
and we'll be discovering the vastly
expanded AVFoundation framework in iOS 4.
We'll talk about why you would want to use this
framework and for what purpose; we'll talk in some depth
about concepts underlying the use of Time Media that
inform the design of the APIs we'll be discussing
in which you'll want to be familiar with to aid in your
adoption of the APIs and; of course, we'll talk specifically
about tasks that you can accomplish with this API set.
Now, where are we going in our
voyage of discovery this afternoon?
We are going beneath the blue line.
Underneath Media Player, underneath UIKit where
the pressures are and the media takes time,
we are at the level of AVFoundation,
the Objective C framework
which gives you a great degree of control over Time Media.
We sit on top of frameworks that are familiar to you such
as Core Animation and Core Audio and also a framework
which is newly public in iOS 4, Core Media.
So that's where we're talking about in the
system today underneath the level of UI.
Now there have been technologies on iPhone in the
past dating back to the original release and enhanced
through iPhone OS 3.0 and even after that
are useful for Time Media operations.
There are several very easy ways that you can use Time
Media in frameworks that sit atop the level of AVFoundation.
For example, for playback the MediaPlayer
framework offers MPMoviePlayerController
and MPMoviePlayerViewController well integrated
with UIKit successfully used by nearly every app
on the platform that currently plays Time Media.
In addition starting with iPhone OS 3.0, the browser
has offered support for the HTML5 video and audio tags.
So if you have web-based content that you want to
integrate Time Media with, that's a great solution for you.
Now I mentioned that AVFoundation
is vastly expanded in iOS 4.
It initially was shipped with iPhone OS 3.0
and in that version it offered a class called
AVAudioPlayer, which is useful for playing audio files.
So, those are very easy ways to play
Time Media and may still be appropriate
for your apps even now with the release of iOS 4.
Similarly for Capture there are
solutions already extant on the platform.
UIImagePickerController in UIKit supports
video capture as well as still image capture
and AVFoundation starting iPhone S3.0 supports
AudioRecorder for recording audio files.
So that's great stuff to be aware of and may be appropriate
for your app even after we survey AVFoundation together.
So, the question is, why would you use AVFoundation and
iOS 4 if these other great technologies are available?
The basic answer is we give you a much
larger measure of control over Time Media
in AVFoundation and a lot more features as well.
In particular if you need to inspect the contents of
a Time Media resource, you can get deep information.
You can glean what media is available, what
metadata is present in a Time Media resource.
If you need to play Time Media in ways that are more
sophisticated than you can with the other frameworks,
in particular, if you want to implement
a totally custom user interface
to control play back, you can do that with this API set.
In addition if you wish to pull together media from
multiple sources such as iMovie does pull some video
from this resource, some audio from
another resource, arrange it temporary,
perform editing operations, this is the API set for you.
If you want to take existing media resources either simple
ones such as simple Time Media files or complex resources
such as compositions that you've edited
together and re-encode them in order
to create new Time Media resources,
you can do that through this API set.
And, finally, if you want full
control over the input devices
that are present including the camera,
this API set has the features for you.
So these are the five areas of functionality that
we're going to be talking about in this session
and the two subsequent sessions so keep these five in mind.
There's a bargain we're going to strike.
We give you in this framework this vastly greater degree
of control over Time Media, but as you accept that power,
that ability to control Time Media, you also accept
the responsibility for handling Time Media in ways
that are appropriate for this type of content.
So it's important to be aware of the challenges of
working with Time Media as you adopt these APIs.
The most important point to make about Time
Media is an obvious one that it's intended
to be processed, intended to be consumed incrementally.
You see a sequence of video frames
or hear a sequence of audio samples.
Time Media takes time and that point is fundamental to the
design of the APIs we're going to be talking about today.
It will be necessary for you to code some forbearance
into your apps because Time Media takes time to process,
not just to play but also to perform other operations
as well even operations as simple as inspection.
You might not be surprised to learn that because Time Media
is intended to be represented over a period of time some
of the formats in which it's delivered are not optimized to
provide summary information about the Time Media resource
as a whole and as a result in order to get information about
these resources, for example, just asking a simple question
like ("what's the duration of a resource?")
you may require a good deal of work
to be done on your behalf to deliver the answer.
So, Time Media takes time, but it doesn't monopolize
the device while it's performing an operation.
You wish your apps to remain responsive to
your end user while these things are going on.
So it's possible for the user to turn his or her attention
to some other task or for the device to handle an event
such as an incoming phone call that changes the
circumstances of operation that you are enjoying at the time
that you kicked off the operation that you're currently
performing, playback or re-encoding, for example.
So it's necessary for the APIs to give you the
opportunity to respond to these changes in circumstance.
Finally the last important point to make about
Time Media is that the variety of formats,
the variety of delivery protocols is great and if
you wish to handle Time Media resources uniformly,
it may be necessary for app to do a little bit of
additional work to take some extra steps in order
to provide a uniform processing path
for the operations you have in mind.
We'll get back to details about this when we
discuss playback a little bit later in the hour.
Okay, but that's what to keep in mind.
The five areas of functionality we offer in
AVFoundation and the challenges associated
with Time Media that inform the API design.
Let me give you an overview first
before we go into detail of the classes
that we are making available to
you in AVFoundation in iOS 4.
I mention AVFoundation first shift and
iPhone OS 3.0 and with that version
of the framework there were audio-related classes
very useful, you'll still want to use these today.
AVAudioSession is present.
This is the class that you use in order to inform the
underlying audio system on the platform of the type
of audio processing that you're performing.
If you tell the audio subsystem what you're doing,
then it can arbitrate resources appropriately
for what your app is doing and what
else is going on in the system.
So, you're going to use this class in
connection with AVFoundation even in 4.0.
Other classes in the audio category present in AVFoundation
mentioned earlier AVAudioPlayer and AVAudioRecorder,
but now the expansion, the annex, the
annex, which is quite large in iOS 4
in AVFoundation the five areas of
functionality we mentioned earlier.
First, inspection.
In order to provide uniform inspection of Time Media
resources, we offer a model object known as AVAsset,
Audio Visual Asset, that allows you to get
information about the contents of a resource.
Assets can contain multiple streams of media
each of which is represented by an AVAssetTrack.
They also can contain collections of metadata that we
make available to you as a collection of AVMetadataItems;
these are the basic inspection classes, but remember
the challenge I mentioned earlier it can take time
to provide you with information about a resource
and so these classes implement a protocol known
as AVAsynchronousKeyValueLoading that
extends key value coding to allow you
to request the value of a property to be loaded on demand.
We'll talk in great detail about how that works shortly.
In addition, we allow you to represent resources as
still images, thumbnails, other types of visual previews,
you would create those thumbnails and other still
image previews by means of AVAssetImageGenerator.
Second area of functionality I mentioned is playback.
Obviously to play Time Media, you
require a class, a controller class,
that has basic play and pause types of methods.
That class and this framework is
known as AVPlayer, Audiovisual Player.
A strangely apt name if I do say so myself.
AVPlayer plays AVPlayer items.
Note that an asset is merely representation
of the contents of a Time Media resource,
but does not itself carry presentation state.
The presentation state for an asset is carried by
AVPlayerItem and similarly the presentation state
of any asset track is carried by AVPlayerItemTrack.
Now, at this level we do not have UI affordances, no views,
but it would be kind of pointless for us to allow you
to play a video with no way to display it to the end
user so what we do have is a subclass of CA Layer,
Core Animation Layer, known as AVPlayerLayer, which is
capable of displaying the visual output of a player.
Core Animation, of course, is not only useful
for visual display, it's also useful for timing,
for animations that are timed and so we also offer in our
little bag of tricks another subclass of CA Layer known
as AVSynchronizedLayer, which is capable of
synchronizing a layer sub tree with a playback of an item.
Very useful for synchronization.
The third area of functionality, editing.
I mentioned earlier that we have this
wonderful asset model that we use uniformly.
It would be very nice to be able to extend that
asset model in order to describe the composition
of media from multiple sources, multiple URLs.
If I can do that with an asset, like any other asset
I would like that composition to have multiple tracks,
perhaps multiple video tracks, multiple audio
tracks, and I would call those AVCompositionTracks
and they would extend the track model by allowing me
to describe the sequence of media from multiple sources
that a particular track can display, but it's not
sufficient just to be able to describe these compositions.
I want to be able to create them.
So, it would be nice to have mutable subclasses if
AVComposition and AVCompositionTrack that have methods
for the insertion of media and other editing operations
and that should give me enough to
be able to do temporal composition.
It isn't quite enough to describe how
to display a temporal composition so to
that we'll add a couple of additional classes.
If I have multiple audio sources in my composition,
I want to describe the way they are mixed together.
I might want to set their volumes relative to each
other or I might want to control the ramping of audio
from one source down while another ramps up.
The ability to describe that could be
available in a class known as AVAudioMix.
Similarly, we can offer a means to describe
how video should be composed together.
What's the front to back ordering of my video sources?
What's the opacity of each of the layers?
Perhaps I can ramp the opacity,
fade one down and another one up.
That's AVVideoComposition.
So, I have the ability to pull media
together from multiple sources and edit it.
I would like to create new assets from existing ones
either complex asset like a composition or a simple one.
Start with an asset, create an object known as an
AVExportSession that manages the process of export,
which is going to take time so this is a controller
class that I can kick off describing the type of export
that I want to perform by means of one or
another preset that's available in the framework.
Once I have the export session configured,
I can run it, create the new asset,
have it written out to an output URL,
and be told when the job is done.
Last area of functionality in our high-level tour of
AVFoundation is pertaining to capture, input devices.
I want to be able to survey the input devices
that are available in my current context.
For that purpose, we can offer the AVCaptureDevice class.
You can enumerate the capture devices that are available
by type, the audio devices, the cameras and so forth,
and you can find out what features they offer as well
to determine what features you
should make available in your app.
Once you've chosen a capture device to use, you would
like to set up a capture session with the inputs
to the capture sessions where is the media coming
from and to specify the outputs of the session,
where is the media going, in order for you to process
media coming off of the input devices and it's helpful
to have the option either to process the media in
your application, maybe you actually want to examine
and process video frames you should be able to do that with
or without the option of recording that media to a file.
Finally, in order to allow your end user to know
where the camera is pointing in your application,
it would be helpful to have yet another subclass of CA
Layer, AVCaptureVideoPreviewLayer we might choose to call it
if we're long winded, that allows you to display to your
user a preview of what the camera is currently looking at.
So, there you go.
We've just designed the AVFoundation Framework.
Thank you for coming.
[ Applause ]
I will be splitting my proceeds
from this job with you all equally.
Sign up in the lobby.
All right so one thing I need to mention before we go into
greater detail about the use of these APIs I want to point
out the fundamental framework that we
depend on in AVFoundation, Core Media.
Down at this very deep level the pressures are so
great that times are represented as a rational value.
That's pressure.
The Core Media Framework that underlies
AVFoundation defines a number of primitives
that you'll find as you survey the AVFoundation APIs.
Essentially anything that starts, any type that starts
with a CM is defined in the Core Media Framework.
The one that I want to point out to
you now is a representation of time,
which I mentioned is a rational value known as CMTime.
Have a look at the header file in CoreMediaCMTime.H
to survey the means for creating CM Times
for performing arithmetic operations on
them, for comparing them and so forth.
Similarly, CMTimeRange is another data structure
in Core Media that you'll want to be familiar with.
The speakers who follow me this afternoon will
be highlighting more of the details of Core Media
that you'll need to become familiar
with as you adopt the API set.
That's a good start.
So let's rise up from that very great depth so
that we can all breathe more easily, me at least,
and talk in detail for the remainder of our hour
about two of the areas of functionality that we offer.
We'll talk about inspection in AVFoundation,
how you find out about Time Media resources,
and we'll talk about playback, how you play them and
that will carry us through to the end of the hour
and the remaining three areas of functionality will
be covered over the remainder of the afternoon.
So, how do you inspect?
Well, obviously you want one of these AVAsset things.
That's the model object.
An AVAsset is the model for time-based resources that we use
uniformly that provides information about assets as a whole.
What's the asset's duration?
Also, you can provide presentation hints though
it doesn't carry presentation state itself,
that's the job of APPlayerItem, AVAsset does
carry information about the way an asset likes
to be displayed, for example, what's its natural size?
Note that an asset is not constrained in any way in the
number and type of sequences of media that it can present.
It can present one or more streams of audio, one
or more streams of video, and we design it this way
so that we can apply this uniform model to any number, any
number of the variety of media formats that we support;
the audio only ones, the video only ones and so forth.
Here on the slide are some examples of formats
that we can represent by AVAsset and in addition,
a couple of other objects available in the
OS that can work together with AVAsset.
Objects from the MediaPlayer framework, for
example, and from the AssetsLibrary.framework.
I'll give you details about that a little later.
Now, if AVAsset can contain multiple sequences
of media data how we represent them
each one is an instance of AVAssetTrack.
Each track represents a sequence of uniform type.
A track will be all video or all audio, for example,
and each track not only will have its own uniform type,
it will also have its own set of format descriptions.
The format descriptions tell you about the encoding of the
media; is it H264 video, for example, or something else.
It's possible for there to be more than one
format description represented in a single track
so the coding does not need to be
uniform across the whole thing.
A track also has a timeline expressed in
terms of the timeline of its parent asset.
A track doesn't need to start at time zero if its
parent asset nor does it need to play all the way
through to the duration of its parent
asset, it can occupy any segment
of the parent asset that's convenient for offering.
Other information about tracks available via AVAssetTrack.
Now a typical asset will contain a single audio track
and video track that are synchronized with each other,
but as mentioned before, there's really no constraint.
Any number of tracks is possible.
Some of you are going to immediately rush out and author
assets with say seventeen closed-captioned tracks just
to prove a point, and I say go
ahead; the model will support it.
What use you might put it to other than
as a conversation piece to be discussed.
Now, let's go through some specific workflow examples,
some code, some pseudocode that you might wish to write
to inspect assets and here in the next several
slides I intend to give you the basic flavor
of what it's like to work with this framework.
Remember I mentioned that you need to
code some forbearance into your apps.
Why? Because Time Media takes time.
That's what I want you to take away from this session.
I'll give you a very concrete example
of what that means in just a minute.
To inspect an asset I will start by initializing an
AVURLAsset object, a concrete subclass of AVAsset
that presents the Time Media model for any asset that can be
reference by URL, but note just because I have that instance
of AVURLAsset in hand does not mean that
any work has been done on my behalf.
Remember Time Media takes time, which has a corollary
applied to AVFoundation that initialization of an object
in this framework does not guarantee suitability
or readiness for any particular purpose.
Specifically once you have initialized
an AVURLAsset what is it ready to do?
The answer is nothing yet.
We have not examined the resource at all.
We have not even attempted to find the host.
Initialization of an AVURLAsset
from any URL will always succeed.
So how do you find out what you need to know,
how you get us to do some work on your behalf?
You use the AVAsynchronousKeyValueLoading protocol that
I mentioned earlier in order to tell the framework,
in order to tell the asset, which values for its keys, its
properties that you wish to have loaded on your behalf.
This protocol has two methods.
First of all in order to find out whether any
particular value for a key such as duration
or tracks is already available use
the method, statusOfValueForKeyError
and this method will tell you whether the information
you seek has already been loaded on your behalf.
At initialization times since we've done no
work, the status for virtually all the keys
of AVAsset and AVAssetTrack will be unknown.
You have to request the loading of a particular key in
order for the status to change to loading and subsequently
to arrive at one of the three terminal statuses.
Ideally, will arrive at the status loaded, which tells you
that, okay, now you can call the getter and get the value
that you wish, the array of tracks, the
CM time for the duration, et cetera,
but it's also possible for loading to fail.
Remember we're doing nothing to vet the
URL when you initialize the AVURLAsset.
It's only after you've requested some loading
that we may arrive at the decision that, well,
the URL you reference is not a Time Media resource
at all or, oh, by the way the network's down.
That failure will occur as a result of a request to load.
All right so with no more suspense how do you request
the loading of a value, a key, on one of these classes?
A value for a property, a declared
property in these classes.
Use the other method in the protocol.
Load values, plural, asynchronously, adverb,
and you have no idea how hard it was to work
in an adverb into an API name, four keys.
The idea here is that you decide what collection of values
you require for the operation that you're performing.
Put them all together into an array, all of the
strings representing the names of the keys you want
and present them all at once to the asset
via this method and then the asset will
in turn do the work that's necessary
in order to do the loading.
When all of the keys in the collection have
reached a terminal status, the block that you pass
to load values asynchronously for keys will be
invoked and at that point you can test their status
and then move on to do the appropriate thing.
What does it look like in a code example?
Here's how you put it all together.
The first thing you do with a URL is to
initialize an instance of AVURLAsset,
which at that point is not ready to tell you anything.
It needs to do work.
Then you would say here's the array of keys that
I require to be loaded, their values to be loaded,
in order to perform the operation I'm interested in.
In this particular case, I'm going to prepare an
asset for playback and what I need to load in order
to play something back is its array of tracks.
So, I'm going to tell the asset please load your tracks key
by invoking LoadValuesAsynchronouslyForKeys with the array
that in this case contains just the one
key, and I'll supply a block that I wish
to be called the net loading is complete.
You can tell this is a block because
it starts with the funny hat.
This particular block that you pass to this method
takes no parameters so the code that's executed
when loading completes is all inside the braces there.
The first thing that it does is that it checks
the status for the key I wish to have loaded;
and according to the terminal state that was reached
if it's loaded, I want to update my user interface
with this tracks information; if it failed to
load, I wish to report the error to the end user;
or if I've canceled the loading of the values for
key zone and asset, I want to do some bookkeeping.
So, that's basically what it would look like and this is
the way that you prepare assets for operations in your app,
this is the means by which you code the
forbearance for the time that Time Media takes.
So, let me review this really quickly.
How do you inspect and load assets
in order to prepare them for use?
You have information you want to find out.
You know that it may take time.
I'll give you a concrete example.
Something as innocuous as an MP3 file can take an
enormous time in order to deliver just a very simple piece
of information about the duration of
an MP3 file may require the parsing
of every single audio packet in
the file in order to calculate.
It's not necessarily the case, in other words, that MP3
files contain summary information about their contents.
You do not want your user to have
to wait while that work goes on.
You can ask for the work to be done synchronously,
but there are significant downsides to doing that.
First of all, you risk having your app become unresponsive
to the user, which of course, is a total no-no.
Users expect apps to respond to their control, but
there's an even worse consequence that's possible
if you request these pieces of
information to be loaded synchronously.
There's a watch dog on iOS 4 that watches interaction
with the Media Services available on the platform.
If any one of the clients of Media Services
request an operation to be performed synchronously
and that operation takes longer than a timeout value
that that watchdog manages, the watchdog will come along
and kill the application that took all that time and this
may also have the side effect of causing media services
that are in use by all of the entities
on the system to be reset.
Don't let this happen to your app.
What will this result in?
Well, fewer stars in the app store that's for sure.
How do you avoid this calamity?
Come to www.C210 meet me in Presidio, and I will
tell you all about how to use the AV, oh, wait,
I'm having one of those time shift things, right?
Sorry. Use the protocol just described, load the
values for T's asynchronously and you will be good.
Now, this duration thing, this troublesome duration thing.
I told you how expensive it is sometimes
to calculate the duration of an MP3 file.
Even if you do it right and you load
that value asynchronously you're sitting there saying I
don't really need you to do all of that work on my behalf,
I just want to play the darned thing, I don't need
you to find out the duration exactly in advance.
Well, yes, that's actually true.
It turns out that for duration, which is a special value
defined by AVAsset, it's usually sufficient particularly
in playback scenarios to use an estimated value.
You don't need to know exactly how long something
takes unless you're trying to coordinate something else
with its playback, which is not typically the case.
So, by default the behavior of AVURLAsset is to
provide enough accuracy for a playback scenario.
Note that if the underlying format that stores the media
offers summary information about the timing and duration
of the resource, the information that we
provide you will be completely accurate.
For example, if the file is a QuickTime movie
file or an MPEG4 file, those things do contain
that summary information, and we will give it to
you and you can find out if any particular instance
of AVURLAsset provides precise duration
and timing by examining its eponymous key,
but if you require precise duration and timing from every
asset that you're working with, if you're doing something
that requires that degree of precision, you can request it
at initialization time by setting in the options dictionary
when you initialize the AVURLAsset the key
AVURLAssetPreferPreciseDurationAndTimingKey
and we will give you an instance of AVURLAsset
that will be accurate regardless of cost.
Okay. So that is the fundamental interaction
that you will have with this framework,
and you'll note as we discuss these classes
this afternoon that there are similar stages
of operation with each of the chief classes.
You initialize something, you prepare it for the purpose
that you want to use it for, you observe its status in order
to determine whether it's ready
for that purpose then you move on.
Playback not surprisingly is very
similar in behavior to inspection.
I mentioned earlier that the chief class
for controlling playback is AVPlayer.
It has the methods on it for controlling rate and
so forth that you would expect in such a class,
but beyond control it's extremely rich in
the facilities that it provides to allow you
to observe the presentation state, the playback state,
as it changes so that you can synchronize
a UI to playback state, for example.
I mentioned that the AVPlayer plays items, it has a
property known as its current item so you can find
out what it's playing at any given
time, and the AVPlayer item
as I mentioned earlier confers
presentation state upon an asset.
It describes how an asset should be presented so it's
possible to play an asset with more than one player item.
For example, in one context you may wish to play a
particular time segment of an asset with one instance
of AVPlayer item and in another context play a
different AVPlayer item associated with the same asset
to play a different time range of interest.
You can initialize an AVPlayer item with an
existing asset that you have or directly from a URL.
If you initialize it with a URL, the
AVPlayer item will prepare the instance
of AVAsset for you that you can use for inspection.
AVPlayer item is also the class that you use in
order to control time as playback progresses.
This is what you'd use to seek, for example, or to step.
In addition, AVPlayerItems have one or more
AVPlayerItem tracks that correspond with the tracks
of the asset you're playing and those
have presentation state as well.
In particular, whether a track
is enabled for playback or not.
So, having given you the overview
of the player related classes,
let's have a look at how you would code
up preparation for playback in your app.
One of the challenges I mentioned right at
the start of our talk is that it's difficult
or it may require a little extra work in
order for you to treat assets uniformly
because of differences particularly in delivery protocol.
There are two main classes of assets that
you need to be aware of for playback.
The first one is a file-based asset.
Essentially an asset that we have random access to
in order to read information out of its container.
The second one is a stream-based asset.
We have a lesser degree of control over
what we can read out of that asset.
It's essentially being beamed to us by a server.
There's a little bit of a difference in the way
that you set up playback of an asset depending
on whether it's filed based or stream based.
So, let's talk about the workflows for each and then talk
about what you would do if you
don't know what type you have.
To start with if you have a file-based asset, in other
words, something like a video from the camera role
that was shot with the camera or an item from the iPod
library that you can access by the media library framework
or even a file that resides in a remote HTTP server.
Here's the workflow that you would use to
playback any one of those file-based assets.
As I mentioned earlier, initialize an instance
of AVURLAsset with the asset of interest
and then it is your responsibility in preparing
that asset for playback to load its tracks.
The player is going to want to know
what media in there so go ahead
and do the job using the AVAsychronousKeyValueLoading
protocol to load the tracks of that asset.
If that succeeds, then you can go on to initialize an
AVPlayerItem with that asset and remember our correlary
to our basic tenet, that Time Media takes time, that
initialization of an object does not guarantee readiness
or suitability for any particular purpose,
I've just initialized an AVPlayerItem
but when initialization completes,
it is not yet ready to play.
What you'll want to do once you've created an AVPlayerItem
is observe its status key by a key value observing.
Observing a status key you can be informed
of when the PlayerItem becomes ready to play
and you initiate the process by which it becomes ready
to play by associating the AVPlayerItem with an AVPlayer.
In this example, I'm initializing the
AVPlayer with the AVPlayerItem I created.
That kicks off the process to prepare all of the
chutes and ladders to get the thing ready to play
and via key value observing you'll soon discover that the
status of the player item has changed to ready to play.
When that occurs, then it's possible for you to
survey the presentation state of the player item
where tracks are enabled, for example, to choose a track
in this particular case to be disabled for playback,
as an example, other customization of presentation state is
also possible, of course, but once you've prepared the item
for playback with whatever customization that you
desire, then go ahead and tell the AVPlayer to play
and that's essentially the workflow that
you follow to play a file-based asset.
The second example as I mentioned
earlier would be for stream-based assets.
As you know, iOS 4 supports the HTTP live stream protocol
and HTTP live streams are essentially
a playback only technology.
It is not possible for you to create an AVURLAsset
from an HTPP live stream URL from scratch.
What you need to do if you fall into this category
you have an HTTP live stream that you wish
to play is go directly to the player related classes.
Start with an AVPlayerItem, initialize it with the URL for
your HTTP live stream, do not spill water on the hardware,
you just weren't aware of the safety tips
you're going to receive at this session.
I'm glad you're enjoying them.
Associate the player item with an AVPlayer in order to
start the process of making that PlayerItem ready to play.
Then a little magic happens.
Once that PlayerItem becomes ready to play, the AVPlayerItem
will create on your behalf an AVAsset that you can use
to inspect the contexts of that HTTP live stream
so you can find out what tracks are present.
For example, and if necessary, you can customize the
presentation state as well, but presuming that you just want
to play it then, of course, you can move on once
it's ready to play to tell the AVPlayer to play.
As a side note, it is possible for you to take a shortcut
for HTTP live streams and simply initialize an instance
of AVPlayer with a URL that you wish to play and the
AVPlayer will create on your behalf the AVPlayerItem
and the whole chain of events will be kicked off for you.
So, if you know that you're playing HTTP live streams,
you can take this shortcut, but
here's how we put it all together.
If you don't know in advance the type of resource
that you wish to play could be a file-based asset,
it could be a stream-based one, here's what we recommend.
Essentially, you have to concatenate the two workflows, and
we recommend that you start with the file-based workflow.
Try the URL as a file-based asset.
Create an AVURLAsset from that URL with that URL,
attempt to load its tracks key as described previously.
If that succeeds, move on with
the file-based playback scenario.
If it fails, it's possible that
URL is to a valid HTTP live stream.
So, try it then by initializing an AVPlayerItem with the URL
and move on with the stream-based
workflow as mentioned earlier.
The two code paths converge, and you can treat them
uniformly once the Player Item becomes ready to play.
All right so now you're preparing items for
playback, you've told the AVPlayerItem to play,
how does your app stay in sync with time and control time?
First of all AVPlayerItem as I mentioned earlier
is the class that provides control over time.
Could I have a mic down for a second, please?
I have a little housekeeping to take care of.
Are we ready?
Now? No. Okay.
Sorry.
[ Coughing ]
[ Laughter ]
>> Kevin Calhoun: Okay, thank you.
That impromptu performance was
rehearsed endlessly for weeks at a time.
[Laughter] Until it was perfected in San
Francisco, California, I'll skip that.
So, control over time.
Seek the time is the method that you use to move
in time within the time range of an AVPlayerItem.
You should note that seek to time
is not necessarily precise.
One of the things that you'll note
in the design of this framework is
that we place a very high value
on responsiveness to the end user.
So, seeking to time can be an extremely expensive
operation if you wish it to be for it to be precise.
It can require to go to any specific time the decoding
of an arbitrary long sequence of dependent video frames.
You don't necessarily need that work
to be done arriving at a time nearer
to the time you wish is usually sufficient
and that's a behavior seek to time.
It will give you good responsiveness and good
enough results for typical playback scenarios.
However, if you need more precise control over
time as you seek around in an AVPlayerItem,
you can use the variant method seek to time tolerance
before, tolerance after and these tolerances allow you
to essentially to define the time range
within which you'll be satisfied for the time
to arrive at when they seek operation is complete.
You can set these tolerances to zero to
arrive at precisely the time that you desire,
but you should note as I mentioned just earlier
this operation can be expensive and, in fact,
it can be detrimental to the responsiveness of your
application to the end user so use with caution.
Where does the media come from?
The question I know we all ask each other.
You can play file-based assets from
the camera roll as I mentioned earlier.
How do you do that?
The framework that you want to become familiar with in
order to play video that you can shoot with the camera
on the device is the AssetsLibrary.framework.
That framework has facilities that allow you to
survey the groups of assets that are available,
essentially the camera rolls that have been
recorded, and within each group it allows you
to enumerate the assets that are present.
You can filter them by type such as just showing you the
videos, but once you have a specific instance of ALAsset
from the AssetsLibrary.framework that you wish
to play, you obtain the URL from that ALAsset
by asking it we missed this particular code
point up here so I'll tell you what it is.
Use the method ALAsset default representation
to get us to fault representation
and ask of its default representation for its URL.
Once you have that URL in hand, you can
initialize an AVURLAsset and proceed
with the file-based playback workflow as described earlier.
Similarly, if you wish to play media from the iPod library,
the MediaPlayer framework has the facilities for you
to allow you to query for any particular
piece of media of interest.
Essentially you create an instance of MPMediaQuery and
resolve it against the MediaLibrary and it will give you one
or more MPMediaItems that satisfy your query.
Once you have the MPMediaItem in hand
that you wish to play via AVPlayer,
you obtain its URL by requesting
its MPMediaItemPropertyAssetURL.
That's the URL that you would use
to initialize an AVURLAsset
and then you can proceed again with
the file-based playback workflow.
So, now you have a source of media you know how to set up
playback and get it going, even the seek around and time,
how do you keep your app in sync
with playback as it proceeds?
Well, let's talk about the things that you can
observe and respond to while playback occurs.
First of all you can track presentation state.
A prime example here, for example, is the rate of playback.
As it changes, you can use key-value observation in
order for your app to respond to changes to properties
of playback both in AVPlayer and in AVPlayerItem.
So key-value observing is your friend.
Register for the observation of these keys and you'll
be able to discover not only when changes occur that you
as the application initiates perhaps in response to
user input, but also and equally important changes
that are initiated underneath you
by the framework or by the system.
Well, what kinds of changes can occur in playback not
initiated by you the application in response to your user?
For example, if you are playing visual media and
the user multitasks, which is out of your app,
you'll observe a change in the rate because that playback
item, that AVPlayerItem will automatically be paused
and you would observe that by key-value
observation of the rate key.
Also, if you're playing remote media, you can observe
changes in the AVPlayerItemsProperties loaded time ranges
and seekable time ranges, which will tell you what portions
of the timeline of the AVPlayerItem are currently available.
Similarly if you are playing a HTTP live
stream that has alternate encodings,
higher data rates for greater network bandwidth, lower
data rates for conditions that are not quite as promising,
you can actually observe when the HTTP live stream
changes from one of the alternate encodings to another
by observing the tracks key of AVPlayerItem
and as a switch occurs, you can see, ah-ha,
now this AVPlayerItem is playing this encoding.
One last thing you'll want to observe on AVPlayer and
AVPlayerItem already recommended as you set things
up you wish to be observing their status keys.
Are they ready to play?
That's the first thing you'll need to know before you
kickoff playback, but it's possible for these objects
to arrive at other interesting statuses as
well, in particular, if you or your neighbor
or the guy three rows behind you writes one of those
applications that requests information synchronously
of an AVAsset, that takes longer than the time out value and
his app is killed and media services are reset for everyone
on the system after we have a moment to hang our heads
in shame, the first thing that you have to realize is
that every other client of media services on
the system will be responsible for setting
up their media operations all over again.
How do you know that you have to do this?
If someone has caused this calamity to occur, you
observe the status key on AVPlayer and AVPlayerItem.
The status can change to a failure status and the
error key of either of those classes can report
that media services were reset via the error code
of the NSError that's available from the error key.
When you see that that has occurred, you know
oh, no, someone has done the wrong thing,
media services have been reset, I need to create
new instances of my playback objects and so forth,
put them into the state that I was
in before and then I can proceed.
Now, if you are observing changes that you
initiate and you are also observing changes
that the framework initiates underneath you, well, there
needs to be some way of serializing your registration
and unregistration of interest and
notification with notifications in flight.
We don't wish for you to have to deal with any
possible race conditions if you're trying to disengage
from key-value observation while
there's a notification about to arrive.
So, in order to avoid that, our recommendation for
iOS 4, for your key-value observation of AVPlayer
and the other player related classes, is to register
for a key-value observation and unregister from it
on the main thread and that will guarantee
a sensible serialization of registration
and unregistration with notifications in flight.
Now, note that we still very highly
value responsiveness to the end user.
We're not actually going to perform any of the work
associated with these state changes on the main thread,
it's only the notifications of those
changes that we deliver on that thread
and that we recommend that you register and unregister on.
You can also track the readiness
for visual display of a visual item.
Perhaps you have an AVPlayer playing audio visual
content, and you want to know when the AVPlayer layer
that you have setup to display the visual output of
the player is ready for display in your layer tree.
You can observe the ready for display key
on AVPlayerLayer and when that becomes yes,
you know that you can insert the AVPlayerLayer into
your layer tree and it has something ready to draw.
This is particularly useful if you want to code a
Core Animation transition from the tree in a state
in which it lacks the AVPlayerLayer to
display the visual output to one that does,
and you can do quite a few interesting effects with this.
You can also track the progression of time.
Now because time progresses incrementally during playback,
it's not something that you can
observe by a key value observing.
It doesn't work for that model.
So, we have a different model available to
you for tracking the progression of time.
AVPlayer offers you the option of creating one or more
periodic time observers and you get your hands on one
of these by using the
AVPlayerMethodAddPeriodicTimeObserverForInterval,
you supply an interval at which you want to be invoked
as time progresses and the block that you supply
to this method will be called at that interval
and that should allow you, for example,
to keep a UI that's tracking the
current time in sync with playback.
The block will also be called when time
jumps also when playback starts or stops.
So, you'll have full information, full disclosure about the
progression of time if it's moving smoothly or if it jumps.
Also if you're trying to perform some manual programmatic
synchronization of something going on in your app
with playback, we offer a boundary time observer.
You create one of these on AVPlayer, give it a list of
times of interest, an array of CMTimes stored in NS Values,
and when any one of those times is
traversed, when it's crossed during playback,
the block that you supply will be evoked
and you can respond appropriately according
to the current time that has been reached at that point.
Now note that if you do a lot of expensive operations
in one of these blocks in response to either a periodic
or a boundary time observer, we don't guarantee
delivery of all of the invocations of the block.
It's up to you, of course, to code your apps to fit in the
operations that you perform in connection with playback
so that you don't swamp the CPU; that you can co-exist with
the Time Media operation that's going on at the same time.
Finally, the last thing that you
can track as playback progresses is
when an AVPlayerItem reaches its end time and stops.
We offer a good old NS notification for that purpose
known as AVPlayerItemDidPlayToEndTimeNotification.
You can listen for this, it will fire when the
AVPlayerItem has played all the way through to the end.
All right so that's basically the playback classes.
What you need to do in order to initiate playback, how you
can get media from the various sources available to you
in order to play it and how you can
keep in sync with playback as it occurs.
A couple of best practices to cover before we move on.
How do you become a good citizen of the platform
now that you are taking control over Time Media?
Well, use the AVAsynchronousKeyValueLoadingProtocol
described earlier, that's number one.
Also, tell the audio subsystem the type of
audio processing that you're performing.
To do that use AVAudioSession in AVFoundation, set the
category of audio processing that you are performing.
If you're playing, tell it that your category is playback.
This allows the audio subsystem on the device to
arbitrate resources, audio-related resources properly
for the various applications that
are trying to make use of them.
There's more that you can do with AVAudioSession.
For example, you can use it to become aware of interruptions
that may arise during playback
or other Time Media operations.
More details about AVAudioSession are available in the core
audio sessions that I'll point to you in just a few minutes.
Special multi-tasking note.
You are already aware from other sessions that
you can create your application and register it
to get processing time in the background
and even to play audio in the background.
What I need to make clear to you is the specific
behavior that occurs when you're playing visual media
and the user switches you from
the foreground to the background.
When that happens with no intervention
necessary on your part,
the playback of a visual item will
automatically be paused and its display
in the layer tree will automatically be disengaged
and you need do nothing in order to accomplish this.
That's they standard user interface for what should
happen when visual items are playing on the platform
and the app is switched to the background.
Now, if you setup your application to get processing
time in the background and play audio, you can,
if it's appropriate for your content
and the workflow of your app,
to continue playing the audio portion
of the AVPlayerItem in the background.
One other note about multi-tasking referring
back to the earlier functionality of inspection,
the loading the values of an asset in
order to inspect information about it,
any loading that you have initiated will continue
to progress even if your app is in the background
and if the loading has completed while your app is in the
background and you've set up your app to get processing time
in the background, you will be notified at the completion
of that loading while you're still in the background
so you can set yourself up on the basis of what an asset
contains even while you await return to the foreground.
If your application is not setup to get processing time in
the background, then you'll be notified of the completion
of any loading that has occurred in the
interim when you return to the foreground.
Okay. Let's review what we've covered
during these 55 or so minutes so far.
I've given you an overview of the AVFoundation framework
in iOS 4; told you the ways in which we've expanded it;
and covered the main areas of functionality that it offers.
I have also given you the flavor of the API and what
you need to do in your apps, to as I said earlier,
code a little forbearance into your applications because
the processes that you apply to Time Media will take time.
You want your apps to remain responsive
and good citizens of the platform.
I've told you in detail how to use the
inspection-related classes; AVAsset, AVAssetTrack,
and how to use the playback-related
classes AVPlayer, et cetera.
Remember that an AVFoundation because most of the
operations that occur occur asynchronously together
with other things going on in the platform we're making
full use of programming paradigms that permit this level
of asynchronicity, this level of cooperation.
In particular, we're extending key-value coding with
something we're calling asynchronous key-value loading
in AVFoundation so that you can request
specifically the information you require
and the framework can tailor the work that it does on
your behalf specifically to those things that you need,
notify you of when the information is available
and allow you to proceed on with your tasks all
without reducing responsiveness to the end user.
Remember, even simple questions can take time to answer.
We're using other very typical paradigms available in
objective C 2.0, our classes have declared properties,
you call their getters after checking whether the
information is available in order to obtain values
of interest to you, you use key value observing to note
changes that occur in state, you use blocks as call backs.
Lots of information about blocks
available at the conference.
I'll give you a couple of other sessions that you
can go to to learn more about block-base programming.
In the context of AVFoundation, a block is simply
a piece of code that you wish to have evoked
when a certain operation is complete or a certain state
is reached and the block is evoked at some time later
when that occurs or perhaps in line if that state has
already been reached, and then that block is the code
that you would supply that tells you what to do as
a result of the operation that you have completed.
So, where else can you go for more information?
Eryk Vershen is your Evangelist, your Media
Technologies Evangelist you can contact.
You heard him talk about HTTP live streaming and
know that he's well informed on these topics.
Documentation for the AVFoundation Framework is available
together with the other iPhone documentation for iOS 4.
And, as usual, the Apple Developer Forums are
excellent sources of information and contacts.
Other sessions for you to attend not just the sessions
that follow this one about editing and use of the camera.
Also I mentioned a core audio session.
Let's see tomorrow in Mission 10:15AM,
Fundamentals of Digital Audio.
There are other audio-related sessions that you may wish to
attend to learn more about audio processing on the platform.
There will be a repeat of this session hopefully
delivered with expanded lung power on Thursday.
If you have a colleague who wished to attend this
session but needed to go to one of the others,
you can tell him exactly what will be said there and,
in fact, you can let him in on all the jokes as well.
I'll come up with new ones by then.
A couple of sessions about block-based programming
are available to you as well to learn more
about structuring your application around the use of blocks.
Very powerful technology there.
I recommend that you become familiar with it.
So, to summarize, Time Media takes time.
Initialization of an object does not guarantee
suitability or fitness for any particular purpose.
Observe the status of those things that you're interested in
and as the status becomes ready, then you can move forward
with the operation that you wish to undertake.
So by following these best practices,
your apps can stay current
and have the full media processing
power of the platform available to them.
All of the things you've seen in the demos
in the last day and some including an iMovie
and other applications you've seen demoed, you can
do in your applications as well with these APIs.
Make sure your apps stay responsive
by using the asynchronous facilities.
Stay in time by tracking time as it progresses
and above all make sure your app stays
alive, don't let the watch dog get you.
And with that please stay tuned for the remainder of
the AV's Foundation sessions later this afternoon.
Thank you very much.
[ Applause ]