WWDC2010 Session 403

Transcript

>> Welcome to this year's HTTP Live Streaming Session.
My name is Roger Pantos, and I work on our
streaming client and also I am one of the people
who defined the protocol and continues to do that.
So it was exactly one year ago that I stood up here at WWDC
and introduced this new thing called HTTP Live Streaming.
And how are we doing?
Well, apparently a lot of people have been
waiting for a simple, standards-based way
to cost effectively broadcast live
video to wireless devices.
People are jumping on this.
We started with our launch partners, Major
League Baseball, CNN, Akamai, Inlet, Invivio.
And since then we have seen everyone from top-tier networks
like ABC, Sky, to start-up companies use this technology
to produce incredible applications, incredible.
We worked with some of the most
innovative companies in media.
Folks like Netflix and they have helped
us focus on the most important aspects
of delivering their content, your content to mobile devices.
Over the past year we have refined our protocol and our
implementation, the protocol, our implementation of it.
And our goal has been to give you the tools that
you need to take your application from incredible
to mind blowing App Store chart topping
oh my God I am living in Star Trek!
[ Laughter ]
[ Applause ]
That's the goal right?
We want to help you guys build some cool applications.
And so today what're going to do is give you a quick
overview of how the technology works and walk you
through the new things that we've added to it in iOS 4.0.
And to do that what I would like to do is introduce our
media technology evangelist for HTTP Live Streaming.
His name is Eryk Vershen.
Eryk.
[ Applause ]
>> Thanks Roger.
Good morning everybody!
So it's been a busy year and I want to talk,
my talk is going to cover four main things.
First of all we are going to have a technology
walk-through for those of you who may not be that familiar
with the technology or having been looking at the details
recently, then we are going to talk about the new features
and functionality that we've added over the past year.
Some of that was added early in the year.
Some of that's been added very recently.
And I'll have Roger back up for a demo and
we'll talk about the tools that we provide
so that you can create your own HTTP Live Streams.
And then lastly, I will talk about some tips and tricks.
So let's get started.
Now, before we get started with the technology walk-through
I want you to think about how we do streaming with HTTP.
I mean, HTTP doesn't do streaming, right?
It just delivers discreet files.
So essentially what were doing is we're turning your
long movie or your live content into discreet files.
Now, you've probably seen this slide last year or
some version of it, workflow or architecture now.
Whenever we're doing-- converting video, what we are
going to do-- we have to initially have some video.
Now, there is two disparate things here.
I might have live material or I might
have some video on demand material.
And the workflow is slightly different between those two.
So we're going to start out with an audio/video
source that might be in the case of live
that might be an STI connection that's coming in
to my desktop machine, or in the case of video
on demand I have already got a
movie file around on my file system.
Now the first thing we have to do
is run it through a media encoder.
That means turning it into H.264 and
AAC so we can play it on an iOS device
and it also means wrapping it in an MPEG-2 Transport Stream.
Now once we wrapped it in an MPEG-2 Transport
Stream we are going to pass it into our segmenter.
And what the segmenter is doing is it's chopping that
movie into discreet chunks of roughly equal length.
Now the segmenter is also going to create at the
same time a playlist that lists those segments,
and it's going to put those files
somewhere that the web server can see them.
Now, there is nothing special about
the web server in this context.
It is an ordinary web server.
The web server would make it available in the
cloud and so you can download it on your devices.
Now remember I talked to you about,
we've got these two different worlds,
either I've got a live stream or I've got video on demand.
In the case of live streaming that first part is going
to happen at the same time as I am serving it out.
I am going to be continually getting new content.
I am going to have to create new
segments and serve those out.
But in the case of video on demand, I'll have done that
initial phase once and I'll be serving it out repeatedly.
Now, the center of this is segments and playlist.
So, I'm going to be creating segments, here let's imagine
I've got several segments that are 10 seconds long,
and you'll see that 10-second number a lot because that's
the number we tend to recommend for the length of a segment.
Now, segments by themselves aren't any good.
I need to have a playlist and that
playlist has to point at the segments.
And the playlist is really the centerpiece of
how the client finds out about the material.
So the playlist does several things.
First of all, it lists the segments in playback order and if
I am doing a live stream this defines the playback window.
That is, in a live stream I'm not necessarily showing you
everything because in fact the stream might exist 24/7.
I can't have all that content around all the time.
So I am going to give you a window
into the content that rolls along.
Now, you are going to want to protect your content
in some cases and so we can encrypt the segments,
but in order for the client to decrypt them
it has to know how to-- what key to use.
So we have to have in the playlist the
key associated with this encryption.
And an important feature of HTTP live streaming
is that I can adapt to different bit rates.
So as my network characteristics change I can go ahead
and the client can fetch a version
that's suitable for that bit rate.
And so the playlist has to be able to
define multiple variations of the content.
Now first, let's look at what's a fairly straight
forward playlist, a video on demand playlist.
Now, when you look at this the main thing
to notice is it's just a list of URLs.
Those are URLs of segments.
Now, initially they can be absolute URLs but generally
speaking you're going to want to make them relative URLs.
That's more portable, it's just going to work better.
Now the other lines in the file, you know, some or all got
a hash mark in front of them and that makes them comments.
But if those comments start with a word that we recognize
they become tags that actually affect what is going on.
Now the first tag the EXTM3U tells the
client what the format of the file is
and at the moment we only support the one format so
that is always going to be the first tag in your file.
The second tag, TARGETDURATION indicates the maximum
duration of any segment in seconds, 10 seconds here.
And you'll notice that there is this extra INF tag in front
of each segment indicating the
duration in seconds of that segment.
In this case they are all 10 segments, 10
second segments, but they could be shorter.
The next tag to notice is the MEDIA-SEQUENCE tag.
This indicates the sequence numbers associated
with the segments that will become more important
when I show you a live playlist later on.
Only thing I want you to notice now is that
sequence number does not have anything to do
with the filenames associated with the segments.
The last tag I want you to notice is the ENDLIST tag.
Now, the ENDLIST tag indicates to the
client that this playlist is complete.
It is never going to change.
So the client is going to fetch this
playlist and it knows-- it never changes.
It is complete.
Now, if I am doing live I can't
give you a complete playlist.
So here is an example of a live
playlist and it looks a lot like my video
on demand playlist except there is no ENDLIST tag
and when there is no ENDLIST tag the client knows
that he has to re-fetch this periodically.
Well, how often?
TARGETDURATION is going to indicate that.
Although if I have, one of my segments is
shorter that is going to vary that a little bit.
And let us say that I've highlighted the sequence
number here because the client knowing he has
to re-fetch thisalso implies that the server
has promised that he is going to update it.
So let's imagine that I pulled the next time.
Now it is up to it, it knows my sequence has changed and
the set of files that are in the playlist have changed.
Let's do it again.
OK, so now my sequence number is 3
and the list of files have changed.
And what we're giving you is a rolling list of the content.
Now, the sequence number has to
stay consistent between one version
of the playlist that I download and the next version.
So every time I get that playlist, if
I am dropping a file off the front,
I've got to bump that sequence number to keep it consistent.
Now, we can keep doing this as long as
we want and naturally I am not limited
to having only 5 or 6 segments in the playlist.
I could have 10, I could have 100, I could have 500.
When I have that on the live playlist the client's
is going to come in and the client is going to start
by default, near the end of that playlist.
But the client can seek around and so what I am giving
you is a window into what's happening as it goes along.
Now there is a third kind of basic playlist
which is what we call an Event Playlist.
Now the difference with an Event Playlist, it looks a
lot like, to start out it looks like a Live Playlist.
It does not have an ENDLIST tag which means
the client knows he's got to fetch it again.
This time when the client fetches it we're
just going to add a segment on to the end.
We're not going to get rid of the segments on the front.
We are going to keep on adding segments.
We can do this again as long as we
want until at some point it is over.
Now why would I do with an Event Playlist?
Well, for an event, for a sporting event, for a rock concert
something like that I'd want to deliver it while it was live
but I'd want it to complete at the end, alright.
So once the client sees that ENDLIST the client knows that
this playlist is done, I do not have to fetch it anymore.
Now, if I am delivering an event like
that I really want to protect my content
so I'm going to want to turn on encryption..
So here's a playlist that has some encryption
and what we've added is this key tag.
Now, the key tag indicates the
method of encryption were using.
At the moment we only support AES 128 although you can, if
you have a portion of your content that is not encryption,
you can switch out of encryption by
specifying the encryption method as being none.
Now the other point to notice here is the URI.
The URI indicates where the client should go to fetch
the key that he needs in order to decrypt the segment.
Now, this key is going to apply to all subsequent
segments until a point where I specify another key.
At this point, the client's going to go and fetch that key
and that going to apply to subsequent segments going on.
Now, at this point we've only been
talking about one kind of, you know,
one example of a playlist, it's only got one data rate.
We want to be able to handle variants.
We want to be able to handle multiple
data rates at the same time.
So what's a variant?
A variant is a version of the stream
at a particular bit rate.
Now, each variant is in a separate playlist
and what we call the Variant Playlist
or the Master Variant Playlist describes
all of the variants that I have available.
Now the client is going to be given that variant
playlist and the client is going to switch based
on the measured bit rate-- on the bit rate that is actually
seeing over the network which variant he should play
and the client's player has been
tuned to minimize stalling playback.
We want to give the user a good experience.
We don't want him to have the video
drop out if we can help it.
Now, I've got a nice picture here of a variant playlist.
Let us imagine that I have a playlist at some
kind of a medium resolution and bit rate,
and let's make two more variants available, one
at a lower bit rate and one at a higher bit rate.
Now, what I do is I create a variant playlist
that points at the individual playlist.
I hand that to the client and the client can play it.
Unless I put the audio here and
the audio hasn't changed size.
The reason I am doing that is you want your audio to be
identical between all the streams and I mean identical,
not just the same bit rate and sample
rate but actually the very same audio.
And the reason you want to do that is if you don't do that
you can get pops and clicks when you switch between streams.
It is a bad user experience and you'd
rather, the best way to go about this is
to make the audio completely consistent
between the variants.
Now, here is an example of what
a variant playlist looks like.
It looks a lot like another playlist.
It is just a list of URLs, but in this
case those URLs are to other playlists.
And instead of being preceded by that
extra INF tag that we saw with one
of the segments it is preceded by the STREAM-INF tag.
The STREAM-INF tag ties the individual variants
together and in particular specifies the bandwidth
that is the maximum data rate that
this version of the stream can take.
I want to call out two of these variants in particular.
The first one, because the first one is the default,
when the client picks up the variant playlist
it's going to start out with the first stream.
The other one I want to call out is the last
one, that's a 64 kilobit stream, audio only.
You want to have a low data rate stream for
fallback if you happen to be on cellular.
You want to have something that the client
can go down to that you're basically going
to be able to serve no matter what happens.
Oh yes, one last thing I wanted to
point out about variant playlists.
The variant playlist, even though it does
not have an ENDLIST tag, is not reread.
Once you've read the variant-- the client
has read the variant playlist, it assumes
that the set of variations isn't changing.
Now, if the individual variations are Live or Event
Playlist, as soon as it sees an ENDLIST tag on one
of the individual variants, as soon as
it hits that point, that ends the stream.
It's not the case that you could say, oh well, I
just do not want to serve that bit rate anymore.
I'll put an ENDLIST on it, it is like, no, these
streams are all supposed to be the same content.
Now, in terms of playback you can play back using Safari
either on the web or mobile Safari and the best way to do
that these days is with the HTML5 video element but
in-- that's the wrong version it's supposed to say iOS.
In iOS there are several possibilities,
UIWebView which gives you something like HTML5
but also MPMoviePlayerController which has existed
for a while and now in iOS 4 AVPlayerItem with is part
of AV Foundation, allows you to play back HTTP live streams.
Now, let's start talking about new features.
I've got four main new features that I want to talk about.
The first is Stream Discontinuities.
Streams aren't continuous.
That is, they aren't always the same.
I might be wanting to deliver something
where I am delivering a set
of different short movies, TV shows, or whatever.
And I'm going to be stitching those together
into my stream that I am actually delivering
and those might be encoded at different times.
There might be variations in between them.
So we've got to handle those discontinuities.
We also want to provide metadata that
goes along with the streams and we want
that metadata to be associated with particular times.
The third thing is custom protocols.
I will go into more detail when I talk a little bit later
but basically this allows you to have a greater degree
of control over how your keys are delivered to the client.
I'll talk about performance improvements and then I
have a few odds and ends I'll talk about after that.
OK, discontinuity.
So let's say I have a whole bunch of
movies that I am delivering and I really
like to put some kind of bumper at the front of the movie.
Some sort of idents or some sort of branding
that indicates these are coming from my site.
Now, how am I going to do that?
Cause I won't have just three movies.
I may have hundreds or thousands of movies.
Well, I could take that bumper and I could merge it into the
movie but then if I decided to change the bumper I am going
to have to re-encode all those things
in order to make it work right.
And further, I've used that space: if I've got
a thousand movies, I've got a thousand copies
of that bumper that I stuck on my server.
What was the point of that?
So it is a brittle solution.
Now, you could say, well what if we
just delivered the bumper as one movie.
You know, we'll play one movie and then play the next movie.
And you think that might work but there is a problem
with that, and that is that the client when you switch
to a new movie forgets about what's going on, what it was
getting in terms of data rate, because it doesn't know
that essentially we're going to be
getting this from the same place.
And so what will happen as I start playing my bumper I'll
have a low data rate which will start out because we want
to start out conservatively and make sure
the clients' likely to be able to read it.
And then it is going to go up and then when I
hit the end of the bumper I am going to go back
to my movie and I have to start ramping up.
So I am going to get this break in quality.
Now, further if I'm deciding to do these things in the
middle, if it's TV shows and I am doing a station ident
in between shows, then again I am getting
these drops in quality as I go along.
So we really want a different solution.
Because our streams can change we can have timecode breaks.
We can change the encoding parameters.
So the solution is to let the client
know that there is a change coming up.
We do that with a discontinuity tag.
So here is an example of a stream
that has a discontinuity tag in it.
And I think that, well that is perfectly fine.
OK we're done.
Well, what if we want to encrypt it?
There we go.
So if I am encrypting, OK it still looks
straightforward, what's the problem with this?
The problem is the default Initialization
Vector for encryption is the sequence number,
and you're going, I know you're going What?
What's an Initialization Vector and why do I care?
OK, what is encryption trying to do?
Encryption is trying to make your data,
which is definitely not random, look random.
And the problem is at the beginning of a
segment it's hard to make it look random.
It's just not that easy.
And what Initialization Vector does in essence is
make the beginning of the segment look more random.
Now, ideally an Initialization Vector should be a
random sequence of bits that changes often enough.
So, OK our default Initialization Vector
for encryption is the sequence number.
So what are our sequence numbers?
0,1,2,3. OK, so where's the problem here?
Well, the first problems is, I've got this bumper, right?
And what if I decide to change the bumper?
Right now it's 18 seconds.
What if I decided in the future to make that 22 seconds?
Then it's going to take three segments, right?
So the sequence number for the movie
is going to start out at 3 instead of 2
and now I'd have to re-encrypt all my movies.
Well, that's not good.
The other problem is these aren't really
outstanding Initialization Vectors.
They've got lots of zeros in them.
The solution is to add an attribute
to the key that allows us
to specify an Initialization Vector
which is a 128 bit number.
And that Initialization Vector is going
to apply to all the subsequent segments
until I specify another Initialization Vector.
Now those of you who were looking closely might have noticed
that there was a new tag in these
playlists, the VERSION tag.
And the VERSION tag, we have to add the VERSION:2
because the Initialization Vector is not compatible
with the previous version, the old client wouldn't
be able to understand Initialization Vector.
So we have added the VERSION tag and that is required.
Now, you can put the VERSION tag, you can put a
VERSION tag with version number 1 into your playlist
and that will be fine with an old
client because old clients,
if they don't recognize the tag it just becomes a comment.
There is one other point I wanted to make
about Initialization Vectors and that's
that you can continue to specify the Initialization Vector.
You can re-specify it with the same
encryption key that you already have.
So in this case, I've specified a new Initialization Vector
starting with a third segment and another one with a fourth.
Now notice that it's the same encryption key.
Now, we're not going to re-fetch that encryption key.
If we look at the URI and see it's a URI that we
already know about, we are not going to re-fetch it.
And that also holds true across, across multiple variants.
If I have a playlist and I am using the same encryption
key on the variant when I switch to a different variant,
if I've already seen the key I don't have to re-fetch it.
Now, Timed Metadata.
So if you were at the, the graphics
and media state of the union yesterday,
they would have talked about synchronized metadata.
Well we're talking about the same
thing when we say Timed Metadata.
The reason I'm saying Timed Metadata is
that's what, what we call it in the code.
And what's Timed Metadata?
So it is metadata so it is data
about the video and it's timed.
It occurs at a specific movie time.
We want to communicate this info about a specific
moment in time and we want to communicate it
to our particular player, our dedicated player app.
This is not totally generic in the sense
that I can just send arbitrary metadata
and an arbitrary client will be able to understand it.
I can say well, why do we have to add this into the movie?
We could do it as an independent channel.
I can do that already.
It is like, well you could but it's kinda
hard to synchronize if I am getting this
through another TCP connection or something, it becomes
harder to rewind, to seek, to replay that stuff properly
because now I've got to seek in
these two independent channels.
So we add a time stamped information stream
into the movie and I'll give you an example.
So, here we've got our movie playing and
some metadata is coming along with it.
Now, it doesn't have to be text like we're seeing here.
It could be just a number like 92 miles
per hours or it could be a picture.
And we already use this to time stamp an audio only stream.
We also use it to add pictures to an audio only stream.
But now we're making it available
in iOS 4 to your apps as well.
So what can you do with it?
Besides text, you can use images to overlay.
Maybe I've got a bug, you know, a station
ident bug that I want to put over my stream
and I don't want to actually encode it into the movie.
Maybe I've got text to display like we saw.
I could even use this to do subtitling
although it's not ideal for that.
I can use it to mark points in the movie, things
of interest, chapters, or other things I could mark
where I am doing insertions, where
my bumper or where an ad occurs.
Or to give you a more complicated example I could
use this on a sandwich, filming a lecture like this,
and I've got a camera on me and
I've also got the slide deck.
Now, I could feed the slide deck as discrete
pictures and I could add metadata along
that said whether I should display the slide by
itself, the slide with me as a picture in picture,
or me as the main screen and the
slide is a picture in picture.
So there's a lot of flexibility here.
Now, in order to provide the metadata we're using ID3 tags.
Pretty well known standard.
And this exists in the movie as
a separate elementary stream.
So in our MPEG to transport stream
it's a separate elementary stream--
except when we are doing an audio only stream.
Then its actually piggy-backed into the audio stream.
Now, I can add this with one of our tools
mediafilesegmenter and with mediastreamsegmenter.
I'll talk about that more later on.
And it's supported starting in iOS 4 in both
MPMoviePlayerController and in AVPlayerItem.
And you find this using the timed metadata property.
Now, we also had some things with the encryption keys.
Now, it's tricky to get certificates to
work right, especially on iOS and a number
of our clients wanted more secure
key delivery than just HTTPS.
So we decided to add private protocols for keys.
Now, how does that work?
OK, we're using the custom URL scheme.
Some of you may have used this in
some of your apps for other reasons.
It uses the NSURLProtocol class
and if you're not familiar with it,
the URL Loading System Programming
Guide gives you good explanation.
It terms of the way it looks in
a playlist, it looks like this.
There is my key and I've got, I just specify my protocol
and what happens is the player, this framework when it sees
that my protocol, it's OK, I'll go and ask the
app and your code is responding and say, oh yeah,
yeah I know how to handle my protocol and you
go off and fetch that key however you want.
Now, the only gotcha there is because
you are giving that key to us, right?
That key is going to be one of these 128 bit
imagers as you've got to abide by the rules.
You got to give us the same key every time or at least
that key has to apply to subsequent segments just
like it would if it was a file being fetched.
Now, we've made a number of performance improvements.
In particular, faster stream switching.
So when I want to pop up to a higher bit
rate because my network's gotten better
that transition happens much faster
in iOS 4 than it did previously.
We also get faster startup, then the movie starts up
initially much faster when you have a fast connection.
We also added, because some of clients who were
delivering really long video on demand playlists found
that the playlist was taking a little
bit longer that they liked to download.
So what we did was we added support in the client
to un-gzip compressed playlists and
you can turn that on in your server.
For example if you're using Apache, you
just turn on the mod deflate module.
[Noise] OK, so now I've got to my odds and ends.
Failover is the first one.
Now when you're delivering a variant playlist, you're
not required to just supply one variant at each bit rate.
You can supply multiple at each bit rate.
So in this case, I've got two variants at a lower
bit rate and two variants at a higher bit rate.
Now the client when he comes in, he's
just going to pick the first one.
So the client's going to be fetching from server 1.
Now what happens if server 1 goes
down, because, let's face it,
I mean even the best servers aren't
a hundred percent up time.
So if the client tries to fetch the Playlist
from server 1 and server 1 doesn't respond,
the client's going to failover to server 2.
Now you'll notice that in my higher bit rate
example, I'm actually getting from different servers.
There's no requirement that it'd be the same set of
servers serving the different copies of the same stream.
There's no requirement that each bit
rate have the same number of variations.
I could have 3 for the low and 2 for the high.
I could even only have 1 for one of the others.
Now, one point I want to make is this only fails
over if the server doesn't supply the file.
Now if the server supplies the
file but something's gone wrong,
and the server isn't updating file anymore
it's not going to failover in that case
but hopefully we can fix that at some future date.
Now the last thing I want-- new feature I want to
point out is actually something that's been around.
Program Date-Time is a tag that allows you to associate
a wall clock time, real calendar time like, you know,
June 8th, 2010 at 11 o'clock in the morning and
it associates that with the start of a segment.
So it's associating a wall clock time with a point in
the movie and that association is going to carry forward
as you go through subsequent segments in your movie.
And in iOS 4, AVPlayerItem let's you seek to dates.
And the seeking is very straight forward.
It's just an NSDate that you pass.
Now one thing I want to point out is if
you use this and you have a discontinuity,
the discontinuity said, well things have changed.
Well one of the things that changed is it says I don't know
that the program date is still valid at a discontinuity.
So after each discontinuity, if you
want that program date association,
you're going to have to re-insert one of those tags.
Now at this point, I'd like to invite Roger back up
on stage to give you a demo that ties together some
of the things that we've been talking about.
[ Applause ]
>> Thank you Eryk.
Just before I do the demo, one thing I'd like to mention is
that with regard to seek to date, if you've had experience
of HTTP streaming and seeking before, you'll
know that the seek is a little bit rough
and there'll be seek at the beginning of the first segment.
One difference with seek to date is its subsecond accuracy.
You can seek very, very finely within a-- so
that's [laughs] some folks here are happy about that.
Great, because it's hard to write. So what we have for you
today for the demo, it's a very simple little application
and it is designed to show you how you can use two
of the features that Eryk talked about just now,
the discontinuity tag and Timed Metadata to
stitch together two different types of content
and use that to support kind of
a custom playback user interface.
So let's launch the app here, OK now let's play the movie.
So what we've got here is a video that's taken at a
park and what we've done is drop in some bonus content
into the middle of it and so what you can see here, we
have a custom controller and the bonus content is marked
with those different colors, the
red, the green, and the purple.
So what we'll do is start the playback here.
So there's my cat.
The first thing you'll notice is as the controller
reaches the first discontinuity, the red area,
you'll see a transition take place and so here it comes.
And so, here I am there was a discontinuity tag between
that last segment that had the cat in it and me over here.
The next thing you'll notice is we've disabled
seeking while you're in some of this bonus content.
Obviously you could implement any kind of policy you wanted
but that's kind of a simple one to show as an example.
You can still, you know, the play-pause
still works but seeking is disabled.
The next thing we have here is
some back to back bonus content.
And what's happening is that every-- as it plays forward,
your application is getting a callback which is synchronized
to playback and it's-- the callback
carries a little bit of Timed Metadata.
In this case, all it is, is a URL
to kind of an imaginary ad server.
And so the application here is using
that callback to trigger the enabling
and disabling of that playback controller.
So we can seek, we hit the bonus content, there's me again.
You must listen to me, you cannot seek away from me.
But OK, now we're back and so now we
can run back here and we can seek again.
We can seek back in the bonus content and there we go.
So, that's it.
This sample code is actually available.
It's associated with the session.
You can find through the WWDC site.
So the content is up there on a public
server, in case you want to download
and take a look at how the metadata is embedded into it.
And we'll be available in the lab tomorrow if you'd like
to come by or even maybe for a little bit later today,
if you want to come by and ask us
questions about that sample code.
So I'll hand it back to Eryk.
[ Applause ]
>> OK, so I want to talk some about the
Tools that we used to create that sample,
particularly to create the streams that are in that sample.
So we have a set of Tools and I'm happy to announce
today that we've added a fifth tool into our Tool set
and these Tools as always are available at
connect.apple.com in the downloads iPhone folder.
So I'm going to talk about each of these
Tools, and to create the content for this demo,
we used mediafilesegmenter and the id3taggenerator.
So first point I want to make about
mediafilesegmenter is it's really easy to use.
I mean, if you want to get started with
HTTP live streaming, use mediafilesegmenter.
I mean honestly, this is how easy it is use.
All you need is a movie file that's already H.264
and AAC and you pass it to mediafilesegmenter.
Boom, you're done.
You've created a playlist and segments.
Because mediafilesegmenter will do the
transport,MPEG-2 transport stream wrapping for you.
And now some people get a little
scared on mediafilesegmenter
because it's got a few options, you know, 20 year or so.
Not that many.
In fact they break into four categories.
So it's really not as complicated as it might look.
So there's the main options, really important ones
like if I want to create just an audio only stream
or if I'm generating Variant Playlists in
particular, what's my target duration going to be.
The next set of options is-- those are
associated with names and locations.
This is where I want to put the files on the file system.
What URL they're going to be located in?
If it's an absolute URL, what's the prefix?
Things like that.
The third set is encryption tags.
These are things that specify how often I
want to rotate my Initialization Vector.
How often I want to rotate my key and also the
same sorts of names and location things that I have
with the segment files I also have for the key files.
What's the prefix going to be on the key files?
Where am I putting the key files on my file system?
That sort of thing.
And the fourth set of options are
those associated with metadata.
And I'll talk about those a little bit
more when I talk about the id3taggenerator.
Basically, there's a few odds and ends that aren't important
but those are the basic options on mediafilesegmenter.
Now once I've messed around with
mediafilesegmenter, I really want to try
and find out about how to create Variant Playlists.
So if I'm creating a variant playlist, then I want to have
several variants of my movie at different data rates, right?
So in this case, I'm starting out
with one variation of my movie.
I'll make a directory, I'll tell mediafilesegmenter
to put the files in that subdirectory
and by passing the generate variant playlist
option, I'm telling mediafilesegmenter
to create a plist that describes that variant.
So it's going to create that on the side.
It's going to use the name and in
fact the location of the movie file.
So if that movie file was in some relative directory,
the plist is going to be created alongside it.
Now I have another variation.
In this case, I'm just going to have two variations.
This one let's say it's my cellular.
I'm going to make a subdirectory and call
mediafilesegmenter again to segment that version
and then it's going to again create a plist.
Now I can call Variant Playlist Creator and
what I do is I tell it for each variant,
where's the playlist for that and
what's the plist that describes it?
I give those in the order I want them
to be in my variant and it creates it.
So Variant Playlist Creator is great.
Once you've gotten started with that, you
can work your way up to mediastreamsegmenter.
Now mediastreamsegmenter is very
similar to mediafilesegmenter.
The big difference is it's not taking it from a file.
It's taking it from a pipe or a UDP port and it's
not expecting to get a movie, it's expecting--
or not an H.264 and AAC, it's expecting
to get a transport stream.
That's what it wants as input.
And it has even more options than mediafilesegmenter
but once you understand mediafilesegmenter,
you'll understand most of the options because
you've got that same basic set that you had.
With the exception, you don't have
generate variant playlists anymore.
Now you're going to have to create
your variant playlist on your own
and there's some slight differences
in the way the metadata options work.
But the big add-ons are Playlist Structure.
OK so, Playlist Structure is-- because with
mediastreamsegmenter I'm going to be creating Live
or Event Playlists, I need to be able to
tell like, is this is a Live Playlist?
This is an Event Playlist and how big is my
sliding window of content and how soon do I want
to start dropping playlists on, you know, do I want to
wait until I have a whole window of content or, you know,
will I start once I have a minute
or even 30 seconds of content.
And also if I'm doing that wall on window, what do
I want to do with the files after I get rid of them.
The last group of options of mediastreamsegmenter
is what I call Actions.
Particular important ones there are because I'm getting my
data through a UDP port or a pipe, I could get a timeout.
I could not get data.
What do I want to do when I don't get data?
So there, I've created my streams.
The next thing I'm really kind of want
to do is I like to validate my streams.
Now if you're using our tools, you don't really need to
validate them 'cause we already went through a lot of effort
to make sure that they do the right thing.
But if you're creating your own playlist, you can use
the Media Stream Validator to validate your playlist.
If you use the PARS option, what it does is it
simply looks at the playlist not at the segments
and checks to see whether it's following the rules.
If you use a validate option, what it's doing is
it will actually look at the segments and in fact
if it's a variant playlist that you're passing it, it will
look at all the individual variants and check them as well.
The last Tool I want to talk about is the ID3 Tag Generator.
This is a new tool that we just added.
It creates ID3 files and you use it with the
mediafilesegmenter with the meta-macro file option.
What does a meta-macro file looks like?
Well there's a sample and basically you're
saying at this point in time and seconds,
I want you to pull in this content, this file.
So it's either an ID3 file that I generated
with the generator or it can be a picture.
Now with mediastreamsegmenter, it's a little bit different.
With mediastreamsegmenter-- the mediastreamsegmenter
is actually listening on a port for metadata
and you can tell ID3 Tag Generator to send it to a port.
And what it's going to do, it's going
to send right at that moment in time.
So you're actually-- can insert
the metadata wherever you want.
Now some tips and tricks.
OK, so for variant playlists, you need to remember that the
first alternative is the one that's going to play initially,
and when you're delivering over both
cellular and Wi-Fi, it's really a good idea
to create two variants of your variant playlists.
One that you'll deliver on cellular
and one that you'll deliver on Wi-Fi
and you use the Reachability APIs to decide.
The reason why you do that is because it's going
to play the first variant initially and you want
to have it be a good data rate for whatever network
you're going over, whether it be cellular or Wi-Fi.
Now the set of variations that you should have in that
playlist should be identical between cellular and Wi-Fi,
the only difference should be which one is first.
The reason you want them identical is
because you're going to move around.
Your client is going to move around networks.
I might start of in here on Wi-Fi and go
outside in the street and now I'm on cellular
or vice versa and then I come back in, right.
The network's going to be changing all the time.
So you want to have the full range of
possibilities available to the client.
Now, if you're delivering these movies via web delivery, you
can use makerefmovie which is a tool that we make available
and that can target cellular or Wi-Fi and it
can also target desktop versus iPhone and iPad.
Now, encoding.
File size is very important over mobile, right?
And if you look at our recommendations,
I'll give you a pointer to the tech note
that has our recommendations a little later,
you'll see that we're pretty conservative
about what data rates we think you can support.
And when you're doing that, don't
forget the container overhead.
The transport stream is going to
add some overhead into your--
on your data rate and also now that you've got metadata,
the metadata is going to add some overhead as well.
And you don't need to encode to the full screen dimensions.
You can encode-- we've got a very good video
scaler on our IOS devices, so you can encode it 2/3
or 3/4 of the screen size and still
get a very, very good experience.
Now because you're trying to minimize your data rate,
you can trade off frames per second versus video quality.
You've got this option-do I make the image
a little worse and keep this frame rate up
or do I decrease the frame rate and
keep the quality of the images up.
People have different opinions about
how they should make that trade off.
Now when you're doing this, you want to
have multiple IDR frames per segment.
The more IDR frames you have-- if you have more IDR
frames per segment, we're going to do a better job
of stream switching and I want to reinforce
the point that the audio needs to be identical
across all the variants so that
you won't get audio artifacts.
Now if you're doing your encoding with something
like QuickTime Player 7, you want to use the movie
to MPEG-4 exporter because it gives
you more control over the encoding.
If you're just using export to web, it
gives you a very restricted set of options.
Now the 3 important things I want you to take away
from this: We're continuing to evolve HTTP streaming.
We've changed-- made a bunch of changes over this year.
We're anticipating making more changes in the future.
So you want to stay current.
You want to go and check on connect.apple.com
if we've updated the tools.
We try and announce that on the
dev forums but sometimes we miss.
And you also, give us your feedback.
The changes we made in key delivery were the result
of feedback from people who were trying to use this,
trying to do various things with the system.
Again, my name is Eryk Verhsen I'm
the media technologies evangelist.
So if you have any questions about HTTP
live streaming, you can send me e-mail.
The big points on the documentation, you can
just go to iPhone developer site and search
for HTTP live streaming and you'll find these.
First one is the HTTP Live Streaming Overview.
The second one is our best practices for creating
and deploying HTTP live streaming for the iPhone
and iPad which outlines what we recommend in
terms of data rates, in terms of resolutions.
And lastly, I want to mention that we've made
the specification for HTTP live streaming public.
It's available.
We do update it.
We've been through three versions in the last year.
We'll probably go through more.
And lastly, the dev forums are a great place to go.
The engineers who work on a HTTP live
streaming do answer questions on the dev forum.
That pretty much wraps it up for us today.
We're not going to do a stand-up Q&A.
If you have questions, you can come up and talk to us for
the few minutes we have before we have to vacate the room
and I invite you to come to the labs tomorrow.
Thank you.
[ Applause ]