WWDC2014 Session 501

Transcript

X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
>> Good morning everyone.
And welcome to Session 501,
"What's New with Core Audio".
I'm the first emcee
on the mic today.
My name is Torrey.
And we have been very busy.
We have a lot of interesting
things to share with you today.
We're going to start off by
talking about some enhancements
that we've made to
Core MIDI and how
that affects you and your apps.
Then we'll move on to
Inter-App Audio views,
and then we will have a large
section on the new and improved
and Enhanced AV Foundation
audio.
And that will include a talk
about the Audio Unit Manager,
AVAudioSession, some
Utility classes,
and that last bullet point
there, AVAudioEngine,
is such a large topic
that it gets a session all
to itself directly
following this one
in the same room
starting at 10:15 a.m.
So without further ado,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So without further ado,
let's talk about what's
new with Core MIDI.
If you have a studio, a
music studio, that you use
to make music, it may
look something like this.
So maybe there's a Mac in the
center of it, or an iOS device,
they're also very capable to
be the center of your studio.
And connected to it may be
several USB MIDI devices,
controllers, break out
boxes that are connected
by a 5-pin DIN to legacy
equipment, musical instruments,
and then also you may have
a network session going.
Well, beginning in iOS
8 and Mac OS X Yosemite,
your studio can start
to look like this.
So imagine after making a very
quick Bluetooth connection
and sitting down on a couch
on the other side of the room
of your studio and
controlling all of your music.
That's what you'll be able to
do with MIDI over Bluetooth.
So starting in iOS 8 and in Mac
OS X Yosemite, you'll be able
to send and receive MIDI data
over Bluetooth Low Energy
connections on any device or Mac
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
over Bluetooth Low Energy
connections on any device or Mac
that has native Bluetooth
Low Energy support.
The connections you
established is secure meaning
that pairing is enforced.
No one can connect
to your devices
without your explicit consent,
and after the connection is
established, it just appears
as an ordinary MIDI device that
any application that knows how
to communicate with a
MIDI device can talk to.
So to talk a little bit more
about how this connection works
over Bluetooth, I want to talk
about the two key roles involved
in a Bluetooth connection.
There's the Central
and the Peripheral.
You already have some
familiarity with this.
Maybe not with these names.
You can view your Central
as like your iPhone
and your Peripheral as like
your Bluetooth earpiece.
The Peripheral's job is to
become discoverable and say,
"Hey, I can do something.
You can connect to me."
So for Bluetooth MIDI,
the peripheral side will
advertise the MIDI capabilities.
It'll say, "Hey, I can do MIDI.
You can connect to me now."
And then that side waits.
The Central can scan
for a device
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The Central can scan
for a device
that says they can do MIDI and
then establish a connection.
After that Bluetooth
connection has been established,
MIDI data can be
shuttled bi-directionally
between both of these.
Now in order to have a
Bluetooth connection you have
to have one Central, and you
have to have one Peripheral.
And we allow Macs and iOS
devices to play either role.
So you can connect Mac
to Mac, iOS to iOS,
Mac to iOS, and vice versa.
So what does this mean for
you and your application?
If you are writing a
Mac OS X application,
the good news is you are
already ready, already ready.
This is the MIDI Studio
Panel from Audio MIDI Setup,
which I'm sure you're
all familiar with.
If you look there
you'll see a new icon,
the Bluetooth Configuration
icon.
If you double click that icon,
you are going to
get a new window.
And this window will allow
you to play either the Central
or the Peripheral role.
If you look at kind of the
top third of the window,
you'll see where there's a
button that says Advertise.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
you'll see where there's a
button that says Advertise.
And click Advertise to become
discoverable as Fresh Air.
That's a name that
you can modify.
Fresh Air is the name of my
MacBook Air because it's fresh.
Then the bottom two thirds
of it is the central view.
If someone is advertising, "Hey,
I can do MIDI," it will show
up in the bottom,
you click Connect
to establish the connection.
The pairing will happen,
and then a new MIDI device
will appear in the setup
that any application that
uses MIDI devices can see
and communicate will.
Now on iOS, there is
no audio MIDI setup.
So how do you manage your
Bluetooth MIDI connections?
You'll be using new
CoreAudioKit View Controllers.
There are 2 new CoreAudioKit
View Controllers
that you can add to
your application.
One of them that allows you to
play the role of the Central,
which means you scan and connect
and another that allows you
to play the role of Peripheral,
which means you advertise
and wait.
If you establish a
connection between 2 devices
over Bluetooth MIDI, and they're
not communicating for a while
and they are unused
by the application,
after several minutes we
will terminate the Bluetooth
connection to save power.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
connection to save power.
So what does all of this
look like in practice?
I'm going to show
you a short UI demo
of how users would use
this in their studios.
OK. I've got my demo
machine ready here.
And what I'm going
to do is I'm going
to launch audio MIDI setup.
This is the audio window.
We'll close this, and we
will go to the MIDI window.
Now if you'll notice here
in the MIDI window
there's this new Bluetooth
Configuration panel.
So if I double click
this, then I will see
that there are currently
no advertising Bluetooth
MIDI devices.
I want my Mac to play
the role of Central.
So I'm going to wait for someone
to become available
to connect to.
And I'm going to use
my iPad for that.
So here's my iPad.
And this is a little test
application that we wrote
to implement the
CoreAudioKit View Controllers
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to implement the
CoreAudioKit View Controllers
that I talked about earlier.
I am going to go to
Advertisement Setup,
and this will give me
the Peripheral view.
If you look here at the
top, you see the name
of this iPad Air
is iPad Air MIDI.
If I want to change this
name I could tap the "i",
but I'm OK with that name.
And then I will say
Advertise the MIDI Service.
Now after I'm advertising
the MIDI service,
back on the Mac OS X machine
you'll see iPad Air MIDI has
shown up here.
If I click connect, after a few
minutes you'll see a new device
appear in the MIDI setup.
I'm going to launch Main Stage
because Main Stage can receive
MIDI notes and play back audio.
Go into Performance
Mode [music playing].
OK. So a big confession
here, I don't play keys.
But I do have an application
that plays keys really well
called Arpeggionome Pro.
So I'm going to launch that,
and I'm going to use it
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So I'm going to launch that,
and I'm going to use it
to send MIDI data over
to the Main Stage 3.
OK. Now one thing I want
to do really quickly here is
check my connection status
because I left it inactive
for quite a while here.
So I'm going to go back and make
myself advertise one more time.
[ Music Playing ]
So now this is live MIDI data
being sent over Bluetooth.
If I could get that volume
a little louder please.
Thank you.
So if I wanted to do this
preset, it's called Epic Fall.
And it is epic.
So that's Bluetooth
being sent over MIDI.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So that's Bluetooth
being sent over MIDI.
And also this sends not
only the controller data,
but it also sends the SISX
data that you may have
or any other type of MIDI.
A few final words before
I turn the mic over.
This Bluetooth, being
able to connect
with Bluetooth MIDI connections
will work on both OS X Yosemite
and iOS 8 using those
view controllers
that I told you about.
And it will work
on any Mac, iPhone,
or iPad that has native
Bluetooth Low Energy support.
So now I'm going to tell
you which ones those are.
For Macs, any Mac that was
manufactured in 2012 or later
and a mixed bag of
Macs that were released
in 2011 also have native
Bluetooth Low Energy support.
For the iPhone, the iPhone 4S
and greater have Bluetooth LE.
For the iPad, the first iPad
with the Retina display has LE
and the ones from that point
and all iPad Minis have native
Bluetooth Low Energy support.
So this will work on
all of those systems.
Also, the connection
is really low latency.
It's very sensitive.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
It's very sensitive.
And the Bluetooth LE bandwidth
greatly exceeds the minimum MIDI
bandwidth requirement for MIDI
of 3,125 bytes per second.
Standardization is in the works.
We're working with standards
bodies to standardize this
so more people can get in on it.
And the key takeaway for you is
if you're making your
iOS applications,
please start adding these
Bluetooth UI View controllers
immediately to your applications
so that users can manage
Bluetooth MIDI connections using
your app.
And the person who is
going to show you how to do
that is my colleague and
homeboy Michael Hopkins.
And I'll turn the
mic over to him.
>> Thank you very much, Torrey.
I'd like to talk to you this
morning about a new framework
for iOS called CoreAudioKit.
This framework provides
standardized user interface
elements for you to add to
your application to do things
like show the MIDI
over Bluetooth LE UI
that Torrey just demonstrated as
well as some new views for those
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that Torrey just demonstrated as
well as some new views for those
of you that are doing
Inter-App Audio.
We've designed these so that
we do all the heavy lifting
so that you don't have to worry
about rolling your own UI,
and you can just concentrate
on what makes your app unique.
Therefore, they are very easy
to adopt with a minimal amount
of source code, and they
work on both iPhone and iPad.
Looking specifically about these
interface elements for MIDI
over Bluetooth LE, as Torrey
showed you we have separated
these into two different
view controllers
so that you can choose
which one is appropriate
for your own application
or you can use both.
For example, if you use the UI
split view controller you can
have those both visible
at the same time.
The first one is
the CABTMIDILocal
PeripheralViewController.
That's quite a mouthful
this early in the morning.
If you want to advertise
your iOS device
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
If you want to advertise
your iOS device
as a Peripheral,
you use this class.
The source code for adopting
this is very straightforward.
You create a new instance
of that view controller,
get the navigation controller
object for your app and push
that view controller
onto the stack.
The CABTMIDICentral
ViewController is required
if you want to discover
and connect two Bluetooth
Peripherals.
And you use that
in the same way.
You create the view
controller and push it
onto your view controller stack.
Now I'd like to switch over
and talk about Inter-App Audio.
For those of you that weren't
present last year at WWDC,
we had a session talking
about this new technology
that we released with iOS 7.
In review, Inter-App Audio
allows you to stream audio
between one or more
apps in real time.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
A host application can discover
available node apps even
if they are not running.
And please refer to
last year's session,
"What's New in Core Audio"
Session 206 for further details.
But looking at how this
works with the Host App
and a connected Node App, the
Node App can be an instrument,
an effect, or a generator.
And the Host App and Node App
can send audio back and forth.
In the case of an instrument,
the Host App can also send a
MIDI to that instrument app
and receive audio back.
The two user interface
elements that we provide
in iOS 8 are firstly the
Inter-App Audio switch review,
which provides an easy way to
see all the Inter-App Audio apps
that are connected
together and switch
between them using a
simple tap gesture.
We also provide an Inter-App
Audio Host Transport view.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
We also provide an Inter-App
Audio Host Transport view.
This displays the transport
of the host you're connected
to in your Node App
and allows you
to control the transport
playback, rewind,
and record in addition
to displaying
where in the Host Transport you
are via that numeric time code.
And I'd like to show a
demo of this in action.
I have 3 different
applications here
that we'll be using together in
our Inter-App Audio Scenario.
The first of which
is GarageBand,
which is the current
version of that application
that I've downloaded
from the iTunes store.
I also have a Delay
application and a Sampler.
Let's take a look at
the Sampler first.
This allows me to trigger sample
playback via the keyboard.
[ Music ]
So now let's go ahead and
connect this to GarageBand.
I'm going to launch GarageBand.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
I'm going to launch GarageBand.
I'm going to connect
to that Sampler app,
and now this is connected
to GarageBand.
So the first thing I'd like to
demonstrate is the Inter-App
Audio Switch Review in action,
which this application
has implemented
as visible via a button.
I press that, and
you can see now
that we have two Nodes shown.
The Host, as well as
our current application.
And I can switch over to
GarageBand simply by tapping.
I'm going to add
in an additional Inter-App
Audio App, the Delay effect.
And now if I was to switch
over to this application
without using the Switch
Review, I could double tap
on my Home key, look, and
try to find that application.
Where is it?
It's difficult to find.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
It's difficult to find.
And that's why we've provided
the Inter-App Switch Review.
In this application, the Delay,
you can see that on the
lower right-hand corner.
And now that is showing
our Host, the Sampler,
as well as our current effect.
So it's very easy to
switch back and forth.
And you can see that it
just showed up there.
So that's the first view
I'd like to demonstrate.
And if I play back
on my keyboard,
we can hear that we're
now getting that Delay.
And this is interesting because
we have, we're sending audio
from the host to our Sampler,
and then through an effect
to playing that delay,
and then back.
Now the second view, the
Transport view, you'll see just
above that view, let me hide
that for you, and that allows me
to control the transport of
the Host [music playing].
I can do recording.
[ Music Playing ]
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Sorry. I'm no Dr. Dre.
It's too early in the morning
for that, but you get the idea.
And these are the views
that we're providing
for your benefit.
So please adopt those
to add this functionality
to your application.
OK. So the goal between these
user interface elements are
to provide a consistent
experience for your customers.
You do have some flexibility
in controlling some
of the visual appearances
of those controls.
They support a number
of different sizes.
So if you want a ginormous
UI you can have that,
or if you want them very
small you can do that.
The source code, as I'm going
to show you, is very easy
to add to your application.
And because these are subclasses
of UIView, you can choose
to create a view controller
if you want to add them
to a UI popover view
on your iPad, or if,
the example demonstrated it, if
you want to embed that directly
in the content of your app
you can do that as well.
Let's take a look at the code.
We import the umbrella header.
In this case, I'm
demonstrating how
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
In this case, I'm
demonstrating how
to add the Switcher
View from a nib file.
So you go into IV,
drag out your UI view,
assign that to be the class of
the CAInterAppAudioSwitcherView,
create an outlet for that view,
and then in the viewDidLoad
method we specified a visual
appearance of that view.
And then we need to associate
an audio unit with that view
so that it can automatically
find the other apps
that are connected.
And that's all there is to it.
And creating the Transport
view programmatically,
as this example shows, we create
the view, specify initial size
and location of that view,
configure it's visual
appearances,
associate an output
audio unit with a view,
and then finally we
add that transport view
as a subview of our
main content.
OK. Now I'd like to switch
gears a little bit now back
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
OK. Now I'd like to switch
gears a little bit now back
to AV Foundation.
The rest of my presenters
including myself will be
focusing on this framework and
some of the new enhancements
and abilities that
we've added for you
to add to your application.
The first new feature is
for Audio Unit Management.
And that's the
AVAudioUnitComponentManager.
This is a Mac OS X Yosemite
API, Objective C-based.
And it's primarily designed for
Audio Unit host applications.
However, as you'll see,
we do have some end-user
features as well.
We provide a number of
different querying methods,
which enable your host
to find the Audio Units
on the system given some
criteria, for example,
number of supported channels.
We have a simple API
for getting information
about each individual
Audio Unit.
We have some new
tagging facilities
that I'll demonstrate
in a moment.
And finally we have a
centralized Audio Unit cache
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And finally we have a
centralized Audio Unit cache
so that if you have
multiple host applications
on your system, once one
host has scanned Audio Units,
and for a lot of people they
have a large number of them
so this can take quite some
time, all the other hosts
on the system share that
information so they don't have
to perform that exhaustive
scan again.
Let's take a look at
the API in more detail.
As I said, these are in AV
Foundation, and they're new.
The first class is the
AVAudioUnitComponentManager.
And this provides three
different search mechanisms
for finding Audio Units.
The first of which is
based on the NSPredicates.
We can use a SQL-based
language to provide strings,
which I'll show you in a
source code example later
for finding audio units
matching the given criteria.
We also have a block-based
mechanism
for finer programmatic control.
And for those of you
using older host apps
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And for those of you
using older host apps
with our current
audio component API,
we have a backwards-compatible
mode as well.
Each of these search
methodologies return an NSArray
of AVAudioUnitComponents.
And that class can be used
to get information
about the audio unit.
Now using our prior API,
if I wanted to do something
like find all stereo effects
that support two-channel input
and two-channel output, I'd have
to write a great deal of code.
That's OK.
But now with this new API we can
reduce all that to this simple,
elegant four lines of code.
The first of which is
retrieving an instance
of the sharedAudioUnitManager.
And here I'm using the
block-based search mechanism
to find all components
that pass a specific test.
And in this block I'm checking
to see if the type name
of that audio unit is equal
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
of that audio unit is equal
to the preset string
AVAudioUnitTypeEffect.
And then furthermore
we're checking to see
if that Audio Unit supports
stereo input and output.
You'll notice there is a stop
parameter, so if you wanted
to return only the
first instance
of the audio component
matching this criteria,
you could return yes
and the stop would,
and it would stop immediately.
OK. Now I'd like to move
on to talk about tagging.
A lot of people,
especially Dr. Dre
in his studio has a large
number of Audio Units.
So finding the right one
can be a bit challenging
because they're sorted
alphabetically
or by manufacturer.
And there's a lot
easier way for users
to find these Audio
Units now with tagging.
It's very similar to what
we have done with a finder
of the previous Mac
OS X release.
Users can now specify their own
tags with an audio unit in order
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Users can now specify their own
tags with an audio unit in order
to create broad categories
or even specific categories
of how they want to
organize their audio units.
They can apply one or more tags
in two different categories.
The first of which
is a system tag.
This is defined by the
creator of the audio unit.
And, for example, in Mavericks,
excuse me, in Yosemite,
I have to get that name in my
head, I personally liked Weed,
but I didn't get to vote.
The system tags are
defined by the creator.
And we at Apple have
added standard tags
to all the Audio Units that
we feel would be useful
to most of our users.
You can also have user tags.
These are specified by each
individual user on the system.
So if you have three users they
can each have their own set
of tags.
A tag is a localized string
in the user's own language.
Swedish, Swahili,
it doesn't matter.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Swedish, Swahili,
it doesn't matter.
They can be arbitrary, or they
can be a pre-defined type.
And these are all
in AudioComponent.h. They can
be either based on the type
of Audio Unit, for example a
filter or a distortion effect,
or they can be based
on the intended usage,
for example an audio unit useful
in a guitar or vocal track.
Now I'd like to show
a demo of tagging
in action using a
modified version of AU Lab.
So in AU Lab we can look
at all the tags associated
with all the built-in
audio units.
And here you see
that, for example,
the AU time pitch
has two standard tags
that are associated with
it, time effect and pitch.
And those are defined by us.
In addition you can see
that this distortion effect has
two user tags, one specifying
that it's useful
for a drum track
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that it's useful
for a drum track
and another one for
a guitar track.
The API also provides developers
the ability to get a list
of all the system-defined
tags localized in the language
of the running host
as you can see here.
And I can also see
all of the user tags
that the users assigned to all
the Audio Units on this system.
Adding tags are as simple
as typing a new one.
Now that's been added
to that Audio Unit.
And I can do a search
using this predicate-based
and other search mechanisms.
And it will search all
the audience looking
for that particular tag.
So this is something
that is really exciting,
and we hope that you'll use this
API to add tagging functionality
to your own host application.
Let's take a look
now at the API.
To find an Audio Unit
with a specific tag,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
in this example I'm going
to use the NSPredicate
filtering mechanism.
Here I'm defining a predicate.
It says that the component has
to have the old tag name's
property containing a particular
string, in this case,
"My Favorite Tag",
and this is the identical
searching
that you just saw in my demo.
Once you've defined the
predicate you get an instance
of the shared AU Manager,
and then call
componentsMatchingPredicate,
which returns an array.
To get a list of
the tags associated
with this particular
AVAudioUnitComponent,
you use the userTags
named property.
You can assign to that as well.
And in this example I'm adding
two tags to the audio Unit.
We could get all tags
for a specific component,
and these will include the user
tags as well as the system tags.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and these will include the user
tags as well as the system tags.
All tagNames property.
We can get a localized list of
all the standard system tags
by getting the Component Manager
and then calling the
standardLocalizedTagNames
property.
This is what I was displaying
in the pop up in my demo.
And finally I can get a list
of all the localized tags
that this user has assigned
across all the audio
units on the system.
And that, again,
you saw in my demo.
For those of you that ship
Audio Units, and you want
to add your own built-in
tags to those Audio Units,
you need to go into your
AudioComponent bundle.
And in your info.plist, look at
your Audio Component Dictionary
and add a tag section.
These first two items are
examples of using standard tags,
and the third item
is a custom tag.
So you can have that
be something meaningful
to your own company,
for example,
if you have like the
Silver Effect Package,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
if you have like the
Silver Effect Package,
you could add that tag.
If you do so, you can
also localize that tag
by adding an
AudioUnitsTag.strings file
into your bundle and then adding
localizations for each language
that you wish to support.
And please do not localize any
of our standard system tags.
We've already done so for you.
So, in summary, if you're a
host developer please adopt the
AVAudioComponentManager API,
so your users can tag
all their Audio Units.
And if you're an
Audio Unit developer,
please add system tags
to your audio units.
So without further
ado I'd like to turn
over this session
to Eric Johnson.
He'll be discussing tips and
tricks and new functionality
in the AVAudioSession.
Eric?
[ Applause ]
>> Good morning.
So I'll be continuing on with
the AV Foundation framework.
This time we're on iOS
only with AVAudioSession.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
This time we're on iOS
only with AVAudioSession.
So today we're just going to
spend a few minutes talking
about some best practices
focusing
on managing your
session's activation state,
and then also talking
about just a little bit
of new things in iOS 8.
Before we dive in I wanted
to call your attention
to an updated Audio Session
Programming Guide that's
available on
developer.apple.com.
Since we saw you all
last year at this time,
this guide has been updated
so that it has been rewritten
in terms of AVAudioSession,
so it's no longer referring
to the deprecated C API.
That's a really great update.
And for those of you who
are maybe not that familiar
with Audio Session, there
was a talk from two years ago
where Torrey talked
about Audio Session
and also Multi-Route
Audio in iOS.
All right.
So let's dive into talking
about managing your
session's activation state.
So there's your application
state.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So there's your application
state.
And then there's
Audio Session state.
And they're separate things.
They're managed independently
of each other.
So if you've been
doing development
for iOS you are probably
familiar with app states.
So this is whether your app is
running or not, whether it's
in the foreground
or the background,
if it's been suspended.
Your Audio Session
activation state is binary.
It's either inactive or active.
Once you've made your
session active you do need
to be prepared to handle
interruptions, and we'll talk
about what that means.
So let's look at an example
of how an Audio Session
state changes over time.
So here we're on an iPhone.
We have our application
on top, our Audio Session.
Let's say that we're
developing a game app.
And then on the bottom we have
the phone's Audio Session.
And right now the user
is not in a phone call,
and they haven't
launched their app yet,
so both sessions
are idle, inactive.
So now the user launches
our app.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So now the user launches
our app.
When we first come into the
foreground our Audio Session is
still inactive.
And because we're a game app, we
want to make our session active
when we're in the foreground
so that we can be
ready to play audio.
So we'll do that.
And we're going to
just play some music,
so we're now happily playing
music in the foreground
with an active Audio Session.
So then the phone
starts ringing.
We get interrupted by,
the system sends us
an interruption event.
The phone's Audio
Session becomes active
and plays the ringtone.
And the user decides
to accept the call.
So the phone's Audio
Session stays active,
and our Audio Session has been
interrupted, so we're inactive.
And then the user ends the
call, hangs up, says goodbye,
and now the system is going
to deliver an end interruption
event to our Audio Session.
And we're going to
use that as a signal
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And we're going to
use that as a signal
to make our session active
again and presume playback.
And we continue in this state.
So this is a typical
example of how something
like a game application
interacts with the phones,
the phone app's Audio
Session on an iPhone.
So the way that you need
to manage your application's
Audio Session state is actually
going to depend on
how you use audio.
We've identified a number of
different types of applications
that commonly use audio on iOS.
And we don't have time to talk
about all of these this morning,
and you'd probably be
bored to death if we did.
So we're just going to
talk about a few of these.
So let's continue
on with the idea
that we're developing
a game app.
So for game apps usually what
we recommend is that when you're
in the foreground, you'll want
to have your Audio
Session active.
So a good place to make
your Audio Session active is
in the app delegate's
applicationDidBecomeActive
method.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
method.
So that will cover the case
when you're being launched.
If you're coming from the
background into the foreground,
or if you are already
in the foreground
and the user had swiped
up the control panel
and then dismissed it, you'll be
covered in each of those cases.
So once you've made your session
active you can leave it active,
but you do need to be prepared
to deal with interruptions.
So if you get a begin
interruption event,
you should update
your internal state
so that you know
that you're paused.
And then if you get an
end interruption event,
that's your opportunity to
make your session active
and to resume audio playback.
And this is just like the
example that we looked
at just a few minutes ago.
Media playback apps need
to manage their Audio Session
state a little bit differently.
So I'm talking about
applications
like the built-in music app
or podcast or streaming radio.
And these are the
types of applications
that we usually have
a play/pause button,
and they're what we refer
to as non-mixable meaning
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and they're what we refer
to as non-mixable meaning
that they'll interrupt the audio
of other non-mixable
Audio Sessions.
So for these types of
applications we recommend
that instead of making your
session active immediately
when you enter the
foreground that you wait
until a user presses
thePplay button.
And the reason that we give you
that advice is sometimes
the user brings your app
into the foreground just to see
if they have a particular
podcast episode downloaded
or to see if they have
a song in their library.
And they don't necessarily want
to interrupt other
audio that was playing.
So it's good to wait
until they press Play.
So like in the case of a game
app once you've made your
session active you
can leave it active.
But, again, you need
to be prepared
to handle interruptions.
So if you get a begin
interruption event,
you should update your UI.
So if you have a play/pause
button it's a good time
to change that state
and also keep track
of your internal states so that
you know that you're paused.
One thing you do not need
to do is you do not need
to make your session inactive
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
because the system has
already done that for you.
That's what the interruption is.
So then if you get an
end interruption event,
we ask that you honor
the ShouldResume option.
So if this option is part of
the info dictionary that's part
of that notification, that's
the system giving you a hint
that it's OK to make
your session active
and to immediately
resume playback.
If you don't see that option
as part of the notification,
then you should wait for the
user to press play again.
OK. So we talked about for game
apps and media playback apps
when you would make
your session active.
What about making
your session inactive.
So if you are something like
a navigation or a fitness app,
you're typically going to be
playing short prompts of audio.
And you're going to be using
the duck others category option
which will lower the volume
of other audio applications
on the system.
So it's important when you're
done playing your short prompts
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So it's important when you're
done playing your short prompts
that you make your
session inactive
so that the other audio is
able to resume at full volume.
If you're a voiceover IP app
or a chat app or maybe one
of these apps that has like
kind of like a browser view
where you're playing
short videos,
then you are usually going to be
what we refer to as non-mixable,
meaning that you're going
to interrupt other audio.
And so it's important that
when you're done playing audio
that you make your
session inactive
so that other sessions
are able to resume.
And it's a good idea to use
the NotifyOthersOnDeactivation
option when you make
your session inactive.
And that way the system is able
to tell an interrupted
Audio Session
that it's OK for them to resume.
All right.
So now let's shift gears
a little bit and talk
about managing your secondary
audio in response to other audio
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
about managing your secondary
audio in response to other audio
on the system playing.
So first let me explain
what I mean
by secondary audio
and primary audio.
So let's say we're
developing a game application.
Our primary audio is going
to be our sound effects,
our explosions, beeps and
bloops, short bits of dialog.
And it's the kind of audio that
really enhances the gameplay.
And it's also the kind of audio
that, if the user was listening
to music when they launched your
app, you still want it to play.
And it's OK that it mixes
in with the other music.
By secondary audio, I am really
talking about your soundtrack.
This is the audio where it
also enhances the gameplay,
but if the user was previously
listening to their music
or their podcast, you'd just
as soon have your
soundtrack be muted.
And then if the user
stops their music playback
on their podcast then you'd like
to have your soundtrack resume.
So in iOS 8 we've added a bit
of new API to help you do this.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So in iOS 8 we've added a bit
of new API to help you do this.
We've added a new property
and a new notification.
The property is called
secondaryAudio
ShouldBeSilencedHint.
As the name implies, it's a hint
that the system is giving you
that it's a good time to
mute your secondary audio.
So this is meant to be used by
apps that are in the foreground.
And we recommend that you
would check this property
in applicationDidBecomeActive.
Going along with the
new property is our
new notification.
This is the AVAudioSession
SilenceSecondary
AudioHintNotification.
Another mouthful for this
early in the morning.
So this notification will be
delivered to apps that are
in the foreground with
active Audio Sessions.
And it's kind of similar
to our interruption notification
that it's two-sided.
There's a begin event, there's
an end event all wrapped
up in the same notification.
So when you get a
begin SilenceSecondary
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So when you get a
begin SilenceSecondary
AudioHintNotification that
means that it's a good time
to mute your secondary audio.
And if you get the end
event it's a good time
to resume your soundtrack.
So let's look at what
this looks like in action.
So on the far right we have
the built-in music app,
and it's currently
in the background.
It's not playing audio.
On the far left we have our
game app that we're developing.
So we're playing our primary
audio, the sound effects,
and we're also playing
our soundtrack
because there was no
other music playing.
And in the middle we have iOS
helping to negotiate things.
So the user has his
headphones plugged in,
and he presses that
middle button.
And the music app responds
to remote control events.
So it uses that as a
signal to begin playback.
The music app also informs iOS
that it's using its
audio output.
And so then the system is able
to send a begin notification
to our app that's
in the foreground.
And in response to that we
can mute our soundtrack.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And in response to that we
can mute our soundtrack.
So our app is still
in the foreground.
The only thing that's
really changed is
that the user used their middle
button to play their music.
And we've responded
to the notification
that we got from the system.
So now some time passes.
We're in this state for a while,
and the user presses
the middle button again.
So the music app responds
by pausing its playback
and telling the system that
it's pausing its audio output.
And then the system is able
to send the end notification
to our app that's still
in the foreground.
And in response to that,
we resume our soundtrack.
So hopefully this will
be pretty easy to use.
There's one new property and
then the two-sided notification
that you can use to
manage your soundtrack.
So kind of on a similar thread,
in the past we've given advice
about how you could manage
your secondary audio based
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
about how you could manage
your secondary audio based
on the isOtherAudioPlaying
property.
And we had given
advice about choosing
between the ambient category
or solo ambient based
on the state of this property.
What we're recommending now
is that if you're this type
of application, that you
just use the ambient category
and then use the new property
and the new notification
to manage your soundtrack.
All right.
I'm going to hand things
over to Doug Wyatt.
He's going to tell us about
some new utility classes
in AV Foundation.
>> Thank you.
Good morning.
I'm Doug Wyatt.
I'm an engineer in
the Core Audio Group,
and I'd like to talk to you
about some new audio classes
in the AV Foundation framework.
I'll give you, we'll start
out with some background
and tell you what
we're up to and why.
Then we'll start looking through
these classes one by one.
And I'll tie things up at
the end with an example.
So in the past our CoreAudio
and AudioToolbox APIs,
they're very powerful, but
they're not always easy
for developers to get their
hands around at the beginning.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
for developers to get their
hands around at the beginning.
We've tried to work around this
by providing some C++
utility classes in our SDK,
and that's helped
to some extent,
but example code
gets copied around.
It evolves over time.
And we think it's best in
the long run if we sort
of solidify these things
in the form of API,
and that's what we're
providing now with these classes
in the AV Foundation
framework starting
with Mac OS X 10.10 and iOS 8.
So our goals here, we don't want
to make a complete
break with the past.
We want to build on
what we've already got.
So we're going to,
in many cases,
wrap our existing
low-level C structures inside
Objective-C objects.
And in doing so, these lower
level C structures become easier
to build.
But we can also extract them
from our Objective-C objects
and pass them to the low-level
APIs we might already be using.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and pass them to the low-level
APIs we might already be using.
And this is a philosophy we used
also with the AVAudioEngine,
which we'll be examining in more
detail in the next session here.
And I should also mention an
overriding goal here is for us
to stay real-time safe,
which isn't always
easy with Objective-C.
We can't access methods
for properties
on the audio rendering thread.
So we've taken some
great care to do
that in our implementations and
as we go I'll give you a couple
of examples of places where
you need to do this to be aware
of real-time issues when
you're using these classes.
OK. So here are the
classes we'll be looking
at today in this session.
At the bottom in green
we've got AVAudioFormat,
which has an
AVAudioChannelLayout.
In blue we have
AVAudioPCMBuffer,
which has an audio format.
Every buffer has a
format describing it.
And finally we'll be
talking about AVAudioFile,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And finally we'll be
talking about AVAudioFile,
which uses AVAudioPCMBuffer
for I/O
and as you would expect the
file also has format objects
describing the file's
data format.
So first let's look
at AVAudioFormat.
This class describes the actual
format of data you might find
in an audio file or stream
and also the audio you,
the format of the audio
you might be passing
to and from APIs.
So our low-level structure here
for describing an
audio format is an
AudioStreamBasicDescription,
which in retrospect might have
been called "audio stream not
so basic" or "audio stream
complete description"
because there's a
lot of fields there,
and it can be a little
challenging to get them all set
up consistently especially
for PCM formats.
But, you know, the beauty
of this structure is
that it describes just about
everything we would want to use
to describe an audio format.
But, again, it's a
little challenging.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
But, again, it's a
little challenging.
But, in any case, you can
always create an AVAudioFormat
from an
AudioStreamBasicDescription,
which you might have
obtained from a low-level API.
And you can always access
a stream description
from an AVAudioFormat.
But now we can move
on to other ways
to interact with AVAudioFormat.
So in the past we've had this
concept of canonical formats.
And this concept goes
back to about 2002
in Mac OS 10.0 or 10.1 or so.
So this format was
floating-point, 32-bit,
de-interleaved, but then
we got along to iOS,
and we couldn't really
recommend using float everywhere
because we didn't have the
greatest floating-point
performance initially.
So for a while canonical
was 8.24 fixed-point.
But because of that
schism we want to reunite
under something new now.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
under something new now.
We've deprecated the
concept of canonical formats.
Now we have what we
call a standard format.
We're back to non-interleaved
32-bit floats on both platforms.
So this is the simplest way
to construct an AVAudioFormat
now is with,
you can create a standard format
by specifying just a sample
rate and a channel count.
You can also query any
AVAudioFormat you might come
across and find out if it is
a standard format using the
standard property.
We've also provided for
using Common Formats
with AVAudioFormat.
And we define Common Formats
as formats you would often use
in signal processing
such as 16-bit integers
if you've been using that
on iOS or other platforms.
We also provide for
64-bit floats.
And it's very easy to create
an AVAudioFormat in one
of these formats by
specifying which one you want,
the sample rate channel count,
and whether it's inter-leaved.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
the sample rate channel count,
and whether it's inter-leaved.
You can query any format to
see if it is some common format
or something else using
the Common Format property.
OK. So that's AVAudioFormat.
Let's look at
AVAudioChannelLayout.
Briefly here, this describes
the ordering or the roles
of multiple channels, which
is especially important,
for example, in surround sound.
You might have left, right,
center, or you might have left,
center, right, and so on.
It's important to know the
actual order of the channels.
So every AVAudioFormat may
have an AVAudioChannelLayout.
And, in fact, when
constructing the AVAudioFormat,
if you were describing three
or more channels you have
to tell us what the layout is.
So it becomes unambiguous to
anyplace else in the system
that sees that AVAudioFormat
what the order
of the channels are.
So the underlying
AudioChannelLayout object is
pretty much exposed
the way it is here.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
pretty much exposed
the way it is here.
You can go look at that
in the CoreAudioTypes.h,
but we have wrapped that up
in the AVAudioChannelLayout
for you.
OK. Moving on let's look
at AVAudioPCMBuffer.
So buffer can be a sort of
funny term when we're dealing
with de-interleaved audio
because of the audioBufferList
structure,
but that aside you can
think of it simply as memory
for storing your audio data
in any format including
non-interleaved formats.
And here at the low-level
structures, which these ones
in particular can also be
a bit of a bother to deal
with because AudioBufferList
is variable length.
So you can simply create
an AVAudioPCMBuffer.
It'll create an audioBufferList
for you of the right size.
And you can always fetch it back
out of the AVAudioPCMBuffer.
Here's the initializer.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Here's the initializer.
So to create a buffer
you specify the format
and a capacity in
audio sample frames.
You can always fetch back the
buffer's format and the capacity
with which it was constructed.
And unlike audioBufferList,
which has a simple byte size
for every buffer, here
we've separated the concept
of capacity and length.
So there's the fixed
capacity it was created with
and the frame length,
which expresses the number
of currently valid
frames in the buffer.
Some more methods here.
To get to the underlying samples
we provide these simple type
safe assessors.
And this is a good
time now to say a word
about real-time safety
because these are properties.
And as useful as they may be for
actually getting to the data,
since they're properties they
may involve a method lookup,
which can, in principle,
take a miss on the lookup
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
which can, in principle,
take a miss on the lookup
and cause you to block.
So if you're going to be
using AVAudioPCMBuffers
on audio real-time threads,
it's best to cache these members
in some safe context when you're
first looking at the buffer
and use those cached members
on the real-time thread.
OK. That's PCM Buffer.
Now we can look at AudioFile,
which wraps all these
other classes together.
So here we let you
read and write files
of any CoreAudio
supported format.
This ranges from .M4A,
.MP4, .WAV, .CAF, .AIFF,
and more I can't
think of right now.
So in accessing the file, here
we give you a single way to read
and write the file
completely independent
of the file's actual
data format.
So if it's an encoded format
like AAC or Apple Lossless
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So if it's an encoded format
like AAC or Apple Lossless
or MP3, if there's a
codec on the system,
and in most cases there is,
we will, transparently to you,
decode from that format
as you read the file.
Similarly when you're writing
an audio file we will encode
from PCM into that encoded
format if we have an encoder.
So to do this, the
file has this concept
of the processing format.
And the processing format
is simply the PCM format
with which you will
interact with the file.
So you specify the PCM format
when you create the file,
and it has to be either a
standard or common format.
The only limitation here is
that we don't permit sample
rate conversion as you read
from or write to a file.
Your processing format
needs to be
at the same sample rate
as the file itself.
Now, if you're familiar with
the Audio Toolbox Extended Audio
File API, this is
functionally very similar,
and it's just a bit
simpler to use.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So I'm looking now
at the initializers
and some assessors
for AVAudioFile.
Here's the initializer
for reading from a file.
If you don't specify a
processing format you simply
are, you get the
default behavior,
which is that your
processing format will be a
standard format.
Very similarly to creating
an AVAudioFile for writing,
the only extra information
you need
to give us is a settings
dictionary.
This is the same
settings dictionary passed
to AV Audio Recorder.
And in there there are keys,
which specify the file format
you want to use, and in the case
of example, for example AAC
you can specify the bit rate
and any other encoder settings.
Those are in the
settings dictionary.
So once you've built a file
you can always access back the
actual file format on disk.
So that might be, for example,
AAC, 44 kHz, two channels.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So that might be, for example,
AAC, 44 kHz, two channels.
But you can also query
the processing format
with which you created the file.
And in the case of the
two simplest initializers,
this would be floating-point,
32-bit,
same sample rate as the file.
Same channel count as the file.
OK. So to read and write
from AVAudioFiles there's a
simple method, readIntoBuffer,
and that will simply
fill the AVAudioPCMBuffer
to its capacity assuming
you have,
you don't hit the end of file.
writeFromBuffer is a
little in that it looks
like the buffer is frame length
rather than the capacity,
so it writes all
of the valid frames
from that buffer to the file.
And you can do random access I/O
when reading from audio files.
So this is like the
standard C-library's seek
and tell functions,
F seek and F tell.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and tell functions,
F seek and F tell.
You can query the frame
position to see where you are
when reading an audio file.
And you can also seek to a
different position in the file
by setting the frame position
pointer before you read.
And the next read will proceed
sequentially from that point.
OK. I'd like to tie all
these classes together now
with this short example.
And I've got four screens here.
We'll see what it's like
to open an audio file,
extract some basic
information from it and read
through every sample
in the file.
So here we have initForReading.
We simply pass the URL.
I'm using the variant
here that's explicit,
but I'm passing PCM
Float 32 always.
I could have left those off
and gotten a standard format.
I'm going to fetch some basic
information from the file
and print it, including
the files on disk format
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and print it, including
the files on disk format
and the processing format.
I can query the audio
file's length
and frames, sample frames.
And I can convert that length in
frames to a duration by dividing
by the file's sample rate.
OK. Next I'm going to create
a PCM Buffer to read from.
Since the file might be
large, I don't want to try
to read it all into
memory at once.
So I'm going to loop through it
128K sample frames at a time.
So I'm going to create a
buffer with that capacity.
And notice I'm just going
to pass the audio files
processing format.
When allocating this
buffer, and that ensures
that the buffer is
the same format
that the file will be giving me.
And here I'm ready to start
reading through the file.
And I'm going to read one buffer
at a time until I get to the end
so I can query the current frame
position and to see if it's less
than the length I
discovered earlier.
I can read into buffer,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
I can read into buffer,
which will again fill
the buffer to capacity.
I can double check to
see if I'm done by seeing
if I got a zero length buffer.
And this is a lot of code, but
it boils down to two for loops.
The outer one is walking
through all of the channels
in the buffer if it's
a multichannel file.
And then the inner-loop
will look
at every sample in that buffer.
So given every sample, I can
look at its absolute level
and see if it's the
loudest, or if it's louder
than the loudest sample I've
found so far, and if so,
I can record that level and
where I found it in the file.
So there, in about four screens
of code, I opened an audio file.
I read through the whole
thing one sample at a time.
OK. So moving on I'd like to
just sort of foreshadow the uses
of these classes in the
AVAudioEngine session,
which will follow this one.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
which will follow this one.
So at the bottom
we see AVAudioFile
and AVAudioPCMBuffer.
And those are both used
by something called
AVAudioPlayerNode,
which will be your
basic mechanism
for scheduling audio
to play back.
If the AudioPlayerNode
is a subclass
of a more generic AVAudioNode
class, which is some unit
of audio processing, and we'll
see how AVAudioFormats are used
when describing how to
connect AVAudioNodes.
So that brings us to the end
of my section of this talk.
We saw the AVAudioFormat
ChannelLayout,
PCM Buffer and file classes.
You can use these without
AVAudioEngine using your
existing code with the
Core Audio, Audio Toolbox,
and Audio Unit C APIs.
If you're careful, just
do real-time saves,
and you can use those
assessor methods
to extract the low
level C structures.
And, again, we'll be seeing how
these are used in more detail
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And, again, we'll be seeing how
these are used in more detail
in the next session
on AVAudioEngine.
And that's the end
of our hour here.
We've looked at MIDI
over Bluetooth,
the Inter-App Audio UI
Views, lots of features
of AV Foundation audio,
and we hope you'll stick
around for the next
session on AVAudioEngine.
If you need more information,
Filip is our Evangelist,
and there are the
developer forums.
Here's the next session
I keep talking about.