WWDC2013 Session 602

Transcript

X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[ Silence ]
>> Good morning everyone.
My name is Tony Guetta
and I'm the Manager
in the Core Audio
group at Apple.
And today, I'm going
to talk to you about,
What's New in Core
Audio for iOS.
We're going to begin with a
very high level overview of some
of the new audio
features in iOS 7.
And for the majority of this
session, we're going to talk--
spend our time focused on one
new technology in particular
that we think you're going
to be very excited about.
So, let's dive in to the
list of new features.
First is Audio Input Selection
and with input selection,
your application now has
the ability to specify
which audio input it would like
to use in certain situations.
So for example, if the user had
a wired headset plugged into his
or her device, but your
app wanted to continue
to use the built-in
microphone for input,
you now have the
capability to control that.
With input selection, you can
also choose which microphone
that you'd like to use on
our multi-mic platforms
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and on devices that support
it such as the iPhone 5.
You can take advantage
of microphone beam
forming processing
to set an effective
microphone directivity
by specifying a polar pattern
such as cardioid or subcardioid.
We've made some enhancements
to multichannel audio on iOS 7.
And through the use of
the AVAudioSession API,
you can now discover the
maximum number of input
and output channels
that are supported
by the current audio route
as well as being able
to specify your preferred number
of input and output channels.
For audio outputs
supported such as HDMI,
you can obtain audio
channel labels
which associate a
particular audio channel
with the description of a
physical speaker location
such as front left, front
right, center and so on.
We've added some
extensions to Open AL
to enhance the gaming audio
experience in iOS 7 starting
with the ability to specify a
spatialization rendering quality
on a per-sound source basis.
Now, you might use this
to specify a very high
quality rendering algorithm
for the important sound
sources in your game.
But a less CPU intensive
algorithm
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
for those less importance
sound sources in your game.
We've also made some
improvements
to our high quality
spatialization
rendering algorithm.
And also added the ability
to support rendering
to multichannel output
hardware when it's available.
Finally, we've added
an extension
to allow capturing the output
of the current Open AL
3D rendering context.
We've added time-pitch
capabilities to Audio Queue.
So your application can now
control the speed up and slow
down of Audio Queue
playback both in terms
of time and in frequency.
We've enhanced the security
around audio recording in iOS 7
and we now require explicit user
approval before your application
can do audio input.
Now, the reason for
doing this is
to prevent a malicious
application from being able
to record a user without
him or her knowing it.
The way that this works
is very similar to the way
that the location service's
permission mechanism works.
In that the user is presented
with a model dialog
requesting his or her permission
to use the audio input.
The decision is made on
per-application basis
and it is a one-time decision.
However, if you'd like to
go in and change your mind
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
at a later time,
you can always go
into the Settings
application to do that.
Until the user has given
your application permission
to use audio input, you
will get silence so you need
to be prepared to handle that.
Now, what actually triggers
the dialog from being presented
to the user is an attempt
by your application
to use an audio session
category that would enable input
such as the record
category or play and record.
However, if you'd
like to have control
over when the user is
presented with his dialog
so that it can happen
at a more opportune time
for your application,
we've added some API
and AVAudioSession for
you to be able to do that.
Finally, just a note on
the AudioSession API.
As we mentioned at last year's
conference, the C version
of the AudioSession API is
officially being deprecated
in iOS 7.
So, we hope that over the
course of the past year,
you've all had the opportunity
to move your applications
over to using the
AVAudioSession API.
So, here is a summary of the
features that we just discussed.
We're not going to spend anymore
time today going over any
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
of these topics in
any more detail.
So, if you have any
questions about these
or like a more detailed
overview of any of these items,
we encourage you to come by
our labs either later today
or tomorrow morning and
we'd be happy to discuss
with you in more detail.
I'd also encourage you to have
a look at the documentation
in the various header files
that I outlined in the course
of going through
each of these topics.
So for the remainder of this
session, we're going to focus
on one new technology in
particular that again,
we think you're going
to be very excited about
and that's Inter-App Audio.
So what is Inter-App Audio?
Well, as the name implies,
Inter-App Audio provides
the ability to stream audio
between applications
in real-time.
So, if you have a really cool
effects application and you want
to integrate that into
your DAW application,
you now have the
ability to do that.
We've built Inter-App Audio on
top of existing Core Audio APIs
so it should be very
easy for you to integrate
into your existing applications
and deploy quickly
to the app store.
Because it's built into
the operating system,
the solution is very efficient
with zero additional latency
and should provide for a stable
platform for the evolution
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
of the feature over time.
Now, before we get into any
of the technical details
of how Inter-App Audio works,
I'd like to invite up Alec
from the GarageBand
team to give you a demo.
[ Applause ]
>> Thanks Tony.
Am I up?
>> Yeah.
>> My name is Alec, I am a
product designer for GarageBand
and Logic and I'm going to
switch over here to my iPad.
So, what I want to do today
is give a quick demonstration
about how we have been working
with the development version,
kind of a sneak peek into
a development version
of GarageBand and how we're
doing some experiments
with Inter-App Audio.
So, what I have up here is
just a simple FourTrack song
in GarageBand, I'm going
to play a little bit
so you can get an idea
of what it sounds like.
[ Music ]
OK. So the first thing
I want to do is I want
to add a little keyboard
part to this.
But instead of using one
of the built-in instruments
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
in GarageBand, I want to use
an instrument, on the system
that is not part of GarageBand.
So to do that, I'm going to go
out to the GarageBand
instrument browser.
Now, what we see here are
the instruments that ship
with GarageBand,
part of GarageBand.
And then we have a new
icon here, Music Apps.
I'm going to tap on that and
we see the icons of other apps
on the system which
are audio apps.
So, I'm going to click on
sampler one here and we'll see
that the sampler launches
in the background.
Now here it with the UI in the
foreground and we can hear it.
Now, you see there's a
transport here and that's,
this transport is remotely
controlling the transport
of GarageBand.
So, when I press the record
button, what we're going
to hear is a count off from
GarageBand and then the track
that I just played and I'll
record over the top of it.
[ Music ]
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Brilliant musical passage.
So now, if I-- if you look up at
this transport again you'll see
that there's a GarageBand icon.
When I tap on that
icon, I switch back
to the GarageBand application
and now, in the tracks view,
a new track has been added
with this little keyboard part
that I played we
can listen to it.
[Background Music] And
add some keyboard to it.
[ Music ]
So, that was bringing audio
from another application,
controlling that
application in its interface,
and recording that
in GarageBand.
The next thing I
want to do is I want
to process an input
from GarageBand.
So, I'm going to put on my
little guitar here and we'll go
to the guitar amp in GarageBand.
Now, this guitar
amp is part of--
one of the instruments built
in the GarageBand and I'm going
to turn on input monitoring
so I can hear myself.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[Background Music] You
guys got it out there?
It's a little phase switch.
OK. Or we're going to--
there you go, that's
more rock and roll.
OK. So, that's a
good sound right?
That's using the guitar
app from GarageBand.
What I want to do though
is I want to process it
with another effect
on my system.
So again, I'm going to go into
the input settings in GarageBand
and if you see about halfway
down this list, it
says Effect App.
I'm going to tap on that and
we can see a list of apps
on my system that are
effects so I'm going
to click on this Audio Delay.
[Music] So, there is the delay
but it's not really
the settings I want.
So, I'm going to tap on
the Effect icon and switch
to the Effects Interface.
I'm going to take the feedback
down here a little
bit and the mix.
[Music] OK.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So that's a little bit better.
So now, what I'm doing
is I'm taking the input
through GarageBand, sending
it out to this effect
and bringing it back
on to GarageBand.
Then I can hit record.
[ Music ]
Now, if we switch back
to the tracks view,
we can see that new
region has been recorded
in GarageBand and if I play.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[ Music ]
There is the source
with the delay added
to it recorded in
the GarageBand.
So that's just the
quick overview
of how we're doing some
experiments inside this
development version
of GarageBand
with the new Inter-App
Audio APIs.
And next, we're going
to bring up Doug
to give you a little more
detail about how some
of the stuff works
under the hood.
[ Applause ]
>> Thank you Alec.
Hi, my name is Doug Wyatt, I'm a
plumber in the Core Audio group.
I'd like to present to
you some of the details
of the Inter-App Audio APIs.
So, conceptually here, we
have two kinds of applications
which we call the
host application
and the node application.
The fundamental distinction
between these two
applications is that the host is
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
where we ultimately
want the audio coming
from the node application
to end up.
So, GarageBand in this
example was a host application.
It was receiving audio from
the sampler application
and from the delay
effect application.
So, given these two
kinds of applications,
we're going to look at APIs
for how node applications
can register themselves
with the system and how host
applications can discover those
registered node applications.
We'll look at how
host applications can
of initiate connections through
the system to node applications.
And once those connections
are established,
the two applications can stream
audio between each other.
But, again, primarily the
destination has to be the host
that could optionally
send the audio to the node
if the node is providing
an effect.
We'll look at how furthermore
host applications can send MIDI
events to node applications to
control their audio rendering.
So for example, with
that sampler application,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
the host could have been
actually sending the MIDI nodes
to the sampler and receiving
the rendered audio back.
We'll look at some interfaces
where the host can
express information
about its transport
controls and transport state
and timeline position
to node applications.
And finally, we'll look
at how node applications
can remotely control
host applications.
So, let's look inside
host applications.
So, this is your
basic standalone music
or audio application on iOS.
We have AURemoteIO audio unit
and its function is to connect
to the audio input and
output system with zero--
well, very low latency using
pretty much the same mechanisms
as on the desktop but always
through the RemoteIO audio unit.
So, feeding the audio unit, we
have the host's audio engine
and that can be constructed
in a number of ways,
we'll show some examples later.
But for the-- the purposes
of Inter-App Audio,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
the host engine connects
to a node application
by instantiating
a node audio unit.
This is another Apple supplied
audio unit that, in effect,
creates the bridge to the
remote node application
to communicate audio with it.
Now, on the node side, a
node application is also
by default a normal application
with an AURemoteIO that can play
and record as always and it's
got its own audio engine.
What's a little different here
is in the Inter-App scenario,
the node application has its
input and output redirected
from the mic and speaker
to the host application.
So, that's the node application.
So, you see then we're
implementing this API
as a series of extensions
to the existing
AudioUnit.framework APIs.
The host sees the node
application as an audio unit
that it communicates with
and the nodes AURemoteIO unit
gets redirected to the host
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
so the node's communication
to the host application
is through that IO unit.
So, to express the capabilities
of these node applications
and to distinguish them a bit
from the existing
audio unit types,
we have these four new types.
They all are the same in that
they produce audio output
but they differ in what input
they receive from the host.
We have remote generators
which require no input.
We have remote instruments
which take MIDI input
to produce output, audio output.
We have effects which
are audio in and out.
And finally, we have music
effects which take both audio
and MIDI input and
produce audio.
So, node applications
use these component types
to describe their capabilities.
And furthermore one node
application may actually have
multiple sets of capabilities
and may wish to present itself
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
in multiple ways to hosts.
As a simple example,
a node application may produce
audio just fine on its own,
in which case, it's a generator.
It may optionally be
able to respond to MIDI
in producing that audio.
So, it can also be a generator.
So, such a node application
could publish itself
as two different
audio components
with separate capabilities.
Another example of
that is an application
like a guitar amp simulator
where the application appears
to the user as an effect
because audio is going in,
it's being processed in some
way and then it comes out.
But from the host's
point of view,
this application can appear
either as a generator an effect
and the node can publish
itself either way.
For example, if a node says I'm
a generator, it can continue
to receive microphone or a line
input from a guitar directly
from the underlying AURemoteIO
while only sending the audio
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
output to the host.
So again, that's generator mode.
Or if a host application
like GarageBand might have
a prerecorded guitar track
and want to process that through
the guitar amp simulator.
The guitar amp simulator can
function fully as an effect,
not communicate with the
audio hardware at all,
and just communicate
the two audio streams
between itself and the host.
Let's move on and look at
some of the requirements
for the Inter-App Audio
feature, it's available
on most iOS 7 compatible
devices,
the exception being
the iPhone 4.
And on the iPhone 4, you
don't really have to deal
with this specially because
what will happen is node
applications, if they
attempt to register themselves
with the system, those calls
will just fail silently,
the system will ignore them.
And on the host side, the
host will simply see no node
applications on the system.
Both host and node
applications need
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to have a new entitlement called
"inter-app-audio" and this--
you can set this
for your application
in the Xcode Capabilities tab.
Furthermore, most applications
will want to have audio
in their UIBackgroundModes.
Most especially hosts
for obvious reasons
because hosts will keep
running their engines
when nodes are on
the foreground.
Also, nodes like the guitar amp
simulator I just mentioned may
want to continue accessing the
mic and to be able to do that,
they too need to have the
audio background mode.
One final requirement for nodes
in particular is
the MixWithOthers
AudioSessionCategoryOption.
Hosts can go either
way on this one.
We'll get into that
in more detail later.
OK. Getting in to the nuts
and bolts of the APIs here,
let's look at how node
applications can register
themselves with the system.
So, there's two pieces
of registering one's self
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
for a node application.
The first is an Info.plist
entry called AudioComponents.
So, the presence of this
Info.plist entry makes the app
discoverable and
launchable to the system.
The system knows,
oh I've got one
of this node applications
installed.
The second part of registration
is for the node application
to call AudioOutputUnitPublish
which checks
in that registration that it
advertised in its Info.plist.
It says, I've been launched and
here I am ready to communicate.
So let's look at those two
pieces in a little detail.
So here is the AudioComponents
entry in the Info.plist.
Its value is an array
and in that array,
there is a dictionary
for every AudioComponent
that the node wants to register.
And in that dictionary,
if you're familiar
with AudioComponentDescriptions
already,
you'll see some familiar
fields there.
There is the type, subtype and
manufacturer along with the name
and the version number.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So, that completely
describes the AudioComponent
that the node application
is advertising.
So, moving on to the second
part of the registration here,
this is when the node
application launches,
the first piece of code here
is the node's normal process
for creating its
AURemoteIO when it launches.
It creates an
AudioComponentDescription
describing the Apple
AURemoteIO instance,
it uses AudioComponentFindNext
to go find the AudioComponent
for the AURemoteIO.
And finally, it creates an
instance of the AURemoteIO
and this is something just about
every audio and music app on--
today will do for creating
a low latency IO channel.
What's new is that the node
application, to participate
in Inter-App Audio is now
going to connect that IO Unit
that it just created with
the component description
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that was published in
the Info.plist entry.
So, to do that, we're
seeing this code here
that that node creates an
AudioComponentDescription
which matches the one in the
Info.plist we saw a moment ago.
It supplies the name and version
number and passes all that along
with the AURemoteIO instance
to a new API called
AudioOutputUnitPublish.
So again, that connects what
was advertised in the Info.plist
with the actual RemoteIO
instance in the application
to which the host
application will connect
as we'll see in a little bit.
So, to make this all
work, a requirement
of the node application is
to publish that RemoteIO unit
when it launches because the
node application is going
to get launched by host
applications when--
at times when the user
what's to use them.
And so, the node application
basically has to acknowledge,
I'm here, I've been launched.
And so, you can see then why the
Info.plist entry and the call
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to AudioOutputUnitPublish
must have the same component
descriptions, names,
and versions.
One note here is
that by convention,
the component name should
contain your manufacture name
and application name and that
lets host applications sort the
available node applications by
manufacture name if they like.
So, that's the registration
process for node applications,
let's look at how host
applications can discover
those registrations.
So again, if you've used the
AudioComponent calls before,
this should look
fairly familiar.
What we here-- have here is a
loop where we want to iterate
through all of the components on
the system because we're looking
for nodes and there
are multiple types.
So, the simplest
way to do that is
to create a wild card
component description
and that's the searchDesc,
it's full of zeros.
And so then, this loop will
call AudioComponentFindNext
repeatedly and that
will yield in turn each
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
of the AudioComponents
on the system which are
in the local variable comp.
When we find null, then we've
gotten to the end of the list
of all the components in
the system and we're done
with our loop, we'll
have found them all.
Now, for each component on
the system, what we want
to do is call
AudioComponentGetDescription
and this will supply to us
the AudioComponent description
of the actual unit as
opposed to that wild card
that we used for searching.
So now, in foundDesc, we can
look at its component type
and see if it's one of the
four inter-app audio unit types
that we're interested in, the
RemoteEffects, RemoteGenerator,
RemoteInstrument, and
RemoteMusicEffect.
If we see one of those, then
we know we found the node.
OK. So the host has
found a node.
So now, I'm going to
walk through a little bit
of code here from one
of our sample apps.
It creates an objective C
object of its own just as a way
of storing information about
the nodes that it's found.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And it calls this class
RemoteAU and it stores away
into it the component
description that was found
and the AudioComponent
that was found.
It also fetches the
component's name and stores
that in the field of
the RemoteAU object.
It sets the image from
AudioComponentGetIcon
which is a new API call which
works with inter-app audio.
This gives you the
application of--
I'm sorry, the icon of
the node application.
We can also discover the time at
which the user last interacted
with the node app and this
can be useful if we want
to sort a list of available
node applications by time
of when they were
most recently used,
the way the home screen does.
So we've gathered up
all this information
about the node application,
and now we've built an array
from which we can drive a
table view and present the user
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
with a choice of node
applications to deal with.
One wrinkle though is having
cached all that information
in an array, it can become stale
and out of sync with the system.
Most notably, when apps
are installed and deleted
so if you find yourself caching
list of components like this,
you should probably also
listen to this new notification
that we supply, its name is
AudioComponentRegistrations
ChangedNotification.
So, you can pass that
to NSNotification center
to register for a notification.
In this example, we're supplying
a block to be called and then
that block which is called when
the notification or rather,
when the registration has
changed, we can refresh
that cached list of
audio units we built.
So, that's the process of
discovering node apps for host.
So now, we've built up a table
and maybe the user has
selected one of them
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and in the host application now,
we want to actually
establish a connection
to the node application so
let's look at how that works.
The first step is very
simple because we held
on to the AudioComponent
of that node.
Now, all we have to do is create
an instance of that component
and now, we have an audio unit
through which we can
communicate with the node.
It's worth mentioning
that this is the moment
at which the node
application will get launched
into the background if it
it's not already running.
And we'll look it all the
mechanics of what happens
on the node side of that later.
Right now, we're just going
to focus on the host side.
So, the host has to do a fair--
a few steps here to get ready
to steam audio between
itself and the node.
Most importantly, the
host must be communicating
with the node using the same
hardware sample rate as--
or the same sample
rate as the hardware.
So, to be absolutely sure the
hardware sample rate is what
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
it's supposed to be, we should
be making our audio session
active if we haven't already.
So, once having done that,
then we can specify the audio
stream basic description
which is a detailed
description of the audio format
that the host wishes to use
to communicate with the node.
So, we can choose
mono or stereo.
In this example,
I've chosen stereo.
Here is where we're using
the hardware sample rate
And these lines of code here
are basically specifying 32 bit
floating-point, non-interleaved.
Now, the host application can
choose any format it likes here
and the system will perform
whatever conversions are
necessary as long as there's
not a sample rate conversion
being requested.
Again, you must use the sample
rate that matches the hardware.
All right.
Now, we have built up an
audio stream basic description
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and we can use
AudioUnitSetProperty
on the node AudioUnit for
the stream format property
and this is specifying-- since
it's in the output scope,
this is specifying the output
format of the audio we need
to receive from the node.
If we're working with a
generator or instrument
which don't take audio
input, that's it, we've--
we're done, we've just
specified the output format.
But if we're dealing
with an effect,
then we should also
specify the input format.
And in many cases, it's
going to be identical
to the output format and so,
we can make that same call
using the input scope just
that the input format that we're
going to supply to the node.
So, having specified formats,
we can look how we're going
to get audio from the
host into the node
and this is starting it get into
the details of multiple ways
that your host may
be interacting
with the node AudioUnit.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Now, since we're connecting
input, this is only for effects
and the host at this point
can supply input to a node
from another audio unit using
AUGraphConnectNodeInput.
AUGraph is a higher level API
which I'm just going to touch
on a few times today but you can
use AUGraph to build up graphs
or a series of connections
between audio units.
The other way to make a
connection to the node's input
from some other audio unit is
with the AudioUnitProperty
MakeConnection.
Alternatively, a host can simply
supply a callback function
with the SetRenderCallback
property.
This callback function
gets called at render time
and the host supplies the audio
samples to be given to the node.
Now, as far as connecting
the output of the node,
this too depends on the way
you built your host engine.
If you're using audio
units, you want to connect
to the node output to
some other audio unit.
You can use the MakeConnection
property again.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
If however you're pulling
audio into a custom engine,
then you would call
AudioUnitRender
but there's no setup
at this time for that.
We'll look at the
rendering process
in more detail a little later.
OK. One last a bit of mechanics
here that a host needs to do
to establish a reliable
connection to a node or actually
to reliably handle bad things
happening with the node is
to look out for what happens
when nodes become disconnected.
This could happen automatically
if the node app crashes,
if the system ejects it from
this memory before being
under memory pressure.
Also, if the host fails
to render the node
application regularly enough,
the system will evict it from--
or I'm sorry, will
break the connection.
When these things happen, then
the node AudioUnit becomes,
in effect of zombie,
meaning that it's--
there's still an
audio unit there.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
You can make API calls on
it but they won't crash
but you will get errors
back and that's the error
that you'll get back, the
InstanceInvalidated error.
The mechanics of establishing
that disconnection callback -
we call
AudioUnitAddPropertyListener
for this new property
IsInterAppConnected.
Here is what you would do
in the connection listener,
you can fetch current value of
the property and see if it is 0
and if the local variable here
connected has become zero,
then you know the node
application has become
disconnected and you
should react accordingly.
So, all of that prep work
has led us up to the point
or we're ready to actually
initialize the node.
Now, the AudioUnit
initialize call basically says
to the system and
the other AudioUnit.
Here, allocate all
of the resources you
need for rendering.
In the case of inter-app audio,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
the system at this point is
also allocating some resources
on behalf of that
connection such as the buffers
between the applications and
the real-time rendering thread
in the node application.
So, it's important to realize.
This is a point at which you are
beginning to consume resources
and as such, you have
the responsibility now
of calling AudioUnitRender
regularly
on this node audio unit.
So, that's the process
of setting up a host
to communicate with a node.
You activate your audio session,
you set your stream formats,
you connect your audio input,
add a disconnection listener,
and finally call
AudioUnitInitialize.
So having done that, you're at
the point now where you're ready
to begin streaming audio
between the two applications.
Let's look inside a host
application's engine
in more detail.
This is kind of a wonderfully
simple way to do things
if you can get your work
done using Apple AudioUnits.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So, the green box, the
green dotted lines--
box represents a host engine
but those red boxes inside are
all Apple supplied audio units.
So, there is the AURemoteIO,
we have a mixer AudioUnite
feeding that.
In feeding the mixer, we
have a file player AudioUnit
and the node AudioUnit.
But of course, there are many
things you would want to do
with audio that Apple doesn't
give you AudioUnits for.
If you want to that, then
you're going to write some code
of your own represented
by the green box
with the squiggly brackets.
So here, your engine is
feeding the AURemoteIO
and if you've written
an app like this before,
you know the way to provide
input to an AURemoteIO
from your own engine is with
the SetRenderCallback property.
And now in this case,
to fetch the audio
from the node AudioUnit, you
would call AudioUnitRender.
OK. So, that's a bunch
of stuff about how we--
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
a host application
interact with a node.
One final nice thing to
do for the user here is
to provide a way for the user
to bring the node
application to the foreground.
So, we can do this by asking
the audio unit for a PeerURL
and this URL is only
valid during the life
of the connection.
You don't want to hold on to it
because it's not going
to be useful later.
But right before the user
wants to switch in response
to that Icon tap or whatever,
you can fetch the PeerURL then
pass that to UIApplication
and ask it to open that URL and
that will accomplish the switch
of bringing the node
application to the foreground.
So, let's go back just
a little bit and look
at how node applications
see the process
of becoming connected to hosts.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So, the most important thing to
think about here as the author
of a node application is
that when the user opens
your application explicitly
from the home screen,
you're launched
into the foreground state,
you're ready start making music.
But if you're being
launched from the context
of a host application, you're
actually going to get launched
into the background state and
there are some limitations
about what you can do at this--
in this state and there's
also a requirement here.
You can't start running from the
background but you must create
and publish your I/O
unit as I showed earlier.
So, it's probably going
to be necessary and useful
in your node application
to ask UIApplication
what's the state here,
Am I in the background or
am I in the foreground?
and proceed accordingly.
So, node applications to find
out when they're
becoming connected
and disconnected can also listen
for the IsInterAppConnected
property just as I described
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
for host applications earlier.
For a node application,
you listen to this property
on your AURemoteIO instance.
So, in your property listener,
you can notice the transitions
of this property value
from zero to one.
When you see it becoming
true, then you know
that you're output unit has
been initialized underneath you
and that you should set
your audio session active
if you're going to
access the microphone.
You should at this time start
running because that's kind
of your final step of consent
saying, My engine is all hooked
up and ready to render,
start pulling on me.
You can, at this time,
start running even
if you are in the background.
This is the exception
to the rule
about running in the background.
When you are connected
to the host,
you can start running
in the background.
One further note, if you want
to draw an icon representing
the host
that you've become connected to,
there's a new API called
AudioOutputUnitGetHostIcon.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Pertaining further to the
IsInterAppConnected property,
you also want to watch
for the transition to zero
or false meaning that the host
has disconnected from you.
What you want to do at this
point is understand your output
unit has been uninitialized
and stopped for--
out from underneath you.
Now, if you were
accessing the microphone,
you should set your session
inactive at this time.
However, you might,
in some situations,
find yourself disconnected
while in the foreground.
Maybe the host application
crashed
or the system didn't have enough
memory to keep it running.
So, if that happens,
you probably do want
to start running and keep
your audio session active
or make it active
if it isn't already.
But again, you can only start--
you can only make your session
active and start running
when you're in the foreground.
So, just to reemphasize that.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Your node application can
start if you've been connected
to the host or you're
in the foreground
but you can keep running
in to the background
if you're connected, of
course, or if you are
in some other standalone
non inter-app scenario
where your app wants to keep
running in to the background.
Let's look again now at a few
different scenarios involving
how nodes render audio.
This is your normal
standalone mode
when the user has launched you.
You've got your engine
connected to the RemoteIO,
connected to the
audio I/O system.
If you're a generator
or instrument,
you may have your output
completely redirected
to the host.
But if you leave your
input bus enabled
but you advertise yourself
as a generator or instrument,
then you've continued
to receive input
from the microphone even while
your output has been redirected
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to the host application.
Now, this doesn't
add any extra latency
because the system
is smart enough
to deliver your application, the
microphone input first and then
in that same I/O cycle, the
host application will pull
your output.
In the final node
rendering scenario -
is you have in effect
both your input
and output streams are
connected to the host rather
than the audio I/O system.
Node applications can also use
that PeerURL property
I described earlier
to show an icon as
Alec did in his demo.
He showed-- the Garageband
icon in the sampler app.
So, you can fetch that
icon from your remote--
your AURemoteIO instance
in this case.
You can-- I'm sorry.
You can fetch that URL
to accomplish the switch.
OK. Back on the host
side of things,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
there are a few considerations
about stopping audio rendering.
The normal API calls for this
doing are AudioOutputUnitStop
or AUGraphStop and
what you want to do
at this point is promptly
uninitialize your AudioUnit
representing the node.
That releases the
resources that were allocated
when you initialized it and it
releases you from the promise
to keep rendering frequently.
You can turn around and
reinitialize when the user wants
to start communicating again
or if you're completely done
with that node AudioUnit,
you can call
AudioComponentInstanceDispose
and that's what you would do
the if the user, for example,
explicitly breaks the
connection or if you discover
that the node application
has become invalidated.
So, that's the process
of audio rendering.
Next, I'd like to look at how
we can communicate MIDI events
from host applications
to node applications.
Now, this of course, is
for remote instrument
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and remote music effect nodes.
You would want to use this
if you have MIDI events
that are tightly coupled to your
audio that's being rendered.
It lets you sample-accurately
schedule MIDI note-ons,
control events, pitch-bends,
et cetera.
But this is not recommended as
a way of communicating clock
and time code information.
That's sort of a funny
way to communicate
that you're using seven
bit numbers to break
up timing information.
We actually have a
better way to do that.
I should also mention that this
does not replace the coreMIDI
framework which still has a role
when you're dealing USB MIDI
input and output devices or,
for example, the
MIDI network driver.
You might also be
dealing with applications
that don't support inter-app
audio and you still want
to communicate with them.
So, let's look at how a
host application can send
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
MIDI events.
You might do something
like this in this--
like the sampler demo
app, Alex showed.
It had an on-screen keyboard.
So, whenever the user touches
the key, you send a note-on.
When the key is released,
you send a note-off.
So, the APIs for
sending MIDI events are
in the header file MusicDevice.h
and there is a function
in there called
MusicDeviceMIDIEvent.
Here, you pass the
node AudioUnit,
the three byte MIDI--
MIDI message.
And here, offsetSampleFrames,
the final parameter,
that would be used for
sample-accurate scheduling
but since we're doing this
in kind of a UI context,
we don't really know how to have
that kind of sample accuracy.
I'll get into how
we do in a moment.
So, we just passed this sample
offset frames of zero that--
at that note-on will
appear at the beginning
of the next rendered buffer.
Now, if we do want to
do sample-accurate,
scheduling then we have to
schedule our MIDI events
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
on the same thread that
were rendering the audio,
because in that thread context,
we can say where the MIDI
events need to land relative
to the beginning of
that audio buffer.
For instance if that MIDI buffer
or audio buffer rather is 1,024
frames, we might do some math
and figure out, oh, that note-on
needs to land at 412 samples
in to that sample buffer and
we can specify that in our call
to MusicDeviceMIDIEvent.
Now, of course, we can call
MusicDeviceMIDIEvent any number
of times to schedule any number
of events for one render cycle.
I just put these next to each
other to emphasize that you have
to be in the rendering
thread context to be able
to schedule sample-accurately.
Now, for you're using
AUGraph and you want
to schedule sample-accurately,
it's similar
but a little different because
you're not calling AUGgraph--
I'm sorry, you're not
calling AudioUnitRender,
the graph is doing
that on your behalf.
So, the way to do this,
there's an AUGraph API
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that lets you get called
back in the render context
and that's
AUGraphAddRenderNotify.
That gives you a callback
function that the graph calls
at the beginning of the render
cycle before actually pulling
audio from the node.
And that turns out to be
the precisely corrects time
to call MusicDeviceMIDIEvent
to schedule events
for that render cycle.
So, that's the process
of sending MIDIEvents,
let's look at how nodes
receive MIDI Events.
So, we have two basic
functions for sending
and then there's
MusicDeviceMIDIEvent
and MusicDeviceSysEx.
And we have two corresponding
callback functions for the use
of the node application,
the MIDIEventProc
and the MIDISysExProc.
So, in the node application,
here we have an example
of MIDIEventProc.
Well, it doesn't
do much but here is
where you receive each event
that's coming from the host
and typically, you would just
save it up in a local structure
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and use it the next
time you render a buffer
because this function will
get called at the beginning
of each render cycle
with new events
that apply to that render cycle.
So, having created
that callback function,
we can populate a
structure of callbacks.
You can notice I left
the SysExProc null,
that just means I'm
not going to get called
if there is any SysEx.
We use AudioUnitSetProperty to
install those callbacks and now,
on the node application side,
I'm going to receive each
MIDIEvent as it arrives.
So, that's how hosts
can sent MIDI to nodes.
Let's look now at how host can
communicate their transport
and timeline information
to nodes.
So, the important thing
about this model is
that the host is
always the master here.
The nodes can just find
out where the host is
and synchronize to that.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
We'll look at how the host can
communicate its musical position
as well as the state
of its transport.
And all of this is highly
precise and it's called
and pertains to the
render context.
So here too, we have a structure
full of callback functions,
we'll look at each of these.
So, this is probably
the most common one
that a host will implement,
this is called the
BeatAndTempo callback.
Here, the host can
say for the beginning
of the current audio buffer,
Where am I in the track
and that could be
in between beats.
The host can also communicate
what the current tempo is.
And so with these two pieces
of information, even--
even only these two
pieces information,
the node can do beat
synchronized effects
from the host for instance.
There's also some more detailed
musical location information
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
supplied by the host such as
the current time signature.
And finally, the host
can communicate some bits
of transport state, most
notably whether it's playing
or recording.
There's also a facility
for the host
to express whether it's
cycling or looping.
So here too, we're installing
a set of callback from--
callback functions
on an audio unit.
The host populates the host
callback info structure,
installs the callback
functions that it implements
and calls AudioUnitSetProperty.
So, once the host does this, the
system will call those callbacks
at the beginning of each
render cycle and communicate
that over-- that information
over to the node process
where the node application
will have access to them.
And the way the node
application gets that access is
by fetching the host
callback property.
It will receive that structure
full of function pointers.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
They won't actually
point to functions
in the host process, of course.
We can't make a cross process
call there, but the information,
as I just said, has been
communicated over to the node.
And it can access them there
within its own process.
There are some considerations
of thread safety here.
Most people importantly, since
this information is accurate
as of the beginning of the
render cycle, if you call it
in some other context, you
might get inconsistent results.
It's easiest if you fetch this
information on the render thread
but of course, there are
some cases where you want
to observe a transport
state for instance.
So, we give you a better
way to receive notifications
of transport state changes on
a non-render thread context.
You can install this
property listener
for the HostTransportState
and get a callback
on a non-render thread.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Okay, so that's the
process of transport
and timeline information.
Finally, I'd like to look
at the whole mechanism
by which node applications
can send remote control events
to host applications.
To accomplish that, we
have something called
AudioUnitRemoteControlEvents.
Now, there's something
called RemoteControlEvents
in UIKit as well.
Those are kind of in
a different world.
These are more specific to the
needs of audio applications.
So with these events, the
node can control the host
application's transport.
And for now, we have these
three events to find.
You can toggle-- you being a
node application-can toggle the
host's play or pause state, its
recording state and the node,
through an event, can
send the host back
to the beginning of
the song or track.
We do have some sample
applications
where our node applications
have some standard looking
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
transport controls.
And we'd like to encourage you
to check those out and use them
in your application so that
we can have a consistent look
and feel for these controls.
So, looking at how node
applications can send
RemoteControlEvents,
first, we want to find
out whether the host actually
is listening and is going
to support them because
if it doesn't,
maybe we don't even want
to bother drawing the
transport controls at all.
So to do that, we can
fetch this property
HostReceivesRemoteControlEvents.
And to actually send
the RemoteControlEvent,
the node calls
AudioUnitSetProperty using the
remote control to
event or I'm sorry,
remote control to host event.
And the value of that
property is the actual control
to be sent, toggle, or
record on this example.
So, there's a node sending
a RemoteControlEvent.
Here is a host receiving
one or rather preparing
to receive them, I should say.
So, to do that, the host creates
a block called the listenerBlock
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and in that block, the host
simply takes the incoming
AudioUnitRemoteControlEvent
and passes it to one
of its own methods called
handleRemoteControlEvent.
Now, that block is in
turn a property value
for the
RemoteControlEventListener
property so the host only
has to set that property
on the node AudioUnit and that
accomplishes the installation
of the listener for
RemoteControlEvents.
Next, I'd like to bring up
my colleague Harry Tormey
to show you about some
of these other aspects
of the inter-app
audio API in action.
>> Thanks Doug.
Hey everybody, my name is Harry
Tormey and I work with Doug
in the Core Audio
Group at Apple.
And today, I'm going to be
giving you a demonstration
of some of the sample
applications we're going
to be releasing on
the developer portal
to illustrate how
inter-app audio works.
The first demo I'm going
to be giving you is
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
of a host application connecting
to a sampler node application
and sending it some MIDI events.
So what you see in the screen
up there is a host application
and I'm going to bring up a list
of all the remote instrument
node applications installed
on this device and
I'm going to do
that by touching the
add instrument button.
So, none of these applications
are currently running.
They have just published
themselves
with their audio
component descriptions.
When I select one of these
applications from the list,
it will launch into the
background and connect
to the host application.
So I'm going to do that, I'm
going to select the sampler.
OK. So, you can see
the sampler's icon
up there underneath
the instrument label.
That means it's connected
to the host application.
So, I'm going to bring
up a keyboard in the host
by touching the show
keyboard button and I'm going
to send some MIDI
events from the host
to the sampler by
playing the keys.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[Music] Totally awesome.
OK. So, what if I want
to change the sample bank
that the sampler is using?
Well, I'm going to have to do
to the sampler and do that.
I'm going to do that by
touching the sampler's icon.
We're now in a separate
application and I'm going
to select a different
sample bank to use so how
about something nice
like a harpsichord?
Let me just do that there.
OK. So now, we're
in harpsichord,
I'm going to touch the
host icon there and go back
to the host application.
Touch the show keyboard
again and listen for it.
[ Music ]
That's a harpsichord.
OK. So, the next thing that
I'm going to show you is how
to use the callbacks that the
host application has published
to get the time code of the
host application when it records
and plays back things.
So one again, I'm going to go
the sampler by touching its icon
and I'm going to touch the
record button and I'm going
to record some audio in the host
so I'm sending a remote
message to the host.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[Music] I'm going
to stop recoding
by touching record button again.
Now, what I want
you to pay attention
to is the blue text
over the play button.
This text is going to be
updated with the callbacks
that the host application
has published and were going
to use this to display a
time code indicating how far
into the recording we are.
So, I'm going to touch the play
button and watch that text.
[Music] So, if I do
that again and I go
to the host application, you'll
see the time code is consistent
across both applications
so let me do that.
Let me press play again and go
to-- back to host application.
[Music] Okay, so
for my grand finale,
I'm going to add an effect
and that effect is going
to be the delay effect
that you saw.
So once again, I touched
the add effect button.
It shows you all of the effects
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that are installed
on this device.
I'm going to select the delay
one, it's going to launch it
and connect to the host.
OK. So in the host, if I touched
the show keyboard button again
and play a note, it's
going to be delayed.
[Music] How about that?
Much cooler than
remote controlled cars.
Okay everyone, that's
me, these demos are all
up on the developer portal and
I'm done with my demo so back
over to you Doug and
thank you very much.
[Applause]
>> Thank you Harry.
Hey, I found the right button.
So, back to some more
mundane matters here.
Dealing with audio session
interruptions, both host
and node applications
need to deal
with audio session
interruptions.
Here, the usual rules
apply namely
that your AURemoteIO gets
stopped underneath you.
But furthermore, in
a host application,
the system will uninitialize
any node AudioUnits
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that you have open.
This will reclaim the
resources I've been talking
about that you acquire when you
initialize the node AudioUnit.
One other bit of
housekeeping here,
you can make your
application more robust
if you handle a media
services reset correctly.
It's a little bit hard to test
this sometimes-- oops, but--
let me find my way back.
But if you implement this,
your application will
survive calamities.
So, when this happens, you can--
you will find out that all
of your inter-app audio
connections have been broken,
the component instances
have been invalidated.
So, in a host audio--
host application,
you should dispose your node
AudioUnit and your AURemoteIO.
And in a node application,
you should also dispose
your AURemoteIO.
So in general, it's simplest
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to dispose your entire audio
engine including those Apple
Audio objects.
And then, start over
from scratch
as if your app has
just been launched
and that's the simplest way
to robustly handle the
media services being reset.
Some questions that have come
up in showing this feature
to people, in talking with them,
can you have multiple
host applications?
Yes, if they are all mixable.
If one is unmixable, of course,
it will interrupt everything
else as it takes control.
Also, if you were to have
multiple host that are mixable
and one node application,
only one host can connect
to that node at a time.
Can you have multiple
node applications?
Yes, Harry just showed us that
that's more than possible.
A couple of debugging
tips here you may find
when creating a node application
that you're having
trouble getting it to show
up in host applications.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
If you see that happening, you
should watch the system log.
We try to leave some
clues for you there
in the form of error messages.
If you see a problem with
your Info.plist entry
which is a little bit
easy to do unfortunately
but if you do see a problem
there, we'll tell you that
and I would recommend going
and comparing your Info.plist
with the one-- in one of
our example applications.
I should also mention here
the infamous error of 12,985
which many people stub
their toes on in a lot
of different contexts.
I can tell you that what it
means is operation denied.
And in the context of inter-app
audio, you're likely to hit it
if you start playing
from the background.
We do hope to in an upcoming
release give that a proper name
and maybe another
value but in any case,
if you do see it,
that's what it means.
So we've looked at how node
applications register themselves
with the system,
hosts discover them.
Hosts create connections
to node applications.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Once that connection is up, host
and node apps can stream audio
to and from each other.
Host apps can send MIDI
to node applications.
Hosts can communicate
their transport
and timeline information.
And finally, we have seen how
nodes can remotely control hosts
so I think if you
have an existing music
or audio application,
it's not that much work
to convert it to a node.
It's mostly adding a
little bit of code to deal
with the transitions to and
from the connected state
and you can look how that works
in the example apps
we have posted.
Creating a host application
is a bit more work
but you're using existing
API for audio units
and there's a lot of history
there as well as powerful--
there's a lot of
power and flexibility.
We also like you to--
we encourage you to look
at our sample applications.
They'll help you with a lot
of the little ins and outs
and we're really looking forward
to the great music apps
you're going to make.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
On to some housekeeping matters
here, if you wish to talk
to an Apple Evangelist,
there's John Geleynse.
Here are some links
to some documentation
and our developer forums.
This is the only Core
Audio Session this year
but here are some other media
sessions later this week
that you might be interested in.
Thank you very much.
[ Silence ]