WWDC2015 Session 507

Transcript

X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[ Applause ]
>>AKSHATHA NAGESH: Thank you.
Good afternoon everyone, welcome
to the session What's
New in Code Audio.
I am Akshatha Nagesh, and
I will be the first speaker
in this session.
I will be talking about
[unintelligible] AVAudioEngine
and its new features for this
year's iOS and OS X releases.
Later, my colleague
Torrey will be talking
about other exciting
new features we have
for you this year like
inter-device audio
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
for you this year like
inter-device audio
and what's new in our
good old AVAudioSession.
Tomorrow morning we have another
Core Audio presentation called
Audio Unit Extensions and it's
about a whole new set of APIs
which I am sure you will
find very interesting,
so do catch that
session as well.
All right.
Let's begin with a
recap of AVAudioEngine.
If you know about Core
Audio, you may be aware
that we offer a wide
variety of APIs
for implementing
powerful audio features.
Last year, in iOS 8
and OS X Yosemite,
we introduced a new set
of Objective-C APIs called
AVAudioEngine as a part
of AVFoundation framework.
If you are not very
familiar with AVAudioEngine,
I would highly encourage
you to check
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
I would highly encourage
you to check
out our last year's WWDC session
AVAudioEngine In Practice.
Let's look at some of the
goals behind this effort.
There were three
important goals.
First, to provide a powerful
and a feature-rich API set.
AVAudioEngine is built on top
of our C frameworks; hence,
it supports most of
the powerful features
that our C frameworks
already do.
The second goal was to
enable you to achieve simple,
as well as complex tasks with
only a fraction of the amount
of code that you would write if
using our C frameworks directly.
Now, the task can be as simple
as a playback of an audio file
to as complex as, say,
implementing an entire
audio engine for a game.
The third important goal was
to simplify real-time audio.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The third important goal was
to simplify real-time audio.
AVAudioEngine is a
real-time audio system,
but yet it offers a
non-real-time interface for you
to interact with and, hence,
hides most of the complexities
in dealing with real-time
audio underneath.
This, again, enhances the ease
of usability of AVAudioEngine.
On to some of the features.
It is an Objective-C
API set and, hence,
accessible from Swift as well.
It supports low latency
real-time audio.
Using AVAudioEngine, you will
be able to perform a variety
of audio tasks, like
play and record audio,
connect various audio
processing blocks together
to form your processing chain.
You could capture audio at any
point in this processing chain,
say for your analysis
or debugging.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
say for your analysis
or debugging.
And also, you could
implement 3D audio for games.
Now, what is the
engine comprised of?
A node is a basic building
block of the engine.
The engine itself
manages a graph of nodes
that you connect together.
A node can be one of
three types, source nodes
that provide data for
entering, processing nodes
that process this data,
and the destination node,
which is usually the terminating
node in your processing graph
and refers to the
output node connected
to the output hardware.
Now, let's look at a
sample engine setup.
This could represent
a simple karaoke app.
You could be capturing
users voice
through the microphone
implicitly connected
to the input node, processing
it through an effect node,
which could be a simple delay.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
which could be a simple delay.
You could also be
tapping the users voice
through a node tap block,
and analyzing it say
to determine how the user is
performing and based on that,
you could be playing some sound
effects through the player node,
and you could have another
player node that's playing a
backing track in your app.
And all these signals can be
mixed together using a mixer
node and finally played
through the speaker connected
to the output node.
Now, in this setup, your input
and the players form
the source nodes.
The effect and the mixer are
your processing nodes input
and the output node is
the destination node.
Now let's look at these mixer
nodes in a little bit of detail.
There are two kinds of
mixer nodes in the engine.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
First is the AV audio
mixer node that we just saw
in the previous example,
and this is mainly used
for sample rate conversion,
up or down mixing of channels,
it supports mono, stereo,
and multichannel inputs.
And the second type of mixer
is called the environment node,
which is mainly used
in gaming applications.
It simulates a 3D space in which
the sources that are connected
to the environment node are
specialized with respect
to an implicit listener.
And environment node supports
mono and stereo inputs
and spatializes mono inputs.
Now, associated with the mixer
nodes and the source nodes,
there is something called
AVAudioMixing protocol.
This protocol defines a set of
properties that are applicable
at the input bus
of a mixer node.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
at the input bus
of a mixer node.
And the source nodes conform
to this protocol and, in turn,
control the properties
on the mixers
that they are connected to.
Now, if you set these properties
before making a connection
from the source to the mixer
node, the properties are cached
at the source node level.
And when you actually make a
connection between the source
and the mixer, the properties
take effect on the mixer.
Let's look at some of the
examples for these properties.
There are mainly three types --
common mixing properties
that are applicable
on all the mixer nodes.
The example is volume.
Stereo mixing properties like
pan that's applicable only
on the AV audio mixer node.
And 3D mixing properties
like position, obstruction,
occlusion, which is mainly
used in the gaming use case,
and those are applicable
on the environment node.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and those are applicable
on the environment node.
Now let's look at
a sample setup.
Suppose you have a player
node and both these mixers,
mixer node and the
environment node in your engine,
and suppose you set a bunch
of mixing properties
on the player node.
Now, at this point, since
the player is not connected
to either of the mixers,
the properties remain
cached at the player level.
Now suppose you make a
connection to the mixer node.
Now the properties like
volume and pan take effect
on the mixer node, while the 3D
mixing property, like position,
does not affect the mixer node.
Now, if you disconnect
from the mixer and connect
to the environment node,
volume and position take
effect while pan has no effect
on the environment node.
So this way you could have a
bunch of mixing properties set
on a player and move the
player from one mixer
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
on a player and move the
player from one mixer
to the other mixer
in your application
with the mixing settings intact.
We will revisit this
AVAudioMixing protocol
when we discuss one
of our new features
for this year in a few minutes.
Now, I would like to review
another important aspect
that is how to handle
multichannel audio
with AVAudioEngine.
There are two parts to
the setup involved here.
First is configuring
your hardware to be able
to receive multichannel audio.
Now, the hardware
could be HDMI device
or a USB device and so on.
And the second part is actually
setting the engine itself
to be able to render
multichannel audio
to this hardware.
We will look at these
one by one.
First, hardware setup on OS X.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
First, hardware setup on OS X.
On OS X, there is a built-in
system tool called audio MIDI
setup, using which the
user can configure his
multichannel hardware.
So he could use this
tool to, say,
set up the speaker
configuration,
channel layout, et cetera.
And then the app can
set up AVAudioEngine
to use this hardware for
multichannel rendering.
But on iOS, in order to enable
multichannel on the hardware,
the app needs to configure
its AVAudioSession.
We will assume a
playback use case
and see what are
the steps involved.
The first thing to do would be
to activate your audio session.
Next you need to check for the
maximum number of open channels
that are available
for your session.
Then you set your
preferred number of channels,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Then you set your
preferred number of channels,
and as a final step, you
query back the actual number
of output channels to
verify whether the request
that you just made
went through or not.
Now, note that whenever you make
a request for a certain number
of channels, it does
not guarantee
that the request
is always accepted.
Hence, the final step
of verifying the actual output
number of channels is necessary.
Now, in code, it
looks like this.
We will assume an
audio playback use case
and assume you want
a 5.1 rendering.
So the first thing to do would
be to get a shared instance
of the audio session,
set your category
and make the session active.
Next, check the maximum
number of output channels
that are available
in your session.
And based on that, you
set your preferred number
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And based on that, you
set your preferred number
of channels on the session.
And as a final step, you
query back the actual number
of output channels
and then adapt
to the corresponding
channel count.
Okay. So this was all
the hardware setup part.
Now we'll see how to set
up the engine to be able
to render multichannel audio.
Again here there
are two use cases.
First, say you have a
multichannel audio content
available that needs
to be played back
through the multichannel
hardware.
And in this case, you would
use an AV audio mixer node.
And in the second case,
as a gaming scenario
where you want your content to
be spatialized and then played
through the multichannel
hardware.
And here you would use
an environment node.
Case one, you have a
multichannel audio content
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Case one, you have a
multichannel audio content
and a multichannel hardware
that's just been set
up as we discussed
a few minutes back.
Now, note that although
the format of the content
and the hardware are shown
to be identical here,
they could very well differ.
And the mixer node here will
take care of channel mapping
between the content and
the hardware format.
So the first thing you need
to do is propagate the hardware
format to the connection
between the mixer
and the output node.
So in code, it looks like this.
You would query the output
format of the output node,
which is the hardware format,
and then use that format
to make the connection
between the mixer node
and the output node.
The next thing is similar
on the content side.
You propagate the content
format to the connection
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
You propagate the content
format to the connection
between the player
and the mixer node.
So assume you have the
multichannel audio content
in the form of a file.
You can open the
file for reading
and use its processing
format to make the connection
from the player to
the mixer node.
And then you schedule
your file on the player,
you start your engine
and start the player,
and the content will flow
through your processing chain.
Now case two, where it's
typically a gaming scenario
and you need your content to
be spatialized and then played
through the multichannel
hardware.
So the steps here are very
much similar, except a couple
of subtle differences.
So the first thing is you
get your hardware format
and set the connection format
between your environment
node and the output node.
Now, since the environment node
supports only specific channel
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Now, since the environment node
supports only specific channel
layouts, you need to map the
hardware format to a layout
that the environment
node supports.
So that's the first difference.
So assuming we have
a 5.1 hardware,
we can choose audio
unit 5.0 layout pack
that is supported
by the audio node.
We can create an AV audio
channel layout using this
layout tag.
And then an AV audio
format using this layout.
And then you make a connection
from the environment node
to the output node
using this format.
The second step is
exactly the same,
propagating your content
format between the connection
from player to the
environment node.
So we open the file for reading
and use its processing format
to make a connection from the
player to the environment.
And the next thing here is
to set a rendering algorithm
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And the next thing here is
to set a rendering algorithm
on the player to one which
supports multichannel rendering.
This rendering algorithm is one
of the 3D mixing
protocol properties
that we just saw a
few minutes back,
and this will tell
the environment node
that the corresponding
source is requesting a
multichannel rendering.
And then the usual stuff.
You schedule your file on the
player, you start your engine
and the player, and then your
content will be spatialized
by the environment node.
Okay. So this was
AVAudioEngine as it existed
in iOS 8 and OS X Yosemite.
Now on to the more
exciting stuff.
What's new for this year?
We have three main new features.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
We have three main new features.
First is the splitting support,
which I will be talking
about in a minute.
Second is the audio
format conversion support,
and we have a couple
of new classes here,
the main one being
AVAudioConverter.
And then finally, we have
another new class called
AVAudioSequencer, which supports
play back of MIDI files.
Moving to the splitting support.
Now, let's consider
this sample setup,
which by now I guess
should be pretty familiar.
So in the API that
existed as of last week,
only one-to-one connections
were supported in the engine.
That is the output of any
node could only be connected
to one other node in the engine.
But now, instead of this, we
have added support to do this.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
But now, instead of this, we
have added support to do this.
That is to be able to
split output of a node
into multiple paths in
your processing chain.
So in this example, the
output of the player is split
into three different paths
and then eventually
connected to the mixer node.
Now, splitting is very useful
in use cases like mixing
where you need to blend in some
amount of wet or process signals
with a dry signal all
driven by the same source.
In this example, the
connection from the player
to the mixer node forms your dry
signal path while the other two
paths going through the
effect nodes forms your wet
signal paths.
And all these three signals are
mixed together using a mixer
node to give you the mix.
Now, note that when you
split the output of a node,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
the entire output is actually
rendered through multiple paths,
and there is no splitting
of channels involved.
Now let's see in code how
to set these connections up.
As you can see, the
player is connected
to three different nodes.
We will call these as
connection points represented
by a very simple new
class called AV audio
connection point.
The first thing to do
is to create an array
of connection points that
you want your player node
to be connected to.
So in this example, we want
connections to the input bus:
0 of the two effects and input
bus: 1 of the mixer node.
Then you use the new
connection API we have
to connect the player to
these connection points.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to connect the player to
these connection points.
And that's it, so you are set
up for the split connections.
So you move on and make
your other connections
in the engine as you need.
Now let's revisit the
AVAudioMixing protocol
that we discussed sometime back
and see how it affects
splitting use case.
Assume we have a player
node whose output is split
into two different paths,
going through the effect
nodes to a mixer node.
Now, suppose you set a
property on the player node.
In this example,
say you set volume.
Now, at this point, the
property will take effect on all
of its existing mixer
connections, so in this example,
both input bus: 0 and input bus:
1 of the mixer node
get a volume of .5.
But if you wanted to
overwrite any of the properties
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
But if you wanted to
overwrite any of the properties
on a particular mixer
connection,
you could still do that.
And the way to do it is
using our new class called
AVAudioMixing destination.
So you query the player node
to give you the destination
object corresponding
to the mixer that you want,
and you then set a
property on that object.
So in this example, we
are overwriting the volume
on mixer input bus: 0 to .8.
Now, okay, so similarly, you can
also overwrite the properties
on the other mixer
connection as well.
Now, let's see what
happens in a disconnection.
Suppose you disconnect
the effect
to mixer input bus:
1 connection.
Now, note that the settings
that you may have overwritten
on that particular mixer
connection will not
be preserved.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
be preserved.
Hence, the state of mixing
settings will look like this.
The player's mixing
settings remain intact,
and the other connection that is
active will also have its mixing
settings intact.
Now, if you end up making
the connection back again
to mixer input bus: 1, since
the earliest settings were not
preserved, the base settings off
the player node will now take
effect in this new connection.
So hence, the volume of input
bus: 1 will again be set
to .5 based on the
player's mixing settings.
So to summarize, when a
source node is connected
to multiple mixers, the
properties that you set
on the source node
will be applied to all
of its existing mixer
connections as well
as any new mixer
connection that you make.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And the properties
on the individual mixer
connections can be overwritten
if you want to, but remember
that they will not be preserved
on any disconnections.
Final words on the
splitting support.
The engine supports
splitting of any node
in the processing graph,
provided you adhere
to a couple of restrictions.
Now, starting from the node
whose output is split till the
mixer where all the parts
terminate, you cannot have any
of the time effect nodes.
That is you cannot have
speed and time pitch.
Nor can you have any
rate conversions.
So in other words,
all the split parts
from the base node should be
rendering at the same rate
until they reach a common mixer.
So if you stick to
these restrictions,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So if you stick to
these restrictions,
then you could split
the output of any node
in the engine into
multiple parts.
Okay. So now moving on to the
second new feature we have
for this year, audio
format conversion support.
So we have a couple
of new classes here,
AVAudioCompressedBuffer,
and AV Audio Converter.
Now, in the API that
existed as of last week,
we have an AVAudioBuffer and one
of its subclasses
called AVAudioPCMBuffer,
and as the name suggests,
the PCM buffer holds
uncompressed audio data,
and the data flow
through the engine is
in the form of PCM buffers.
Now, starting this year,
we have another subclass
of AVAudioBuffer called
AVAudioCompressedBuffer,
and this holds compressed
audio data.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and this holds compressed
audio data.
And this can be used with
the new class we have called
AVAudioConverter that
I will talk about next.
AVAudioConverter is
a new utility class,
and it's a higher-level
equivalent
for our audio converter from
the audio toolbox framework.
This supports all audio format
conversion, so you could convert
from PCM to PCM format while
changing, say, integer to float,
bit depth, sample
rate, et cetera.
Or you could convert between
PCM and compressed format
that is you can use it for
encoding and decoding purposes.
And AVAudioConverter can
be used in conjunction
with AVAudioEngine, as we
will see in an example.
Okay. So suppose you have your
engine set up for a playback.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Okay. So suppose you have your
engine set up for a playback.
So we have a player
node connected
to an effect node
and an output node.
And suppose you have a
compressed audio stream
coming in.
Now, we know that the data
flow through the engine is
in the form of PCM buffers.
So now you could use
an AVAudioConverter
to convert your input compressed
stream into PCM buffers,
and then you can use
these buffers to schedule
on the player node, hence
the playback can happen
through the engine.
Now let's consider a
code example and see how
to use AVAudioConverter
for encoding purposes.
Now, here we want to
convert from a PCM
to an ASC compressed format.
So the first thing to
do is define your input
as well as output format.
So here I have an input
format which is a PCM format,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So here I have an input
format which is a PCM format,
and I have an output format,
which is a compressed
ASC format.
Next you create an
AVAudioConverter and asking it
to convert from your input
to the output format.
Then you create your
audio buffers.
The input buffer in this
case is a PCM buffer,
and the output buffer is our
new AVAudioCompressedBuffer
in the ASC format.
The next thing to do is
to define something called
AVAudioConverter input block.
This is the block that the
converter will call whenever it
needs input data.
So there are a couple of things
that you need to do here.
First, you need to
inform the converter
about the status of your input.
So suppose when the
block gets called,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So suppose when the
block gets called,
you do not have any
input data available.
So at that point, you
can say no data now
and return a nil
buffer to the converter.
Now, suppose you have
reached end of stream,
so you can inform the converter
saying it's end of stream
and again return a nil buffer.
Otherwise, in the normal cases,
you can see that you
do have data, and fill
and return your input
buffer to the converter.
Now, this is the
main conversion loop.
In every operation of this loop,
we are asking the converter
to produce one output
buffer of data,
and we are providing the input
block that we just defined
to the converter so that it
can be called by the converter
as many times as it needs input.
Now, the converter will
also return your status,
which you can check to see
the state of conversion.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
which you can check to see
the state of conversion.
So if the converter
says it's end of stream
or if it says there
was an error,
you could handle it accordingly.
Otherwise, in the normal cases,
every iteration will provide
you one output buffer of data.
Okay. So coming to
our final new class
for this year, AVAudioSequencer.
This supports playback
of MIDI files,
and AVAudioSequencer is
associated with an AVAudioEngine
at the time of instantiation.
And the sequencer is responsible
for sending MIDI events
to the instrument nodes that you
may have attached in the engine.
Now, the example for instrument
nodes, audio samplers,
MIDI events et cetera.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Now let's look at
a sample setup.
Suppose you have your
AVAudioEngine set
up with an instrument node
connected to a mixer node
and to the output node.
You can now create
an AVAudioSequencer
and associate it
with this engine.
And then, when you start your
sequencer and start your engine,
the sequencer will automatically
discover the first instrument
node in the engine and
start sending MIDI events
to that instrument node.
And in code, it looks like this.
So the first part is
your engine setup,
which is outside
of the sequencer.
So here we have an instrument
node that is a sampler,
so you make your required
connections in the engine,
and then you start your engine.
Now, at this point, there
will be no audio playback
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Now, at this point, there
will be no audio playback
because there isn't anything
that is driving the
instrument node.
Then next you create your
sequencer and associate it
with the engine that
you just configured.
You load your MIDI file
onto the sequencer.
And then you can
start your sequencer.
So at this point, the sequencer
will implicitly discover your
sampler node that you have
attached in the engine
and start sending MIDI
events to the sampler node.
And hence, your audio
playback will start.
Now, suppose you had multiple
tracks in your MIDI file.
Now, the default behavior
of the sequencer is
to send all the tracks to
the first instrument node
that it finds in the engine.
But in case you wanted
to direct your tracks
to the individual
instrument nodes,
you can do that with
just a few lines of code.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
you can do that with
just a few lines of code.
Now, the creation and
setting up of the engine
in the sequencer is
the same as earlier.
The only additional thing that
you need to do is you need
to get the tracks
from your sequencer
and set the destination
for each of your tracks
to the instrument
node that you want.
Final few words on
the sequencer.
The sequencer has its own
set of transport controls
for the MIDI events, unlike the
transport controls on the engine
that control the flow of audio.
So here you can prepare
the sequencer for playback,
which basically does prerolling.
You can start/stop
the MIDI events.
You can set the playback
position of the MIDI events
in terms of seconds or beats.
And also, you can
set the playback rate
of the MIDI events.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Okay. So with that, we
have seen the new features
in AVAudioEngine for this
year's iOS and OS X releases.
Now I would like to
show you a quick demo
to see these new
features in action.
And for that, I would like to
invite Torrey onto the stage
to help me with the demo.
[ Applause ]
Okay.
So in this demo, we
have an AVAudioEngine
and an AVAudioSequencer that
is associated with this engine.
So in the engine, we have an
instrument node that you can see
at the top whose output is split
into three different paths.
One of the paths is directly
connected to the main mixer node
in the engine, and two of
the other paths are connected
through two different effects,
and then to the main mixer.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
through two different effects,
and then to the main mixer.
Now, using the AVAudioMixing
protocol properties we
discussed, there are
volume controls on each
of these mixer input buses.
So the slightest that you
see for distortion volume,
direct volume, and reverb
volume are the volume controls
and controls through
mixing protocol.
Now, at the top, in
the light gray box,
you can see this transport
controls for the sequencer.
You can see that we have a play
stop control on the sequencer
as well as sliders for
setting the playback position
and the playback
rate of the MIDIs.
There is also a master
volume on the main mixer
to control the volume
of your mix.
Now let's go ahead and
start the sequencer [music].
So at this point, the
MIDI events are being sent
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So at this point, the
MIDI events are being sent
to the instrument node
that is in the engine.
You can dynamically change
the position of playback
of the MIDI events
and the playback rate.
Using the volume controls, you
can blend in the required amount
of effects that you
want in your mix.
You could increase
the distortion volume.
Or the reverb volume.
So effectively, you can play
around with these volumes
and create the mix
that you desire.
And finally, using the master
volume on the mixer node,
you could control the volume of
the overall mix [audio fading].
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
you could control the volume of
the overall mix [audio fading].
Okay. So that was a
demo of the new features
that we just discussed.
[ Applause ]
So the sample code for this demo
should be available sometime
later this year.
Now final words on what we
saw today in AVAudioEngine.
We saw a recap, and we reviewed
how to handle multichannel audio
with AVAudioEngine, and then
we saw three new features
for this year -- first,
splitting support; second,
audio format conversion support,
the main class being
AVAudioConverter; and finally,
we saw another new class
called AVAudioSequencer
that supports playback
of MIDI files.
I hope you will use these
new features in your apps
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
I hope you will use these
new features in your apps
and provide us feedback.
Thank you.
Over to Torrey to
take it from here.
Thanks, Torrey.
[ Applause ]
Thanks, Akshatha.
Good afternoon, everyone.
I'm Torrey, and let's
keep it rolling
with Inter-Device
Audio Mode for iOS.
It's no secret that
the iPad is one
of the most versatile musical
instruments on the planet,
and that's primarily
thanks to all of you-all,
with digital audio workstation
apps, synthesizer apps,
drum machine apps, sound toys.
There's an endless
amount of audio content
that you can generate
with the same device.
So how do you get that audio
content from your iOS device
into your project that you
are working on on your Mac?
Well, you could plug in a
cable into the headphone jack,
run that into an audio
breakout box that's connected
to your Mac, but that
would be digital to analog
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to your Mac, but that
would be digital to analog
and back again to digital.
I am sure that we can all
agree that's less than ideal.
So to record audio
digitally from an iOS device,
we attach a lightning-to-USB
adapter.
We attach a USB audio
class-compliant interface that's
capable of doing digital audio
output; a digital audio cable;
another interface that's capable
of receiving digital
audio input;
and we attach that to the Mac.
It works. But it's
a lot of hardware.
There's also third-party
software that attempts
to solve this same problem,
which is great if your app uses
that or if your favorite
app uses that.
But wouldn't it be
great if you didn't have
to use any extra hardware or
install any extra software
and you could just record audio
digitally through the darn cable
that came with the darn thing?
Well, drumroll, please.
Oh, great latency
on that [laughter].
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Oh, great latency
on that [laughter].
Introducing Inter-Device
Audio Mode,
or as we like to call
it internally, IDAM.
IDAM allows you to record
audio digitally through the USB
to lightning cable that
came with your device.
It records at a hardware
audio stream format
of two channel 24-bit at
48 kilohertz sample rate,
and it is a USB 2.0 audio
class-compliant implementation
from end to end.
What that means is
on the Mac side,
you are using the class
Mac USB audio driver.
You will get the same great
performance and low latency
that class-compliant USB
audio devices currently get.
Also, on the iOS side,
the implementation is
USBISOCHRONOUS audio.
That means your bandwidth
is reserved.
If you want to backup
120 gigabytes of data
to your Mac while you are
tapping out a drum beat
and recording it,
you can rest assured
that your audio is not
going to have any artifacts.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that your audio is not
going to have any artifacts.
Next, my no slide.
No additional hardware required.
There's no additional
software required.
You don't need to modify
your OS X application
to take advantage
of this feature.
If your iOS application
already properly adapts
to a two-channel 24-bit
48-kilohertz audio stream
format, you don't have to
modify your iOS application.
And if you get a calendar alert
while you are in the middle
of jamming out a
sample baseline,
that alert is not going
to go out over the USB.
It goes out over the speaker
like it's supposed to.
[ Applause ]
Okay. So Inter-Device Audio
Mode, your device can charge
and sync in this mode; however,
photo import, tethering,
and QuickTime screen capture
will be temporarily disabled.
You want those back,
click a button
or just unplug the device.
Torrey, how do I
use it, you may ask?
So we built support directly
into audio MIDI setup.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So we built support directly
into audio MIDI setup.
If you look under the window
menu, you will see a new option,
show iOS device browser,
and if you click that,
you get this fancy-pants view.
It shows you all your
connected iOS devices.
You want to enter or exit
the IDAM configuration,
you click the button.
Now, you can actually embed this
view into your OS X application
if you choose to do so, and I
am going to show you some code
for how to do that,
but before that,
you've guessed it,
it's demo time.
I've got one of these
fancy iPads up here.
All right.
Here on this iPad, I am running
a synthesizer application
called Nave.
I like this application
because some
of the patches you can actually
get a nice touch interface
for it.
You can potentially
do aftertouch with it,
which is something that's
not on every MIDI controller
that you buy on the market.
It's one reason why iPads are
such great musical instruments.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
It's one reason why iPads are
such great musical instruments.
So I selected a patch here
that sounds like this.
[ Music ]
I like those sounds, so
I want to record them
into a hip hop project that
I am working on on the Mac.
So let's move over to the Mac,
okay here on the Mac I already
have audio MIDI setup running.
I am going to go to the window
menu, show iOS device browser,
and I can see I currently
have no connected iOS devices.
So I will plug in my iPad.
It shows up immediately.
Blink and you miss
it, but click Enable,
and now you've got an extra
audio device right here.
Yes, Logic, we will
get to you shortly.
You can see there's a
two-channel input audio device
that's been added here.
And we do want to use this in
my Logic project, so I am going
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And we do want to use this in
my Logic project, so I am going
to go ahead and say Use Here.
Let's bring Logic back up.
All right.
Here is the beat I
have been working on.
[ Music ]
And I want to record in an
audio track here, so I am going
to create a new audio track.
Record monitor that
and record enable.
Bring the volume down a little
bit because it will be at unity.
And now [music] I can hear that
coming directly into my track.
So let's record a piece.
[ Music ]
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[ Music ]
And if I play back my
performance [music] the audio's
captured exactly like I expect.
Sick beat, Bro.
[ Applause ]
Now, that concludes my demo,
but I am not a professional.
Please try this at home.
It's in your betas right now.
If you find any bugs
with it whatsoever,
just go to bugreport.apple.com,
and file that bug for us.
Okay. So a few last
words about IDAM.
It requires OS X El
Capitan and iOS 9.
It will work on all iPhones
with a lightning connector.
It works on all iPads
with a lightning connector
except our very first iPad mini.
And if you've got a home iPhone,
a work iPhone, a home iPad,
a work iPad, and an iPad for
the kids and the power hubs
to support it, you can plug all
of those in at the same time,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to support it, you can plug all
of those in at the same time,
aggregate them together as a
massive ten-channel input device
if you want to, and record it.
And if you've got USB
ports left over after that,
please point your Safari browser
at store.apple.com [laughter].
[ Applause ]
Okay. I said earlier
that I would tell you how
to embed the view controller
into your OS X application
if you choose to do
that, and I am going
to show you the code
for that now.
This code is pretty boilerplate,
so I took the liberty
of highlighting the
important part,
if you will let your eyes
slide down to the yellow text,
and you will see CA Inter-
Device Audio View Controller.
You just want to create one of
those and then add the subview
to your view container.
As long as I am talking about
new view controllers that are
in CoreAudioKit, I
also wanted to tell you
about a couple more you
might be interested in.
We've also added
CABTLEMIDI window controller.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
We've also added
CABTLEMIDI window controller.
This is the UI for configuring
your Bluetooth Low Energy MIDI
devices, it's a feature
that we debuted last year.
It's an NS window controller
subclass, and that view looks
like this, so you can
now embed this directly
into your OS X application
if you choose to do so.
One more for you.
We also have CA Network
Browser Window Controller.
This is the UI for configuring
your Audio Video Bridge
audio devices.
Also an NS window
controller subclass,
and that view looks like this.
Okay. Let's push in the clutch
and switch gears as we talk
about what's new
in AVAudioSession.
So how many of you listen
to podcasts and audio books?
All right.
A lot of you.
How many of you also
use your iPhone
as your primary navigation
device?
A lot of you too.
Okay. So you may have
seen this problem before.
Let's say you are going
to a soul food restaurant
that you've never been to.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that you've never been to.
You are navigating.
Then you are listening
to a podcast.
This podcast is hosted
by podcast personality X,
and they are interviewing your
favorite currently relevant
celebrity Y, and in the
middle of this interview,
the conversation is
getting really juicy
when your navigation pipes
up and says in 500 feet,
keep to the right, followed
by a -- keep to the right.
Then after that you
hear raucous laughter.
Some great joke happened,
and you just missed it.
So you back up the audio, and
you start playing it again,
and just as you get to the
joke, "keep to the right."
You pump your fist and say "I am
having a bad user experience."
Okay. So we have a solution
for that [laughter].
I took some liberties
with that one.
We have a solution for
that for you in iOS 9.
So now podcasts and audio
book apps can use a new
AVAudioSession mode called
AVAudioSession mode spoken
audio, and for navigation
and fitness apps or apps
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
audio, and for navigation
and fitness apps or apps
that issue vocal prompts
that interrupt other
applications' audio,
there's a new AVAudioSession
category option called Interrupt
Spoken Audio and
Mix with Others.
Now, Maps already uses
Interrupt Spoken Audio and Mix
with Others, and podcasts
and iBooks are already
using AVAudioSession mode
spoken audio.
How can you use these
in your application?
Well, let's look at some code.
I am just going to go through
these pieces of code line
by line so you can
see what we are doing.
First we will start with
your audio session setup.
You will get the shared
instance of the audio session,
set your category to
playback, and for your options,
you will use the
option duck others.
Now we are going to use
something new in Swift 2,
this new if # available.
If will allow you to deploy
the same code in iOS 9
that you can deploy
to earlier iOS's.
So only if you are on
iOS 9 or greater, you can
or in this extra option,
Interrupt Spoken Audio
and Mix with Others.
Then you will set your
audio session category.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Then you will set your
audio session category.
Okay. Let's actually
look at the nav prompt.
Now you are going to play
your navigation prompt
that looks like this.
First get the shared
instance of the audio session.
You will create an AV audio
player using the URL that's
supplied in the function
prototype.
You set the player's
delegate to self.
What this will do for you is it
will allow the delegate method
to be called on your behalf
when your audio prompt
is finished playing.
Set your audio session
to active,
and then play on, player.
Now, here we go.
Now we are -- your
audio is done playing,
so your delegate
method is being called.
Audio player did finish playing.
The code looks like this.
Get the shared instance
of the audio session.
Set your audio session
to be inactive.
And use the option, option
notify others on deactivation.
This means any other audio
that was playing before you
from another process
can get notified
that you are done
interrupting them.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Now let's look at
the other side.
Let's go to the podcast or
spoken audio application.
Here's the audio
session setup for that.
Get the shared instance
of the audio session,
set your category to playback.
You let the mode be default for
now, but if you were running
in iOS 9 or later, you
can use this new mode,
AVAudioSession mode
spoken audio,
then set your category
and your mode.
The next thing you're
going to want to do is
to add an interruption handler.
We are going add an interruption
handler called handle
interruption and we want
it to be called in response
to AVAudioSession
interruption notification.
This is also a good time to
register for other notifications
of interest; for example, if the
media server died and you want
to be notified about that so
you can reset your audio state.
Okay. So now let's look at
the interruption handler.
The first thing we'll do here
is we will get the user info
dictionary from the NS
notification object that's
supplied in the function
prototype and from
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
supplied in the function
prototype and from
that user info dictionary
we will look
at the key AVAudioSession
interruption type key.
Now we are going to
switch on that type.
The first part of this is
going to be what happens
when your audio session is
beginning to be interrupted,
and then the second part
will be on the next slide,
and that's for the end.
So if this is a begin
interruption,
the first thing you will
want to do is update your UI
to indicate your playback
has already been stopped.
Then if your internal state
dictated that you were playing,
then you will set
was playing to true.
That will let you know later
on when this interruption is
over that you are okay to resume
audio, if that's appropriate.
Then, of course, update
your internal state.
So now this is the
end interruption,
so in case this is
the end interruption,
you'll get the flag from
the user info dictionary
at AVAudioSession
interruption option key,
and if that flag is
option should resume
and you were playing before
when you were interrupted,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
rewind the audio
just a little bit
because your audio was
ducked before it was stopped,
and then you can play
your player again,
update your internal state, and
then update the UI to reflect
that the playback has resumed.
And that's all the
code I have for you.
Quick recap.
Today we have told you
about some exciting new
enhancements in AVAudioEngine.
We have introduced
Inter-Device Audio Mode.
We've told you about some new
CoreAudioKit view controllers
that you can embed into
your OS X applications
for Inter-Device Audio Mode,
Bluetooth Low Energy, MIDI,
and Audio Video Bridge.
And we've also introduced
AVAudioSession mode spoken audio
and the new AVAudioSession
category option Interrupt Spoken
Audio and Mix with Others.
But that's not all.
We have another session
coming up tomorrow at 11 a.m.
on all the exciting changes
to Audio Unit Extensions,
so we hope to see you all there.
Also some related sessions
that happened earlier today,
if you want to go
back and look at them,
especially if you are coming
in from the game audio side.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
especially if you are coming
in from the game audio side.
If we are not able to answer
all your questions at the labs,
these are some very useful
websites that you can go to,
and any general inquiries can
be directed to Craig Keithley,
whose contact information is
at the bottom, and we out.
[ Applause ]