WWDC2019 Session 207

Transcript

[ Music ]
[ Applause ]
>> Hi. I'm Danny Mandel.
And welcome to Introducing
SiriKit Media Intents.
We've added media domain support
to SiriKit for audio use cases
and we're super excited to tell
you all about it.
So what are we going to cover in
this session?
The first thing we'll do is
introduce the new SiriKit Media
Intents and talk about their
capabilities.
Then we'll talk about what's
required for you to handle
SiriKit Media requests on your
app.
And finally, we'll talk about
some best practices you're going
to want to follow to provide the
best user experience possible
when you add SiriKit Media
Support to your app.
This year, we're allowing you to
control audio in a whole new way
with SiriKit Media Intents.
And we think people are going to
love using the Siri Media
capabilities you build into your
apps.
With SiriKit Media Intents,
people will be able to do things
like play audio, update taste
profiles, add to collections,
and search.
This means that they'll be able
to use the rich natural language
processing capabilities of Siri
to say things like "Play Khalid
on my app," to immediately begin
playing Khalid in your app.
Let's take a look at the new
SiriKit Media Intents and their
capabilities.
There are four SiriKit Media
Intents.
The first intent is
INPlayMediaIntent.
INPlayMediaIntent allows people
to play audio by saying
something like "Play Outer Peace
in my app."
Now, you might remember that we
launched Media Playback Support
in IOS 12 with Shortcuts.
And this year, we're adding
SiriKit features to
INPlayMediaIntent.
The next intent is
INAddMediaIntent.
INAddMediaIntent allows people
to add items to their playlists
and libraries.
An example of this could be "add
this song to my road trip
playlist in my app."
We have
INUpdateMediaAffinityIntent,
which allows people to express
affinity for media items.
People can say this by saying
something as simple as "I like
this song."
And finally, we have
INSearchForMediaIntent, which
lets people search your app for
a particular media item.
For example, "Find Billie Eilish
in my app."
Let's talk about the supported
media types in SiriKit Media.
SiriKit Media supports a number
of different audio types, with
the first one being music.
And music support is going to
let you say things like "Play
the song Awesome Song in my
app."
In addition to songs, we have
support for albums, artists,
playlists, genres, and many
more.
So you're going to want to check
out the documentation for
INMediaSearch to get the full
list of supported search terms.
And we want you to adopt as many
of them as possible to provide
the best Siri user experience in
your app.
Additionally, we have playback
controls, like shuffle, repeat,
and playback queues.
And this lets people say things
like "Play Khalid shuffle in my
app," or, "Play Outer Peace next
in my app."
The next supported audio type is
Podcasts.
People can begin playing
podcasts by saying something
like "Put on the Stuff You
Should Know podcast for my app."
Additionally, people can also
control the playback order and
speeds of podcast episodes.
This lets people say something
like "Play the newest episode of
Stuff You Should Know podcast in
my app," or, "Play the Stuff You
Should Know podcast in my app at
double speed."
Moving on, we have audiobook
support.
This lets people say things like
"Play the audiobook Becoming in
my app."
And like Podcasts, people can
begin to playback at a specific
speed when asking to play
audiobooks.
And finally, we have radio
support.
Radio support allows people to
ask for a specific radio station
in your Radio Playback app.
For example, "Play 89.1 FM in my
app."
And don't worry if your app
doesn't fall into one of the
previous media types, you can
still adopt the SiriKit Media
Intent and get the full power.
People will be able to say
things like "Play search term in
my app," and you'll be able to
look up search term in your app
and play it.
The only thing that will be
missing will be support for
strongly parsed media types.
So say you had a nature sounds
app and you said "Play reptile
sounds in my app," or, "Play
mammal sounds in my app," Siri
is not going to know that those
are two different types of
animal sounds.
So you'd get a string of mammal
sounds or reptile sounds, and
you could look it up and play
it.
So not quite as structured as
the other types but still
supported.
Let's look at how we handle
these intents in SiriKit.
So the first thing to know about
how to request with SiriKit
Media is that SiriKit Media
Intents are just like any other
SiriKit domain.
So all the intent-handling
happens in your Intents app
extension, where you conform to
the SiriKit Media Intent
handling protocols.
The details of SiriKit request
handling have been covered
really well in previous WWDC
talks, so I'd refer you to those
talks and to the developer
documentation online for more
general details about SiriKit
Request Processing.
Now let's look at what happens
for a typical request in the
SiriKit Media domain.
The request processing begins
when someone says "Play cool
song in my app."
And Siri is going to recognize
that this is a request for your
app, it's going to launch your
Intents Extension.
Now, there are three steps in
SiriKit Request Processing:
resolve, confirm, and handle.
The first step in request
processing is the resolve step.
In the media domain, the resolve
step is where we take the
intents' INMediaSearch object
and we run a search against our
app catalog.
The output of resolve is one or
more concrete media item objects
to play.
Alternatively, if we didn't find
anything that matched or another
error occurred, we can return an
unsupported result, which will
tell Siri to display the
appropriate error dialog.
The next step in Request
Processing is the confirm step.
And typically, we discourage use
of the confirm step in the Media
domain.
In looking at usage in our own
apps, we find that using confirm
actually lowers the likelihood
that people will continue on to
play media.
So we don't recommend using the
confirm step in the Media
domain.
The final step in Request
Processing is the handle step.
Now, for INPlayMediaIntent, this
ends up being really simple,
because we're going to return
the response code "handle an
app," which is going to do a
background app launch.
And inside of our background app
launch, we're just going to play
media like we normally do in our
app.
The only tricky part here is
testing.
You're going to want to make
sure everything plays because
there's not going to be any UI
on screen.
You're also going to want to
make sure that you test in a
variety of situations.
For example, in CarPlay or when
you're wearing headphones.
So now that we've seen an
overview of SiriKit Request
Processing, let's take a look at
a simplified resolve media items
method.
And the first thing to note is
that the parameters are going to
be slightly different but the
same resolve logic is going to
be the same for all four media
intents.
Resolve's job is to search the
app catalog.
And you're going to want to do
it the same way for all four
intents.
So we'll initialize a result to
the unsupported result.
And this is going to tell Siri
to say the appropriate error
dialogue if we don't find
anything in our app catalog.
INMediaSearch is the intent
field that contains the details
about what the user asks to
play.
INMediaSearch represents the
universe of all the
audio-related queries that Siri
supports.
Our job in Resolve is to go from
that universe of possibilities
to a single item to play.
And in this example, the first
thing we're going to do is read
the media name off the
INMediaSearch.
Then we're going to retrieve a
list of items from our app
catalog.
And we're going to use the media
name property off the media
search to compare against the
item's name property.
And we'll talk a little bit more
about this later on, but this
isn't really something you're
going to want to do in your
shipping app.
But if we had an exact match on
the name, we found the item to
play, and we're going to create
a success result using that
item's properties.
Then we'll call a completion
handler and move on to handle.
And in this case, as we said
before, handle ends up being
very short, since all we're
going to do is return the
handleInApp success response
code.
And this is going to start the
process of background launching
our app.
Now let's take a look at
background app launch.
The method that we implement in
our app delegate to support
background app launch is
application handling intent
completion handler.
Again, this is a pretty short
implementation.
We're going to read the first
media item to play out of the
intent and then we're going to
just play it in our app the way
we normally do.
And finally, we'll call the
completion handler with our
success response code.
So now that we've seen the new
intents and how they all fit
together, let's hand it over to
Ryan Klems, who's going to show
us how this all works in a real
app.
[ Applause ]
>> Thanks, Danny.
Adding SiriKit Media Intent
handling to your existing media
application is easy.
Here we have our music
application and all we need to
do to add Siri support is to add
the Siri extension target, add a
few methods, and then we'll up
and handling Siri requests in no
time.
To add to the intent's
extension, all we have to do is
go to file, new, target, select
the intents extension and click
next.
Give it a name and then click
finish.
It will go ahead and create our
intent handler for us.
We'll want to go ahead and add
the Siri capability to our
application.
And then we'll come over to our
control extension and we will
add our intents that we support,
and in this example, we'll just
go ahead and support the
INPlayMediaIntent and the
INAddMediaIntent.
We'll go ahead and select the
music type.
We have a few methods here that
we want to add to make sure that
we build in our extension.
And we'll want to make sure to
turn on our proper code signing.
Now we'll come over to our
intent handler and all we need
to do is add support for the
INPlayMediaIntent handling
protocol.
And then we'll drop in some
stubs for our resolve and handle
methods.
For this beginning example,
we're just going to return the
unsupported result from resolve
media items, which will cause
Siri to speak to the fact that
it couldn't find the item.
So we'll go ahead and try that
out.
This is what that would've
looked like.
So Siri speaks to the fact that
it couldn't find the item due to
the fact that we returned the
unsupported from the intent
handler.
So what we'll do now is we'll go
ahead and hook this up to our
existing search implementation.
And in this case, the first
thing that we're going to want
to do is determine what the user
is searching for.
So for this simple example,
we'll just search for an artist.
And in the method, we'll go
ahead and resolve the media
item.
And once we resolve the media
item, then we will return to our
handle method and we're just
going to return the handleInApp
method response code, which will
cause us to background launch
the application.
So in order to do that, we'll
switch over to our app delegate
and we need to add the handle
intent method.
And this will just extract the
INMediaItem that we passed, that
we resolved, in the previous
step, and pass that to playback.
And so we'll go ahead and play
what that would look like.
So you can see, we return the
INMediaItem and it has resolved
that, handed it back to the
application, and begun playback.
So now that we've done that, why
don't we go ahead and add
support for the add method.
So to do that, we'll just extend
this by adding the
INAddMediaIntent handling
protocol.
And then we'll add our methods
for resolving and handling for
the add method.
So you notice here that the
resolve media items for add
looks pretty much identical to
the resolve method for play.
Additionally, now for ad, we'll
also have a resolve media
destination.
And this is where we're going to
determine whether the user's
trying to add to the library or
to a playlist.
And in the case of a playlist,
you might want to do something
like "Return playlist name not
found," if the playlist that the
user specified was not present
in their library.
And also, what's different in
the ad is there's no reason to
go back to the application like
to begin playback like we do in
the Play Media Intent.
So in this case what we would do
is we would actually handle
everything in the extension
itself.
So we have the resolved media
item, and we'll just go ahead
and use our applications methods
to add to the library or to the
playlist.
And in this case, we're just
going to use the media player
utilities to add to it.
So let's go ahead and take a
look at what that would look
like.
So it speaks to the item that is
added to the playlist as well as
the playlist name, because those
are specified in the request.
So as you can see, it's
relatively easy to add support
to your application.
And we really look forward to
seeing what you do in your
application.
Thank you.
[ Applause ]
Danny, back to you.
>> Thanks, Ryan.
So what did Ryan show us?
First, he showed us how to add
our intents extension to our
app.
Then he showed us how to specify
our supported intents and
supported media types.
And finally, he showed us how to
implement resolve and handle for
INPlayMediaIntent and
INAddMediaIntent so we could
immediately begin playing and
adding media using Siri.
So let's look at some best
practices you're going to want
to follow when you adopt SiriKit
Media.
We have great news if you've
already implemented shortcut
support for media playback.
SiriKit Media uses the same code
and handle and for background
app launch.
While Shortcuts operates on
previously donated intents which
don't require intent resolution,
SiriKit does require a resolve
step.
So the two things you're going
to need to add are your resolve
method and your intent handler,
and you're going to need to
update your intents extension
supported media types in Xcode
so Siri knows what content types
your app supports.
The extensions handle method
should be the same between both
implementations and the
background app launch to the app
delegate's handle intent can be
the same, as long as you use the
same identifiers for your media
items.
So let's see what this looks
like.
Here's our Shortcuts
implementation from last year.
You'll notice that there's no
resolve method, but the rest of
it's the same.
So to go from Shortcuts to
SiriKit, we just add in our
resolve method and we're good to
go.
Now let's talk about what you
need to do to bring SiriKit
Media Support to the Apple
Watch.
On watchOS, apps launch in the
foreground.
And the way you do that is by
returning INPlayMediaIntent
response code continueInApp from
your handle method in your
intent's extension.
This code is going to do a
foreground app launch and
forward the intent to your
WKExtension delegate in your
app.
You'll note that the apps handle
method looks pretty similar to
the one in iOS.
The method signature is slightly
different, but you're going to
want to read the intent off the
NSUser activities interaction
property.
And then, just like on iOS, you
read the media item to play and
start playback in your app.
One word of caution.
You're going to want to use the
on-device cache in your resolve
method on watchOS.
Only go over the network if it's
absolutely necessary.
So we know that when someone
says "Play Awesome Song in my
app," the first step in Request
Processing is to resolve the
media items to play.
And we looked at a previous
implementation where we checked
the value of the media item's
name against the intent's media
name property.
And it was an exact match.
So what are some edge cases that
we didn't cover in that
implementation?
The first place that our
previous method won't do the
right thing is if we have a
mismatch on either case or
punctuation.
So in this example, "Play hello
in my app," we can see a few
cases where exact string
comparison will fail.
The exact song title is
uppercase HELLO with an
exclamation point.
But it's possible that the Siri
speech engine could give us
lowercase hello.
Or maybe it gives us uppercase
HELLO without the exclamation
point.
So it's really important that we
ignore case and punctuation in
our resolve method.
Similarly, a lot of music
entities have things in the
title that people either won't
say or they're going to say in a
way that doesn't exactly match
with the item's title.
For instance, a lot of albums
come in a deluxe edition
variant.
And people aren't going to say
this when they ask to play the
album.
They aren't going to say "Play
the album Outer Peace deluxe
edition in my app," they'll say
"Play the album Outer Peace in
my app."
And soundtracks are another
example.
People aren't going to say --
they'll say "Play the Rocket Man
soundtrack in my app."
And they aren't going to say
"Music from the motion picture."
And finally, a lot of hip-hop
songs have this featuring
abbreviation in their title.
So people either won't say it or
they'll say the word
"featuring."
So exact match isn't going to
work here either.
And podcasts also have some
cases where there is a mismatch
between what people say and what
entity titles are.
So some podcasts have the word
"podcast" in their title.
So if someone said "Play the
Stuff You Should Know podcast in
my app," Siri could parse it as
stuff you should know in a media
item type of podcast.
So an exact match isn't going to
work here either.
And additionally, some podcasts
come in audio or video variants,
and that audio or video word
appears in the title.
But SiriKit Media implies the
audio variety, so people aren't
going to say that either.
Finally, remember that you're
working with a speech
recognizer, and the speech
recognizer can come with word
formatting variations.
So if someone asks to play the
song 81st, you can get the
number 81 st or you can get the
hyphenated eighty-first.
Or if someone asks to play the
song I Love You Son, you could
get sun or son.
Now, Siri is going to do the
best job it can to give you the
entity titles for the things
that it knows about.
But it's better for you to be
flexible in your resolve method.
When you implement your SiriKit
Media Support, you control what
Siri says by the INMediaItem
objects you return from your
resolve method.
As you can see here, the user
asks to play the song Maybe
Sometime by Special Disaster
team in Control Audio.
And Siri said, "Here's Maybe
Sometime by Special Disaster
Team from Control Audio."
In this case the returned
INMediaItem had a title property
of Maybe Sometime and an artist
property of Special Disaster
team.
So make sure you always populate
title, artist, and type in the
returned media items, as these
can all influence what Siri
says.
And one thing to note, if you
return more than one item in
your resolve method, Siri is
going to speak to the first item
in the list.
It's super important that you
handle error cases appropriately
in SiriKit Media.
When you're interacting with an
intelligent assistant like Siri,
it can be unclear why something
happened when it happened.
And handling error cases
appropriately is going to allow
you to give the user the best
idea of what happened when
something goes wrong.
So the most common case you're
going to run into is when you
don't find something in your app
catalog that someone's asked
for.
And you handle this case by just
returning the unsupported
resolution result from resolve
media items.
But there's a lot of other
errors that could happen.
Maybe someone asks to play
something that requires cellular
data but they have the cellular
data switch turned off for your
app.
Or maybe they ask to play
something that requires a
subscription and they don't have
one.
The full list is in
INPlayMediaMediaItem
UnsupportedReason.
And there's generally similar
naming for all four intents, so
make sure you adopt them for all
the intents you support.
Now let's talk about some of the
variety of things that people
say to Siri and how you can do a
good job handling of them in
your SiriKit Media
implementation.
So one of the most popular
things that people say to Siri
is "Play my app."
They don't tell you exactly what
it is that they want to play,
and it's your job as a SiriKit
Media developer to choose the
right thing to do for your app.
Now, this might be something as
simple as resuming an existing
queue.
If you're not an audiobooks or
podcasting app, this is probably
the most reasonable behavior to
implement.
But you can make the behavior as
dynamic as you like.
Maybe you want to direct them to
a recommended playlist or some
hot new trending music.
The choice is yours.
And how do you know that someone
said, "Play my app," well, there
will be no search criteria
specified in the INMediaSearch
object.
One thing that might seem like a
good idea is to ask someone what
they want to play.
But for the same reason that we
don't recommend using confirm,
we don't recommend this
approach.
Putting dialogue prompts in
front of people is one of the
most common ways that they'll
quit the SiriKit Media
experience.
People can ask to initiate
playback with additional
controls, and some of the
supported options are repeat,
shuffle, resume, and playback
queue location.
And if you support these in your
app, make sure you support them
in your INPlayMediaIntent
implementation as well.
And people can also ask to play
content with a variety of search
options.
And one of the most useful
search options is the sort
parameter.
You can say things like "Play
the new Stuff You Should Know
podcast in my app," and you'll
get INMediaSortOrder newest.
Or you can ask for the best
album by an artist, and you'll
get INMediaSortOrder best.
Or you can ask your app for a
recommendation, and you'll get
INMediaSortOrder recommended.
Check out the full list in
INMediaSortOrder.
And another powerful search
option is the reference
property, which can have
INMediaReferenceCurrently
Playing.
This is really useful for
INAddMediaIntent or
INUpdateMediaIntent, because
people can easily add the
currently playing item to a
library or a playlist, or they
can tell your app that they love
or they hate their currently
playing item.
And one thing to note too is if
you've populated the external
content identifier in
MPNowPlayingInfo center, that
identifier is going to be in
INMediaSearch's identifier
property, so you know exactly
what item to find.
Now, telling Siri more about how
your customer uses your app is
going to help Siri provide a
wonderful SiriKit Media
experience.
So when you give user vocabulary
to Siri, this helps Siri
understand the parts of your
catalog that are interesting to
your customer.
It's important to note it's not
the entire contents of your app
catalog, it's only the pieces of
content that your customer is
specifically interested in.
It's also important to note that
the vocabulary is ordered, so
make sure that you include the
most important items at the
beginning of the collection.
And depending on the type of
media your app supports, you're
going to send different types to
Siri.
Music apps should use playlist
title and music artist name.
Audiobook apps should use
audiobook title and audiobook
author.
And podcasting apps should use
show title.
And for those things that are
applicable to everybody that
uses your app, check out the
global vocabulary support in
SiriKit.
So in conclusion, we're
launching new SiriKit Media
Intents to allow you to use Siri
to control your audio apps.
You'll be able to play, add,
update taste profiles, and
search for media using the new
intents.
It's very important that you
provide the best experience
possible.
And you can do this by embracing
search flexibility.
Because people don't say exactly
what they want to play.
You can handle errors
appropriately, so people know
what's happened when something
goes wrong.
And you're going to you want to
make sure that you construct
your INMediaItem objects
appropriately so Siri can speak
the best dialogue possible.
And finally, make sure you give
Siri the important contextual
information possible so Siri can
make the best choices possible
on your customer's behalf.
So come see us at the labs and
check out the documentation
online.
We think you're going to love
building apps with the new
SiriKit Media Intents and we
can't wait to see what you'll
build.
Thank you.
[ Applause ]