WWDC2017 Session 710

Transcript

>> Good morning, everyone.
My name is Krishna and I'm from
the Core ML Engineering team,
and today we're going to talk
about Core ML in Depth.
This year, we introduced Core
ML.
It's the easiest way for you to
integrate machine learning
models in your applications.
Core ML is available on macOS,
iOS, watchOS, and tvOS.
We had a session on Tuesday that
introduced Core ML.
For those of you that missed
that session, let's just take a
couple moments to recap some of
the key things we learned in
that session.
Now the first and the most
important thing about Core ML is
that you can think of your
machine learning models just
like code, and you interact with
them just like any other Swift
class.
Your workflow looks a bit like
this.
You start with a machine
learning model, you drag and
drop that model into Xcode,
Xcode will automatically
generate a Swift or an Objective
C interface for you to program
against that model.
You write your application code,
you build it.
Xcode will bundle both the code
as well as the model in your
app.
In that session, we also saw a
little demo of a flower
predictor.
It was an application where
given a picture, let's say this
pink rose, the application is
supposed to tell you what kind
of a flower it was and how
confident it was.
We saw that in order to use that
or build that application, it
just took a few lines of code.
One line of code to instantiate
the model, and one line of code
to make a prediction from that
model.
So in this session, we're going
to pick up from where we left
off and we're going to talk a
little bit more about the
different kinds of use cases,
all the cool stuff you guys can
do with Core ML.
We're then going to talk about
how Core ML is optimized for the
hardware on which it runs for or
runs on, and what that means for
you as a developer.
Finally, we're going to talk
about how you can obtain Core ML
models for use in all your
applications.
So it's going to be a fun
session with a couple demos,
let's get started.
So you've already seen an
example of an app that used
images, the flower predictor.
But with Core ML you can do a
lot more than just images.
You can work with gestures,
let's say handwriting detection
on the watch.
You can work with video, let's
say you want to do credit card
detection.
You can work with audio, and you
can even work with text.
Now by taking inputs of all of
these different types, you can
build a large variety of
applications.
Let's say you want to build an
application that as you type
some text it tells you if it's a
happy text or a sad text or a
passive aggressive text or
angry.
You can do that with Sentiment
Analysis, and we'll see a little
demo of that today.
You can -- with Style Transfer
you can even make pictures of
your family look like Vincent
Van Gogh paintings.
And with Gesture Recognition,
you can have a whole new way to
take inputs to your
applications.
Now all of these are amazing
possibilities because with Core
ML you can use a large variety
of models.
You can use Classical Machine
Learning Models like Generalized
Linear Models, Trees, and
Support Vector Machines.
Now these ones are great because
they are small, you can make
fast predictions with them and
they are on any device.
But you can also work with a
large variety of Neural
Networks.
We have support for over 30
different layer types and that's
huge.
You can do things like
Feedforward and Convolution
Linear Networks for all your
image and video-based
applications, and you can also
do things like Recurrent Neural
Networks or LSDMs for all of
your text-based applications.
We'll take a look at Recurrent
Neural Networks in today's
session.
In fact, with Core ML you can
also combine models of various
different types.
So you can take, let's say, a
Neural Network and combine it
with a Tree and then you can get
one big model with that.
So this concept is called a
Pipeline.
But most importantly for you as
a developer, we want you to
focus on the code that you're
writing in your apps and not on
the specific complexities of the
model that's running there.
And we achieve that by giving
you a functional abstraction
viewer models.
So all you need to care is your
models are prediction functions
that take in some inputs and
give out some outputs.
And these inputs and outputs can
be of five different types;
numeric, categorical, images,
arrays, and dictionaries.
Now let's just take a little
look at each of these five
different types.
So numerics and categories are
exposed to you in Swift as
doubles, integers or strings.
So it's very natural.
We have a little example of an
application that uses these two
types on developer.apple.com.
It's an application that does
house price prediction.
So some of the inputs to this
model are numeric, that is
they're continuous so you can go
from zero to infinity.
And some of them are categorical
or discrete, so you go zero, 1,
2, 3, 4.
You've already seen an example
of using images.
Now these images are exposed to
you as CVPixelBuffers.
For the more complex things like
gestures, audio and video, we
have a new type called
MLMultiArray to encapsulate a
multidimensional array.
And for a lot of your text-based
applications, you'll be
interacting with dictionaries.
Here are the dictionaries.
The keys are either strings or
integers, and the values are
doubles.
Let's take a little look at
using Text and working with
Dictionaries.
And we're going to do that with
an application of Sentiment
Analysis.
Now I've always wanted this app
where if I type some text I want
the UI to pop and reflect the
mood that I'm in.
So if I say Core ML is awesome!
I love using it!
I want it to go green and I
want, like, a happy face.
And if I talk to you about, say,
how bad the lunch was.
Like, today's lunch was
disappointing and sad.
I want it to go red.
So that's what I want to do.
So what does it take to build an
application like this?
So I'm going to start with an
app shell where the user can
type in some text.
As soon as the user hits the
space bar, I'm going to take all
of that text, give it to a
machine learning model, and get
back a sentiment prediction.
So the sentiment prediction is
either going to be happy,
neutral or sad.
As soon as I get back this
prediction, I'm going to go back
and quickly update the UI to
reflect the mode I'm in.
The most important thing for you
to note is all of this can
happen real time on the device
as the user is typing, so it
makes for an amazing experience.
So let's see how you can go
ahead and build that app.
Well, the most important thing
here is the model, and for this
we're going to use a Sentiment
Analysis model.
But for this sent Sentiment
Analysis model is going to
operate on word counts not on
the raw text, but on word
counts.
So these word counts will be
represented as dictionaries
where the keys are the words and
the values are the number of
times that word appears in a
sentence.
So Core ML is awesome.
I love using it.
Translates to a dictionary that
looks a bit like this.
So once I have my Word Counts I
can pass that to my Sentiment
Analysis model and I can get
back a prediction.
In this case it's "happy".
But you might wonder, okay, how
do I go for my raw text to word
count.
Well, you might have already
seen the session on NLP, but you
can use the already existing
tools in the NSLinguisticTagger
to tokenize and count number of
words.
So I'll use NLP to preprocess my
texts, specifically the
NSLinguisticTagger, and then I'm
going to get my word counts
which I'll then give to a model
and get a prediction out of it.
So with that, you can build an
application like this.
But let's not just talk about
it, let's go ahead and do it.
So I'm going to walk over and
open up Xcode.
So what I have here is Xcode and
a simulator.
The simulator is now running an
app shell.
The app shell doesn't have the
model in there, so if I type
something, let's say, Core ML is
amazing, amazing fun, nothing
really happens to the UI.
So what we're going to do right
now is we're going to go ahead
and incorporate the machine
learning model in here so that
this app is going to become much
more vibrant.
So the first thing I'm going to
do is open up Finder and drag
the Sentiment Analysis model
into Xcode.
Let's go take a look at what
this model is.
So as you can see, this is a
Sentiment Analysis model.
The type of this model is a
Pipeline Classifier, so it does
a couple different things before
it gives the final prediction.
It's only 167 kilobytes, so it's
pretty tiny.
And the inputs to this model are
word counts, and the word counts
are dictionaries where the key
is the word and the value is the
number of time that word
appeared.
And I get two outputs from this
model, a Sentiment Label, that's
one of two things.
It's either good or bad.
And a Sentiment Score, which is
a probability associated with
the good sentiment or the bad
sentiment.
So that's a dictionary.
So what I'm going to use in this
application is I'm going to use
the Sentiment Score to determine
within a range of zero to 1 how
nice this text was, and that's
what I'm going to be using to
update my UI.
So let me go ahead and include
this model in the target of my
application, and then Xcode will
automatically generate a nice
interface for me.
So I can go back to my
ViewController and now I'm going
to implement the logic to
incorporate this Machine
Learning Model in there.
So for that, I'm going to
implement this function, predict
SentimentScoreFromRawText.
So this function is going to
take a sentence which is the
entire string.
It gets called every time the
user types a space, and what it
returns is a double value
between zero and 1, where zero
is really, really sad and 1 is
really, really happy.
So the first thing I want to do
is I'm going to instantiate this
model, and I can simply do that
by saying let model =
SentimentAnalysis.
And then I'm going to predict
use this model to make this
prediction.
But as you can see, the input
here is a sentence but what I
really want is a word count.
So I've already implemented this
function called
tokenizeAndCountWords.
This function uses the
NSLinguisticTagger to tokenize
the sentence and then count the
number of tokens in that
sentence.
So I'm going to skip over that
and I'm just going to call that
function.
So I'm going to say let
WordCounts =
tokenizeAndCountWords sentence.
And then I'm going to use this
word count and provide that to
my model.
So I'm going to say if let
prediction = try
model.prediction(wordCounts),
and simply pass that wordCount
there.
And if this succeeds, I'm going
to use the prediction object to
get out the sentiment score, but
because I want a value between
zero and 1 I'm going to get the
score and not the label.
So I'll take the sentiment score
associated with the sentiment
"good" and I'll return that to
my UI.
And if this fails, by any
chance, I'm going to go down
0.5.
So I have a little error
handling here as well.
So I'm going to go ahead and
build that application.
So during this process, as you
might be aware, the model and
the code are both getting
packaged and getting shipped to
the device.
Another thing to note is that
the compiler, the Core ML
compiler gets shipped as part of
the Xcode pool chain.
So if you want to compile them
all or run the code generator
yourself, you can use the
compiler directly.
So now let's go use this app and
let's type something nice.
Let's type "Core ML is amazing
fun."
That's what I want to write.
"Core ML is amazing fun and I
love using it."
So immediately you saw the UI
popped and I got a little green
and I'm happy and, you know,
this is great.
[ Applause ]
But now I want to type something
bad, but I don't really want to
make fun of anything or anyone
so I want to talk about how my
life is terrible without CoreML.
"Life without CoreML is sloppy,
terrible and sad."
So obviously the UI is really
sad because, you know, life
without Core ML is really sad.
So what we really saw was a
seamless integration between NLP
and Core ML, so I built this
Sentiment Analysis Model and I
was able to make my application
a lot more vibrant.
And all of this was happening in
real time on the device as the
user typed it.
So that was pretty cool.
Let's go recap the two main
things that we talked about in
this demo.
So the first thing was that the
preprocess text we use the
NSLinguisticTagger.
The second thing was that once I
got those word counts I could
then give it to a model and get
a prediction out of it.
And this is a pattern you're
going to encounter a lot with
text based applications because
most text space applications do
not work directly on raw text.
There's always a little bit of
preprocessing that you have to
do for it.
But that was really a simple
example.
It was an introductory example.
Let's step our game, let's get
to the next level.
Let's talk about something
you've all interacted with on a
daily basis.
This is the Apple keyboard.
Now, when you type words in the
Apple keyboard, as you might all
be aware, you get very
contextual predictions of what's
the next most likely word you're
going to type.
So if I say, "I'm not sure if
Oliver will eat oysters, but he
will."
They keyboard tells you "so",
"totally" and "love" are three
likely words you're going to
type next.
So how do you go about building
something as sophisticated as
this?
So this is a predictive
keyboard, and the machine
learning task here is to make a
prediction for the next word.
The model that's being used here
or the model that will be used
in an application like this is
usually a model that takes the
sequence of words.
So "I'm not sure Oliver will eat
oysters, but he will."
is a sequence of words.
I give that as input to the
model and I get a prediction.
So you might wonder, okay,
what's the difference between
this model that we just saw and
the Sentiment Analysis one?
They look the same to me.
The key difference is that here
the input is a sequence of
words.
So if you jumble up the words
and give it to the model you're
going to get a completely
different prediction.
And to do something like this,
most machine learning models
will have a notion of state
that's associated with them, and
that's how they get this
behavior.
And the state gets passed along
as every prediction is made.
So it's like a baton in a relay
race.
Every time you make a
prediction, take the state and
pass it along.
We're going to do something like
this using an LSDM, usually.
Specifically, like a [inaudible]
network.
But with Core ML, all of this is
going to be a lot easier.
So, let's take a look at what
you do but we'll do it with a
little more fun application.
We'll do it with a Shakespeare
Keyboard.
So instead of a regular
keyboard, this keyboard is going
to make me sound like
Shakespeare.
So if I say, "Shall I compare?"
It should say "thee", "summers",
"day" are the next three words
I'm likely to type.
So, what's really the difference
between the Shakespeare Keyboard
and the regular keyboard?
It's really the model that's
predicting the next word.
So one of those models is
trained on Shakespeare data and
another one is just trained on
regular English data.
So this concept is the Language
Model.
So I just threw so many new
concepts at you, a Language
Model, Sequences, LSDM, but
don't worry, with Core ML this
should be a lot easier.
Let's see how you would do
something like this.
So I start with a model and I'm
going to give it the first word,
let's say in this case, "Shall".
That's the current word.
And what I'll get back from the
model are two things.
A set of choices for the next
word.
So in this case I get basically
a probability associated with
all the set of next words, and
I'm also going to get a state
associated with this prediction.
So I'll take these next word
choices and I'll give them to
the user.
The user will either select one
of those three words or maybe
they'll type their own word.
Either way I get a next word.
I'll use that next word, pass it
back to the model for the next
prediction, and I'm also going
to take the state, pass it back
to the model for the next
prediction.
So in steady state every time
you're going to do two things.
You're going to take the current
word in the state and give it to
the model, and what you'll get
back are the set of choices for
the next word and some state.
And the second thing you're
going to do is you're going to
pass that all back to the model
for the next prediction.
So, it's pretty simple.
Let's see what the code would
look like to do something like
that.
So I'm going to start by saying
let output =
model.prediction(input).
I'll take the probabilities
associated with the next word
and I'll give it to a function
say displayTopPredictions which
says selects the top 3 and gives
that to the user.
The user is either going to
select one of those 3 words or
maybe type their own, either way
I'll get that from this function
getWordFromUser and I'll pass
that back to the input as the
current word.
I'll take the state, pass it
along to the model again.
So in just a few lines of code,
you can integrate a more less
complex as an LSDM that involves
state, the language model,
keyboard, all sorts of things in
just a few lines of code.
So that was about the different
sets of use cases and a little
bit about text.
Now let's talk about how Core ML
is optimized for the hardware on
which it runs and most
importantly what that means for
all of you when you're building
apps.
So we're going to motivate that
with a little video of real time
object detection.
What's important to note here is
that the camera feed is live
going to a model, and a
relatively powerful model, and
you're getting accurate
predictions as you see, live.
And this is only possible
because Core ML is super
optimized for the hardware on
which it runs.
And in this case, the model runs
in about, say, under 50
milliseconds.
I was hoping nobody laugh
because this joke has been said
7 times already.
[ Laughter and Applause ]
So what really matters for you
is that Core ML is built on top
of the performance primitives,
Accelerate and MPS.
But more importantly, it
completely hides the hardware
from you so you don't have to
worry about whether it's running
on the CPU or the GPU.
So that demo that you saw, you
might ask, okay, how many knobs
did I have to turn to get that
to work?
Well it's zero.
That's the performance you'll
get out of the box.
[ Applause ]
So specifically that demo that
you saw and flower predictor
that you saw earlier, those two
were compute heavy tasks, and we
knew that so we showed you them
on the GPU.
Whereas some of the text based
demos like Sentiment Analysis
and Next Word Prediction, these
were memory heavy tasks and
that's why we showed you them on
the CPU.
But most importantly, they all
just run on Core ML so you don't
have to worry about where it's
running.
We've got your back.
And this kind of abstraction
lets us do powerful things.
So for the Image Captioning, for
example, where part of that
model is compute heavy and part
of that model is memory heavy,
we automatically contact Switch
from the GPU to the CPU so that
you can get the best of both
worlds.
So this was all about use cases
and performance, but Core ML is
much more than just the
framework.
It's a file format and a
collection of tools to help you
get more and more models that
you can use in your apps.
And to talk about that, I'd like
to invite my friend and
colleague, Zach.
[ Applause ]
>> Thanks, Krishna.
Hi. My name is Zach and I'm an
engineer on the Core ML
Engineering Team.
[ Applause ]
And I'm really excited to talk
to you today about the Core ML
Model Format and where you can
get models in this format for
use in your apps.
So by now, you've seen this
diagram many times and this
shows how easy it is to use a
Machine Learning Model.
Simply drag and drop it into
Xcode and you get a code
interface.
But by now you're probably
wondering where do these models
come from?
Well, there are really two
places you can look for Machine
Learning Models in the Core ML
Model Format.
The first is the example models
on developer.apple.com.
These are a variety of
pre-trained models already in
the Core ML Model Format and
this is the easiest way to get
started if you're new to machine
learning.
But we also know there's a whole
wide world of machine learning
out there.
There are a lot of existing
popular training tools and a lot
of existing models out there in
these formats already.
So we want to make it possible
to take machine learning models,
trained using the most popular
tools, and use them in your apps
today.
So to that end, we've created
Core ML Tools.
It's a converter package that
takes models in a variety of
popular formats and converts
them into the Core ML Model
Format and it's open source.
[ Applause ]
We've released Core ML Tools
under the permissive BSD license
so that there are no barriers to
adoption.
So to get a model off the
internet somewhere, you're going
to start with a model in a
different format.
Let's say Caffe.
So Caffe is a really popular
deep learning training library.
If you're starting with a model
in, say, Caffe Format, the way
that you get it into Core ML
Model Format and to use it in
your application is to run it
through a converter from Core ML
Tools.
Or if you don't find a model out
there that does what you want,
you can start with your own
training data and use any of a
variety of these popular tools
to train your own model in that
format.
From there, again, you run the
converter to produce a model in
Core ML Model Format and the
rest of the workflow stays
exactly the same.
Simply drag and drop the model
into Xcode and you get a code
interface.
[ Applause ]
Using Core ML Tools is as easy
as pip install coremltools which
downloads and installs the
package.
This is a python package with
converters for a variety of
popular training tools, and most
of these tools are already in
python.
So to be part of that machine
learning ecosystem, this is a
python library as well.
Let's look at the breakdown of
what's inside this package.
At the very top are each of the
converters, and this is a set of
converters, one for each popular
training library.
Underneath that, we have Core ML
bindings and a converter
library, and this is what we've
used to build all of the
converters.
So here the Core ML bindings
allow you to call directly into
Core ML from python and get
backup prediction, and that's
really useful to verify that the
prediction you get for a
converted model is exactly the
same as the prediction you would
get with the original training
framework.
We also have a converter
library, which is a high-level
API for building converters, and
its shared code among all of
these converters so that make it
really easy to build new
converters for new formats.
Underneath all of that is the
Core ML Specification.
This is a read and write API to
the Core ML Model Format
directly.
So all of the individual fields
can be accessed here.
We've designed the package this
way so that it's compatible and
extensible.
At the top level, the converters
give Core ML compatibility with
a variety of popular tools.
Underneath that, the Core ML
Bindings, Converter Library, and
the Core ML Specification make
this package extensible so it's
easy to build new converters and
to build converters for a lot of
existing formats that aren't
there yet.
And because this is open source,
it's easy to take this package
and even integrate it into
another open source library and
build new converters on top.
And there are no restrictions
here.
This is BSD licensed.
[ Applause ]
The Core ML Model Format is a
single document format and it
encapsulates both the functional
description of the model in
terms of its inputs and outputs,
as well as the trained
parameters of the model.
So to look at an example, for a
simple model like a Linear
Regression, this would be the
set of weights and offset that
are learned at training time.
And for a more complex model
like a Neural Network, this
actually encapsulates both the
structure of the network as well
as the learned weights at
training time.
And this is a public file format
and it's fully documented on
developer.apple.com.
When you look at a Machine
Learning Model in Xcode, you see
a view something like this.
You get all of the metadata and
the functional interface, and
what we now see is that that's
entirely powered by this Core ML
Model Format.
So the Single Document Format
contains all the information
Xcode needs to give you a UI on
top of this model and to let
your code call it and then to
execute it on device.
The Core ML converters all work
in the same way.
They start by taking a model in
a source format, for instance
Caffe, and converting it into
the Core ML Model Format.
There's a set of unified APIs to
convert these models from a
variety of formats into Core ML
Format.
So if you know how to convert
from one format, you know how to
convert from all formats.
Let's take an example and look
more closely at how we would
convert a Caffe model.
Caffe works a bit like this.
It has several files to
represent the model.
The .caffemodel file represents
the learned weights in that
model which the .prototxt file
represents the structure of the
Neural Network.
When Caffe is doing inference,
you would start by taking an
image, say, like a rose like
this, and you'd pass it into
Caffe with these two files.
And Caffe would give back an
index of a class label like,
say, 74.
Then there's a third file, a
labels.txt that maps those
indices to string class labels
like "Rose".
So it's important to note that
those 3 files really encapsulate
together all of the information
in the model, and so that's
what's needed for conversion
into the Core ML Format.
So now, let's look at an example
of converting a Caffe model into
a Core ML Model.
[ Applause ]
I'm going to start by opening an
interactive python prompt here.
So we can type python code and
see the output in the real time.
So I'm going to start by
importing coremltools.
Again, this is the name of that
package in python.
And as soon as that's done, I
can just type coremltools.
And when I'm working with a new
python package, the first thing
I like to do is tab complete on
it and see what's available in
the API.
So here on tab complete, we can
see that coremltools contains
converters which is each of
those high-level converters from
another format into the Core ML
Format, as well as models,
specification version and utils
which are the framework bindings
and the converter library, and
together with those you can
build a new converter and test
that it gives the same
predictions as the original
training framework.
And the .protonamesbase which
contains the read and write APIs
for the Core ML Model Format.
Today we're going to focus on
the converters.
So again, if I do .converters
and then tab complete once more,
I can see all of the converters
that are available in this name
space right now.
So there's caffe, keras, libsvn,
scikit-learn, and xgboost.
And today we're going to focus
on Caffe.
So once more I'll do .caffe and
tab complete and see what's
available.
And it just tab completed for
me.
Convert. So it's that simple.
There's just one function in
here called convert, and let's
look at how we would use that.
So first I'm going to start by
setting up the inputs.
So I know I have a Caffe model
defined by two files.
So I'm going to say caffemodel =
and I'm going to give a two pole
of two strings pointing to the
file names of that Caffe Model,
which are flowers.caffemodel and
flowers.prototxt.
And together those represent the
learned weights in the network
as well as the network
structure.
Next I'm going to set up a class
label file.
So I'm going to say labels = and
I have labels.txt here.
And that represents the mapping
of numeric indices to string
class labels.
Now to run the converter.
It's very simple.
All I have to do is say
coremlmodel = coremltools
.converters.caffe.convert and it
all tab completes, and I'm just
going to pass in that caffemodel
and classlabels = labels.
So I'm just passing in those
three files.
And when the converter finishes,
what I get back is a Core ML
Model.
Right here in python, I can
print out the model and see the
interface that I get with that
model.
So I can see that it has one
input named "data" and that
input is a multi-array which
shaped 3 by 227 by 227 and typed
double.
And we'll get back to that in a
minute.
It also has 2 outputs.
One is named "prob" and it's a
dictionary with string keys, so
that's going to represent the
probabilities for each possible
class label.
And another output named
classLabel as a string, and
that's simply going to be the
most likely class label.
So for convenience, you don't
have to look at all the
probabilities to find out which
one is the most likely.
But looking at this, I know that
this input type is not quite the
interface that I wanted the
model to have, so I can go back
and run the converter with
another parameter to modify the
input type because instead of a
multi-array, I would like for
this model to take an image as
input.
So I'm going to go back and add
image input names = data.
And again, just looking at the
output here, you can see that
the input name is data, so it's
going to know what to do with
that input.
If I run the converter once more
and then again print out the
model interface, now I see that
the input named data is an image
with width 227, height 227, and
color space RGB.
[ Applause ]
Now that we have a Core ML
Model, let's check and make sure
that the conversion succeeded
and thus we get correct
predictions with this Core ML
Model.
I'm going to start by importing
the python image library so that
I can work with images and pass
an image directly into the
model.
So I'm going to say from PIL
input Image and then Rose =
Image.open rose.jpg.
And just to prove to you I've
got nothing up my sleeve, I'm
going to call rose.show and show
you that this is indeed a
picture of a rose.
[ Applause ]
You don't need to clap for that.
[ Laughter ]
So now that I've shown that this
image actually does represent a
rose, let's see if the model
agrees.
So checking the prediction is as
easy as calling
coremlmodel.predict, and this is
that Core ML framework binding
that we talked about earlier.
And we're going to pass the
input named data the value rose.
And immediately we get back a
prediction of class label rose.
[ Applause ]
But let's make sure the model --
let's make sure it's not a fluke
and let's make sure the model
really knows that this is a
rose.
So we're going to scroll down
through the class probabilities
until we see rose here.
And we can see that the model is
actually very confident that
this is a rose.
So this is .991 out of 1
confidence that this is a rose.
So I'm going to conclude that
probably this model did convert
correctly.
If I was doing this in a real
application, I would want to
actually test this a bit more
rigorously so I would provide
more than one example and I
would also want to check that
the predictions that I get here
from Core ML are exactly the
same as the predictions that
Caffe would have given me for
the same input, and that's how
we know that the conversion has
succeeded.
But that will take too long for
this demo so let's move on.
What I'm going to do now is save
the model out and then look at
it in Xcode.
So I'm, going to say
coremlmodel.save, and I'm going
to call it
FlowerPredictor.mlmodel.
And then I'm just going to open
the current directory and finder
and double-click that model to
open it in Xcode.
And what we can see here is that
it's a Machine Learning Model
with name FlowerPredictor.
Type is Neural Network
Classifier and its size is 229.1
megabytes.
But there's a lot of missing
information, too.
It doesn't know the author, the
license, the description or the
descriptions for the inputs and
outputs.
So while this is a working
model, it's not necessarily one
I would want to give to a
colleague or put on the internet
because it's not quite as useful
if it doesn't really declare
what it's supposed to do and how
to use it.
So I'm going to be a good
citizen and go back and give
this model some metadata.
Back in the python prompt, I can
just mutate the model right here
by assigning to fields at the
top level.
So I can say author = "Zach
Nation".
coremlmodel.license = "BSD".
coremlmodel.shortdescription = a
flower classifier.
And let's set the help text for
those inputs and outputs as
well, because not only does that
show up in the Xcode view, but
when the generated code is there
and you can call into it that
actually becomes documentation
comments in the generated code.
So when your tab completing an
Xcode this is what someone
consuming this mode will see.
So I'm going to say
coremlmodel.inputdescription for
"data" is an image of a flower.
And
coremlmodel.outputdescription
"prob" is the probabilities for
each flower type, for the given
input.
And
coremlmodel.outputdescription
"classLabel" is "The most likely
type of flower, for the given
input."
And now we can just save that
model once again and I'm just
going to clobber the file that
was there because now I want the
one with metadata.
So I'm going to just save it
right on top of the file that
was there before and open that
directory and finder again.
And now when I double-click the
model and open it in Xcode, we
can see that it contains useful
metadata describing how the
model is intended should be used
and what the inputs and outputs
do.
[ Applause ]
So to recap, what we saw is that
using Core ML Tools to convert a
model is as easy as import
coremltools, setting up the
inputs, and then calling a
simple high-level convert
function to get back a model in
Core ML Model Format.
And the best part is, let's say
you switch training frameworks
from Caffe to Keras, switching
converters is as easy as
updating the name space because
all of the convert functions
share the same high-level API
[ Applause ]
Core ML Tools supports Caffe and
Keras models as Neural Networks.
Scikit-Learn for Pipelines.
Scikit-Learn and XGBoost for
Tree Ensembles, and LIBSVM and
Scikit-Learn for Linear Models
and Support Vector Machines.
It's also worth noting that
Keras is a really powerful
high-level interface to several
popular deep learning training
tools, including TensorFlow.
So if you're training a
TensorFlow Model in Keras, you
can use Core ML Tools to convert
it into Core ML Model Format.
[ Applause ]
Obtaining models, you want to
look in two places.
One is the set of example models
on developer.apple.com.
And again, these are
pre-training models already in
the Core ML Model Format and so
this is the easiest way to get
started if you're new to Machine
Learning or if one of these
models does the task that you're
trying to do in your app.
But because there's a whole
world of machine learning out
there and a variety of models in
use cases in a variety of
formats, we've created Core ML
Tools to allow you to convert
from any of these popular
formats into the Core ML Model
Format.
[ Applause ]
So, in summary, what we've
learned today is that Core ML
makes it really easy to
integrate Machine Learning
Models into your app.
Simply drag and drop into Xcode
and you get a code interfaced to
the model.
Core ML has rich datatype
support for a variety of use
cases and it deals with
datatypes that you're already
familiar with from your app
code.
Core ML is hardware optimized
and it's built on top of
performance primitives like
Metal Performance Shaders and
Accelerate so that you get the
best possible performance on
device.
And through Core ML Tools, Core
ML is compatible with the most
popular Machine Learning Formats
and more will be added over
time.
For more information, please see
developer.apple.com with our
session number 710.
We have a couple of related
sessions coming up you may be
interested in.
To look at some low-level
details about how we get such
good performance on this
hardware, check out the
Accelerate and Metal 2 sessions.
Thank you.
[ Applause ]