WWDC2019 Session 428

Transcript

[ Music ]
[ Applause ]
>> Hello, everyone.
Good afternoon.
Welcome to the session.
My name is Tao.
I'm from Core ML Team.
Today we're super excited to
talk about a few new features we
added to Create ML this year.
In What's New in Machine
Learning Session, you were
introduced to the brand-new
Create ML app.
It's a great new tool designed
by Apple to guide you through
the essential steps of training
of machine learning models.
We believe this makes your
machine learning workflow easy
and approachable.
Now, let's start with Text
Classification.
What is Text Classification?
It's a machine learning task
that classifies input text to a
set of predefined labels.
You can use it for sentiment
analysis, like classifying text
as positive or negative.
Or doing spam detection.
Or more complicated things like
topic analysis, classifying text
as food, politics, or science.
Now, Text Classification has
been supported since last year,
but this year we're adding the
support for the state of the art
transfer learning.
Today, I'm going to give you a
concrete example of how you can
train a Text Classifier using
Transfer Learning.
To train such a model, the first
thing you'll want to do is
collect some training data.
Once you have the data, you can
organize in an easy way.
I have sports, entertainment and
nature.
Under each folder, I have a few
text files.
And each one of them is a single
training example, and its label
is simply the name of the folder
they are in.
And that's it.
That's all you need to do with
the data.
I'll give you a quick demo to
show how you can do this.
On my desktop, I have a training
data folder.
So, I decided to have a little
bit of fun.
Instead of using plain text as
the label, I used a few emojis.
For example, for entertainment I
use a camcorder.
For sport, I'm using an American
football.
For nature, I'm using this cute
little tent besides a great pine
tree.
You notice there is also this
other folder.
This is the training data that I
actually don't want.
I want my model to learn such
that anything that's not one of
the three intended class, I want
my model to classify them into
this class.
So, let's take a look at a few
examples here.
For example if you read this
sentence, there's something
about writer and IMDB.
It's clearly something related
to entertainment.
If I go into other folder, this
is not one of the three intended
class.
That's about data.
Now, I'll go to Create ML app.
To do that, I go to Developer
Tools and find the Create ML.
But I'm going to do a new
document and go to text.
I'll name it like that.
Create. So, I'm sure you have
seen this user interface in the
earlier session, Object
Connection and
Subclassification.
So, I'm just going to drag my
training data in directly here.
So, it gives you some feedback
about your training data.
But what's new about Text
Classifier is this particular
parameter section.
So, for today's demo, I'm going
to select Transfer Learning.
If we take a look at it, the
first two algorithms actually
have been supported since last
year.
And this year we're adding
Transfer Learning.
And for the demo I'm going to
use this feature extractor
called Dynamic Embedding.
I'll go into the detail about
what is Transfer Learning a bit
later after the demo, but all
you need to know now is Transfer
Learning with Dynamic Embedding
is actually an algorithm that
pays attention to the semantic
meaning of your input text.
I'll start training.
So, in general, Transfer
Learning takes a bit longer to
train.
For this particular data set,
it'll take about five minutes on
this particular machine.
So, I'm not going to wait.
Instead, I have my pre-picked
solution here that's trained on
exactly the same data.
Once the model has been trained,
the accuracy test gives you a
good summary about how your
model performs on different
data, broken down into different
classes.
It looks my model has done a
good job on training validation
and testing data.
So, I will directly jump to the
output tab.
This preview pane gives you some
summary about the model as well
as a few different ways for you
to do some quick experiments.
For example, if you go to the
lower right-hand side corner,
you can directly add File to do
a quick test.
Or you can track a folder that
the model will be able to make
prediction of all the text file
within that folder.
The model actually will use the
entire body of the text in each
file to make a single
prediction.
But there is another way for you
to do quick experiment which is
manually typing.
So in this particular text input
section, when you, when I type
each space, or if I stop typing
for a little while, the model
will make a prediction.
So, let's see how that works.
What a comeback win.
Sport. Let's try something
different.
The season finale adds a twist
to the plot.
Which seems about right.
Entertainment.
[ Applause ]
I actually also want to try
something more recent and more
relevant.
This just came to me last night.
The Raptors are on top of the
mountain.
Nature [laughter].
Well, if I switch context.
[ Applause ]
It predicts sport.
[ Applause ]
So, I'm pretty happy with the
model performance.
Now I'm ready to deploy.
So, I just simply drag it out
and I can put it on my iOS
devices.
That's how you can train a Text
Classifier using Transfer
Learning.
But you might want to ask what
is Transfer Learning?
Transfer Learning is a powerful
machine learning technique where
a model trained on a huge amount
of data for one particular task
can be repurposed for another
task so that you, as developers,
only need to prepare limited
amount of data and still get
your customized machine learning
model.
Create ML actually uses Transfer
Learning for image
classification today.
And now we're bringing it to
text.
Well, to train a good Transfer
Learning model for text, the
problem is a little bit
different.
If you read this sentence, "I
was able to park my car near the
park entrance."
For a model that's trained with
conventional technique or study
embedding, these two parts will
look exactly the same.
But for a model trained with
Transfer Learning with Dynamic
Embedding, because it pays
attention to the semantic
meaning of the overall context,
it is able to figure out these
two parts actually mean
different things.
To train such a good model, you
need to do it on many texts.
That's exactly what we did.
We trained on billions of text
and shipped that pretrained
model to your mobile devices.
And more importantly, we
optimized its performance so
that you only need to prepare
your limited amount of raw text,
send it in to Create ML app.
Underneath it, it'll interact
with the model that's already
part of your OS and gives your
customized text classifier.
And you can deploy the same
model on your iOS devices
seamlessly because the
pretrained model is already
there for you to use.
Now, that's the workflow to use
the Create ML app to train your
Text Classifier.
If you're also interested in
writing code, it's also very
easy.
To use the Transfer Learning
with Dynamic Embedding, the only
thing you need to specify is
this model parameter.
The rest of the code is exactly
the same as last year.
Now, I've shown you how to train
Text Classifier, but I'd like to
give you a few simple tips to
get the most out of your Text
Classifier.
Now, choose an algorithm which
suits your use case.
Transfer Learning is only one of
the three algorithms that Text
Classifier supports.
Different algorithms have
different tradeoffs.
To find out which one fits your
use case best, please go to the
Natural Language session that
happens after this.
Provide balanced classes.
This is very important.
In the particular example that I
showed you, because each text
file is a single training
example.
So, I roughly kept the same
amount of text files in each of
the folders.
Data consistency.
If you expect your Text
Classifier to mostly work on
short sentences, like I do, your
training data should be mostly
consisting of such examples.
If you expect your Text
Classifier to work on short
words, paragraphs, or even an
entire article, you also want to
make sure your training data
cover these cases as well.
[ Applause ]