WWDC2018 Session 222

Transcript

[ Music ]
[ Applause ]
>> Good morning, everyone.
Thank you all for joining me,
even before the coffee's kicked
in.
My name is Itai and I work on
the foundation team.
In this session, I'd like to
talk to you about how the data
that flows through your app can
affect it, and how you can
better protect your customers by
building trust in that data.
Let's get started.
Apps don't live in a vacuum.
In order for your apps to do
something useful, they have to
draw in data from external
sources, like the disc, or the
network, or your customers
themselves.
And then do something meaningful
with that data, and then present
it to your customers.
In order for that data to be
consumable, it has to come in
some known format or structure.
What happens when it doesn't?
Usually, this means the data is
corrupted, or invalid in some
way and can be ignored.
But sometimes, this data can
invalidate assumptions that your
app makes, and it can cause your
app to misbehave, or maybe even
crash.
This can be a bad experience for
your customers who might have to
wait for your app to get updated
in the app store.
And it's an even worse
experience if it's a crash on
launch, because they can't even
use the app.
And, in the meantime, while
you're waiting, you'll get a
wave of one-star reviews.
It's a bad experience for
everyone.
This is something to be even
more cognizant of, if you're a
framework author.
Because it's not just one app
that could possibly be affected,
but maybe many apps.
Today, we're going to be talking
about trust.
And specifically, we're going to
be talking about how to build
trust in data, by making sure of
two things.
One, that the data that we're
going to be using, hasn't been
modified from underneath us, and
two that it contains what we
expect it to in the format and
structure that we want.
So, we'll do just that by taking
a look at the lifecycle of our
data, and what we can validate
about that data at every stage
in the lifecycle.
Then, we'll see what sort of
type-level validation we can
apply with the NS Secure coding
protocol.
And then apply those same
concepts to codable types.
Let's get started.
In order to talk about data,
we're going to want to build a
mental model of the forms that
data can take within our app.
At the most basic level, data
makes its way into our app as a
stream of bites.
There's not much we can tell
about this data at this stage
without looking at it, but this
we'll call raw data.
Now, to get working with that
data, we need to make sure it
conforms to that known format or
structure.
And in this case, each of those
code points correspond into a
UTF code point, and sorry, one
moment, let me make that a
little bit more readable.
It looks like this is JSON.
And so, once we've made sure
that that data conforms to some
format that we want to work
with, we'll call this formatted
data.
Now, formatted data, on its own
doesn't mean much, until we
create primitive values out of
it, strings, and arrays, and
dictionaries that we can then
use as building blocks for
further algorithms.
So, this we'll call our
primitive data.
Now, there's building blocks we
most often want to work with,
not as just primitive values,
but as our own model types.
So, once we do that, we'll make
use of this as we're going to
call, structure data.
Now, these forms of data in our
apps, form an abstraction
timeline.
Raw data is the least abstract
data that we're going to work
with, and our own structured
model types are the most
abstract.
So, our goal for today is to
take that data as far along the
spectrum as we can.
Now, our apps can stop at any
point, make use of the data,
however, we see fit.
But we really want to work with
our own model types wherever
possible.
Now, the goal for today is to
not just go as far along the
abstraction spectrum as we can,
but to build trust as we do so.
At every stage, the data is
going to get more complicated,
and there's going to be more
that we need to validate about
it.
But once we do that, there's
also going to be more that we
can trust about it.
Now, for our use case today,
we're not really going to be
talking about formatted data.
Very often it's just a stepping
stone between raw and primitive
data, and you don't work with it
directly.
For instance, given raw data,
foundations JSON serialization
will give you primitive data
back.
You don't see just the formatted
date directly, and you won't
make use of it.
So, today we'll just be talking
about raw, primitive, and
structured data.
Now, let's start by talking
about raw data.
Again, as we mentioned, raw data
is just a stream of bytes that's
made its way into your app.
Until you inspect that data and
you give it meaning, there's not
much you can do with it.
Now, we might care to know what
we can take a look at about that
data before we start
interpreting it.
Is it even safe to do that?
One thing we can validate about
this data before making use of
it is its length.
Say your app expects to load a
1-kilobyte file from disc, but
finds a 1-gigabyte file in disc.
Does it make sense to even load
that data in the first place,
and start reading it?
Almost certainly not.
Now, sometimes we might not be
able to have length expectations
about the data.
Maybe it's external data we
don't own.
We don't' know how much data
there could be.
But in some cases, we might also
be able to verify a checksum, or
a cryptographic signature that
represents what the data might
look like, even if we don't know
what's inside.
Checksum is built by hashing all
of the data.
And if any bit in the data
changes, either due to a
potential malicious third party,
or just regular data corruption,
bad blocks on disc, a bad
network connection.
If any one of the bits flips in
that data, it will invalidate
the checksum, or the signature.
And we'll know before even
reading any of those bytes, that
the data is incorrect, and you
shouldn't trust it.
Now, we also don't always have a
checksum.
Maybe it's data we don't own,
where you can't get that ahead
of time.
So, at this stage, there isn't
much we can do with this data,
besides read it and inspect it.
And so, once we do that, we can
get primitive data out.
Now, as we've mentioned, we can
take that raw data and pass it
through, usually deserialize it,
like foundations JSON
serialization.
When we do that, we'll get inert
strings and dictionaries and
arrays of numbers back out that
we can use.
And if this process exceeds, we
know two things about that data.
One, that the data was indeed in
the correct format that we
expected.
For instance, XML data won't
pass through JSON serialization.
And two, if we trust the
deserializer, we know that the
run-time objects we get back out
are going to be valid.
Again, foundations JSON
serialization will always give
you strings and numbers and
arrays that you can actually
work with.
It's individual values that we
can trust.
But at this stage, we might
wonder okay how can we make use
of this data, or what can we
trust about it, and what
validation do we still need to
do?
Well, we don't actually know
much about the contents of this
data yet until we start looking
at it.
And in fact, we might not know
anything about the structure of
the data until we start
inspecting it.
If you've ever worked with
dynamic deserialization in this
way, you'll know that there's a
lot of downcasting from anys.
There's no upfront expectation
what the data can be because
it's very generalized.
And so, we'll want to check to
see what the data contains and
how we can work with it.
So, let's motivate this with an
example.
I've been working with an app
lately called Sell My Old Junk,
which allows me to sell my old
junk to some friends and family.
And when one of them opens up my
app, my app makes a request to
my server.
Which requests a list of
products that are currently
available for sale to my friends
and family.
When the server receives this
request, it responses with JSON,
that indicates here are the
products available to sell.
Now, this is what an API
response from my server might
look like.
It's an array of product
listings, which have some
interesting fields that you
might care to look at.
For instance, each listing has a
product ID.
A positive integer that uniquely
identifies the product.
And in my case, these are
sequential integer IDs.
Every listing also has a name
and a description which are
strings.
And there are a few other type
fields here that we might care
to look at.
For instance, there's a field
that's a Boolean that indicates
whether or not this listing has
already been sold.
And there's some internal
structure here.
This list of tags, which are
strings, which we might care to
use.
There are also a few fields
here, which come to us as
strings, but really represent
other forms of data that we
might care to look at.
For instance, URLs and dates.
So, let's make use of this data.
In my app, I can fetch that data
from the network say with URL
session, and wherever possible
I'll validate the length.
Maybe my server can produce a
checksum that I can validate.
Or a cryptographic signature.
Once I've done that, I can take
the data and pass it off JSON
serialization.
If deserializing the data fails,
JSON serialization will throw an
error.
Will check and then catch and
handle.
May display dialogue to my
customers.
Again, in our purlins of the
day, we've just taken raw data
and carried along it the
abstraction spectrum to
primitive data.
And if anything went wrong, we
can handle that failure.
So, now we need to make use of
this data.
How can we consume it?
Well, JSON is an any variable
that contains the actual values.
So, I can downcast it to the
array of dictionaries that I
expect it to be.
Now, this part of my app cares
only about product listings that
are related to music.
So, will filter out any products
that don't contain the music
tag.
And here, I have that
substructure.
That list of tags, which will
downcast an array of strings and
make use of it, right.
Whoops. Each of those forced
downcasts, actually contains a
hidden fatal error.
If either of those casts fails
because the API changed or the
data changed along the way
before it made into my app,
again due to data corruption or
malicious changing, those
downcasts will fail.
And when they do fail, they'll
abort, and they'll crash my app.
And again, that's a bad
experience for my customers.
Let's take a look at how this
could happen.
So, here again is that sample
API response.
And we'll take a look at the
list of tags here.
And say that second tag in there
is modified.
Instead of a string, we have a
number.
It's maliciously changed by a
third-party, or maybe again due
to regular data corruption.
We can't always tell.
Downcasting this list of tags
will fail because they're not
strings and we never checked to
make sure they work before we
cast.
So, to avoid this, our main
tenant for the day will always
be validate first, execute
later.
Instead of asserting that you
know what the structure of the
data is, check first.
Don't blindly assume.
So, let's see how we can do
that.
So, here again is that first
forced downcast.
And instead of forcibly
downcasting these values, I can
conditionally downcast.
This allows me to validate that
the data actually contains what
I want and if that cast fails,
well I can handle that error
instead of fatally erroring.
Now, similarly, later
downcasting that list of
strings, instead of forcibly
downcasting, again I can
conditionally downcast.
And in this case, instead of
throwing an error, I can give a
default value that allows
execution to continue.
In this case, I'll simply ignore
any product listings that don't
have a valid list of tags.
I could throw an error, but in
this case, I chose not to.
Now, type validation isn't the
only form of validation that you
want to perform at this stage.
For instance, if that had been
replaced by null, which is
totally valid in JSON, I
would've seen a similar crash.
In Swift strong static type
system nullability is part of
the type.
And indeed, you can't downcast
null to a string.
And so, again, this cast would
have failed.
Now, even if all of these values
are of the correct type and
nullability, there's other forms
of validations that we should
care about here.
For instance, I said that each
product listing has a positive
integer ID.
In my case, they're all
sequential integers.
Does it make sense for one of
these IDs to be negative?
No, it doesn't.
But even if it is always
positive, does it make sense for
it to be such a large positive
integer value?
I'm not selling that many
things.
So, no it doesn't.
and in this case, this might be
due to somebody trying to cause
overflow in my app.
This is something you need to
watch out for.
Now similar to range validation
is length validation.
Again, every product listing has
a description.
Does it make sense for that
description to be empty?
Well, in my case, I know that
anytime I upload a product
listing, I'm always going to put
a description in there.
So, in my case, no it doesn't
make sense for it to be empty.
But even if it's not empty, does
it make sense for it to be the
full length and contents of
"Romeo and Juliet?"
Also no, it doesn't make sense.
Something here's gone wrong and
I need to look for that.
Now, there's additional forms of
validation that we really do
care about here.
Even if all of these fields are
right type, nullability, and fit
within the range and length that
we expect, their values and
contents are also equally
important.
Every product listing has a URL
that I can send a customer to
see more information about that
product listing.
This comes to me as a string,
but it actually has to contain a
URL.
Does it make sense for it to be
any arbitrary string?
No, and in my case, I want to
make sure that it actually
represents a URL.
But more importantly, just
because it looks like a URL,
doesn't mean it will point to my
domain.
And this is something I care
very deeply about.
I really, really want to keep my
customers safe.
I don't want to possibly send
them to a phishing domain, which
could look like mine, but not
really be my site.
So, this is something that I
care about watching out for.
Now, lastly, even if each of
these fields is valid on its
own, sometimes the relationships
between fields that matters.
For instance, each product
listing has a date when it was
created, and a date when it was
last updated.
Each of these can be valid on
their own, but does it make
sense for the date that it was
last updated to come before when
it was created?
No. And in my case, this might
not open a security
vulnerability in my app.
But this is something that maybe
you watch out for because it's a
good indicator that something's
gone wrong, and maybe I
shouldn't trust this data.
So, let's take a look at how we
can start doing that.
Here, I've started writing a
function which will take one
such listing and start
validating all of the contents.
So, I'll take a listing, and
I'll start pulling out the
product ID.
And we've learned here not to
forcibly downcast this ID to an
Int, but conditionally downcast.
And if the cast fails the guard
will fail and will throw an
error.
Now I don't want to stop there,
I want to perform the range
validation that ensures that
product ID is also valid.
That it's positive and not too
large.
And again, if something goes
wrong, I'll throw an error.
Now, later on I might care to
check out that URL.
And again, I'll downcast it to a
string, instead of forcibly
downcast.
And here, I can check the link.
In this case, I know my server
will never produce URLs that are
too long.
So, if I find a really long URL,
I'll know that it's invalid.
Once I've validated that, I can
send it off to the URL type to
perform that domain-specific
validation to make sure it
actually is a URL, and not just
a garbage string.
And again, if anything goes
wrong, I'll throw an error.
But, again, I don't want to stop
there, once I have an actual
URL, I want to make sure it
points to my domain and not a
phishing domain, so I'll keep
working with that.
Now, at this point, I can apply
the same types of validation to
the other fields here.
And again, if anything goes
wrong, I'll throw an error.
But once I have this function, I
can apply it to all the product
listings that I've loaded from
the payload.
And again, I'll stop executing
if anything goes wrong.
Now, this is how we can work to
validate primitive data.
But as we just saw, primitive
data can be very general.
A string can be a string, but it
can also be a date.
And it can also be a URL.
Sometimes we care to work with
data with the semantics that
matter to us.
We wanted to make sure that the
host of that URL was our domain,
and we can't do that with just a
regular string.
Similarly, a dictionary can
represent a model like a listing
here, or it can represent
arbitrary customer data that we
know nothing about.
Instead of performing the same
validations everywhere to make
sure all the fields that we care
about are there, isn't it nicer
to work with our own model
types, where that guarantee is
always present?
It is. And in our case, we want
to work with structured data
wherever possible.
We can use primitive date as a
building block to get there.
But we want to work with that
form of data.
So, let's take a look at how we
can do that.
Elsewhere in my app, I have a
purchase type, which does just
this.
When a customer makes a
purchase, I store that data to
disc, so that later, when they
open the app, even if they're
not connected to the network,
they can view their purchase
history.
Each purchase keeps track of the
product listing it was
associated with, and when the
purchase was made, and a
receipt.
I can save it to disc in this
way using NS coding and NS key
to archiver.
And I'll archive it.
But as we saw, when we unarchive
data, and we handle raw and
primitive data, we want to
validate it.
So, let's do that by taking a
look at how doing it with coder
here could work.
If you've ever written in a note
with coder, this might look
familiar to you.
We'll start by decoding the
product listing.
And again, we've learned to
conditionally downcast, instead
of forcibly downcasting.
And if something goes wrong,
well this is a fail-able
initializer, we'll simply return
nil, right?
if decoding succeeds, I'll
assign this to my property, and
I'll keep going.
I'll do the same thing with the
purchase data, and again,
conditionally downcast to a
date.
If something goes wrong, I'll
fail, and so on.
And repeat this for each of the
fields in my type.
When I want to save one of these
purchases to the history, well I
have a function, which does
this.
It archives a purchase to binary
data.
And then, I can save it out to
disc, or shrill that data off
into database or similar.
When I want to load that data
back, well I can do the same
thing.
I can get that raw data and then
pass it along to
KeyedUnarchiver, to get an
object back out.
Now, as we've said, at every
point here, the data is getting
more complicated.
There might be more to validate
about it before we can trust it,
just as well.
So, you might wonder, okay,
what's the catch here?
What else is there left to
validate?
And this downcast, here is a
good hint.
Note that this downcast happens
after we've unarchived an
object.
How could this ever fail?
It's a good indicator that
something else might be going on
here.
So, let's take a look at that.
This is an abstract
representation of what these
model objects might look like in
my archive.
Here we have all the fields that
we cared about in coding.
And each of them contains their
own structure, and substructure,
and contents, and so on.
But, interestingly here this
representation also contains the
name of the class of this
object.
Let's take a look at how
KeyedUnarchiver can make use of
this information.
So, we have a decode call we're
currently making.
And this under the hood creates
a KeyedUnarchiver, and decodes
an object for the object key.
When you perform this in
KeyedUnarchiver, KeyedUnarchiver
finds that class name in the
object in the archive and pulls
it out.
And dynamically finds a class
with that same name in your app.
It then allocates an instance of
that class and then initializes
it to allow it to decode its own
contents.
Afterwards, it awakens the
object to give it a chance to
finalize its state.
Now, this works great for our
objects.
But now, we might wonder what
happens if the data is
maliciously changed to contain
an object of a class that we
didn't expect?
Well, this whole process that we
just performed happened on a
different type.
We just allocated, initialized
and awoke an object of a class
that we didn't expect.
What kind of effect can this
have in our app?
Well, as we saw before, the
conditional downcast here,
prevents us from accidentally
using an object of this
unexpected class.
We're only going to work with
objects of the types that we did
expect.
The downcast fails, well we'll
fail.
But decoding one such object can
have a lasting impact in our
app.
Say that class has an alloc
method, which changes global
state.
Maybe it allocates a singleton
or changes some global data.
Even though we throw the object
away, if this fails, this can
have a lasting impact in our
app.
And can cause differing behavior
somewhere else and an archive
can be maliciously constructed
in this way to cause this to
happen in our apps.
So, how can we validate the data
to prevent this from happening?
This is exactly where the
NSSecureCoding protocol comes
in.
NSSecureCoding is a protocol
inheriting from NSCoding, whose
goal is to prevent exactly this
sort of attack.
By allowing you to pass in type
information upfront, it prevents
arbitrary code execution by
validating the contents of the
archive to make sure it only
contains the types that you
expect.
The hallmark introduction of
NSSecureCoding were two
alternative methods to decode
object for key, which allow you
to pass that type information
upfront.
And using that type information,
NSKeyedUnarchiver can keep you
safe.
So, let's take a look at the
current decode object for key
call that we have.
This top level decode.
Now, here, if instead we use the
variant that allows us to pass
in the class upfront, in this
case, we want to decode a
purchase, instead of performing
this whole process and whatever
is in the archive, you can first
gate it on a class check.
Let's examine this class check
for a moment.
At every stage in decoding, if
secure coding is on,
NSKeyedUnarchiver maintains a
list of classes, which are valid
to decode.
When we make such a call,
NSKeyedUnarchiver, takes the
object that we used, this class,
and constructs an allowed class
list from it.
When we go ahead and decode an
object from an archive, it's
class is first checked against
the allowed class list.
And if it isn't present, the
decode call will be rejected.
Now, if the class of the object
that we find in the archive is
in the allowed class list,
there's a few checks that we
need to perform.
Specifically, we'll need to make
sure that this class itself also
conforms to NSSecureCoding.
If it doesn't, we can't be sure
that it itself will be able to
further securely decode its own
contents.
And so, we can't safely decode
one of these objects.
In our case, the purchase class
will.
And so, it's safe to decode and
keep track of it.
Now, there's two other checks
that are related to
superclass-subclass
relationships.
If you have two classes, one of
which is a subclass of another,
both conforming to NSCoding, and
the superclass adopts
NSSecureCoding conformance, the
subclass will inherit that
conformance.
Now, the subclass may never have
had a chance to rewrite its init
with coder to do the secure
thing.
And so, we have an escape hatch
here.
The support secure coding
method, allows you to say,
actually I don't really support
secure coding, and you can turn
this off to say, I'm not ready
yet.
Alternatively, if you still say
yes, we have to make sure that
you either inherited the full
conformance to NSSecureCoding
from the superclass, or that you
overrode both of the methods to
indicate yes I really do support
secure coding.
There isn't a mismatch here.
In our case, purchase meets both
of these requirements and so we
can safely decode it from the
archive.
Now, we go ahead and decode a
purchase, it itself decodes a
listing.
And it can make use of the same
type of call to indicate that it
wants a listing.
When it does that
NSKeyedUnarchiver uses that
class to build a new allowed
class list.
And everything is checked
against this new version of the
list.
We go ahead and decode an
object, the same checks are
performed and in this case a
listing is still valid to
decode.
But again, if we try to decode
an object of an unexpected class
that's not in the list, it will
be rejected.
Let's take a look at what that
rejection might look like.
And this is called decoding
failure and there are a few
other types of failures that we
might care to look at.
So, when secure coding is on, we
might be able to see secure
coding violations in cases like
this.
But there's other forms of
failure here, too.
For example, a type mismatch can
happen if you try to decode an
object and instead there's a
primitive value, like an integer
in the archive at that location.
Or, you try to decode a
primitive, like an integer, and
instead we find an object or a
primitive of an incompatible
type like double.
These can cause decoding
failures.
There's another form of failure
that can happen here.
And that's archive corruption.
If the archive itself is too
corrupted and doesn't follow the
expected format for
NSKeyedUnarchiver, well we won't
be able to decode anything, and
so you'll get that same sort of
failure.
Now, what happens on failure is
decided by the decoding failure
policy on the Unarchiver.
There's two options here.
On failure, we can either raise
an exception or store
information about what happened
and continue execution.
Raising exceptions is currently
the default.
So, if we have a call, again
this is our call from earlier,
trying to decode a listing.
And we find an object of an
unexpected class in the archive,
under the hood this calls the
failWithError method, on the
Unarchiver and passes in an
error that indicates what went
wrong and where.
Now, under the hood,
failWithError has a decision to
make.
If the decoding failure policy
is to raise exceptions, it will
raise exceptions.
If you're writing a Swift app,
this is something you have to be
mindful of.
Swift can't catch Objective-C or
C++ exceptions, and so this can
lead to a fatal error in your
app.
Now again, this can lead to a
crash and a bad experience our
customers.
If the decoding failure policy
to set error and return, the
error will be assigned to the
Unarchiver's error property.
And execution has to continue.
And in this case, because
execution does continue, the
decode call has to return
something, and so it will return
nil to indicate that nothing
could be decoded.
And as mentioned, if you're
decoding a primitive type, here
and we find an object or a
primitive type that's
incompatible, the same series of
steps has to happened.
And in this case, instead of
returning nil, we'll simply
return 0.
Now failWithError is API on
NSKeyedUnarchiver, and we urge
you to use in your own code to
indicate when failures happen
and what went wrong.
Instead of simply returning nil,
failWithError first to record
that information.
If you do, there are a few
things to keep in mind.
If the decoding failure policy
is to set an error, and return,
you have to keep in mind, that
once an error is set on
Unarchiver, it won't later be
reset.
And that's because one decoding
failure often leads to a cascade
of decoding failures.
And we don't want to lose sight
of what originally went wrong.
Second, keep in mind that any
given failWithError call, can
either throw an exception or
continue execution, so you have
to keep both of those options in
mind.
Especially if you're working
Objective-C.
Maybe you can catch the
exception.
So, there's things to handle
there.
And lastly, keeping an eye out
for nil, or 0 return values.
This could either happen because
of a decoding failure, if the
decoding failure policy is to
set an error and return.
Or, the data could have just
been missing.
Or, even encoded as nil or 0.
So, to disambiguate between
these cases, check the error
property on the Unarchiver.
All right, so that was a lot of
information.
Let's distill this down into a
checklist to see how we can
adopt NSSecureCoding on our own
types.
We'll start by converting all
decode object for key calls to
the variant which allows us to
pass in that type information up
front.
And then, if something goes
wrong instead of just returning
nil, let's failWithError to
indicate what happened.
And lastly, this is a great
opportunity to audit for further
points of failure, where we
weren't performing validation
before.
So, let's do just that.
So, here again is a decode call
to decode a listing.
And you'll notice that if we
pass in that type information up
front, the conditional downcast
later can go away.
There's a generic overload of
this method, which when given
the type information statically
causes you to not have to
conditionally downcast anymore.
You'll always get an object of
that class.
Now, again, as we said, instead
of returning nil on failure, we
want to fail meaningfully.
So, here we can fail with an
error to indicate what went
wrong and where.
And in this case, we can use one
of the conveniences on
CocoaError to return a
meaningful error that has a good
localized description for our
customers and that indicates
what went wrong.
We can always add a debug
description for ourselves, to
later log if we care to.
But this is the core.
We want to failWithError before
returning nil.
And then, later on, again, we
were decoding the purchase date.
And here, there is a great
opportunity to add further
validation where we weren't
before.
Here, if we can decode a date,
well, I was simply storing the
property.
But instead, I want to make sure
that that date is valid.
For instance, a purchase
couldn't possibly had been made
before my app went live on the
app store.
So, this is something you could
check.
And again, if something goes
wrong here, we want to fail with
a meaningful error.
In this case, it wasn't that the
data was missing, it was that
the data was corrupted or
invalid in some way.
And so, we'll indicate just
that.
Now, that we've gone ahead and
done exactly this on our type,
we can go ahead and claim that
it supports secure coding.
And lastly, conform to the
NSSecureCoding protocol instead
to indicate to the runtime that
this is what we intended.
It really does support secure
coding.
And after that, well,
congratulations.
We've earned our NSSecureCoding
badge.
Physical badge is sold
separately.
Now, we think it's so important
for you to earn your own
NSSecureCoding badges and use
them that this year, we've added
new API and NSKeyedUnarchiver to
make sure that NSSecureCoding is
done wherever possible.
This new initializer and
convenience methods, turns on
secure coding by default and
sets the default decoding
failure policy to set error in
return.
So, unless you change the
decoding failure policy on your
own, you don't have to worry
about exceptions in Swift.
And indeed, the old initializer
and convenience methods are
deprecated in favor of these
versions.
So, we really want you to use
them.
And similarly, we've introduced
the same APIs on
NSKeyedUnarchiver, to make it
easier for you to turn
SecureCoding on when archiving.
And this is equally important
because it makes sure you can't
accidentally archive an object
which doesn't conform to
NSSecureCoding.
And you wouldn't later able to
decode it.
And again, the old initializer
and convenience methods are
deprecated.
And so, this means that if you
have old code that looks like
this, well turn on SecureCoding
when archiving.
And switch to the convenience
methods that allow you to pass
in that type information
upfront.
This way, you can actually make
use of your SecureCoding badge.
Make sure you've earned it.
Now, if you can't yet support
SecureCoding, because your types
can't conform, or they,
themselves depend on types which
don't yet conform, that's okay.
You can still use these methods.
On nCode, simply turn off the
secure coding requirement.
And on decode, these
conveniences always have
SecureCoding enabled.
So, instead use the new
initializer to create a
KeyedUnarchiver, and then turn
SecureCoding off, manually.
This also gives you the
opportunity to reset the
decoding failure policy back to
the old default in case you need
it.
Once you have such an
Unarchiver, well you can decode
as usual.
Now, if you're working on a
Swift app, NSSecureCoding isn't
the only way for you to convert
your own model types to external
representations and back.
Last year with Swift 4, we
introduced the Codable protocols
which make it easier to do just
that.
And importantly, all the design
decisions that made their way
into NSSecureCoding were also
present in Codable from day one.
Specifically, Codable never
writes type information into
archives.
So, there's nothing to trust
later on.
By requiring you to pass in
static type information upfront
about what you expect to decode,
you can prevent exactly this
sort of attack.
Now, Codable also comes with
conveniences.
If you have a type whose fields
are all themselves codable, well
you'll get a synthesized
implementation of the init from
and encode to requirements.
And the synthesized
implementation gives you type
and nullability checking for
free.
But as we saw, most types
require additional validation if
they come from external sources,
just like ours do.
So, we need to further validate
those.
So, we can do that for our own
types by overwriting in it with
init from decoder.
And in this case, again, here's
that JSON response from earlier.
And I can trivially turn it into
a Codable type by simply
creating a type with all the
same name fields.
Because all these fields are
codable, I get that free
implementation of init from and
encode to.
But, I want to override init
from to make sure I'm performing
the same validation that I was
before with the primitive
values.
So, I can do that here, the same
way.
Where my old code decoded the ID
from the payload by downcasting
to an int, I can do similarly
here by decoding an int from the
decoder.
If the type found in the payload
is of a different type or is
missing, this will throw an
error that indicates what
happened.
Now, more important than that is
my own validation that I added
in that validate method.
Here, I want to make sure I keep
performing that same validation
to make sure the ID is valid.
And here, I can use a
convenience method to throw a
meaningful error that indicates
what happened.
Now, later in that validate
function I might've pulled that
creation date out as a string
and then had to pass it along to
a date formatter to get a valid
date back out.
Here, because we can use JSON
decoder, we can simply decode a
date directly, we don't have to
worry about that type
conversion.
We can use JSON decoders date
decoding policy to indicate what
sort of conversion can happen
here.
And this is nice because this
conversion became a one-liner.
And, the other decode call is
also a one-liner, which makes it
easy for me to focus on the
types of validation that I care
about.
Now, later on in the validate
function, I might have also
pulled out that substructure of
tags out as an array of strings,
and later had to map those
strings to my own type, maybe
for further validation later on.
Here, again with Codable, I have
the benefit of tag conforming to
Codable itself.
And so, I can just decode an
array of tags directly.
This happens automatically.
And instead of focusing on the
type conversion, allows me to
focus on the validation that
matters to me.
This lets me make sure that this
data is data I can trust.
We talked about a lot today.
We started with raw data and
carried it all away across the
abstraction spectrum to our own
model types.
And along the way we saw how to
build trust into that data by
validating a checksum, or the
length on that raw data, we made
sure that it was valid to work
with an inspect, and even check
whether it conforms to some
format.
Once we've made sure that it
conforms to a known format, we
knew that that formatted data
was valid to work with and pull
primitive values out of.
By validating the contents and
structure of those primitive
values, we made sure that it was
valid to work with them to turn
them into our own model types.
By validating the semantics and
relationships between those
model types, we made sure that
they were valid to work with and
that they were data we could
trust.
So, where to go from here?
Well, now that you know all of
this, I encourage you to go and
look at your code and do just
this.
Validate data at every stage in
the lifecycle of that data.
Check for types and nullability.
But more importantly, range, and
length, and domain-specific
applications.
If you have NSCoding types,
audit those types and earn your
NSSecureCoding badge.
And don't just earn it, use it.
Turn on secure coding wherever
you can.
And lastly, for new data types,
adopt Codable where possible and
make sure to perform that
validation.
Make sure you're only ever
working with data you can trust.
For more information about
Codable, specifically, see last
year's talk about What's New in
Foundation.
But if you have more questions
about this topic or want to come
to us for help on how to apply
this in your own apps, come to
our lab later today, we have a
foundation lab, where we can
help you do just that.
With that, I'd like to say thank
you very much.
Have a great day.
[ Applause ]