WWDC2019 Session 710

Transcript

[ Music ]
[ Applause ]
>> Good afternoon, ladies and
gentlemen.
Welcome to the Filesystem
session.
In this session, we're going to
cover several topics related to
the filesystems.
First, we're going to have a
look at the role a filesystem
plays in protecting system
software on macOS.
We're going to describe how
volume replication works in
APFS.
And finally, we are going to
talk about the new and very
exciting feature coming to the
devices running iOS and iPadOS,
the access to the external files
on the external media.
But before we start, let's rehab
what happened to APFS recently.
APFS has been the default
filesystem it -- on iOS and tvOS
since 10.3 and on macOS since
High Sierra.
One of the features which APFS
introduced is a built-in volume
manager.
The concept of a volume is not
new to APFS.
They existed in HFS Plus.
The HFS Plus volume is a
one-to-one map to -- on this
partition.
And it occupies contiguous range
of blocks on disk.
Because of this one-to-one
mapping to partition and HFS
Plus volumes cannot be added
easily.
In order to add a new volume to
partitioned disk, you need first
to shrink the existing volume.
And then you can add a new
volume.
APFS volumes are a lot more
flexible because they share
their own disk partition space
with their siblings.
And that flexibility allows
higher levels of a system to
implement features which would
not otherwise be possible to
implement with the old-style
volumes.
One of us -- one of the features
which we'll implement is
protecting system software from
malicious or accidental updates.
You may remember back in the
macOS El Capitan days, we
introduced the concept of system
integrity protection.
It was used to control access to
a part of the directory
hierarchy.
Some directories would be write
protected, while changes would
be allowed in the other parts of
the filesystems.
This year, we're taking this one
step further.
We're making entire root
filesystem read-only.
[ Cheering and Applause ]
Obviously, a system where you
cannot install new software or
where user cannot save the data
is not particularly useful.
In order to understand how we
can reconcile those two
conflicting goals, read-only and
writeable at the same time,
let's look at what's happening
when we upgrade from macOS
Mojave to macOS Catalina.
A typical container on macOS
Mojave will have one main
volume, and few service volumes.
For example, VM.
The main volume is used to store
both user's data and system
software.
When we start upgrade--
When we start the upgrade to
macOS Catalina, we first change
the role of the main volume and
mark it as data volume.
We then can prune some parts of
the directive hierarchy which
contains only system software.
Once this is done, we can move
to the next stage where we
create a new empty volume, which
will be used only to store
system software.
We'll populate that volume with
the system content.
And once this process is done,
we can declare victory on the
protection front.
We are read-only.
We're protected.
It's good, but it's not enough.
We still need to somehow tie the
new system content with the user
content.
And for that, we introduce the
concept of a volume group.
A volume group is one data
volume and one system volume.
And they treat it as it as a
single entity.
The UI present it as a single
disk.
They share encryption state.
If your volumes are encrypted,
then the same password can be
used to unlock both volumes.
Almost everything looks as a
single unified entity.
There is one thing which is
missing.
We still need to somehow provide
an illusion of a single
directory hierarchy.
Traditionally, it was done by
mounting filesystems on top of
directories in the root
filesystem.
With a number of crossing
points, which we need to
introduce, and the number of
volumes which are required in
the filesystem, that approach
becomes rather expensive.
So to deal with this, we
introduced a new concept called
Firmlink.
The Firmlink is a new filesystem
object.
It's similar to Symlink.
But unlike Symlink, Firmlink
support backwards and forwards
path reversals, which is
consistent in its
representation.
And that consistency is rather
important.
If you ever had to deal with an
application, which absolutely
insists it has to live in a
particular directory, for
example, such as applications.
You know that you have to be
able to walk both from the top
of a filesystem from the road
down to the leaves and backwards
and get the same path.
We can do that with Firmlinks.
The Firmlinks are a traversal
point from a directory on the
system volume to a directory on
the data volume.
They only have one-to-one
mapping.
One source.
One target.
You cannot use a Firmlink to
cross a boundary of a volume
group.
The Firmlinks are rather
transparent to the user and the
application.
They are created by the install
at the installation time.
And they are not supposed to be
noticed.
Once we get this new tool, we
can use it to stitch up volumes
together.
The installer will create entry
on the system volume.
And point them to the
corresponding volumes on the
data volume.
And once it's done, we have a
unified directory hierarchy.
We can reboot, mount root as
read-only, and enjoy the
protection which it gives us.
Life is good.
Everything's protected.
Everything's running.
But you still have to remember
that the volumes are split
during the install process.
There is no way to avoid it.
It will happen.
In the developer's preview, we
elected to leave the root
filesystem writeable to make it
easier for you to test your
application.
If you want to mimic your
behavior, which will be
implemented in the future, you
can create a special file in the
root directory and on reboot,
your volumes will be mounted
read-only.
That will change in the next
build over next seed.
In the release build, you would
still be able to mount root
filesystem as read-write if you
disable system integrity
protection.
But this is not a persistent
change.
On the reboot, it will revert
back to read-only state.
You can remount it read-only,
read-write again.
And again, it will revert to the
same state on reboot.
As you can imagine, this is
rather big change in the way
macOS shapes and installs.
And it could catch some
applications.
If, for example, your
application uses a complex
layout on the filesystem or
comes with the installer
package, make sure it works on
the new read-only root
partition.
If you're writing a backup
utility, which cares about inode
numbers, filesystem IDs, and the
like, make sure that you test
it, because the assumptions you
may had previously may not be
true.
So the bottom line is test,
test, test.
And next, I will hand over to
Jon Becker to talk about volume
replication.
[ Applause ]
>> Thank you, Max.
Good afternoon.
My name is Jon, and I'm going to
be talking about volume
replication with APFS.
So let's dive right in.
What is volume replication?
Well, the basic idea is pretty
simple.
We'd want to copy one volume to
another volume.
Sounds simple.
But the important aspect of this
is that we want the fidelity of
this copy to be as high as
possible.
In general, it's not sufficient
to just be doing file-by-file
copies.
We want to be copying all volume
content.
All data, all metadata, volume
attributes.
If the source volume contains a
bootable OS, we want to be
copying the metadata that makes
that volume bootable so that the
target of our copy will also be
bootable.
Now I'm going to be talking
about replication in the context
of the Apple Software Restore
command line utility, or ASR.
ASR has been around for a very
long time.
Many of you may be familiar with
it.
And its primary function is to
do volume replication.
Now, with ASR, it's very common
for the source of our
replication to be a disk image.
And in that context, we often
refer to the procedure of doing
this replication to occur -- to
a target volume as restoring,
hence the name Apple Software
Restore.
But you'll hear me use the terms
restoring and replicating pretty
much interchangeably.
So who wants this?
Who needs this functionality?
Well, if you are in education or
Enterprise IT, if you do things
like set up large labs with lots
of machines, or if you write
backup utilities, then
replication is a function that
you probably need or want to use
with some regularity.
As we'll see, some of the new
features of APFS present a
challenge for how we can do
replication.
On the other hand, as well also
see, these new features also
present an opportunity to make
replication a more powerful and
flexible operation.
So I want to back up for just a
second and talk about how
replication has looked in the
past prior to APFS.
Now as Max demonstrated before,
with previous filesystems, like
HFS Plus, the filesystem, or I
should say the volume, is in a
one-to-one relationship with the
partition that contains it.
And what that means is that the
volume is backed by a contiguous
block device.
So replicating can really
involve just doing a block copy
of the entire partition.
Of course, if we're copying the
entire block device, we are
copying all of the filesystem
information in that volume.
Now there may be some
complications like as in this
diagram.
We have the source and target
partition not being the same
size.
But there are ways that we can
fix that up.
And so by and large, block
copying is a very effective and
relatively simple way of doing
replication.
But of course, with APFS, there
are some new features that
complicate this picture.
So APFS, as Max told you, has
some features, like volume
management and space sharing.
And as we can see, if we take a
look at Volume 1 in the diagram
here, it's spread out around its
container partition.
And so it is not a contiguous
block device.
And it may be -- its data may be
interspersed with the data for
another volume in the same
container, like Volume 2 in this
example.
Furthermore, we of course care
about security and privacy.
And so we have to think about
encryption.
Now with APFS, encryption is
done at the filesystem level.
And what's more, on Macs that
have the T2 security chip for
internal storage devices, that
encryption is on all the time.
And it is tied directly to the
hardware, meaning that it is
specific to that storage device
in that Mac.
And so if we would try to copy
the blocks for that volume and
take them somewhere else, they
won't be decryptable.
So the upshot here is block
copies are really not a possible
way to do replication with APFS
volumes.
So how do we reconcile this with
the needs that we have?
Well, new with macOS Catalina
we're introducing APFS volume
replication with ASR.
ASR and APFS.
Yeah, thank you.
ASR and APFS are tightly
integrated.
And ASR can have APFS generate a
stream from the source volume.
And that stream then gets
written to the target volume.
Now because APFS is generating
this stream, it knows where it
needs to read the parts of the
source volume.
And so it's not a problem that
that source volume is not a
contiguous block device.
As far as encryption goes, if
the source of volume is
encrypted, then it will be --
the data from it will be
decrypted on the fly as the
stream is generated.
And of course, if the source
volume is protected by
FileVault, then it does need to
be unlocked by user action prior
to this replication taking
place.
If the target volume is itself
encrypted then the data is in --
or the data is encrypted as it
is written from the stream to
the target volume.
And so in this case, that data
is encrypted from the get go by
the time it hits the target
volume.
Okay? One other nice feature of
this replication is that as the
stream is generated, the volume
data is defragmented on the fly.
The metadata is compacted, and
so the resulting stream, and
therefore, the resulting target
volume, are very nicely
optimized.
This can be a great way to do --
to master images, for example.
If you're mastering an image,
and your final step is to do a
replication operation, then your
image volume will be very nicely
optimized.
So when we do restores,
replication, however you want to
say it, there are a number of
options that we can use.
I just want to call out a couple
of them.
The first one is really very
much like restores as we're used
to.
We can specify a source volume.
You can specify a target volume.
And the idea is that the target
volume will be completely erased
and replaced.
Or its contents replaced by the
contents of the source volume.
Now in this case, we will have
Volume 2 be our target volume.
And so our restore would look
something like this.
You would see a sample command
line right here.
And the result is Volume 2's
contents are erased, replaced by
the contents of the source
volume.
So the restored volume and
source volume are the same.
Notice, by the way, that in this
example, the target container
also has another volume in it.
And that volume is left alone.
Its contents are preserved.
It does not form in any way a
part of the replication
operation.
But there's another option that
we have, which is instead of
having to specify a target
volume and erase that, we can
instead generate a brand-new
volume to be the target on the
fly.
And we do this by specifying the
entire container as the target.
And this tells ASR that what we
really want is to generate a
brand-new volume and restore to
that.
You can see another sample
command line.
And the result is a new volume
is created and restored to.
So in this case, Volume 1 and
Volume 2 are both left alone.
Now I want to step away for just
a second from replication.
And I want to talk about
snapshots in APFS.
So a snapshot is just a
point-in-time capture of all
volume state.
So for example, we may have a
volume with a number of files on
it.
We can take a snapshot of that
volume.
And the result is a capture, a
freeze frame, of what that
volume looks like at the time
that the snapshot is taken.
If we make subsequent changes to
the volume, like for example,
deleting a file or adding some
new files, the snapshot is still
encompassing all of the state
that existed at the time it was
created.
So in this case, if we look at
the live volume, it appears that
that file that was deleted is
not there.
But in some sense, it is still
there because it's part of that
snapshot.
So you might wonder, "Well, what
does this have to do with
replication?"
Well, once again, new with macOS
Catalina, we can now do
restores, replication of
snapshots.
So to understand what that means
--
[ Applause ]
Thank you.
To understand what that means,
consider this volume here on the
left.
It has two snapshots in it, Snap
1 and Snap 2.
They contain some files in
common, the yellow ones, some
files that are in one snapshot
and not the other, some files
that are in the other and not
the one, and a file that's not
an either snapshot.
I can restore this volume to a
target volume over here on the
right.
It's currently empty.
And the idea there, of course,
is that at the end of that
restore, the target volume looks
like the source volume.
But instead of restoring the
sort -- the live version of the
source volume the way it
currently looks, I can instead
restore a snapshot.
So if, for example, I want to
restore Snap 1, I can specify
that's the snapshot I want to
restore.
And the result is that my target
now looks like Snap 1.
And notice in particular that
those two files, which were
deleted from the source volume,
have come back to life in the
target volume.
Having done that, maybe I want
to add some new files to my
target volume.
But then, maybe now I want to
restore Snap 2 to that target
volume.
And of course, again, the idea
is that at the end of that
operation, the live target
volume should look like Snap 2
from the source volume.
But notice that both the source
volume and the target volume
already have Snap 1 on them.
Wouldn't it be great, if instead
of having to copy all of Snap 2,
I could just copy the things
that are different between Snap
1 and Snap 2?
Well, indeed I can.
We call that difference between
two snapshots a snapshot delta.
And the idea is when I restore a
snapshot delta, I'm specifying
the two snapshots whose
difference I want to take.
I perform the restore.
And of course at the end, the
live target volume looks like
Snap 2 from the source.
But there's three things that I
want you to notice about this
target volume.
Number one, all of those files
which were not part of the
snapshot on the target have been
discarded.
Number two, the files which were
in Snap 1 but not in Snap 2 have
been discarded from the live
target volume.
They, of course, still exist in
Snap 1 on the target volume.
And number three, the only
things that we needed to copy
were those files that were part
of Snap 2 and not part of Snap
1.
Okay? Now this is a really
powerful feature.
It's a great way to do
incremental releases.
Imagine that you are updating a
lab filled with 100 machines.
You can save a lot of time, a
lot of network bandwidth, if
you're only copying the
difference between a couple of
snapshots on your source image.
So that's what I had to say
about replication.
Take home points here.
A more sophisticated filesystem
requires more sophisticated
mechanisms for doing
replication.
APFS volume replication is
really best done using ASR
because it provides the highest
fidelity of copies.
And it handles the encryption as
necessary.
And finally, we can now restore
snapshots in snapshot deltas
using ASR.
And with that, I'm going to turn
it over to Bill, who will talk
about external file access for
iOS.
Thank you.
[ Applause ]
>> Thank you, John.
Good afternoon.
You may recall two years ago, we
introduced Files app and a new
file provider, API.
Together, they offer an
excellent experience for
Cloud-based documents.
This year, we thought we could
do more.
So this year, we're happy to
announce support on iOS for
accessing files on network
shares and on USB storage.
[ Cheering and Applause ]
For USB storage, we support
everything from compact flash
and CF and cards and sticks, up
through USB raid boxes.
We support a number of
filesystems.
We support unencrypted APFS,
unencrypted HFS Plus, and we
support FAT and ExFAT.
[ Applause ]
This feature is available on all
iOS and iPadOS devices.
It's available on iPad Pro with
USB-C.
And for lightning devices, it's
available with the appropriate
adapters.
Moving on for network shares, we
support connecting to SMB 3.0
servers.
We support connecting over
Wi-Fi, over cellular, and over
Ethernet.
There are a lot of exciting
features here.
But one that we wanted to call
out is for our iOS devices,
iPadOS devices.
We're supporting search using
the Windows Search Protocol.
So all these devices can search
your SMB server if it supports
the WSP protocol.
We're also really excited to
announce that that that includes
the SMB server and macOS
Catalina.
Before going on, I wanted to
talk a moment about security.
Security is a feature users have
come to expect from iOS.
Two of the tools we have used to
help deliver this security are
process separation and privilege
separation.
In developing this feature,
we've kept this in mind and
incorporated them from the
ground up.
So for all of our volumes and
shares, the filesystem
manipulations happen not on the
kernel but in a dedicated
process space.
This separation helps us deliver
the security iOS users have come
to expect.
Now, let's see it in an action.
All right, I have an iPad.
And I don't know if -- you
probably can't see it.
But I have a USB stick
connected.
And I have mail.
And in files, on the left, you
can see the locations.
iCloud Drive.
If we had a third-party Cloud
provider, they would be listed
there as well.
And we see a destination for our
USB device.
When we select it, we see our
photos, our documents, all the
files and directories on there.
And we can manipulate them just
like any other file in Files
app.
So to copy one, you just select
it, get it ready for drag and
drop, pick your destination,
drag it, and drop it.
And you can see it's already in
the folder now.
Another thing we like to do with
devices is let's copy photos
onto them.
Photos, I have a picture a
friend took in India of
tomatoes.
Let's save it to the USB.
To do that, we just select the
document, go to Share, go down
the list to save to Files.
You can see, as a destination,
we have the USB stick listed.
We just select it.
It already is.
We hit save.
And it's copied.
[ Cheering and Applause ]
I expect many of you are
application developers.
And you're wondering how your
application can take advantage
of this.
This feature is available to any
and all applications linked on
or after iOS 13.
So rebuild your application.
And you can take advantage.
To see that in action, let's
look at Numbers.
When I open up Numbers, it's
start -- it's starting with my
iCloud Drive.
The USB is listed as a
destination.
We select it.
And there, we can see all of our
documents.
We can see our photos dimmed
because they're not Numbers
documents.
And then we see that we've had
two Numbers documents on this
drive.
Let's open one of them.
It's a loan comparison comparing
two different loan amounts and
two different interest rates.
Just for comparison, let's see
what happens if we raise the
interest rate.
No. Oh.
[ Laughter ]
Set it to 20, not 200.
Oh, no. Well, that's crazy.
[ Laughter ]
And you can see on the -- see
live the interest rate amounts
changing.
So what does this mean for you
as developers?
As I said, it's available if
you're linked on or after iOS
13.
So rebuild your application and
test.
We're adding five different
filesystem types to iOS.
These filesystems act slightly
differently than APFS on the
internal flash storage.
One difference is iOS has always
had case-sensitive filesystems.
FAT and ExFAT are
case-insensitive.
And HFS and APFS can be
configured either way.
The Clone System call may not
always be available.
So as these differences are --
if these are differences are
important to you or as they are,
please pay attention to volume
capabilities.
There are two different APIs or
a couple of APIs that can get
them for you.
One I wanted to call out are the
resourceValues in NSURL.
These can give you parameters
for the filesystem you're
working with.
Another important point is file
movement may take time.
So please put your temporary
files near your working files.
If you don't, right, this is
helpful.
Because Save-Save uses a rename
at the very end so that a user
always sees a good file.
They either see the document
that they started with, or they
see the new save.
For this to work, we need that
rename.
And that only works if they're
both on the same filesystem.
Also, if you're not careful
about your temporary files, they
may end up in your container on
the internal storage.
And that's going to lead to a
lot of unnecessary IO.
File Manager can help you with
this.
If you ask for a URL for the
itemReplacementDirectory
appropriate for your documents,
it will give you a temporary
directory on the same
filesystem.
Another thing is that external
devices can go away.
A network can go out of range.
A file server can go down.
A CAT can disconnect a cable.
These things can happen.
And your application needs to be
robust in face of it.
One thing I especially wanted to
point out is mmap can be
dangerous.
It can be really powerful.
But if the file goes away, the
only way the kernel can tell you
that is with a BUS error.
So one thing, if you're using
NSData, there is a hint you can
give NSData and say mmap this
data from a file if it's safe.
The last point is that external
devices, all of them, have
higher latencies than APFS on
the internal flash storage.
So if you're doing sizable IO,
please keep multiple operations
in-flight at once.
So to summarize today's talk,
Max talked to us about how we're
making the root volume read-only
for enhanced security.
John talked to us about ASR and
how you can use ASR to replicate
volumes, including snapshot
deltas.
And I talked to you about how
we're adding support for
external files to iOS and iPadOS
and how that will let you access
files on USB storage and network
shares.
For more information, we have a
lab after this session.
And tomorrow, there's a talk,
Combine and Advances in
Foundation, where they're going
to talk more about what you can
do with Foundation on external
media.
And there's also a video
session, What's New in File
Management and Quick Look, that
will talk more about the UI
document aspects of this talk.
Thank you.
[ Applause ]