WWDC2018 Session 406

Transcript

[ Music ]
[ Applause ]
Hi, everybody, I'm Ben.
I work on the Swift standard
library.
And together with my colleague
Doug, from the compiler team.
We're going to talk to you about
Swift generics.
So the recent releases of Swift
have added some important new
features.
Including conditional
conformance and recursive
protocol constraints.
And, in fact, with every release
of Swift, we've been refining
the generic system, making it
more expressive.
And we feel that the 4.2 release
marks an important point.
It's the point where we can
finally fully implement a number
of designs that have always been
envisioned for the standard
library.
Something that's critical for us
in achieving our goal of API
stability for Swift.
So, we've given a lot of talks
about generics in the past, but
we haven't taken a step back.
And talked about generics as a
whole for a while.
So today, we're going to take
you through a few different
features of the generics system.
Both new and old, to help
understand how they fit
together.
I'm going to briefly recap the
motivation for generics.
We're going to talk about
designing protocols, given a
number of concrete types, using
examples taken from the standard
library.
We're going to review protocol
inheritance, and talk about the
new feature of conditional
conformance.
And how it interacts with
protocol inheritance.
And finally, we're going to wrap
up with a discussion of classes
and generics.
So why are generics such an
important part of Swift?
Well one way of seeing their
impact is by designing a simple
collection, like type.
We'll call it buffer, and it's
going to be similar to the
standard library's array type.
Now, the simplest possible API
for the reading part of a buffer
might include a count of the
number of elements.
And a way to fetch each element
to the given position in the
index.
But, what do we make of that
return type?
Now, if we didn't have generics,
we'd have to make it some kind
of type that could represent
anything that we'd want to put
inside the buffer.
You can call that type ID or
object or void star.
In Swift we call it Any, which
is a type that can stand in for
any different kind of type in
Swift.
So if you wanted to handle
anything in the buffer, you
could have subscript return an
Any.
But, of course, you probably
know that that leads to a really
unpleasant user experience.
At some point, you've got to get
out that type from inside the
box.
In order to actually use it.
And this isn't just annoying,
it's also error-prone.
What if somewhere in your code,
maybe by accident, you put an
integer into what was supposed
to be a buffer of strings?
But it's not just about ease of
use.
We also want to solve some
problems relating to how these
values are represented in
memory.
Now, the ideal representation
for a buffer of strings, would
be a contiguous block of memory.
With every element held in line
next to each other.
But with an untyped approach,
this doesn't work out quite so
well.
Because the buffer doesn't know
in advance what kind of type
it's going to contain.
And so it has to use a type like
Any that can account for any of
the possibilities.
And, there's a lot of overhead
in tracking, boxing, and
unboxing the types in that Any.
Here, I might have just wanted a
buffer of integers, but I have
no way of expressing that to the
compiler.
And so, I'm paying for
flexibility, even though I'm not
interested in it.
What's more, because Any has to
account for any different kind
of type.
Including types that are too
large to fit inside its own
internal storage, it has to
sometimes use indirection.
It has to hold a pointer to the
values, and that value could be
located all over memory.
And so we really want to solve
these problems, not just for
ease of use and correctness, but
also for performance reasons.
And, we do it using a technique
called parametric polymorphism.
Which is just another term for
what we in Swift refer to as
generics.
With a generic approach, we put
more information on the buffer,
to represent the type that the
buffer is going to contain.
We'll call that type Element.
Element is a generic parameter
of the type, hence the term of
parametric polymorphism.
You can think of it kind of like
a compile-time argument that
tells the buffer what it's going
to contain.
Now it has a way of referring to
that element type.
It can use it wherever it was
previously using Any.
And, that means that there's no
need to do conversions when
you're getting a type out of the
buffer.
And if you make an accidental
assignment of the wrong kind of
type, or some issue similar to
that.
The compiler will catch you.
Now, now there's no such type as
buffer without an associated
element type.
If you try to declare a type
like that, you'll get a
compilation error.
You might find that slightly
surprising.
Because sometimes you'll see
that you can declare types like
buffer without any element type.
But, that's just because the
compiler is able to infer what
the element type ought to be
from the context.
In this case, from the literals
on the right-hand side here.
The element is still there, it's
just implicit.
This knowledge of exactly what
type a buff-- a type like buffer
contains is carried all the way
through both compile and
runtime.
And this means that we can
achieve our goal of holding all
of the elements in a contiguous
block of memory, with no
overhead.
Even if those types are
arbitrarily large.
And because the compiler has
direct knowledge at all times of
exactly what element type the
buffer contains.
It has optimization
opportunities available to it
that it wouldn't otherwise have.
So, in the case here, where I've
declared a buffer of integers.
A loop like this ought to be
compiled down to just a handful
of very efficient CPU
instructions.
Now, if you were writing a loop
like this, on a regular basis.
To sum up a buffer of integers,
it might make sense to extract
it out into a method.
An extension on buffer that's
more unit-testable, and more
readable when you actually call
it.
But, you probably know that if
you've written code like this,
you'll get a compilation issue,
because not all element types
can be summed up like this.
We need to tell the compiler
more about the capabilities the
element needs to have, in order
to make this method available on
a buffer.
Now, the easiest way to do that
is by constraining the element
type to be a specific type like
the Int from our original loop.
If you take this easy approach
to get up and running with your
extension, it's easy to
generalize it later.
When you find that you need to
do something different.
Like sum up a buffer of doubles,
or floats.
Just look at the type that
you've constrained to.
Look at the protocols it
conforms to.
And follow them up until you get
the most general protocol that
gives you everything that you
need to do your work.
In this case, the numeric
protocol, which gives us the two
things we're relying on here.
The ability to create a new
element with a value of zero,
and the ability to add elements
to it.
Which come as part of the
numeric protocol.
Now, let's talk about that
process of factoring out
protocols from various types.
So we've been talking about this
buffer type, and we can make it
generic across different
elements.
But what about writing generic
code that's generic in a
different direction?
Or writing code that works on
any different kind of
collection?
Such as an array that's very
similar to our buffer type.
But also more varied types, like
a dictionary that's a collection
of key value pairs.
Or maybe types that aren't
generic or are the different
element types, like data or
string that returns specific
element types.
We want to create a protocol
that captures all of their
common capabilities.
We're going to create a, a cut
down, simplified version of the
standard library's own
collection protocol.
So notice that we considered a
varied number of concrete types
first.
And now, we're thinking about a
kind of protocol that could join
them all together.
And, it's important to think of
things as this way around.
To start with some concrete
types, and then try and unify
them with a protocol.
What do those types have in
common?
What don't they have in common?
When you're designing a protocol
like this, you can think of it
kind of like a contract
negotiation.
There's a natural push and pull
here, between conforming types
on the one hand.
That want as much flexibility as
possible in fulfilling that
contract.
And users of the protocol, that
want a really nice, tight,
simple protocol in order to do
their extensions.
That's why it's really important
to have both a variety of
different possible conforming
types.
And a number of different use
cases in mind when you're
designing your protocol.
Because it's a balancing act.
So, let's start to flesh out the
collection protocol.
So, first we need to represent
the element type.
Now, in protocols, we use an
associated type for that.
Each conforming type needs to
set element to be something
appropriate.
In the case of buffer, or array,
as of Swift 4.2, this happens
automatically.
Because we also named their
generic parameters to be element
as well.
This is a nice side benefit of
giving your generic arguments
meaningful names that follow
common conventions like the word
element.
Rather than giving them
something arbitrary like T that
you'd have to separately state
was the element type.
For other data types, you might
need to do something slightly
more specific.
For example, a dictionary needs
to set the element type to be
the pair of its key and value
type.
Next, let's talk about adding
the subscript operation.
Now, if we were talking about
just a protocol for types like
array, we might be tempted to
have subscripts take an Int as
its argument.
But making subscript take an Int
would imply a very strong
contract.
Every conforming type would have
to supply the ability to fetch
an element's given position that
was represented by an integer.
And, that works great for types
like array.
It's also definitely easy for
users of the protocol to
understand.
But is it flexible enough for a
slightly more complicated type,
like a dictionary?
Now no matter how you model it,
a dictionary's probably going to
be backed by some fairly
complicated internal data
structure.
That has specific logic for
moving from one element to the
next.
For example, it could be backed
by an internal buffer of some
kind, and it could use an index
type that stored an offset into
that buffer.
That it could then take as the
argument to subscript in order
to fetch an element to the
position, using that offset.
But it would be critical that
the dictionary's index type be
an opaque type that only the
dictionary can control.
You wouldn't want somebody
necessarily just adding one to
your offset.
That wouldn't necessarily move
to the next element in the
dictionary.
It could move some arbitrary,
maybe uninitialized part of the
dictionary's internal storage.
So instead we want the
dictionary to control moving
forward through the collection
by advancing the index.
And so to do that, we add
another method.
That given an index, gives you
the index that marks the
position after it.
Once you take this step, you
need a couple more things.
You need a start index property,
and an end index property.
Because a simple count isn't
going to work anymore in order
to tell us that we've reached
the end.
Now that we're not using Ints as
our index type.
So let's bring those back to the
collection protocol.
So we've got a subscript that
takes some index type to
represent a position, and gives
you an element there.
And, we've got a way of moving
that position forward.
But we also need types to supply
what kind of type they're going
to use for their index.
We do that with another
associated type.
Conforming types would supply
the appropriate types.
So an array or a data would give
an Int as their index type.
Whereas a dictionary would give
its own custom implementation
that handles its own internal
logic.
So let's go back to count that
we dropped a minute ago in order
to generalize our indexing
model.
It's still a really useful
property to have.
So we probably want to add it
back as an extension on
collection.
Something that walks over the
collection, moving the index
forward, incrementing a counter
that it then returns.
Now, if we try and implement
this, we hit another missing
requirement.
Since we moved off of Int to a
general index type, we can no
longer assume that the index
type was equatable.
Ints are, but arbitrary index
types aren't necessarily.
And, we need that in order to
know that we've reached the end.
Now, we could solve this in the
same way that we did earlier, of
constraining our extension.
Say that it only works when the
index type is equatable.
But, that doesn't feel right.
We want a protocol to be easy to
use.
And it's going to get really
irritating, if we have to
always, on every extension we
write, put this constraint on
there.
Because we're nearly always
going to need to be able to
compare two indexes.
Instead, it's probably better
expressed as a requirement of
the protocol.
As a constraint on our
index-associated type.
Putting this constraint on the
protocol means that all types
that conform to the protocol
need to supply an equatable type
for their index.
That way you don't have to
specify it every time you write
the extension.
This is another example of
negotiating the protocol
contract.
Users of the protocol had a
requirement that they really
needed to be able to compare
indexes.
And, conforming types, they did
a check that they can reasonably
accommodate that without giving
up too much flexibility.
In this case, they definitely
can.
Ints, the data, and array are
using are already equatable.
And, with Swift 4.2's new
automatic synthesis of equatable
conformance.
It's easy for dictionary to make
its index type equatable as
well.
Next, let's talk about
optimizing this count operation
with a customization point.
So, we've written a version of
count, that calculates the
number of elements in the
collection by walking over the
entire collection.
But, obviously a lot of
collections can probably do that
a lot faster.
For example, supposing a
dictionary kept internally a
count of the number of elements
it held, for its own purposes.
If it has this information, it
can just serve it up in its own
implementation of count.
That means that when people call
count on a dictionary, they're
getting fast constant time.
Instead of the linear time that
our original version that works
with any collection takes.
But, when adding optimizations
like this, there's something you
need to be aware of.
Which is the difference between
fulfilling protocol
requirements, and just adding
lots of overloads onto specific
types.
Up until now, this new version
of count on dictionary is just
an overload.
That means that when you have a
dictionary, and you know it's a
dictionary.
You'll get the newer, better
version of count.
But, what about calling it
inside a generic algorithm?
So supposing we wanted, for
example, to write a version of
the standard library's map?
If you're not already familiar
with it, it's a really useful
operation that transforms each
element in the collection.
And gives it back to you as a
new array.
The implementation's pretty
simple.
It just creates a new array,
moves over the collection,
transforms each element.
And then appends it to the
array.
Now, as you append elements to
an array like this, the array
automatically grows.
And, as it grows, it needs
sometimes to re-allocate its
internal storage.
In order to make more room to
accommodate the new elements.
In a loop like this, it might
have to do that multiple times
over, depending on how big it
gets.
And, doing that takes time.
Allocating memory can be fairly
expensive.
There is a nice optimization
trick we can do with this
implementation.
We already know exactly how big
the final array is going to be.
It's going to be exactly the
same size as our original
collection.
So we could reserve exactly the
right amount of space in the
array up front, before we start
appending to it, which is a nice
speed-up.
And to do this, we're calling
count.
But, we're calling count here,
in what's referred to as a
generic context.
That is, a context where the
collection type is completely
generic, not specific.
It could be an array, or a
dictionary, or a link list, or
anything.
So, we can't know that it
necessarily has a better
implementation of count
available to it.
When the compiler compiles this
code.
And so, in this case, the
version of count that's going to
be called is actually the
general version of count.
That works on any collection and
iterates over the entire
collection.
If you called map on a
dictionary, it wouldn't call the
better version of count that
we've just written yet.
In order for customized method
or property like this to be
called in a gen-- in a generic
context.
It needs to be declared as a
requirement on the protocol
itself.
We've established that there's
definitely a way in which
certain collections could
provide an optimized version of
count, so it makes sense to add
it as a requirement on the
protocol.
Now, even though we've made it a
requirement to implement it, all
collections don't have to
provide their own
implementation.
Because we've already provided
one via our extension that will
work on any collection.
Adding a requirement to the
protocol, and alongside it
adding a default implementation
via an extension is what we
refer to as a customization
point.
With a customization point, the
compiler can know that there's
potentially a better
implementation of a method or
property available to it.
And so, in a generic context, it
dynamically dispatches to that
implementation through the
protocol.
So now, if you call map on a
dictionary, even though it's a
completely generic function.
You will get the better
implementation of count.
Adding customization points like
this, alongside default
implementations through
extensions.
Is a really powerful way of
getting the same kind of benefit
that you can also get with
classes, implementation
inheritance, and method
overwriting.
But, this technique works on
structs and enums, as well as
classes.
Now, not every method can be
optimized like this.
And, customization points have a
small but non-zero impact on
your binary size, your compiler
runtime performance.
So, it only makes sense to add
customization points when
there's definitely an
opportunity for customization.
For example, in the map
operation that we just wrote.
There's no reasonable way in
which any different kind of
collection could actually
provide a better implementation.
And so, it doesn't make sense to
add it as a customization point.
It can just stay as an
extension.
So, we've created this
collection type, and it's
actually pretty fully-featured
now.
It has lots of different
conforming types possible.
And various different useful
algorithms you can write for it.
But, sometimes you need more
than just a single protocol in
order to categorize your family
of types.
You need protocol inheritance.
And, to talk to you more about
that, here's Doug.
[ Applause ]
>> Thank you, Ben.
So, protocol inheritance has
been around since the beginning
of Swift.
And, to think about where we
need protocol inheritance.
Let's go look at this collection
protocol that we've been
building.
It's a nice protocol.
It's well-designed.
It describes a set of conforming
types, and gives you the ability
to write interesting generic
algorithms on them.
But, we don't have to reach very
far to find other
collection-like algorithms that
we cannot implement in terms of
the collection protocol thus
far.
For example, if we want to find
the index of the last element in
a collection, that matches some
predicate.
The best way to do that would be
to start at the end, and walk
backwards.
Collection protocol doesn't let
us do that.
Or say we want to build a
shuffle operation to randomly
shuffle around the elements in a
collection.
Well, that requires mutation,
and collection doesn't do that.
Now it's not that the collection
protocol is wrong.
But it's that we need something
more to describe these
additional generic algorithms,
and that is the point of
protocol inheritance.
So, here the
bidirectionalCollection protocol
inherits from, or is a
collection.
What that means is that any type
that conforms to the
bidirectionalCollection protocol
also conforms to collection, and
you can use those collection
algorithms.
But bidirectionalCollection adds
this additional requirement, of
being able to step backwards in
the collection.
An important thing to note is
not every collection can
actually implement this
particular requirement.
Think of a singlyLinkedList,
where you only have these
pointers hopping from one
location to the next.
There's no efficient way to walk
backward through this sequence,
so it cannot be a
bidirectionalCollection.
So, once we've introduced
inheritance, you've restricted
the set of conforming types.
But you've allowed yourself to
implement more interesting
algorithms.
So, here's the code behind this
last index where operation.
It's fairly simple.
We're just walking backwards
through the collection, using
this new requirement from the
bidirectionalCollection
protocol.
Let's look at a more interesting
algorithm.
So here's a shuffle operation.
So, it was introduced for, for
collections in Swift 4.2.
You don't have to implement it
yourself, but we're going to
look at the algorithm itself to
see what kinds of requirements
it introduces to figure out how
to categorize those into
protocols meaningfully.
So the Fisher-Yates shuffle
algorithm's a pretty old
algorithm.
It's also fairly simple.
You start with an index to the
first element in the collection.
And then, you select randomly
some other element in the
collection, and swap those two.
In the next iteration, you move
the left index forward one.
Randomly select between there
and the end, swap those
elements.
And so, the algorithm is pretty
simple.
It's just this linear march
through the collection, randomly
selecting another element to
swap with.
And, at the end of this, you end
up with a nicely shuffled
collection.
So, we can actually look at the
code here.
It's a little bit involved.
Don't worry about that.
And, we're going to implement it
on some kind of collection.
So, we'll look at the core
operations in here.
So, first we need to be able to
grab a random number between
where we are in the collection
and the end of the collection,
using this, this random
facility.
But, that's an integer.
And what we need is an index
into the collection.
We know those are different.
So we need some operation.
Let's call it index offsetBy.
To jump from the start index
quickly over to whatever
position we've selected.
The other operation we need is
the ability to swap two
elements.
Great. We have two operations
that we need to add to the
notion of a collection to be
able to implement shuffle.
Therefore, we have a new
shuffleCollection protocol.
Please don't do this.
So this is an anti-pattern that
we see.
And the anti-pattern here is we
had one algorithm.
We found its requirements, and
then we packaged it up into a
protocol that is just that one--
just describes that one
algorithm.
If you do this, you have lots
and lots and lots of protocols
around that don't have any
interesting meaning.
You're not learning anything
from those protocols.
So what you should do is notice
that we actually have distinct
capabilities here.
So shuffle is using random
access, and it's using mutation.
But, these are, these are
separate, and we can categorize
them in separate protocols.
So, for example, the
randomAccessCollection protocol
is something where it allows us
to jump around the collection,
moving indices quickly.
And there are types like
unsafeBufferPointer that can
give you random access.
But, do not allow any mutation.
That's a separate capability.
So, we also have the
mutableCollection protocol here.
And, we can think of types here
that allow mutation, but not
random access, like the
singlyLinkedList that we talked
about earlier.
Now, you notice that we've
essentially split the
inheritance hierarchy here.
We've got the access side for
random access, bidirectional,
and so on.
And then, we've got this
mutation side.
That's perfectly fine, because
clients themselves can compose
multiple protocols to implement
whatever generic algorithm
they're doing.
So, we go back to our shuffle
algorithm.
And it can be written as an
extension on
randomAccessCollection, with a
self-type.
So this is the type that
conforms to
randomAccessCollection also
conforms to the
mutableCollection protocol.
And now, we've pulled together
the capabilities of both of
these.
Now, when you have a bunch of
conforming types, and a bunch of
generic algorithms, you tend to
get protocol hierarchies
forming.
Now, these hierarchies, they
shouldn't be too big.
They should not be too
fine-grained.
Because you really want a small
number of protocols that really
describe the kinds of types that
show up in the domain, right?
And now, there's things, things
that you notice when you do
build these protocol
hierarchies.
So, as you go from the bottom of
the hierarchy to the top, you're
going to protocols that have
fewer requirements.
And therefore, there are more
conforming types that can
implement those requirements.
Now, on the other hand, as
you're moving down the
hierarchy, and combining
different protocols from the
hierarchy.
You get to implement more
intricate, more specialized
algorithms that require more
advanced capabilities.
But naturally work with fewer
conforming types.
Okay. So let's talk about
conditional conformance.
This is, of course, a newer
feature in, in Swift.
And, let's start by looking at
slices again.
So for any collection that you
have, you can form a slice of
that collection by subscripting
with a particular range of
indices.
And, that slice is essentially a
view into some part of the
collection.
Now these are default type that
you get from slicing a
collection, is called slice.
And slice is a generic adaptor
type.
So it is parameterized on a base
collection type, and it is
itself a collection.
So our expectation on a slice is
that you can do anything to a
slice that you can do to the
underlying collection.
It's a reasonable thing to want.
And so, certainly we can go and
use the forward search
operations like indexwhere, to
go find something matching a
predicate.
And that works on the collection
and any slice of that
collection.
So, we'd like to do the same
thing with backwards search, but
here we're going to run into a
problem.
So even if the buffer is a
bidirectionalCollection, nothing
has said that the slice is a
bidirectionalCollection.
We can fix that.
Let's extend slice to make it
conform to the
bidirectionalCollection
protocol.
We need to implement this index
before operation, which we can
implement in terms of the
underlying base collection.
Except the compiler's going to
complain here.
The only thing we knew about
that base collection is that
it's a collection.
It doesn't have an index before
operation on it.
We know how to fix this.
All we need to do is introduce a
requirement into this extension
to say that well, base needs to
be a bidirectionalCollection.
This is conditional conformance.
All it is, is extensions that
declare conformance to a
protocol.
And then the constraints under
which that conformance actually
makes sense.
And the wonderful thing about
conditional conformance, is it
stacks nicely when you have
these protocol hierarchies.
So we can also state that slice
is a randomAccessCollection.
When its underlying base type is
a randomAccessCollection.
Now, notice that I've written
two different extensions here.
Now, it's generally good Swift
style.
Write an extension, have it
conform to one protocol, so you
know what that extension is for,
you know its meaning.
It's particularly important with
conditional requirements,
conformances, because you have
different requirements on these
extensions.
And, this allows for
composability.
Whatever the underlying base
collection can do, the slice
type can also do.
So let's look at another
application of conditional
conformance, also in the
standard library, and these are
ranges.
So, ranges have been around
forever in Swift.
And, you can form a range with,
for example, these dot-dot less
than operations.
And so you can form ranges of
doubles, you can form ranges of
integers.
But some ranges are more
powerful than others.
So, you can iterate over the
elements in a range of integers.
Well, why can you do that?
It was because an intRange
conforms to collection.
Now, if you're actually look at
the type, it's reduced by that
dot-dot less-than operator.
It is aptly named the range
type.
Again, it's generic over the
underlying bound type.
So in this case, we have a range
of doubles, and it merely stores
the lower and upper bounds.
That's fairly simple.
But, prior to Swift 4.2, you
would get from an integer range,
an actually different type.
This is the countableRange type.
Now, notice it's structurally
the same as the range type.
It has one type parameter.
It has lower and upperBound.
But it adds a couple additional
requirements onto that bound
type.
That the bound be stridable,
right?
Meaning you can walk through and
enumerate all the elements.
Now that's the ability you need
so that you can make
countableRange conform to
randomAccessCollection.
That enables the forEach, the
forEach iteration loop, and
other things.
But with conditional
conformance, of course, we can
do better.
So let's turn the basic range
type into a collection, when the
bound type conforms this-- has
these extra stridable
requirements on it.
It's a simple application of
conditional conformance.
But it makes the range type more
powerful when used with better
type parameters.
Now, notice that I'm just
conforming to
randomAccessCollection.
I have not actually mentioned
collection or
bidirectionalCollection.
With unconditional performances,
this is okay.
Declaring conformance to
randomAccessCollection implies
conformances to any protocols
that it inherits.
In this case,
bidirectionalCollection and
collection.
However, with conditional
conformance, this is actually an
error.
Now, if you think back to the
slice example, we needed to have
different constraints for those,
for those different levels of
the hierarchy for collection.
Versus bidirectionalCollection
versus randomAccessCollection.
And so, compiler's enforcing
that you've thought about this,
and made sure that you have the
right set of constraints for
conditional conformance.
In this case, the constraints
across the entire hierarchy are
the same.
So, we can just write out
explicitly collection and
bidirectionalCollection.
To assert that this is where all
these conformances are.
Or we can do the stylistically
better thing, and split out the
different conformances.
Now at this point, our range
type is pretty powerful.
It does everything the
countableRange does.
So what should we do with
countableRange?
We could throw it away.
In this case we're talking about
the standard library, and
there's a lot of code that
actually uses countableRange.
So we can keep it around as a
generic type alias.
This is a really nice solution.
So the generic type alias adds
all of those extra requirements
you need to make the range
countable.
The requirements you need to
turn it into a collection, but
it's just an alternate name for
the underlying range type.
Again, this is great for source
compatibility, because code can
still use countableRange.
On the other hand, it's also
really nice to give a name to
those ranges that have
additional capabilities of being
a randomAccessCollection.
In fact, we can use this to
clean up other code.
To say, well, we know what a
countableRange is.
It's a range with this extra
striding capability, so we can
go extend countableRanges.
And that is a case in which we
have randomAccessCollection
conformance.
So, we've introduced this in
Swift 4.2 to help simplify the
set of types that we're dealing
with.
And make the existing core types
like range more composable and
more flexible.
>> Recursive constraints
describe relationships among
protocols and their associated
types.
This is a topic that we didn't
cover in the WWDC version of
this talk.
But it's an important part of
the standard library's use of
Swift's generic system.
Let's jump right in.
A recursive constraint is
nothing more than a constraint
within a protocol that mentions
that same protocol.
Here, collection has an
associated type named
subsequence.
That is itself a collection.
Why would you need this?
Well, let's look at a generic
algorithm that relies on it.
So here, given an already sorted
collection.
We want to find the index at
which we should insert a new
value.
To maintain that sort order.
We're going to compute the
sorted insertion point for the
value 11.
When we go ahead and insert 11
at that index, the result is
still a sorted array.
The sorted insertion point of
function is implemented in terms
of a binary search.
Binary search is a classic
divide-and-conquer algorithm.
Meaning that at each step, it
makes a decision that allows it
to significantly reduce the
problem size.
For the next step to consider.
For binary search, we first look
at the middle element, 8.
And compare it against the value
that we want to insert.
That's 11.
And because 11 is greater than
8, we know that 11 needs to be
inserted somewhere after the 8.
In the latter half of the
collection.
So we restrict our search space
by half.
In our next step, we find the
new middle, 14, and compare it
against the value we want to
insert.
Eleven is less than 14, so the
insertion point has to come
before the middle.
Divide the remaining collection
in half again.
Continue dividing collection
we're looking at in half.
Until we're pointing at the
proper insertion point.
That's our solution.
Divide-and-conquer algorithms
like this are fantastic.
Because they're extremely fast.
Binary search takes logarithmic
time.
Which means that doubling your
input size doesn't make the
algorithm run twice as slowly
like it would for a linear
algorithm.
With a logarithmic algorithm
like binary search, it only has
to perform one additional step
to cut the problem size in half
again.
So let's turn that into code.
The first thing we need to do is
find the index of the middle
element.
Which we can do using
randomAccessCollections index
offset by a function.
Next, we check whether our value
comes before the middle element.
So we know which half of the
collection contains our
insertion point.
Now in our example, the value to
insert is greater than the
middle element.
So we take a slice of the
collection from the index after
the middle.
All the way until the end.
And recursively call sort and
insertion point of on that
slice.
This is common of
divide-and-conquer algorithms.
Where you reduce the problem
space and then recurse.
Now to make this work, we need
that slicing syntax.
Provide a suitable slice of a
collection.
We can do this for all
collections by introducing a
general operation that takes a
range of indices and produces a
slice.
Like this.
Now remember that the slice
adapter we discussed earlier
works on any collection.
Providing a view of the elements
of the underlying collection
that is itself a collection.
This makes our
divide-and-conquer algorithm
work for any collection.
As well as providing slicing
syntax to all collections.
That's great, but there's just
one problem.
Some collections don't want this
particular slice type.
They really want to provide
their own slicing operations
that produce a different type.
String is the most common
example.
When you slice a string, you get
back a substring.
And so if you apply our
divide-and-conquer algorithms to
the string collection.
You really want those to be in
terms of substring.
Rather than some other type like
the slice of a string.
Range is another interesting
example.
Because its slicing operation
returns an instance of the exact
same range type just with the
new bounds.
So to capture this variation
among different types that
conform to collection, we can
introduce new requirements into
the collection protocol.
Specifically for slicing.
So here we've pulled the slicing
subscript into the collection
protocol itself as a
requirement.
Now note that the result type of
this subscript is described by a
new associated type:
subsequence.
Now both string and range meet
these new requirements of
collection.
With string, the subsequence
type is substring.
For range, the subsequence type
is going to be the range itself.
Now this works well for, for
string and range.
But for all the other collection
types that don't want to
customize the actual subsequence
type.
We can provide default
limitations of slicing.
So the authors of these
collection type don't actually
have to do any extra work to
conform to collection.
They get all the slicing
behavior for free.
So we're going to start with
subsequence.
Associated types themselves can
have default values.
Written after the equals sign.
For subsequence, the slice
adaptor type is a perfect
default because it works for all
collections.
So this default will be used for
any conforming type that doesn't
provide its own subsequence
type.
This pairs well with the
implementation of the slicing
subscript we started with
earlier.
Written in extension on the
collection protocol.
And it can act as a default
implementation, providing the
slicing subscript operation that
returns a slice.
We can even go one step further
and limit the applicability of
our default slicing subscript
implementation to those cases
where we picked the default
subsequence type.
So this pattern prevents the
default implementation from
showing up as an overload on
collection types that have
customized their subsequence.
Like string and range.
So this is all great for
conforming types.
They get slicing for free, or
they can customize it if they
want to.
But remember our goal here.
We're trying to write our
divide-and-conquer algorithms
against the collection protocol.
So we have to answer one really
important question.
What does subsequence do?
All we know about subsequence
right now is that it's the
result type of the slicing
subscript operation.
But we need more to actually use
it.
So to answer this question, we
have to go back to the
algorithms that we want to write
in terms of subsequence.
Our algorithm is recursive.
It forms a slice which is now a
value of the subsequence type.
And then recursively calls sort
insertion point of on that
slice.
Now this only makes sense if the
subsequence type you get back is
itself a collection.
Now when it performs that call,
we're going to pass a value of
the collection's element type.
But the recursive call itself is
going to expect a value of this
subsequence's element type.
The only way this can possibly
make sense is if those element
types are identical.
Now the same issue comes up when
returning an index from the
recursive call.
Which is going to be computed in
terms of the subsequence.
But that index that's returned
also needs to be a valid index
for the current collection.
So we can capture all of these
requirements in the collection
protocol itself.
Now the first thing we want to
do is say that the subsequence
of a collection is itself a
collection.
This is a so-called recursive
constraint.
Because the associated type
conforms to its own enclosing
protocol.
We can then use associated type
where clauses to further
constrain our subsequence.
As we talked about earlier, it
has an element type.
And that element type needs to
be the same as that of the
original collection.
So we can express that here with
the same type constraint.
Subsequent element is the same
as element.
We can do exactly the same thing
for the index type.
Now, these cover all the
properties that we discovered by
looking at the implementation of
the sorted insertion point of
algorithm.
This leads us to an interesting
question.
Can you slice a subsequence?
Well, every subsequence is a
collection and every collection
has a slice operation.
So of course you can slice a
subsequence.
And the result is going to be a
subsequence of the subsequence.
Now you can do this again and
get a subsequence of a
subsequence of a subsequence.
And keep on going on and on and
on.
Now interestingly, at each point
we could have a brand-new type.
And so we have this potentially
infinite tower of types.
That's actually okay.
Each recursive step in our
generic algorithm could
conceivably create a new type.
Based on the current collection
type.
So long as the recursion
eventually terminates at
runtime, there's no problem with
this.
However, it's often the case
divide-and-conquer algorithms
can be implemented more
efficiently by making them
nonrecursive.
So here is the nonrecursive
implementation of the sort and
insertion point of algorithm.
We're going to walk through it.
Because the core algorithm is
the same.
But it's expressed iteratively
with this while loop rather than
recursively.
So the first thing we're going
to do is take a slice of the
whole collection.
This slice variable is going to
represent the part of the
collection that we're looking at
in each iteration.
And now we see the familiar
divide-and-conquer pattern.
Find the middle of the slice.
And then compare the value to
insert against the middle
element in the slice.
We then narrow the search base
by slicing the slice before we
go in loop again.
However, here we have a problem.
We're performing an assignment
to the slice variable.
Which is of the subsequence
type.
On the other hand, the
right-hand side is a slice of a
slice.
And as we talked about before,
this is a subsequence of the
subsequence and could be a
completely different type.
So we're going to get a compiler
error telling us that these two
types are not necessarily the
same.
That's really inconvenient here
because it prevents us from
writing this nonrecursive
algorithm.
And it doesn't really reflect
how specific collection types
behave.
Think about string.
If you slice a string, you get a
substring.
If you slice a substring, you
don't get a sub-substring.
You just get another instance of
the substring type.
So let's go back to how this
slice adapter works to
generalize this notion.
We have a collection.
We're going to call it Self,
that we've sliced from I to J.
Now that's going to build
something of the type slice of
Self.
Which is just a view on the
underlying Self collection.
If we then slice the slice, we
get a slice of slice of self.
Which is a view of a view on
that same underlying Self
collection.
So this is our infinite tower of
types in practice.
However, it doesn't have to be
this way.
Remember that the slice types
use the same indices as their
underlying collection.
And they also know about their
underlying basic collection.
So when we slice the slice, we
can take those new indices, I2
and J2.
Bring them back to the original
base collection and form the new
slice from there.
And what this does is it means
that when you slice a slice, you
get something back of the same
slice type.
Effectively tying off the
recursion.
This is exactly the same
behavior we saw with substring.
And it's completely reasonable
to expect that all subsequence
types behave in this way.
So let's model it as an explicit
part of the collection's
protocol requirements.
So here we're saying that the
subsequence of a subsequence is
the same type as the
subsequence.
In other words, when you slice a
slice, you get back the same
slice.
This makes our nonrecursive
divide-and-conquer algorithm
work.
And simplifies the use of the
collection protocol.
There's no more need to reason
about infinite tower of types.
Now there's one last issue here
involving subsequence.
We've said it's required to be a
collection.
But we need the subsequence type
to be a random access collection
to perform this index offset by
operation.
To describe this, we can use
protocol where clauses.
So when bidirectionalCollection
inherits from collection.
It can add a new constraint on
subsequence, requiring it to
conform to that
bidirectionalCollection
protocol.
This again is a recursive
constraint but now it's
expressed on the
bidirectionalCollection
protocol.
We can do the exact same thing
for randomAccessCollections.
Such as the subsequence of a
random access collection, itself
conforms to
randomAccessCollection.
Note how the constraints on
subsequence follow the enclosing
protocol.
This might sound a little bit
familiar.
Both recursive constraints and
conditional conformance tend to
track the protocol hierarchy
like this.
And the features support each
other.
This is particularly important
because we want the default
associated type for subsequence,
the slice of Self, to work at
every level of the collection
hierarchy.
Slice is always a collection.
When we go ahead and create the
bidirectionalCollection
protocol.
It now requires that the
subsequence type also conform to
bidirectionalCollection.
The slice adapter's conditional
conformance to
bidirectionalCollection which
kicks in anytime itself is known
to be a bidirectionalCollection.
Satisfies that requirement.
RandomAccessCollection works the
same way.
Subsequence gains a
randomAccessCollection
requirement.
And slices conditional
conformance to
randomAccessCollection satisfies
that requirement.
Now itself is known to be a
randomAccessCollection.
This behavior where an
associated type default works
for every protocol within the
hierarchy is a good indicator of
a cohesive design.
If you find yourself needing
different associated type
defaults at different points in
the collection hierarchy.
You might have a problem with
your design.
Recursive restraints are a
powerful tool.
Used with associated type and
protocol where clauses, they
help us write the protocol
requirements we need to express
divide-and-conquer algorithms
naturally in generic code.
And now we return to the final
portion of the WWDC talk.
So, Swift is a multi-paradigm
language.
We've been talking exclusively
about generics right now.
But of course, Swift also
supports object-oriented
programming.
And so, I'd like to take a few
moments to talk about the
interaction between those two
features.
How they work together in the
Swift language.
So with class inheritance, we
know how class inheritance
works.
It's fairly simple.
You can declare a superclass,
like Vehicle.
You can declare some subclasses,
like Taxi and PoliceCar that
both inherit from Vehicle.
And, once you do this, you have
this object-oriented hierarchy.
You have some expectations about
where you can use those
subclasses.
So if I were to extend Vehicle
with a new method, to go let it
drive, I fully expect that I can
call that method on one of my
subclasses, Taxi.
So, this is a fundamental aspect
of object-oriented programming.
And, Barbara Liskov, actually
described this really well in a
lecture back in the '80s.
Since then, we've referred to
this as the Liskov substitution
principle.
And, the idea's actually fairly
simple.
So, if you have someplace in
your program that refers to a
supertype, or superclass, like
Vehicle.
You should be able to take an
instance of any of its subtypes,
or subclasses, like Taxi or
PoliceCar, and use that instead.
And the program should still
continue to type check and run
correctly.
So, the substitution here is an
instance of a subclass should be
able to go in anywhere that the
superclass was expected and
tested.
And, this is a really simple
principle.
We've all internalized it, but
it's also really powerful.
If you think about it.
And at any point in your program
think well, what happens if I
get a different subclass, maybe
a subclass I haven't thought
about here.
So, getting back to generics,
what are our expectations when
applying Liskov substitution
principle to the generic system?
Well, maybe we add a new
protocol, Drivable.
Whatever. And extend Vehicle to
make it Drivable.
What do we expect to happen?
Well, we expect that you can use
that protocol, conformance of
Vehicle to Drivable, for some of
its subclasses as well.
Say, you add a simple generic
algorithm to the Drivable
protocol to go for a
sundayDrive.
Well, now you should be able to
use that API on a PoliceCar,
even if that might not be the
best idea.
So, the protocol conformance
here is effectively being
inherited by subclasses.
And this puts a constraint on
the conformance.
The one conformance that you
write, the thing that makes
Vehicle Drivable.
Has to work for all of the
subclasses of Vehicle now and
anyone that comes up with it
later.
Most of the time, that just
works.
However, there are some cases
where this actually adds new
requirements on the subclasses.
The most common one is when
dealing with initializer
requirements.
So, if you've looked at the
decodable protocol, it has one
interesting requirement.
Which is the initializer
requirement to create a new
instance of the conforming type
from a decoder.
How do we use this?
Well, let's go add a convenience
method to the decodable
protocol.
It's a static method decode that
creates a new instance from a
decoder, essentially a wrapper
for the initializer, making it
easier to use.
And, there's two interesting
things to notice about this
particular method.
First, is it returns Self with a
capital S.
Remember this is the conforming
type.
It's the same type that you're
calling the static method on.
Now, the second interesting
thing is, how are we
implementing this?
Well, we're calling to that
initializer above to create a
brand-new instance of whatever
decodable type we have, and then
return it.
Fair enough.
We can go ahead and make our
Vehicle type Decodable.
And then, what we expect, when
applying the Liskov substitution
principle, is we can use any
subclass of Vehicle.
With these new API's that we've
built through the protocol
conformance.
So, we can call Decode on a
Taxi.
And what we get back is not a
Vehicle not some arbitrary
Vehicle instance, but the Taxi,
an instance of Taxi.
This is great, but how does it
work?
So let's take a look at what
Taxi might have.
Maybe there's an hourly rate
here, and when we call
Taxi.decode from, we're going
through the protocol, going
through the protocol initializer
requirement.
There's only one initializer
this can actually call, and
that's the initializer that's
declared inside the Vehicle
class, in the superclass here.
So that initializer, it knows
how to decode all of the state
of a Vehicle.
But it knows nothing about the
Taxi subclass.
And so, if we were to use this
initializer directly, we would
actually have a problem that the
hourly rate would be completely
uninitialized, which could lead
to some rather unfortunate
misunderstandings when you get
your bill at the end.
So, how do we address this?
Well, it turns out Swift doesn't
let you get into this problem.
It's going to diagnose at the
point where you try to make
Vehicle conform to the decodable
protocol that there's actually a
problem with this initializer.
It needs to be marked required.
Now, a required initializer has
to be implemented in all
subclasses.
Not just the direct subclasses,
but any subclasses of those, any
future subclasses you don't know
about now.
Now by adding that requirement,
it means that when Taxi inherits
from Vehicle, it also needs to
introduce an initializer with
the same name.
Now, this is important because
this initializer's responsible
for decoding the hourly rate.
And then chaining up to the
superclass initializer to decode
the rest of the Vehicle type.
Okay. Now, if you're reading
those red boxes really quickly,
you may have noticed the
subphrase non-final.
So, by definition, final classes
have no subclasses.
So, it essentially exempts them
from being substituted later on.
That means that there's no sense
in having a required initializer
because you know there are no
subclasses.
And so final classes are in a
sense a little easier to work
with when dealing with things
like decodable or other
initializer requirements.
Because they're exempt from
these rules of having required
initializers.
So when you're using classes,
for reference semantics,
consider using final when you no
longer need to customize your
class through the inheritance
mechanism.
Now, this doesn't mean that you
can't customize your class
later.
You can still write an extension
on it.
The same way you can extend a
struct or an enum.
You can also add conformances to
it, to get more dynamic
dispatch.
But final can simplify the
interaction with the generic
system, and also unlock
optimization opportunities for
the compiler in runtime.
So we've talked a bit about
Swift generics today.
The idea behind Swift generics
is to provide the ability to
reuse code while maintaining
static type information.
To make it easier to write
correct programs, and compile
those down into efficient,
efficiently executing programs.
When you're designing protocols,
let this push and pull between
the generic algorithms you want
to write against a protocol.
And the conforming types that
need to implement that protocol
guide your design to meaningful
extractions.
Introduce protocol inheritance
when you need some more
specialized capabilities to
implement new generic algorithms
that are only supportable on a
subset of the conforming types.
And, conditional conformance
when you're writing generic
types, so that they can compose
nicely, especially when working
with protocol hierarchies.
And finally, when you're
reasoning about the tricky
interaction between class
inheritance and the generic
system.
Go back to the Liskov
substitution principle, and
think about what happens here if
I introduce a subclass rather
than a superclass at which I
wrote the conformance.
Well, thank you very much.
There's a couple of related
sessions on embracing algorithms
and understanding how they can
help you build better code.
As well as using Swift
collections effectively in your
everyday programming.
Thank you.
[ Applause ]