WWDC2015 Session 719

Transcript

X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[Applause]
>> PRABHAKAR LAKHERA:
Thank you and good morning.
Welcome to Your App and Next
Generation Network session.
I am Prabhakar Lakhera and
with me I have my colleague
Stuart Cheshire.
And this session
is in two parts.
For the first topic I
will talk about IPv6,
and for the second topic
Stuart will talk about how
to make your applications
run faster.
We will start off
with IPv6 first.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
We will start off
with IPv6 first.
Now, what's new in IPv6?
IPv6 RFC was published
almost 17 years back.
So you must be wondering, why
are we talking about IPv6 now?
We are seeing more and more of
IPv6 deployment in enterprise
and cellular networks.
And you want to make sure
that your applications
work in those networks.
That is the reason we will also
be mandating your applications
to be IPv6 compliant.
Now, we will talk more about
that and what it means to you
as developers, but
before we do that,
let's begin with
a bit of history.
Now, a long, long time ago,
client devices had real
and unique IPv4 addresses;
these were the good olden days
and you had end-to-end
network connectivity.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and you had end-to-end
network connectivity.
However, we soon realized
that we were running
out of IPv4 addresses
way too fast.
So we added a NAT in the middle.
Now, this works, but the
larger scale NAT device is both
expensive and fragile.
So carriers are now deploying
IPv6 in their network.
Now, with this, they again have
end-to-end network connectivity
and there's no translation
needed in the data path.
I will now show you how
IPv6 deployment looks
for three major cellular
carriers in the USA.
Now, two things are
obvious here.
One, all the lines are going up.
And second, more than half
of the subscribers
are now connecting
to cellular data
networks over IPv6.
So that's great, right?
Turns out it's actually worse
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Turns out it's actually worse
for cellular carriers
than it was before.
And the reason is now they are
having to support both IPv4
as well as IPv6 in
their network.
So what they really want
to do is to drop IPv4
from their access network.
Now, when you do that,
you lose connectivity
to the IPv4-only
part of the Internet,
which is still in majority.
So now they have deployed DNS64
and NAT64 in their network,
and the way it works
is when the application
on the client device
makes a hostname query
to get the IPv6 address
for an IPv4-only server,
DNS64 and the network
synthesizes an IPv6 address
and gives it back to
the client device.
Now that the client device has
this IPv6 address to work with,
it can start writing
traffic to the network.
The network itself is
configured in a way
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that search packets get
shorted to the NAT64 engine,
which then translates
IPv6 traffic to IPv4
and vice versa on the way back.
Now, the important
thing to note here is
that to the applications
running on the client device,
your IPv4-only server looks
like an IPv6-only server.
And this is important
because till now,
some of you in the room
may have been thinking
that my server is only
configured for IPv4,
so I do not need to test
for a client accessing
it over an IPv6 network.
Your assumption just got broken.
Now the transition to this type
of network will happen very
soon, and when it happens,
we want it to be an
absolutely seamless experience
for our consumers.
And that is the reason your
app has to be IPv6 compliant,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and this will be an app
submission requirement.
So great, we have a new app
submission requirement, right?
And you must be wondering
how do I test my application
for this network?
Where will I find this
NAT64 type of network?
I have a great news
for all of you.
Starting today, with just
your Mac devices on top
of IPv4 connectivity, you can
create your very own NAT64
networks and start
testing your applications.
Now, this feature is meant
to be used by the developers,
so it's somewhat hidden, and to
make it visible, all you have
to do is to Option-click on
Sharing, then Option-click
on Internet Sharing, and now you
will see everything looks just
the same.
But there's a new
checkbox here that says
"Create NAT64 Network."
So you check that, you
choose your interfaces
for Internet sharing, and now
you can host this NAT64 network
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
for Internet sharing, and now
you can host this NAT64 network
and start testing
your applications.
Now, for the example here,
I have my IPv4 connectivity
on the Internet and
I'm sharing this
as an IPv6-only access
network with NAT64/DNS64
on my Wi-Fi interface.
So when I start it, I see that
the Wi-Fi icon has grayed out
and it has an arrow pointing up.
That means now the
Wi-Fi interface is
in access point mode.
What that really means is now
it's hosting a Wi-Fi hotspot,
and you can connect your
other client devices
and start testing
your applications.
Now, a typical test bed
will look like this.
Now, here I have my IPv4
Internet connectivity
on the WAN side, my iMac has
the DNS64/NAT64 engine running
on it, and it is hosting an IPv6
network on the Wi-Fi interface.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Now the application that you
want to test is either installed
in one of the client machines,
or you could very
well be testing it
on a simulator that's running
on another Mac device
that's a client
of this Internet
sharing environment.
So now that we have made testing
easy for you, for this type
of network, what we
really want you to do is
to make sure it is part of
your development process.
That is, any time you are
writing a new application,
or you are writing an
update for your application,
make sure you are testing
for NAT64 network
environment before you submit
your applications.
Now, the good news is for
almost 70% of you, you will see
that your applications
are working just fine.
And that's great, right?
Just keep testing
your applications,
release after release and make
sure there's no regression.
But for almost a third
of you, you will see
that either your application
is severely limited
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that either your application
is severely limited
in a NAT64 network environment
or it does not work at all.
Now, fortunately, most of
the issues are simple to fix,
and here's a sample list.
Now, if you are using only
IPv4-only data structures
or IPv4-only APIs, or
you are using an API
that supports both
IPv4 and IPv6,
but maybe you're passing
an argument that says
that only get me
results for IPv4,
all of these things will make
your application IPv4 only.
And what that means
is it will not work
in an IPv6-only access
network environment.
Now, there's another interesting
thing that we have seen
with some of these
applications that do not work,
that sometimes there's a
preflight check that checks
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that sometimes there's a
preflight check that checks
for IPv4 connectivity even
before attempting a connection.
And that is the reason sometimes
you get errors like this.
So in this case, my iPhone
was actually connected
to the NAT64 network I had
just created on my Mac device,
and I could browse
Internet with Safari,
I could stream videos, music...
I had my Internet connectivity,
but for some reason this
application thinks I have no
Internet connectivity
whatsoever.
And if you read the
error message it says
that my device is in
airplane mode, but if you look
at the top bar, that's
not the case, right?
So what's going on here?
It's precisely the application
making a preflight check
to check for IPv4 connectivity,
and if you remember,
in IPv6-only access
network like NAT64 network,
you do not have IPv4
connectivity;
the entire world looks like
an IPv6-only world to you.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
the entire world looks like
an IPv6-only world to you.
Even the IPv4-only server
looks like an IPv6-only server.
So if you do this
check, it will fail.
Now, in this case,
application is saying retry.
So I retried and it gave
me the same error message,
and then I retried again,
and the same error message.
And it never went off,
and I was just stuck
with this application.
So, what are the
recommendations?
Well, just attempt a connection.
Right? If it connects,
that's great.
If it does not, then
handle that case gracefully.
The second recommendation
would be
to use higher networking
frameworks like NSURLSession
or CFNetwork API, and
the reason is networking
with multihome devices
like iPhone
and Mac devices can
be somewhat complex.
Like for iPhone, you
have Wi-Fi interface
and you have cellular interface.
And with Mac devices, you
may have multiple Ethernet
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And with Mac devices, you
may have multiple Ethernet
interfaces, and you have
a Wi-Fi interface also,
and at a given time, you
may have different kinds
of connectivity to
all these interfaces.
Now, which interface you use
and how to connect which type
of connectivity do you use
for a given destination,
writing that code yourself
can be a lot tedious.
So please use higher
networking frameworks.
It will make your application
code a lot cleaner and simpler.
Now if for some reason you
cannot use a higher networking
framework and if you are
having to work with sockets,
we will definitely recommend you
to read RFC 4038, and it talks
about in great length how
to write your applications
in a way that's address-family
agnostic.
Now, our final recommendation
would be to use hostnames
when possible and not to
use IP address literals,
whether IPv4 or IPv6.
When you are writing
your very own protocol,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
When you are writing
your very own protocol,
a proprietary protocol, or
when you are writing your
application, make sure you are
not using IP address literals
but preferring hostnames.
And the reason is, remember,
in a NAT64/DNS64
network environment,
the client device first has
to make a DNS query, right?
To get the IPv6 address
for the IPv4 server.
So you have to work
with the hostname.
If you are working with
an IPv4 address literal,
the client device will
not make that DNS query,
and the DNS64 network will not
synthesize the IPv6 address
for you.
So, given that, we do understand
that sometimes there's no way
to avoid working with an
IPv4 address, and an example
of that would be Safari.
So you go to a web page,
it loads up just fine.
But within the web page,
there may be some other links,
and some of those links may
have IPv4 addresses embedded.
Now, before today, on
Safari if you clicked on one
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Now, before today, on
Safari if you clicked on one
such link, it would not load up.
Starting iOS 9 and OS X 10.11,
not only Safari but any user
of NSURLSession or CFNetwork
API will be able to work even
with IPv4 address literals
in a NAT64/DNS64 network.
The way it works is when
you are using hostnames,
the DNS64 in the network is
synthesizing IPv6 addresses
for you.
But when you are working
with an IPv4 address literal
and using one of the
higher networking API,
the OS is discovering what
the network would have done
and how it would have
synthesized an IPv6 address
for you, and it will
do that locally.
So this is another reason
why you should be working
with higher network frameworks.
So please do that.
Now, with these data points
and with the new tool,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Now, with these data points
and with the new tool,
we really hope that you will
be able to find and fix issues
with your applications.
Now, what we want you to do
is after this session is over,
install the seed build
on your Mac devices
and start creating your
very own NAT64 networks,
and then use them to
test your applications.
Now, remember this will be an
app submission requirement.
So please take this message to
other developers who are not
in this session and also
take this message back
to your company and make sure
you are doing NAT64 testing
for your applications.
Now, with that, I now invite
Stuart Cheshire to talk
about other networking
features we have for you
that make your applications
faster
and be more responsive
to the users.
Stuart?
[Applause]
>> STUART CHESHIRE:
Thank you, Prabhakar.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
What I want to talk about now
is making your applications
run faster.
In the last couple of decades,
we have seen a phenomenal
increase in network throughput.
I remember a time when the
56-kilobit modem was the latest
technology; now we
live in a world
where 50 megabits per
second is quite commonplace.
But things don't feel a
thousand times faster.
We still spend a lot
of time sitting waiting
for a web page to load.
And why is that?
It's because our industry
has put a huge focus
on increasing throughput
and it has sorely neglected
the other sources of delay.
The speed of light
hasn't gotten any faster
and we can't do anything about
that, but there are other areas
of delay that we can
fix and it's time
for us to start doing that.
So that's what I'm going
to talk about today.
I'm going to talk about
four sources of delay
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
I'm going to talk about
four sources of delay
when users are using
your applications.
The first one is the
delay that happens
when you have weak
Wi-Fi connectivity,
and connection attempts
are not succeeding.
The second area is a technique
called Explicit Congestion
Notification, which along
with smart queuing reduces
delays in the network.
The TCP NOTSENT Low-Water
Mark option reduces delays
in the sending machine, and
then we are going to finish
up with a sneak peek
of an exciting new technology
called TCP Fast Open.
So let's start off with
reliable network fallback.
I'm sure everybody in this
room has had the experience
at the end of the
day: They leave work.
They go out to their car.
They pull out their phone.
You want to check maps, weather
forecast, email, whatever,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
You want to check maps, weather
forecast, email, whatever,
and you are staring at the
phone and it's not loading
and it's not loading and
you are walking to your car,
and it's still not loading
and you get frustrated,
you go into Settings,
you turn Wi-Fi off.
Now you're on LTE and bam!
The page loads.
And then you forget
to turn Wi-Fi back on,
and a week later you've got
a huge cellular data bill
that you didn't want.
Well, that's not a
good user experience.
What we are doing now is we
have some intelligent logic
about doing parallel
connections.
So if your iPhone
thinks it's on Wi-Fi,
but the TCP connection setup
attempt is not succeeding,
then very rapidly, it will
initiate a second parallel
connection over cellular data.
Now, it won't kill
the Wi-Fi connection.
It won't give up on it.
It will let that one continue
to run in parallel and if
that one completes
first, that's great.
You have a connection
over Wi-Fi.
But if it doesn't, and the
cellular connection completes
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
But if it doesn't, and the
cellular connection completes
first, then that's the
connection your application will
get with a delay so short, the
user won't notice anything odd.
Of course, we only
use this for apps
that are allowed to
use cellular data.
If the user has gone
into the settings
and turned off mobile
data for that app,
we won't do the fallback.
And if we do fall back, then
we hide the Wi-Fi indicator,
so now the user knows they
are not on Wi-Fi anymore.
This is something
that you get for free,
as long as you are using
the higher layer APIs,
and you should see no difference
in your applications apart
from a better user experience.
There's one additional
thing you can do,
and that is when you are
running over cellular,
whether you started on cellular
or whether you fell back
to cellular, doesn't matter.
The user can walk back in range
of Wi-Fi and at that point,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The user can walk back in range
of Wi-Fi and at that point,
if you pay attention to the
Better Route notification,
you can then decide
what to do about that.
You may want to tear down
the connections you have
and reconnect over Wi-Fi,
or if you are 99% of the way
through sending an email,
you may want to just let
that complete, but
by paying attention
to the Better Route
notification,
you can make an intelligent
decision
that minimizes the
user's cellular data bill.
My next topic is
delays in the network.
And this is something that came
out of work I did on Apple TV.
We are working on trying to
make Apple TV more responsive
and understanding where
the delays came from.
And I expect everybody
in this room has heard
about bufferbloat.
I did some experiments, and
I want to share the results
of those experiments with you
so you can understand too how
critically important it is
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
so you can understand too how
critically important it is
for all of our applications
and products
that we fix bufferbloat
in the network.
I tested a simulated
network environment
with a 10-megabit downstream
connection, which is plenty
for watching streaming
video, and first I'm going
to show you the results using
a representative network setup,
a simpleminded first-in,
first-out queue
where packets are buffered
until the queue is full
and can't hold anymore, and then
the new arrivals get dropped.
This is very typical of
consumer home gateways today.
And then I will show you a
comparison using smarter queuing
and ECN.
I'm going to show you some
plots generated using tcptrace.
I expect many people in this
room have used tcptrace,
but if you haven't, I
strongly urge you to go
to TCPtrace.org and download it.
If you are working
on networking code,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
If you are working
on networking code,
and you are not using tcptrace
to look at your packets,
then you have no way to
really know what is going on
or understand the
performance characteristics
of your app and your protocol.
When we write apps, we pay
attention to the memory usage.
We profile the code to find
out what's taking the CPU
and then we optimize
the code that needs it
to improve CPU efficiency
and improve battery life.
To do those things, to care
that much about CPU and memory,
but neglect the networking
part doesn't make any sense.
And tcptrace is the tool
that lets you do similar kind
of profiling and analysis
of your network traffic.
Here's a TCP trace of
the first 10 seconds
of some streaming video.
For those of you who haven't
looked at tcptrace before,
I will give a quick overview.
The little white lines
represent data packets.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The little white lines
represent data packets.
The horizontal position of that
white line tells you the moment
in time where the
packet was captured.
The height of the white line
tells you how many bytes
of payload are in the packets,
and the vertical position
of where that line appears
tells you where those bytes fall
within the overall logical
TCP sequence number space.
So here we can see a stream of
packets being sent out in order,
spaced a few microseconds apart.
One round trip later, we
get the acknowledgment back
from the receiver
saying it was received.
And that green line is the
cumulative acknowledgment line.
Everything up to and including
the green line has been
acknowledged by the receiver.
So we should never see any white
packets below the green line.
That would indicate a bug,
and we don't see any white
packets below the line,
so that's good.
The yellow line indicates
the receive window.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The yellow line indicates
the receive window.
When you open up
a TCP connection,
the receiver indicates how
much RAM it has put aside
for your data, and you should
not exceed the amount of RAM
that you have been allocated.
If we see any white packets
above the yellow line,
then that would be
a bug, and we don't.
So that's good.
This looks look a
relatively nice,
straight line data transfer.
The slope of that curve is
exactly 10 megabits per second,
which is what we expect,
but every few seconds,
we see something
like this going on.
So let's zoom in and
take a closer look
at what's going on down there.
There is so much information
on these TCP trace plots,
I could spend an hour just
talking about this one slide.
But we don't have time for that,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
But we don't have time for that,
so I will just cover some
highlights that jump out just
with a glance at this plot.
One is the white packet
line is kind of pulling away
from the green ack line.
What that means is the rate
that we are injecting data
into the network is faster
than the rate data is coming
out of the other side
and being acknowledged.
Well, if we are putting
it in faster
than we are pulling it
out, something has got
to be going somewhere.
It's sitting in buffers and
we can see that the amount
of stale data sitting in buffers
in the network is growing.
And because the amount
of buffering is growing,
that means the round trip delay
between when a packet is sent
and when it's acknowledged
is getting longer.
When we have so much buffered
that the gateway
can't buffer anymore,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that the gateway
can't buffer anymore,
we start losing packets.
And then this mess happens, and
it really is quite a big mess,
because packets are coming into
the tail of the queue faster
than the queue is draining, and
we get a packet, we lose it,
we get another one, we lose it.
The queue drains a bit, we
get a packet, we accept it.
So, at the tail end of
the queue, it's carnage.
It's get a packet,
lose one, lose one,
lose one, get one, lose one.
But over at the front
of the queue,
we've got 200 packets queued up,
in order, they are neatly going
out over that 10-megabit
bottleneck link,
in order, no gaps, no problem.
It's only after that
entire queue has drained
that we actually
witness the results
of that packet loss
carnage at the receiver;
that gets reflected back to the
sender in the selective acks
and the recovery starts.
So this is a big mess.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Because of the way
networking APIs work,
data has to be delivered
in order.
If you lose one packet, then
all the packets that arrive
after it get delayed
in the kernel
until the gap has
been filled in.
Now, there are good
reasons for this.
People have talked many times
about out-of-order delivery,
but it turns out with almost
all applications it's hard
to use data out of order.
If you are trying to decode
H.264 video, having frames
that depend on an I-frame you
don't have is not helpful.
So it turns out that in-order
delivery really is the model
that applications want.
Because of that in-order
delivery,
we get these long plateaus
where no data is delivered.
And for the Apple TV,
video playback process,
that equals a period
of starvation
where it's getting no data.
And because we don't
want the video to freeze,
that's why all streaming
video applications need a
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
that's why all streaming
video applications need a
playback buffer.
And having a big
playback buffer is why
when you watch a
streaming video,
you see that spinning wheel
saying buffering, buffering,
buffering, because it has to
fill up the playback buffer
so that it can weather the
storms when no data arrives
for a long period of time.
When the missing packet
arrives, we then fill in the gap
and deliver it all at once.
That puts an excessive burden
on the network receiving thread
which takes away CPU time from
the video playback threads,
and that results
in the stuttering
of the smooth video playback.
So that's bad.
So this uneven delivery in the
network has bad consequences
for a device like Apple
TV; when we are trying
to make a very affordable,
cost-effective device,
these long plateaus
of starvation equate
to needing more RAM in the
device for more buffering
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to needing more RAM in the
device for more buffering
and slower video start-up
and a poor user experience.
And these spikes in delivery
result in needing a faster CPU
in the device than we
otherwise might have needed,
which pushes the price up.
So this uneven delivery is very
damaging for streaming video.
One interesting thing
to note, though:
If you visually track the
slope of the yellow ack line --
the yellow window line
and the green ack line,
you'll see that at the end of
the trace, they pretty much end
up back where they should have
been, if no loss had happened
because TCP does do an
awesome job of filling
in exactly what needs to be
retransmitted, exactly once,
and not retransmitting anything
that didn't need to
be retransmitted.
So it gets back to where
it should have been.
If you measure this
network using Iperf and look
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
If you measure this
network using Iperf and look
at the number that comes out,
it will tell you 10
megabits per second,
and you will say thumbs up, my
network is working perfectly.
But distilling all
of this information
down to a single figure
loses all the subtlety
of what's really going
on on the network.
So now that I understood
what was going on
and causing the sluggish
performance,
I decided to experiment
with a smarter network.
For this experiment, I used a
smart cueing algorithm called
CoDel, which is short
for Controlled Delay.
The way it works is
instead of filling the queue
until it overflows and loses
data, it monitors the state
of the queue and as soon
as a standing queue starts
to build up, then it considers
that to be a sign of congestion.
And when I say congestion, a
lot of people think I'm talking
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
And when I say congestion, a
lot of people think I'm talking
about something that
happens rarely
at peak times, and it's not.
It's important to
understand that in networking,
congestion is what
happens all the time.
It's the steady state
of the network.
It's the job of any
transport protocol like TCP
to maximize its use
of the network,
to find out how much
the network can carry
and make the best use of that.
And the way a transport
protocol does
that is it sends data faster
and faster and faster.
It keeps probing and it
keeps trying a bit more,
until it loses a
packet, and then it knows
that was too much
and it backs off.
So it's constantly doing
this hunting to track
to find the right rate and what
that means is it's
always pushing the network
into congestion, and
then backing off,
and then congestion
and backing off.
What CoDel does is not wait
until things have gotten really,
really bad before it
signals congestion.
As soon as the first
sign of it starts
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
As soon as the first
sign of it starts
to happen, it tells the sender.
The other thing that I did for
this experiment was instead
of indicating congestion
by losing packets,
which requires a retransmission,
we used a new technology
called Explicit Congestion
Notification, and that way
the smart queuing algorithm,
instead of dropping the
packet, it sets a bit
in the IP header saying
congestion experienced.
That is echoed back to
the sender and it responds
by slowing down without
the destructive effects
of a packet loss.
So this is our graph of the
same data transfer, using CoDel
and ECN, and if I zoom in to
the same part we were looking
at before, you can
see an absolutely
phenomenal difference.
When I was doing
these experiments,
I had planned a week to do
the work and gather the data,
and I was finished
after two hours.
I did one plot with the
standard configuration
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
I did one plot with the
standard configuration
and one plot with CoDel.
This is my first experimental
run that I'm showing you.
I was expecting to have to
tweak parameters and retry
and rerun the experiment.
No. This is -- the
difference is that obvious,
that one trial was all it took.
I'm hearing applause.
Thank you.
[Applause]
We have no plateaus
of starvation.
We have no spikes
of peak delivery.
Every time the slightest
hint of a queue builds up,
we get these little
polite nudges
from the CoDel algorithm
saying slow down.
The CWR on that plot is
Congestion Window Reduced.
That's TCP's acknowledgment
saying message received
and understood; I
have slowed down.
Absolutely wonderful!
So, simple summary, CoDel
and other smart cueing
algorithms are great.
ECN are great.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Put them together,
it's totally awesome.
So if it's so great,
where is it?
Well, historically, packet loss
and retransmission has
not caused big problems
for the traditional
networking applications
like file transfer
and sending email.
When you transfer a
file, the transport layer
in principle could send
the first packet last
and the last packet first and
all the ones in the middle
in a random order; as
long as they all get there
and they are reassembled
in the correct order
you have your file,
and that's all you care about.
But when you are watching
streaming video, you don't want
to see the end first,
and the start last.
You want it in order.
So in order delivery has become
a much more pressing problem,
now that we are using the
Internet for streaming video.
One way I characterize
this is that we used
to have applications like
clicking send on an email,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
to have applications like
clicking send on an email,
where you have a
predetermined amount of data
and how long it takes
to send it is variable.
Basically, the time you'd
like the network to take
to send your email is
as little as possible.
There's really no such thing
as sending an email too fast.
So the time is variable.
You would like it to be fast.
Now we have applications
where you are watching
a two-hour movie,
streaming over the Internet.
It doesn't help to watch it in
half an hour or in eight hours.
It has to take two hours.
So now we have adaptive
applications
where the time is fixed
but the amount of data
that could be sent in
that time has to adjust
to accommodate the
network conditions.
Where are we now?
Well, amazingly, because
it is in Linux and turned
on by default, more than half
of the top million web servers
in the world already
support ECN,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
in the world already
support ECN,
which is phenomenal adoption
for a technology
that no one is using.
[Laughter]
Clients. Well, the clients are
not asking for ECN connections.
They are not requesting ECN
because pretty much none
of the Internet supports
ECN marking.
So if you turn on that option,
there might be some risk
of exposing bugs and
there's no immediate benefit.
What are the routers doing?
Well, none of the routers are
doing marking because none
of the clients are asking for
it, so why put engineering
into something that might have
risk that no one is asking for?
Well, I'm happy to
announce today
that Apple is taking
the initiative
to break this log jam.
In the seeds that you all have,
ECN is now turned on by default
for all TCP connections
for all applications.
[Applause]
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[Applause]
We're not expecting to see
any problems; in our testing,
everything has gone smoothly.
I have been running it on my
own laptop for a long time;
of course, we want to
hear your experience.
Please take the seed builds, run
them on your networks at home,
at work, in your hotel, at
the airport, and as usual,
if you find any bugs,
please report them to Apple.
If we are successful, in a few
months' time we could have a
billion devices running ECN, and
that should be enough incentive
for the ISPs to start
offering that service.
Now we are going to move
on from network delays
to end system delays.
Like many advances in
technology, this was borne
out of a personal
pain point for me.
I use screen sharing to
connect to my Mac at home,
and it's absolutely wonderful.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and it's absolutely wonderful.
Being able to control
it remotely,
being able to access data on it,
being able to start a long video
transcode going that I want
to have finished
when I get back home:
these are all wonderful things.
And at the time I was
doing these experiments,
I had a fairly slow DSL line.
And, of course, DSL
is asymmetric.
It's typically ten times faster
in the downstream direction
than the upstream direction,
and when you are doing screen
sharing, the data is coming
in the wrong direction
in that sense.
So it's kind of to be
expected it will be a bit slow.
It's like the famous joke
about the dancing bear:
when you see a dancing
bear, you are not supposed
to be impressed that
it dances well.
You are supposed to be
impressed that it dances at all.
So for many years, like many
of us, I kind of suffered
with this painful, barely usable
experience, and I would find
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
with this painful, barely usable
experience, and I would find
that when I clicked on a menu,
it took three or four seconds
for the menu to appear,
and after a few minutes
of using the computer this way,
it's really, really frustrating.
There were times it felt
like it would be quicker
to drive home and do it.
And I had been working
on bufferbloat
and excessive queuing
in the network.
So naturally, that was
the first thing I blamed,
and I started digging in.
I started investigating.
I was all ready to be
thoroughly indignant
about this stupid DSL
modem that had all
of this excessive
bufferbloat in it.
And I pinged the machine,
and the ping time
was 35 milliseconds,
but when I click the mouse,
it takes 3 seconds
for a menu to appear.
So now I'm realizing this is
not as obvious as I thought.
Where is the delay coming from?
Well, I did some investigation.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The default socket send buffer
at that time was 120 kilobytes,
my throughput was about
50 kilobytes a second,
so that's about 2.5 seconds,
which was about the
delay I was seeing.
Now, the socket send buffer
serves a very important purpose.
When we use a transport
protocol like TCP,
if it just sent one packet and
waited for the acknowledgment
and one packet and waited
for the acknowledgment,
we would get terrible
performance.
We need multiple packets in
flight; we need enough packets
in flight to fill the bandwidth
delay product of that path
to the destination and
back, and those packets have
to be buffered so
that if they are lost,
they can be retransmitted.
That's all good and
useful and necessary
to maximize the throughput
of that connection.
But any excess buffering
above that requirement just
adds delay for no benefit.
It doesn't help the throughput.
It just adds delay.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
It just adds delay.
And we end up with something
that looks like this.
We have a little bit of data
in flight, which is buffered,
in case it needs to
be retransmitted,
and we have a whole lot, just
sitting in the kernel waiting
for its turn to go out.
Well, this was an
eye-opening revelation for me.
There aren't just
delays in the network.
There are big delays in
the hosts themselves.
Screen sharing would grab a
frame, put it in the buffer,
grab a frame, put it in
the buffer, grab a frame,
put it in the buffer, and then
the kernel would let those
frames mature like a fine
wine before it was time
to put them out on the network.
[Laughter]
Because of that, we came up with
the TCP NOTSENT low-water mark
socket option.
When you send that option,
the socket send buffer
remains unchanged.
The difference is that kevent
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
The difference is that kevent
or your run loop will not report
the socket as being writable
until the unsent
data has drained
to some fairly low
threshold, typically 8
or 16 kilobytes works well.
When the socket becomes
writable,
you then write a single useful
atomic chunk into the buffer.
You don't loop, cramming as
much data into the kernel
as it can take, because
RAM is cheap these days,
it can take a lot.
You just write a sensible unit.
And in the case of screen
sharing, that's one frame.
And now the picture
looks like this.
We have data in flight,
that's buffered.
We have a little bit waiting
for its turn to go out.
As that drains and
reaches the threshold,
the socket becomes writable.
We write a single chunk and let
that drain before
we write some more.
So with that, I would
like to show you a demo
of this in action.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Here I'm using screen sharing
from this machine,
connecting to this.
I'm using a gateway running
CeroWrt
to simulate a DSL
performance connection.
And let's bring up
a Terminal window.
Okay, there we go.
I'm going to move this window.
You can see the mouse
pointer move
because it's generated locally.
The actual graphical updates are
generated by the remote machine.
So let's move this
window over here.
No, not over here.
So let's move it -- actually,
no I like it back where it is.
Hands off the keyboard.
[Laughter]
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[Laughter]
Let's pull down some
menus: Shell.
There we go.
Let's look at Edit.
No, maybe View.
If this feels like a demo
that's going painfully badly,
this is what my life
was like trying
to use my computer remotely.
It takes the patience of a
saint to put up with this.
Well, now I have a new option.
Let's turn on the not
sent low-water mark,
and once we reconnect,
now I will try dragging
this window around.
[Applause]
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
[Applause]
Thank you.
And that was years of suffering
unusable screen-sharing
connections just for the
sake of a silly oversight
in the BSD networking stack.
The great news is that
we have fixed this.
It is available in -- it's now
being used by screen sharing
in the last software update
in 10.10.3, so if you noticed
that screen sharing seemed
to get a lot more snappy,
then this is the reason why.
It's used by AirPlay, and
it's available in Linux too
because this option
applies at the source
of the data on the sending side.
So for those of you who
are running Linux servers,
this option is available
for your servers too.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
this option is available
for your servers too.
The benefit of this delay
reduction is really obvious
for real-time applications.
And we started making
slides for this presentation
where we had two columns.
We had the apps that should
use the low-water mark option
and the apps that shouldn't.
And we couldn't think of any
to go in the shouldn't column.
Every time we thought
of a traditional app
like file transfer, well,
that doesn't need it.
We realized you've
had that experience
where you change your
mind about a file transfer
and you press Control-C
and it seems to take
about 30 seconds to cancel.
It's because it had
over-committed all of this data
into the kernel and it had
to wait for it to drain
because there's no way
to change your mind.
So, yeah, actually, file
transfer does not benefit
from over-committing data,
and we couldn't think
of any application
that does benefit
from over-stuffing the kernel.
So once we had that
realization, we decided starting
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
So once we had that
realization, we decided starting
in the next seed, this option
will be turned on automatically
for all connections using
the higher layer NSURLSession
and CFNetwork APIs.
All you have to do to
make best use of this is
when your socket becomes
writable, don't loop writing
as much data as you can
until you get an EWOULDBLOCK.
Just write a sensible-sized
unit, and then wait
to be told it's time for more.
And that way if the user
has changed their mind
or something else has
changed about the environment,
next time you find your
socket is writable,
you can make an intelligent
decision.
You can do just-in-time
data generation
where you generate the data
so it is as fresh as possible
and based on the
current information,
not based on the information
from five or ten seconds ago.
And that brings me to our
final part of the presentation,
and that is a sneak peek
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and that is a sneak peek
of a brand-new technology
called TCP Fast Open.
The way TCP traditionally works
is this: we do one round trip
to set up the TCP connection.
And then we do a
second round trip
to send the request
and get the response.
TCP Fast Open combines
the connection setup
and the data exchange
into one packet exchange.
This is not turned on by
default for all applications
and there's a reason for that.
There is a caveat you
need to be aware of,
and that is that this is only
safe for idempotent data.
I'll explain what that means.
When you use a TFO
operation, the handshake
and data are combined, the
server gets the message,
the server acts on that,
sends you the response,
and then you close
the connection.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
But the service model of the
Internet doesn't guarantee
that packets can't
be duplicated.
The success of the
Internet has been due
to a very simple service model,
which is deliver the
packets fast and cheap.
And don't worry about
them being in order.
Don't worry if some get
corrupted, some get lost,
some get duplicated,
just be fast and cheap.
And the end systems are
smart enough to deal
with that network model.
Packets can be duplicated
for many reasons.
It could just be
a bug in a router,
it could be Wi-Fi link
layer retransmission
that accidentally
sends the packet twice.
It could be end-system
retransmission:
If you send a packet and
don't get an acknowledgment
because the acknowledgment
was lost, perhaps,
you will then retransmit
the packet
and then immediately
you've got two copies
of the same packet
in the network.
Well, if one of those copies is
delayed and shows up much later,
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
Well, if one of those copies is
delayed and shows up much later,
then to the server that looks
like a perfectly
valid TFO request,
and whatever the operation
was, it will do it again.
If that operation was
sending you a JPEG image,
doing it twice may
be no big deal.
If that operation was sending
you a pair of shoes from Zappos,
then doing it twice might
not be what you want.
So this is something where you
have to intelligently decide
when it's appropriate and safe
and when it's not
for your application.
You use this through the
connectx system call.
This is a sneak peek technology
preview for the early adopters.
Later, we will look
at how to expose this
through the higher level APIs
but now it's only
available through connectx.
The server you are talking to
has to support TFO as well,
and the application
has to opt in.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
and the application
has to opt in.
For those of you running Linux
servers, if you have the very,
very latest Linux kernel as
of a couple of weeks ago,
that now supports the
standard ITF, TFO,
TCP option code,
the same as OS X.
So to wrap up what we
would like you to remember
from today's session,
wherever you can,
use the highest layer
network API as possible
and that way you get the
full benefit of all the work
that those APIs can do for you.
You absolutely must be
testing on a NAT64 network
for your applications,
and thankfully we have made
it really, really easy for you
to do that with just
an Option click.
Reliable network fallback will
give your customers a better
user experience with
your applications.
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
What you can do is pay attention
to the Better Route notification
so that you migrate back to
Wi-Fi when it's available.
Explicit Congestion Notification
is a new feature for the seed.
It will enable the
Internet to move
to this much more
responsive mode of operation
with lower queuing
and lower packet loss.
So we would like you to test
that and report any problems.
The TCP NOTSENT Low-Water mark
is something that you can set
for yourself as a socket
option or get for free starting
in the next seed, and that
will reduce the amount
of stagnant data buffered
in the sending machine.
And then finally, for the people
excited about TCP Fast Open,
we've made that available too.
There's a bunch of sources
of good documentation
that you can look at.
I'm not expecting anybody
to write down these URLs;
you can click those
in the PDF version.
There are some good forums
where you can ask questions
X-TIMESTAMP-MAP=MPEGTS:181083,LOCAL:00:00:00.000
There are some good forums
where you can ask questions
and have discussions
with questions you
have about networking.
I encourage you to
watch the videos
of the NSURLSession presentation
and the Network Extensions
presentation and, of course,
come and join us in person after
lunch in the networking lab
and we can answer all
of your questions there.
Thank you.
[Applause]