WWDC2001 Session 304
Transcript
Kind: captions
Language: en
good morning my name is Tom wire I'm the
network and communications technology
manager in developer relations I'm going
to introduce session 304 which is
extensible kernel networking services
hopefully many of you have made it over
from the previous session in room a one
so with that I'd like to introduce
Virant Dumont who is one of the core OS
networking engineers a good morning I'm
going to talk about the extensible
kernel networking services and in your
case in your case we'll see that really
mainly what we're talking about here are
kernel extensions and how to create a
kernel extension to add some networking
functionality to the kernel so as an
introduction Mac OS 10 networking
architecture is extensible so we get
mechanism where you can add firewalls or
VPNs or content filters or network
drivers so this is the goal here is that
a little bit like in Mac OS 9 where you
get a mechanism for extensions here you
can add your networking extension
without having to recompile the wall
kernel like you would do in a regular
freebsd type of environment so what
you'll learn this session will we'll
talk in some details about the network
kernel architecture so what we have for
Mac OS 10 and Darwin we kind of use both
terms because everything here is in
Darwin so it's all open source so you're
really welcome to look at all this you
know how to source yourself from from
the darwin kernel so and we'll see how
to to do and filter and intercept
packets from different points in the
kernel from the socket layer from the
lower levels and we'll see that and also
how to add network interfaces and
drivers
in the colonels and one of the
interesting things that we may you may
learn here is some interesting tips
about the specificities of Mac OS 10
kernel and some of the changes you know
coming from Mac OS 9 or even from some
FreeBSD or Darwin I mean that that are
different in Darwin from from FreeBSD so
there is different behaviors and some
tips here that you may you be interested
in so this is the pictures that you
probably have seen many times by now so
networking in Darwin is part of the bsd
kernel so I think it's it's always
interesting to say that there is really
three parts in in the kernel in Mac OS
10 you get the BSD subsystem which is a
lot of GAAP eyes that are sockets like
we'll talk about later and the file
system and we get the mark kernel which
is a base for VM and a bunch of all the
core services scheduling and so on and
also the i/o kit subsystem which is
using the driver so just because of this
networking as some different interaction
with the rest of the kernel you'll see
at the i/o kit layer or Mac kernel and
we get some some differences here so if
we go a little bit deeper in details
about the networking subsystem we see
that we can roughly decompensate it in
like four layers we get the socket part
which as you know and you probably were
probably in a Vincent session earlier
and you saw that socket is main EPI of
networking in 10 and so we get the
socket layer and we'll see that there is
some mechanism where you can extend the
socket layers you can plug yourself in
the socket layer to do filtering and
things like that the other parts the
second layer here from the top is a
network protocols so this is where
things like tcp/ip or appletalk or your
protocol lives we get another third
layer which is
something added in in the Darwin kernel
which you won't find in in FreeBSD which
is a data link interface layer we'll
talk about this in more details but
basically this is a an abstraction layer
for protocol and network interface to
meet for extensibility and also we get
the network interface layer which on one
side is going to communicate with IO kit
and you know absurdo drivers on the
bottom side and on the upper side is
going to talk to the GL il the data link
interface layer so we'll go in details
on those four layers so networking and
our way into to summarize this it's
currently based on FreeBSD 3-2 so we get
the tcp/ip stack socket and a bunch of
other services for networking from the
FreeBSD 3-2 implementation so if you
were a little bit aware of this you know
it's it's not the latest version of
FreeBSD but it's it's a very solid base
for what we're doing here so it's a very
robust and proven implementation FreeBSD
is used in many places and servers and
things like that
so what that gives us it's a it's a
stack which has a lot of life and it's
been you know a lot of problem have been
flushed already so we inherit all those
you know improvement and and and all
these things in the Darwin kernel
however we get some Apple enhancements
there for plenty of reasons that will
detail the FreeBSD stack by itself was
not completely meeting or needs so we're
trying to have a more dynamic approach
where people can load and unload without
you know rebooting and without
recompiling so we also have support for
mps the freebsd stack right now is not
really MP savvy so we get some mechanism
here for multi-threading and MP
efficiency in in the networking
subsystem intent so we also vincent
talked about a little bit before we also
tune some buffer allocation for both
client
and some you know high-performance for
gigabit and things like that so that's
also some of the modification we've done
to this stack and so we get this famous
datalink interface layer which is a
mechanism for extensibility at the
bottom of the stack for heading
protocols adding Network families one of
the biggest problem with freebsd is that
basically when you get a new driver or a
new class of driver and your family of
driver you more or less have to go
pretty deep into the kernel and we
compile the kernel and get that to get
that to be integrated in your in your
stack with Mac OS 10 + GL IL we we have
some mechanism to do this on the fly
basically and you can you can load and
unload your drivers add some type of
different family without having to
restart the machine so and as we said
it's extensible because of those plug-in
architectures and network kernel
extension we're going to just talk about
so the network kernel extensions what
are they they're they add extensibility
to the kernel networking they are
basically part of code that's going to
link against the kernel dynamically and
that will be part of the kernel so this
is a big responsibility
you as developer are going to make some
iron a filter a new protocol something
and using the network kernel extension
you'll link your code to the kernel code
as you know in Mac OS 10 and it's been
it's been said in all the session but
there is a pretty hard blender in
between user mode and kernel mode in
user mode basically whatever you do if
you crash your application you're not
going to bring down the machine when you
run in the kernel you're in protected
mode and you have access to all the
goodies and all the resources and it
means that if your your your code does
something wrong it's going to panic and
it's going to be a very bad experience
so kernel extension in that sense are
something that should be used only if
what you're trying to do can
done in user mode if what you are trying
to do as an application using networking
can be done in in user as we you know
without running in the kernel it's it's
always best to do so why because if your
application crash for some reason you'll
bring down your application user you
know may have to restart your
application but life will be fine
if something goes wrong in the kernel
it's a reboot and that's really
something we're trying to avoid that all
costs so this is a first word of caution
we're going to have several of them
during this talk kernel extensions are
absolute potential to crashing machines
you have to be really careful about what
is done there so what you can do in a
network extension in basically you can
do a filter in case so filter and case
are we'll see can modify and inject or
drop packets so I've added those
different layers we've we've seen before
you'll be able we're gonna go through
with a different type of filter in case
you can you can make but those filter
will intercept packets at some point and
you'll be able to do whatever you want
with a packet so you can decide to let
it go through you can decide to modify
it you can decide to swallow it
duplicate it whatever you trained your
you want to do you'll be able to do that
through filter and K so this is a very
powerful mechanism however as I said you
have the potential to you know if you
somebody if you sending back a packet
which is bad and it's gonna crash
somewhere up the stack you know it's
it's it's a bad thing so you get to be
really careful about what you're doing
Network kernel extensions are are using
the i/o kit functional it is so through
zio kit mechanism you can dynamic load
the dynamically load and unload the end
case and there is no need to reboot this
is just something that you can do by
yourself or you can have through a set
of dependency your nkb loaded
I will see in the next slide a thing so
again you're running in the kernel so
big liability be careful about your
pointers you know be careful about
you're doing when you're unloading you
have some special things to do to make
sure that you do live in a clean state
after your module unload so proceed with
caution so another important point is
that there is no real API the networking
subsystem is wide open
you're looking against all the symbol of
in the kernel so there is no guarantee
of binary compatibility in the future
just just a simple example here if we
change a structure or even worse if we
change a macro and you have linked with
an old version of this macro somehow
this is going to work you know I mean
you won't have any resolution problem
because this is a macro but in the lying
implementation of the macro is going to
be different from what is in the kernel
and your nke may work for a while but at
some point when it's going to get this
macro it's going to do the wrong stuff
and the consequences might be harmful
you know you can potentially panic or
even worse panic in two hours later
because you you freed the wrong stuff so
this is this is a problem here but you
get to be aware of this the binary
compatibility is not guaranteed in the
future because we may have to fix a
stack we may have to fix some of the
things so there is some mechanism in in
in I orchid that lets you declare
dependency and that will make sure that
you your n-k won't load if the kernel
version has changed and things like that
so so those are things that you need to
look into when you're doing an NK Darwin
is not BSD Darwin is based on BST so
some of the rules apply most of the
rules for if you're used to BSD apply
but not all of them and we'll go in some
of them we'll talk about things like
funnels or things like where you have to
be a little bit careful with MP so I
don't expect you code that you just got
from
from open source for the kernel part to
just compile and run in in Darwin it
probably will but you need to have to be
careful to look at some of the aspects
we'll talk about and you get a pretty
good eye on it and see what are your
areas that may give you problem into
Darwin so don't expect this to just
compile and Link it into Darwin it will
work you may have to do a little bit
more okay so the nke dynamic loadings so
as we said before this is provided by I
Oh kit so basically a kernel module is
it a filters that your protocol you know
whatever you do as a as an NK II will
have to required entry points a start
routine that would be called when your
module is loaded and stop routine which
is called when your module is unloaded
this is basically it all the rest is is
in term of of the of the NK i okay side
of things all the rest is going to be
the filter or whatever you're going to
use as a mechanism to plug yourself in
side the kernel so the N KS are not
automatically loaded you have you have
to load them or invoke their loading
through a startup script or through
dependency you can have a Shane of
dependency like that will require UNK to
be loaded this is pretty much what's
going on for some of the NK we get in in
the system library extensions as well
the Apple NK is live some of them are
loaded through scripts and some of them
are loaded through dependency so there
is three command-line tools that you
should be aware of for you know special
issue debugging and doing your own case
Kim K X load is going to load your n ke
came out stats will give you a list of
all the NK loaded and came out and load
will
G&K UNK you won't want load if there is
some symbol collision with what's in the
kernel this is I think in the next line
know one of the thing I wanted to add
here is that when you're gonna do and
then K you should use some prefixing for
for your your symbols there because
you're going to be in the same address
space in the same symbol space as a
kernel so if you declare a function
which is this is B connect right you're
going to have a problem because this
function is already inside the stack in
the kernel so you module won't load and
also we would like people to somehow
prefixes or symbols with their own you
know kind of prefix I don't know maybe
their vendor code Apple OS tied to to be
sure that they're not going to conflict
with anything else or any other other
vendor and K is that might load so
something to think about
so different we alluded to this a little
bit but different types of nke
so four main types socket filters
circuit filters are it's a socket layer
we'll go details on this so basically is
it lets you intercept socket calls
network protocols network protocols can
be implemented as nke also and there is
mechanism to add those Network protocol
to your to the Darwin stack dynamically
data link interface layer filters so
there's different types of filter here
for more of a lower-level type of
filtering and also network interfaces if
you add some different a new family a
new type of interface you might want you
might have to to have some nke to
describe that and let the protocol and
yell il know about you know family so
we'll go in those intruders so first of
all the socket layer so just for people
who are aware of the of the unique
the freebsd way of things you know the
second layer is right and there's a user
to kernel boundary it's got a pretty
interesting role here is that so socket
layer basically is doing all the copying
of buffers in and out of kernel space so
by this when your application under Mac
OS 10 when your application is trying to
do let's say ass and call it's gonna try
to send something on the network the
buffer live into your users read in your
application and they need to be put into
the kernel memory and this is done at
the socket layer the socket layer here
is going to take some memory I mean your
buffers and put them into the socket
buffer queues will go in details a
little bit more details about this and
the copy to the kernel space is done at
the socket layer one thing to note for
people were more aware freebsd is that
here your the execution of your of your
thread is going to continue in the
kernel so your users your user thread
will will do the same so we'll do that
from the from that's right so the socket
and the other side the socket layer is
where the protocol stack are going to
send the buffers and basically by a
mechanism a way get to get the
application awakens applications read
awaken we're going to come necessary
input thread and we'll by some mechanism
will get the application awaken and and
the buffer will be copied from the
socket layer back to user space if you
remember what Vinson was talking about
this morning about so was a xti sand
bath sighs well this is where they are
basically we get two sets of cues your
one for sending and one for receiving
and this is zebb some some you know some
parameter ball size so the socket
filters actually live in to the socket
layer so they're their way to intercept
packets both
coming from the application or going
from the application of going up to the
application so different sets of course
if you aware of the socket calls most of
them have some mechanism here for
filters so what we'll do is when you'll
create an nke socket filter will will
have a mechanism to know that your
socket your filter and into your code
need to be called and depending of which
call you added your filter to you will
be calling you'll receive the packets in
this side easier to modify drop you know
do nothing most of the time but you have
you have the flexibility to do something
with packets coming in and out of the
socket layer so this is one way of
putting a it's a Content filter I'm
talking about all this in this slide so
sockets again is the API for networking
this is the native API so everything is
going through the socket layer socket is
a glue in between application and
network protocols we talked about this
this is where you cross the boundaries
of the kernel to user binary it sits
above the network protocol so the
network protocols are not in the socket
layers the second layer is going to
decide which protocol need to be called
so I get this kind of a file structure
it's it's followings you know unique
type everything is a file system more or
less so it's it's a subset of that it's
got some specific calls but you know
it's it's it's following someone says
you can do a read or you can do a right
on your socket so we talked about this
socket has a pair of packet queues for
incoming and outgoing packets so your
packets are going to sit in in in those
queue specially on the way on the way up
until until basically your your
application is awakened and it's going
to grab those packets so as we said
though inside sockets have plugins each
of those call that SB receive for
example it's going to run
look through all the filters that are
associated to it before it actually does
the call so you have the opportunity to
just discard the packet if you want or
change it it's one-way
okay so socket filter NK yeah which we
just talked about this oh yeah the other
thing is you have you have two type of
socket filters here you can have which
probably is the simplest one a global
socket filter your your filter the
filters that you create is going to be
invoked for each sockets so every socket
is going to run through your filter even
if you don't care you'll say okay well I
don't care and you'll return and the
packet won't be touched and and in the
application I mean and the processing
will continue there is some a little bit
more complex way of doing things where
you can do programmatic circuit filters
that sim will only apply to a certain
type of sockets you know let's say you
just want something for for a web
traffic or something you have some way
to decide that your socket filter will
apply only to that you have to register
your n key and always Apple so those are
for people familiar with mac os9 those
are z/os types of n the types so that
said that let us identify the sockets
and there is no collision when you
insert your sockets your socket filter
can be run after you know some other
vendor x socket filter so you don't know
that
so we have we need to have some way to
identify all socket filters so you need
to use the NK ya know so see with DTS to
get your end alter and as we as we said
an example of this this is a content
filter where you're gonna decide based
on some you know your own criteria
what's what you want to let go through
in those packet set that you receive
each time you you get cold so yes sir
either you change them you you swallow
them you do whatever you want you
duplicate them that
which you know whatever you content
filter or use your circuit filter is
going to do so important points for the
for the socket in case there is no
built-in reference tracking what we mean
by this is that you have to in UNK you
need to keep track of where your socket
is your socket filter is inserted if you
are inserted in 15 you know different
sockets you'll receive you'll know that
and basically what it means like if
somebody asks you to unload you get to
keep track of those insertion and I
going to explain briefly what's going on
that when we run the socket filter code
basically there is a pointer to your
socket filter handler where you're going
to receive packets and if you don't
correctly remove all this when you are
node we're going to call into some codes
that doesn't exist and guess what it's
going to panic so it's all your
responsibility to take care of this and
refuse to unload it's it's pretty it's
pretty reasonable if your if your socket
filters in a state where it doesn't know
or for some reason cannot really unload
to the refuse to unload and people won't
be able to unload your your socket
filter but it's much better than
panicking five minutes later so that's
something you need to be aware and as a
warning you're in the kernel it's it's
pretty much wide open you know you have
wide open access to all the structure
all the thing but you're part of the
kernel you just a function if you're not
there you get to do the right stuff to
plug the hole basically and don't leave
you know null pointers and things like
that hanging around because that's
that's going to be a pretty bad user
experience so that's why we another word
of caution about using kernel extensions
you got to be really careful okay so now
that's that second layer when we looked
at what basically as a networking in
Darwin is the network protocols two
examples that come to mind here is
tcp/ip and appletalk so that's that's
more or less something which is pretty
close to what you'll find in freebsd as
a way to register protocol and add
protocols is pretty much the same we get
some some mechanism to let you do that
dynamically add and remove dynamically
your protocols and your domain so this
is a second layer we're going to talk
about so what's important to know here
is that a domain you find a protocol
family one of the big example here is PF
I net this is where all the TCP UDP and
IP live another one that you can think
about that we have in Darwin SPF I need
six which is four for the family of ipv6
so this is some things that that that is
pretty much covered in a bunch of bsd
books the protocol family and how the
protocol handler works so this is this
is what we have in darwin here from the
socket layer we're going to decide which
protocol handler depending of the of the
type of sockets you know the address
family you put in your socket and if you
have TCP you know you do a connect
we're gonna call TCP connect and this is
done and this is reduce structure where
you can add your own protocol if you
come with a next you know cure protocol
family that's gonna replace IP this is
where you're gonna add it so if you do
that you know please do it and put it in
that way take it so this is extensible
same thing can add your own protocol and
there is a mechanism to to declare this
from your NK again you can remove that
when you remove it be careful
clean up after yourself
and otherwise you know very bad thing
will happen in the kernel will panic
so another one the third layer here that
we we have in in the Dow in kernel is is
a new one
it's it's called Li L that telling
interface layer so what will go in
detail about this it's it's in between
the network protocols and the network
interfaces the goal of this layer is to
be basically a central point yeah
basically GL il that let us do
extensibility you can add protocols you
can add network interface network
interface type of families and also put
filters and GL il is really the central
point that's going to be the abstraction
layer for this
so this is where is a protocol attached
and say hey an IP and I'd like I'll be
interested in in receiving IP type
packets right
I'm appletalk I want to receive
appletalk packet an ipv6 I want ipv6 so
all those guys are gonna I'm gonna talk
to GL il and registers themselves for
requesting those types of packets on the
other side drivers or pseudo drivers
coming from the network interface so it
can be IO kid based drivers I mean I
okay base drivers we're not going to
talk in great details about those here
but they're basically drivers that are
moving stuff of some kind of a wire or
Wireless but you know they are moving
some physical bits if we can say that
well you get some sort of drivers like
tunneling devices and things like that
that just add stuff or remove things to
frames that have been generated
somewhere else or they can generate your
own frames but they don't they don't go
to a media somehow all those go to GL il
and register themselves and and give
some information about the type of
framing they do
the type of you know specificities they
have so GL al is here as a central point
to to end all those and add it to this
if I go back to the previous scheme here
in between those we got two types of
filter again we got in between the
network protocol we got the protocol
filters that register with the Li L and
say I want the IP packets for interface
you know zero you know your your
built-in Ethernet and you also have
interface filters that are registering
with GL al and there are more
lower-level those guys will see all
packets for these one interface say you
want to see everything coming from en 0
or C Airport GN 1 or whatever PPP and
you're gonna put an interface filter
here that will not be protocol dependent
you will get all your appletalk plus IP
plus you know whatever packets here well
as a protocol filter you will specify
that you're interested only in IP
packets or appletalk packets whatever
you want so it's so this mechanism here
it's different from BSD and this is
something you need to be aware of if you
take something like a driver or
something like that it's got to our
pseudo driver let's say channeling
driver it's got to follow some different
rules that it would in FreeBSD 3 or 4 it
will have to declare some GL il modules
there to register with GL al and and do
things differently there is some example
in the NK s and in the darwin code about
how to do that appletalk is an example i
p does that ipv6 that
so the interface layer the interface
layers is a four and I said ioki type
drivers or absolute drivers so they
attach to GL il and basically tell the
networking stack that they are available
so dynamically your airport is turned on
or something and the en one is going to
basically tell the LA hey I'm an eater
internet type of driver and you know
this this is where I am so it's a it's
more of a flexible mechanism for
attachment of detachment of interface on
the fly on the same way you register
your new you load your new protocol and
dynamically it's gonna tell GL il that
hey I'm protocol this type and they'll
take you know whatever snap ID for my
packets now I want all those packets you
know this interfaces that interface so
you'll see around the Geo tag which is
the cookie users you know the handle in
that you'll get four filters and you'll
get for protocol attachment or two
interface and all this is what identify
basically your unique connection to GL I
am so there is to go a little bit in
details about GL al as there is for
interface family per type of interface
to handle framing what I mean by this is
for enternet YP you need the address
resolution protocol and ARP is you know
is not really part of IP it's more in
between IP and Ethernet and there is a
way in DL il to give an interface family
and and specifies those kind of
specificity so here when you're gonna
our PR presentation is going to is going
to be register with GL al so if you come
if you're if you're with Ethernet and
you're using IP you don't have to do
anything if you come up with your know
your own media and needs your own
no address resolution protocol you may
have to look into this and and declare
your module with July yeah and okay so
this is the protocol module and so yeah
GL al also the modules will entail the
framing so you have two for your type of
interface if you go episode of device
pretty simple usually it's just moving
you know your your data pointer and
putting a few bytes anything but this is
what what is handled here
so there is a lot of details in this if
you're doing you know adding new type of
interface you really have to go into the
documentation and look for this the
filters are a little bit easier to do so
DL il filters so as we said before there
is local filters on top of the Li L and
and there's the protocols say the
difference what we do is that they see
all protocol diagram for an interface so
the little trick here is that you
register per interface your filter so if
you register for IP on en0 you'll see
valid IP frames for en 0 you won't see
any appletalk packets there but you
won't see IP packets for yen one for
your airport let's say I need to
register on both so they that kind of
low-level
however you get the interface filters
that give you even more flexibility
because basically at this point you'll
see the world frame so framing looking
at you know if it's an IP packet or an
apple TAC packet won't be done in the
interface filter so this is between the
interface and the family here you will
get access to the packets as they come
from the driver you'll get access to the
valid packets coming from the driver
so you'll see full frames and this is
where you can if you're trying to do
like a VPN solution or a firewall is
there one way where you could put your
nke is asking you know you know it could
but you could be doing that as a
protocol filter or
at the interface filter it really
depends what you're doing but those are
a good point for getting all the traffic
while things that the socket layer arm
or form you know getting things distinct
to an application you'll get all the
traffic for all all the sockets
basically or you know even if they don't
go to any socket if she just dropped in
the stack you'll see them at this point
it's we it's before any processing
bye-zee-bye the stacks so a good
solution for VPN and firewalls network
interface they're still freebsd STI fnet
structure you have one per interface
it's ioki based you have a lot of things
for internet internet as there's a lot
of sample and the family shows how to do
that for i okayed so there that's if
you're doing an internet driver or you
know some drivers that really talk to a
media you really need to look into i
okay because those driver don't live
into this four layers they live into i/o
kit however if you're doing episode of
device of some or something which is a
little bit in between like PDP you may
have to do some work in the i/o kit for
your driver you know dealing with media
side and also i'd see a network
interface layer here where you will have
some GL al work to get your stuff
registered with the lal and known by the
network stack network networking
subsystem is it's not part of i okay but
i okay that some way to basically give
the packets and you know call some GL al
functions to to make the interface well
not one case where if you're doing
absurd of driver like a tunneling device
you may you may do that only in the
networking subsystem you don't have to
go to i/o kit if all you do is take
packets you know add a header to it and
do some capsulation of some sort and
send it back to an internet or another
interface
you won't have to to go through our kids
that's that's the thing to know there is
an answer we just mentioned it sat here
because there's some confusion here but
those filter we're talking about our
different levels and people were coming
from UNIX may may know special in the
BSD side so BPF which is BBF is a is
really an i/o kit kind of tap the
difference it's it's well it's a
standard way and if you're not aware
coming from mine on FreeBSD 2 to get you
know things like sniffer type
application network traffic analyzer and
those kind of things those are tap from
the driver which will copy all the
frames back to BPF and and and get them
you know to your application which which
asking for BPF traffic the big
difference with the GL il filter is that
in GL il when you put like let's say an
interface filter you will get packets
but you'll decide to to let those
packets go through or make a copy and do
your own kind of tap functionality but
by default the packets there's one
instance of the packet here you get a
copy of the packet which is made and
also for internet that's pretty true for
Internet BPF opening the BPF device will
put the driver in promiscuous mode which
means you will get all the packets you
know basically seen by your interface
not just the one for your MAC address
and you know the various multicast or
broadcasts that you can get you will get
everything that is physically seen on
your segment so it's something you won't
see from Jil interface filter July
interface filter will get only the
packets that are valid for your for your
address it's not going to be in
promiscuous mode and again there are
some standard hooks in iokit network
interface for this so that's a good
model to follow if you're building your
own drivers it's it's it's a neat
utility to be able to use TCP dams
there's a bunch of
of you know services on top of that and
that's a pretty much low work to do to
get the BPF support in your new driver
so we just mentioned it here because
there's some confusion between those and
the JLA interface filters they are not
exactly the same level so no now your
kid interfaces need to support PPI folks
so that's four BPF okay
another important thing here that we
want to talk about about the networking
subsystem is the em buff so amber are
the memory buffers that we use all over
the system in the networking subsystem
to old network data if you're coming
from nine you're pretty much aware of
the M blocks which are used in in the OT
or the streams modules it's it's more or
less the same thing in in the bsd world
what we do with amber is that we're
gonna old either the packets coming from
the socket layer in m buffs or things
that are coming from from drivers so
iokit is gonna create some amber with
the packets received by let's say an
Internet driver and sends this back up
to the GL al of course and GLI L will
you know route those packets back to
tcp/ip or about or whatever but those
are all members that are manipulated so
amber are interesting because like M
blocks manipulate pointers to data so
once you're in the kernel there is no
more copy of data everything is done
through an buff so the drivers copies
are data from from you know ring buffer
to Ziemba actually and buff already and
they're passed up until they get to the
socket layer and then they'll be when
the application will be awakened and
we'll get data then they will be copied
back from kernel space to the user
application
memory and they'll be released at this
point so what's interesting in the in
Ziemba is that if we take the example of
a packet going down the example of ass
and you're sending some TCP traffic
what's going to happen and the socket
layer is that your packets from user
space will be copied in the kernel into
some m buff clusters into some man bath
and those n bath will be in the socket q
you know remembers socket / q we're
talking about earlier and what we're
going to do to send IP packet CP IP
packets is that we'll add an ember to a
part of the data that you sent and this
will logically point to the data in your
socket buffer and until and this is
going to go down to the driver and the
driver will send this and the driver
will will will say it's done but the
data in the socket buffer won't be
released until the data has been
acknowledged by the other side of the
TCP protocol so if we need to do a
retransmit of your off of this you know
packet that we took from your data we'll
do that by you know propelling a new
header and pointing to to your data but
your data will be the same in there so
there is no copy here it's only once we
know that all the data has been
acknowledged and we don't need it at the
circuit layers that it's going to be
released so that's something to know
about n buff for you as a you know
somebody who's gonna write in network
kernel extension you have a bunch of
function to access and bath to allocate
them both to many polite n buffs there
is pretty much everything you you can
think about to do with the n buff in n
buff Delta H is a good start into this
directory to look at however there is a
bunch of macro dealing with n buff
try to avoid using them as if possible
for the problem we talked about before
if we change implementation underneath
and the macro that makes let me let me
give some kernel panic if you're using
that so try to avoid the
who use if possible you know other thing
to know which is a little bit different
from from BSD is that we have a
different VM subsystem underneath in the
way we allocate memory is kind of
different
so in FreeBSD you're pretty much
guaranteed that you'll get memory when
you allocate an empath it's not the case
in Darwin so be aware of that that your
allocation of an buff can and will fail
there is two mode for an buff you can
ask for an empath with you know don't
wait which mean give me an F Buffy and
then buff if you have one two and all
your packets your data if you don't you
know just return this is something you
use on the Fast Pass let's say I'm you
transmit and receive pass you know in
your in case this be warned that you
you'll you'll may get you will get an
old back and you know probably the best
way in that case you drop your packet do
whatever is is it's good there you can
also ask for a wait mode but don't do
that on the fast pass because you're
gonna block the Strad what we're trying
to allocate memory so you're gonna block
you know potentially it's it's not very
good so do that for things that are a
low bandwidth kind of things you know
when you start your protocol and it need
some member you know to up front or
something like that but don't do that on
the fast path
so yeah the rule is do not block so
amber if you get to realize them there
is some rules it's not completely
depending you know socket buffers and
everything you'll you'll see that you
have to realize them or somebody's gonna
realize it for you there's no real
preset rules no I think that's it just
yeah just the warning is be careful with
em both be careful about the use of
macros and there is some come in like
netstat dash ends it's gonna tell you
how many Amba you're using a new system
and this
a really good way to see if you have a
leak in your NK if you see the number of
n buff you know going up in use it's
probably somewhere you forget to release
one and so you might you might want to
check that when you're doing debugging
of your encase from from the inside from
inside the kernel from gdb you can look
at MB stats it's gonna give you some
stats about the allocations number of
drops and things like that and again
it's gonna give you some information if
you forget to release a member in your
endgame another thing here that we have
in terms of kernel extensibility are the
kernel events what we have is basically
a new domain here the PF system which
from the socket protocol give you a way
by you know listening on a socket by to
see some kernel events so it's gonna
report events from kernel to user space
and those are you know pretty low
bandwidth events usually the kind of
events we have our things like your
interface you know you put the link so
link is up so link is down things like
that or the IP address change so we you
will receive an event on if you listen
this if you if you're connected socket
this is used mainly by the system
configuration you know configuration is
pretty pretty much the user of this you
can also add your own events to this
mechanism you can see if you have like
some specific driver or so specific you
know drive with families that you added
and you want to add your events and you
want to have an application listening to
this those events you can do that it's
it's not meant for high traffic it's
just a low bandwidth stuff but it's a
system PF system there is another thing
that we add in
in Darwin which is the network and grv
the PFN geography so that gives an
access to its essence I want to mention
for people from coming from nine it has
nothing to do with in the RV that you
you know the drivers native driver from
mac os9 it's it's really you know the
network driver here so while it gives
you is from the socket level give you
access to all the packets to the raw
packets and one you know you know you
can do as we said before try to avoid if
you can to do your your protocol or
things you are trying to do try to avoid
doing them in the kernel for all the
reason we stated before if you want to
do your own protocol in userland you
could use a PF and gr v to get some
packets there as an example in the oven
you can look at shared IP which is the
name of the kernel extension we're using
for doing the port sharing with classic
with classic networking is an example of
basically classic listen to PF and your
fe sockets to get its packet back and
you know in emulate gog you know driver
the LPI driver that way so the g LPI
drivers is really talking through a
socket to PF injury that's an example
for you if you're trying to do this kind
of things so right now works on internet
okay funnels funnels are mechanisms that
we we introduced in Darwin this is not
something you'll find in FreeBSD why do
we have this is basically as you know we
have an MP and we're an MP system so
we'll we have a multiprocessor and what
we want is to have a mode where we can
have you know performance in MP and the
networking stack in Darwin so from
FreeBSD is not completely MP safe let's
say and so funnels are a mechanism to
give the mutual exclusion to make sure
that
run into the networking code networking
from you know coming from i/o kit up to
the socket layer to the system calls
that nobody else is going to be running
into that code at the same time so we
got a mutex that we take from the socket
layer or from the packet level that's
gonna make sure that nobody else you
know can be running in let's say TCP
code and do something at the socket one
of problem to think about is you're
trying to send something on let's say
TCP doing some your you're sending some
data on a TCP socket at the same time
we're getting a disconnect if we didn't
have a system like phenols we're in a
multiprocessor environment you could be
doing your TCP send while the state of
your TCP transaction is being modified
by the packets the incoming packet and
we don't want to do that we are not
prepared for that so so way to do this
is to a funnel so basically in in the
Darwin kernel there is two fun answers
the network for know which is used by
the network stack and the kernel funnel
which is used by the rest of the biggest
D sub system if you remember the diagram
from before
inside the Mac os10 kernel we get BSD
subsystem which is more or less
networking plus file systems so it's
used by a file system so what it means
that in the file system or in the
networking you cannot have two
processors at the same time however by
this mechanism you can have one
processor running you know dealing with
packets in processing packets while the
other processor is doing filesystem sync
so that give us some good performance in
servers on MP environment where you can
have your Apache or you know Apple share
server do at the same time I have a one
processor and do some networking
activity while the others you know
flashing stuff on on the disk are doing
some net file access so problem with a
funnel is that
you need to be aware of them and and the
rule I stated that basically we have one
lock on top of the of the networking
stack at the socket layer and I mean
that's a system called layer and one at
the bottom is is is not completely true
so we're going to go into some of the
detail you need to do in your NK to deal
with funnels but that's the difference
from FreeBSD so that's something you
need to be aware of and the thing is
here that you need to deal with funnels
you cannot like say I don't care about
funnels your system is gonna be on
having problem if you don't deal with
that right okay so when to use them
so you need to set the network funnel
and specify that you want to work in the
network for all ins your module start
and stop why because you're called by
i/o kit basically when you're you're
your module starts and I orchid is not
running an under a funnel IO kit in the
mock part of the kernel don't need to
have an altar already NP completely safe
and they don't need that so you need
when you bring you're going to be called
you need to basically tell I'm gonna use
a network for now and their skulls to do
that same thing for the stop timeouts
are another one where you need to be you
know switching funnel or taking the
funnel taking the network final why
because you the timeouts are called
as a you know direct mapping from the
marker subsystem and you're not under
any funnel so if you run your code you
need to explicitly say hey I'm gonna run
in the networking code so I need to
grabs and network for now first and
there is also things when you're doing
things we're spreading if you create a
new thread and you need to explicitly
you know tell that this Redis to run to
the network for now yeah it's a
preemption point so each time you're
trying to grab the funnel or you you're
gonna leave the funnel you can be
preempted another thread in the kernel
can it can run so be aware of that your
state might change when you come back
from
you know from from asking for getting a
funnel or living a funnel just be aware
of that and more or less if you're aware
with FreeBSD and ESPN net SPL X and our
winds are more or less no ups I'm saying
more or less because all they do right
now is they don't they just make sure
that you're either under the network as
a colonel for now they don't they don't
have any active you know nesting or
anything like that you just do not have
the you know I mean and FreeBSD you
would protect yourself or a critical
section by raising you know SP on that
and say nobody else can can enter at
this point on tens are more less not but
if you're under the network from well
nobody's going to get in there so that's
four four funnels so that's something
you need to look at and look at or
kernel extensions in Darwin to see how
to use that appropriately NK control
there is a way p FN key to control the
Enki from a process from user mode so
you let's say you inserted the socket
filter or data link interface layer you
know filter and you want to have some
control to that you can have a special
mechanism like a conduit to control the
earth your your socket I mean your your
NK through this NK controls the PF NK
the NK manager is not loaded by default
so it's something you need to do and
we're we're kind of in in the changing
will change that so you know right now
it's it's work in progress
it's a character device and there is
some other way to that we encourage to
talk to your NK is you can go through
ioctl
we're going to intercept IOC else and
use that as you can the control
mechanism for your n ke are also socket
options and this is what we do if you
look in shared IP we use the adoption as
a you know control mechanism for ciently
so so VPN yeah we talked about a way to
implement a VPN so you could do that as
absolute device depending on what you're
doing the type of it
you're doing or you could do that as
Glif filter it really depends about how
your code is organized and which level
you think is appropriate for you to to
plug in be aware that the kami IPSec is
coming to Mac OS 10 so if your VPN
solution is using IPSec IPSec right now
can may IPSec is in Darwin you can take
a pic of that and this is gonna come and
be sometime in the future this will be
part of the Mac OS 10 kernel so you can
build your own darwin kernel with IPSec
right now and you know use that as a
base for your VPN solution if you're
using IPSec and talk to us if you're
interested we're really interested in
knowing how we can help you with that
summary I just want to go again because
the massagers
about the rule foreign keys you have to
be really careful about your dependency
you have some iokit mechanism to say hey
I'm linking against that kernel I don't
want to be loaded if you know it's
version of 15 of Mac OS 10 you get to
keep track of your resource and use age
you know nobody's gonna clean up after
you you have to do it do not block input
on the first pass you're gonna block the
world you know the world networking
stack here you're part of the networking
so you just have to behave and use those
rules you have to know your split
funnels be really careful about that
remember timeouts and anything that is
coming out of of of the of the kernel
funnel or the rest of the PS of the rest
of the kernel is probably not on there's
a funnel so just check with that and be
aware of binary compatibility in the
future as we said IPSec ipv6 are coming
the power of Darwin we get you know you
can you can look at that in the Darwin
kernel we're going to be based on that
PvP extensibility there will be a way to
get some plugins for PvP if you say you
want to have PvP over 80 nm or PPP over
some Azure new cool media you just
invented the N key control yeah socket
API is in flux it's gonna change a
little bit
and also in in the future we're planning
to get something to you know instead of
funnel some more finer grain locking of
sockets and things so we don't have to
use a funnel mechanism so that's pretty
much it
so I hope you get information there you
can have additional resources here
my quest and of course you've seen those
I'm going to go fast results darwin org
and freebsd are also good points for
information resources that just
disappear
the stevens books that we talked about
the implementation and design
implementation of bsd four four is also
pretty interesting however be aware of
those differences we talked about like
phenol yell IL and things like that that
make that gives you a pretty good idea
about what's going on but it's not
exactly what we have in darwin and you
know the network kernel extension PDF
file that has a very very complete
coverage about all the type of GL al
filter you can do in socket filters as
well this is where you're gonna dig into
all the details about how to do your UNK
and also kernel extension tutorial for
my orchids which is going to tell you
about how to build in case with project
builder and what are the rules for for
dependency and things like that and you
know tell us what you need from us you
know we're trying to make this
extensible we got some mechanisms that
we think cover some ground so tell us if
you need more what you think we should
change here and we're really interested
in getting your feedback on system and
the roadmap this morning was networking
overview session it's done already and
this afternoon pretty interesting just
history and network configuration
mobility we'll see how those guys are
using the events you know to get you
know some state information from the
stack and Thursday morning will all be
there for the network feedback forum
which is in room j1 just next
always fun so with that you know which
contact contact my boss
you