Contents

About Network Kernel Extensions

Describes how to write a kernel extension for adding or modifying functionality of a networking stack (Mac OS X 10.3 and earlier).

NKE Implementation

Review of 4.4 BSD Network Architecture

Mac OS X is based on the 4.4BSD UNIX operating system. The following structures control the 4.4BSD network architecture:

socket structure, which the kernel uses to keep track of sockets. The socket structure is referenced by file descriptors from user mode.

domain structure, which describes protocol families.

protosw structure, which describes protocol handlers. (A protocol handler is the implementation of a particular protocol in a protocol family.)

ifnet structure, which describes a network device and contains pointers to interface device driver routines.

None of these structures is used uniformly throughout the 4.4BSD networking infrastructure. Instead, each structure is used at a specific level, as shown in Figure 1-1 .

Figure 1-1 4.4BSD network architecture

The socket structure is used to manage the socket while the domain , protosw , and ifnet structures are used to manage packet delivery to and from the network device.

NKE Types

Making the 4.4BSD network architecture dynamically extensible requires several NKE types that are used at specific locations within the kernel.

socket NKEs, which reside between the network layer and protocol handlers and are invoked through a protosw structure. Socket NKEs use a new set of override dispatch vectors that intercept specific socket and socket buffer utility functions.

protocol family NKEs, which are collections of protocols that share a common addressing structure. Internally, a domain structure and a chain of protosw structures describe each protocol.

protocol handler NKEs, which process packets for a particular protocol within the context of a protocol family. A protosw structure describes a protocol handler and provides the mechanism by which the handler is invoked to process incoming and outgoing packets and for invoking various control functions.

data link NKEs, which are inserted below the protocol layer and above the network interface layer. This type of NKE can passively observe traffic as it flows in and out of the system (for example, a sniffer) or can modify the traffic (for example, encrypting or performing address translation). Data link NKEs can provide media support functions (performing demultiplexing, framing, and pre-output functions, such as ARP) and can act as "filters" that are inserted between a protocol stack and a device or above a device.)

Figure 1-2 summarizes the NKE architecture.

Figure 1-2 NKE architecture

Global and Programmatic NKEs

Socket NKEs can operate in one of two modes: programmatic or global.

A global NKE is an NKE that is automatically enabled for sockets of the type specified for the NKE.

A programmatic NKE is a socket NKE that is enabled only under program control, using socket options, for a specfic socket.

Data link `filters' are essentially global in that they can't be accessed by specific sockets.

Tracking NKE Usage

To support the dynamic addition and removal of NKEs in Mac OS X, the kernel keeps track of the use of NKEs by other parts of the system.

Use of protocol family NKEs is tracked by the dom_refs member of the domain structure, which has been added to support NKEs in Mac OS X. The kernel's socreate function increments dom_refs each time socreate is called to create a socket in an NKE domain. The socreate function is called when user-mode applications call socket or when sonewconn successfully connects to a local listening socket. The dom_refs member is decremented each time soclose is called to close a socket connection.

Use of protocol handler NKEs is tracked by the pr_refs member of the protosw structure, which has been added to support NKEs in Mac OS X. Like the dom_refs member of the domain structure, the pr_refs member of the protosw structure tracks the use of the protocol between calls to socreate and sonewconn to create a socket and soclose to close a socket.

The most important aspect of removing an NKE is ensuring that all references to NKE resources are eliminated and that all system resources allocated by the NKE are returned to the system. The NKE must track its use of resources, such as socket structures and protocol control blocks, so that the NKE's termination routine can eliminate references and return system resources.

Modifications to 4.4BSD Networking Architecture

To support NKEs in Mac OS X, the 4.4BSD domain and protosw structures were modified as follows:

The protosw array referenced by the domain structure is now a linked list, thereby removing the array's upper bound. The new dom_maxprotohdr member defines the maximum protocol header size for the domain. The new dom_refs member is a reference count that is incremented when a new socket for this address family is created and is decremented when a socket for this address family is closed.

The protosw structure is no longer an array. The pr_next member has been added to link the structures together. This change has implications for protox usage for AF_INET and AF_ISO input packet processing. The pr_flags member is an unsigned integer instead of a short. NKE hooks have been added to link NKE descriptors together ( pr_sfilter ).

PF_NKE Domain

Mac OS X defines a new domain -- the PF_SYSTEM domain-- whose purpose is to provide a way for applications to configure and control NKEs. The PF_SYSTEM domain has two protocols, of which only one is of interest for communications with the NKE:

The SYSPROTO_CONTROL protocol is used for configuring and controlling all NKEs.

Internally, the PF_SYSTEM domain’s initialization function is called when the PF_SYSTEM domain is initially added to the system. The initialization function adds the SYSPROTO_CONTROL protocol to the domain’s protosw list and performs other initialization tasks.

In the NKE's start method, register a Kernel Controller structure using the ctl_register function. The ctl_register function is defined in <sys/kern_control.h>. The ctl_register call is prototyped as follows.

int ctl_register(struct kern_ctl_reg *userctl,

void *userdata,

kern_ctl_ref *ctlref);

The fields of the kern_ctl_reg structure are defined as follows.

ctl_id - unique 4 byte id for the controller. Enter a registered Creator ID. Go to the Apple Developer Creator ID web page to register a unique ID. See http://developer.apple.com/dev/cftype/ for more information.

ctl_unit - the unit number for the controlller. A controller can be registered multiple times with the same ctl_id, but for each instance and different unit number must be used.

ctl_flags - set to CTL_FLAG_PRIVILEGED which requires that the user must have admin privileges to contact the controller.

ctl_sendsize - size of buffer reserved for sending messages. 0 = default value.

ctl_recvsize - size of buffer reserved for receiving messages. 0 = default value.

Dispatch Functions

ctl_connect - called when the client process calls connect on the socket with the id/unit number of the registered controller.

clt_disconnect - called when the user client process closes the control socket.

ctl_write - called when the user client process writes data to the socket.

ctl_set - called when the user client process setsockopt to set the controller configuration.

ctl_get - called when the user client process calls getsockopt on the socket.

The following is a code example of this process.

Listing 1-1 Dispatch example

struct kern_ctl_reg ep_ctl;

// Initialize controller

bzero(&ep_ctl, sizeof(ep_ctl)); // sets ctl_unit to 0

ep_ctl.ctl_id = kEPCommID; // should be unique -

// use a registered Creator ID here

ep_ctl.ctl_flags = CTL_FLAG_PRIVILEGED;

ep_ctl.ctl_write = EPHandleWrite;

ep_ctl.ctl_get = EPHandleGet;

ep_ctl.ctl_set = EPHandleSet;

ep_ctl.ctl_connect = EPHandleConnect;

ep_ctl.ctl_disconnect = EPHandleDisconnect;

error = ctl_register(&ep_ctl, &gEPState, &gEPState.ctlHandle);

int EPHandleSet( kern_ctl_ref ctlref, void userdata, int opt, void data, size_t len )

{

int error = EINVAL;

#if DO_LOG

log(LOG_ERR, "EPHandleSet opt is %d\n", opt);

#endif

switch ( opt )

{

case kEPCommand1: // program defined symbol

error = Do_First_Thing();

break;

case kEPCommand2: // program defined symbol

error = Do_Command2();

break;

}

return error;

}

int EPHandleGet( kern_ctl_ref ctlref, void userdata, int opt, void data, size_t *len )

{

int error = EINVAL;

#if DO_LOG

log(LOG_ERR, "EPHandleGet opt is %d *\n", opt);

#endif

return error;

}

int

EPHandleConnect(kern_ctl_ref ctlref, void *userdata)

{

#if DO_LOG

log(LOG_ERR, "EPHandleConnect called\n");

#endif

return (0);

}

void

EPHandleDisconnect(kern_ctl_ref ctlref, void *userdata)

{

#if DO_LOG

log(LOG_ERR, "EPHandleDisconnect called\n");

#endif

return;

}

int EPHandleWrite(kern_ctl_ref ctlref, void userdata, struct mbuf m)

{

#if DO_LOG

log(LOG_ERR, "EPHandleWrite called\n");

#endif

return (0);

}

Connection from the Client Process

After the NKE registers a Kernel Controller structure the application level process opens a PF_SYSTEM socket. The application level process sets up the sockaddr_ctl structure with the required parametrs to communicate with the NKE's Kernel Controller.

To communicate with the NKE, the client process opens a <!--a -->PF_SYSTEM<!--/a--> socket using the socket call.

fd = socket(PF_SYSTEM, SOCK_DGRAM, SYSPROTO_CONTROL);

The client process uses the connect call with the file descriptor returned from the socket call to establish a connection with the NKE. In making the connect call, fill in the sockaddr_ctl structure as follows.

sc_len = sizeof(struct sockaddr_ctl);

sc_family = AF_SYSTEM;

ss_sysaddr = AF_SYS_CONTROL;

sc_id = set to value of ctl_id registered by the NKE in the ctl_reguster call described above.

sc_unit = set to the unit number registered by the NKE in the ctl_register call described above.

The client process uses the setsockopt call to send commands to the NKE. Note that the option names are user defined. The NKE defines what option names it will respond to, and the client process must pass only supported option names to the NKE in the setsockopt call.

The client process uses the getsockopt call to get status information from the NKE. Note that the option names are user defined. The NKE defines what option names it will respond to, and the client process must pass only supported option names to the NKE in the setsockopt call.

The following is a code example for opening a PF_SYSTEM socket to communicate with an NKE

Listing 1-2 Opening a PF_SYSTEM socket

struct sockaddr_ctl addr;

int ret = 1;

bzero(&addr, sizeof(addr)); // sets the sc_unit field to 0

addr.sc_len = sizeof(addr);

addr.sc_family = AF_SYSTEM;

addr.ss_sysaddr = AF_SYS_CONTROL;

addr.sc_id = kEPCommID; // should be unique - use a registered Creator ID here

fd = socket(PF_SYSTEM, SOCK_DGRAM, SYSPROTO_CONTROL);

if (fd)

{

result = connect(fd, (struct sockaddr *)&addr, sizeof(addr));

if (result)

fprintf(stderr, "connect failed %d\n", result);

}

else

fprintf(stderr, "failed to open socket\n");

if (!result)

{

result = setsockopt( fd, SYSPROTO_CONTROL, kEPCommand1, NULL, 0);

if (result)

fprintf(stderr, "setsockopt failed on kEPCommand1 call - result was %d\n", result);

etc.

Implementing a Preference File for NKE

The question arises as to how an NKE can open a "preference file" in the start method. Under the existing architecture, the NKE cannot reliably access a Preference File. When the system starts the NKE, there are no APIs, which the NKE can use to open a file and read preference information. While the NKE could access its info.plist, there is the assumption that the info.plist will not be changed across startups as this information is cached by the system in order to expedite startups.

The proper way to dynamically configure an NKE is with a startup daemon or other application level process. The daemon finds the NKE using the communication method described above, and passes in configuration information that the NKE may require.

About Protocol Family NKEs

Adding and removing protocol family NKEs is accomplished by calling net_add_domain and net_del_domain , respectively. These calls are described in Protocol Family NKE Functions . For detailed information about implementing protocol families, see The Design and Implementation of the 4.4 BSD Operating System by M. K. McKusick. et al. and TCP/IP Illustrated by Richard W. Stevens.

About Protocol Handler NKEs

Adding and removing protocol handler NKEs is accomplished by calling net_add_proto and net_del_proto , respectively. These calls are described in Protocol Handler NKE Functions . For detailed information about implementing protocol families, see The Design and Implementation of the 4.4 BSD Operating System by M. K. McKusick. et al. and TCP/IP Illustrated by Richard W. Stevens.

About Socket NKEs

Socket NKEs are installed in the kernel by calling <!--a-->register_sockfilter<!--/a--> typically from the NKE's initialization routine. Each socket NKE provides a descriptor structure that is linked into a global list ( nf_list ). A second chain runs through the filter descriptor to link it to a protosw for global NKEs. Figure 1-3 shows the interconnections for these data structures.

Figure 1-3 Domain structure and protosw interconnections

When you call socreate to create a socket, any global NKEs associated with the corresponding protosw structure are attached to the socket structure using the so_ext field to link together ketcb structures that are allocated when the socket is created. (See Figure 1-3 .) These ketcb structures are initialized to point to the extension descriptor and two dispatch vectors of intercept functions (one for socket operations and one for socket buffer utilities).

The filter descriptor for a programmatic NKE is linked into the nf_list in the same way as are global NKEs but the file descriptor does not appear in the list associated with a protosw . A program can call setsocketopt using socket option SO_NKE ) to insert a programmatic NKE into its NKE chain in the same way that it would call setsocketopt to insert a global NKE.

Each socket NKE has two dispatch vectors, a sockif structure and a sockutil structure, that contain pointers to the NKE's implementation of these functions. The functions are called when the corresponding socket and sockbuf functions are are called. The dispatch vectors permit the NKE to selectively intercept socket and socket buffer utilities. Here is an example:

int (sf_sobind)(struct socket , struct mbuf *, st kextcb);

The kernel's sobind function calls the NKE's bind entry point with the arguments passed to sobind and the kextcb pointer for the NKE. The sockaddr structure contains the name of the local endpoint being bound.

Each of the intercept functions can return an integer value. A return value of zero is interpreted to mean that processing at the call site can continue. A non-zero return value is interpreted as an error (as defined in <sys/errno.h> ) that causes the processing of the packet or opertation to halt. If the return value is EJUSTRETURN , the calling function (for example, sobind ) returns at that point with a value of zero. Otherwise, the function returns the non-zero error code. In this way, an NKE can "swallow" a packet or an operation. An NKE may reinject the packet at a later time. (Note that the injection mechanism is not yet defined.)

A program can insert a socket NKE on an open socket by calling setsockopt as follows:

setsockopt(s, SOL_SOCKET, SO_NKE, &so_nke, sizeof (struct so_nke);

The so_nke structure is defined as follows:

struct so_nke {

unsigned int nke_handle;

unsigned int nke_where;

int nke_flags;

};

The nke_handle specifies the NKE to be linked to the socket (with the so_ext link). It is the programmer's task to locate the appropriate NKE, assure that it is loaded, and retain the returned handle for use in the setsockopt call.

The nke_where value specifies an NKE assumed to be in this linked list. If nke_where is NULL , the NKE represented by nke_handle is linked at the beginning or end of the list, depending on the value of nke_flags .

The nke_flags value specifies where, relative to nke_where , the NKE represented by nke_handle will be placed. Possible values are NFF_BEFORE and NFF_AFTER defined in <net/kext_net.h> .

The nke_handle and nke_where values are assigned by Apple Computer from the same name space as the type and creator codes used in Mac OS 8 and Mac OS 9 and using the same registration mechanism.

For more information

The following sources provide additional information that may be of interest to developers of network kernel extensions:

The Design and Implementation of the 4.4 BSD Operating System . M. K. McKusick. et al., Addison-Wesley, Reading, 1996.

Unix Network Programming, Second Edition, Volume 1 . Richard W. Stevens, Prentice Hall, New York, 1998.

TCP/IP Illustrated, Volume 1, The Protocols. Richard W. Stevens, Addison-Wesley, Reading, 1994.

TCP/IP Illustrated, Volume 2, The Implementation. Richard W. Stevens and Gary R. Wright, Addison-Wesley, Reading, 1995.

TCP/IP Illustrated, Volume 3, Other Protocols. Richard W. Stevens, Addison-Wesley, Reading, 1996.

The following websites provide information about the Berkeley Software Distribution (BSD):

http://www.FreeBSD.org

http://www.NetBSD.org

http://www.OpenBSD.org/

Next

Copyright © 2003, 2006 Apple Computer, Inc. All Rights Reserved. Terms of Use | Privacy Policy | Updated: 2006-10-03