Second NMADS Meeting
Abstracts

Home

Participating Institutions

Committees

Call for Participation

Final Program

NMADS Past Events

Related Events in the NYC Metro Area

Send e-mail to NMADS

Paper 10

Operating System Support for Safely and Efficiently Programmable Routers

Prashant Pradhan and Tzi-cker Chiueh

State University of New York at Stony Brook

Placing computation inside the network can yield significant performance benefits to network-based applications. These benefits accrue from topologically strategic placement of computation and from the ability of such computation to exploit global network context. For example, congestion control state can be shared between flows passing through an intranet's portal router, by placing a function in the router that aggregates congestion control state on a path-by-path basis [1].

To support placement of computation in the network, a router operating system should support appropriate abstractions for composing computation on network flows, and provide efficient implementations of these abstractions. More importantly, the core router's integrity and performance should remain unaffected in the presence of such a composable computation framework. Such isolation requires memory protection and performance protection of the router kernel from dynamically added functions. With these goals in mind, we have developed a router operating system that allows safe and efficient composition of computation on network flows. We present the essential features of this OS and describe an example application, Aggregate TCP (ATCP) [1], that can be implemented using these features.

Computation is composed in terms of the following entities :

- Extension functions: These are preemptible functions that can carry state across invocations. Every extension function invocation is made in some execution context or flow. An extension function may have multiple pending invocations, issued in different flow contexts.

- Flows: Flows are abstract execution contexts. A flow is a unit of scheduling and resource allocation. Flows may allow other flows to share their resources through a simple access control mechanism.

Given the above entities, there are two key mechanisms to compose computation and to determine control flow :

- Asynchronous Invocation: A function, invoked in a given flow context, may pass control to another function by posting an invocation to it. The invocation is asynchronous, and the CPU scheduler determines when to make invocations pending in various flow contexts.

- Static Binding: Flows may statically bind themselves to a stream of packets by specifying a packet filtering rule, and an extension function that should get control when a packet matches the rule.

- Dynamic Binding: To pass control to a target function, a given function may reference the target function by names that are strings with semantic connotations. The router OS provides a mechanism to register and query these names.

To implement this framework with good invocation performance, extension functions may be co-located with the router kernel to avoid expensive context-switching and TLB flushing overheads. However, the router kernel's safety is not compromised, owing to the use of intra-address space protection [2]. The extension functions are placed in a lesser privileged subset of the kernel address space, which provides memory protection to the kernel, but only incurs the overhead of a protected function call while making extension function invocation. Performance protection is ensured through a preemptive CPU scheduler.

Aggregate TCP congestion control (ATCP) [1] is an ideal function for placement inside the network, since it can exploit global information about congestion status on various network paths and allow TCP flows to avoid their cold-start phase in congestion estimation. In ATCP, a router placed at the edge of the network, maintains congestion control related state for flows passing through it, grouped by the destination subnet of these flows. An ATCP router, upon receiving a TCP connection request, splits it into a local subconnection (L) and a remote subconnection (R). R starts from a congestion window equal to the warm estimate. On L, an available credit is maintained, depending upon the congestion window of R and its growth mode (linear/exponential). Since the RTT on L is much smaller than that on the whole L-R path, the congestion window for L can grow to the warm estimate much faster.

ATCP doesn't require any changes to the end-system TCP implementations and its evaluation using a real web server trace shows a potential improvement of upto a factor of 2 in normalized HTTP transaction latency. ATCP can be naturally and efficiently implemented using the API exposed by the proposed operating system.

References

[1] P. Pradhan; T. Chiueh; A. Neogi, "Aggregate TCP Congestion Control Using Multiple Network Probing", Proc. ICDCS-2000.

[2] T. Chiueh; G. Venkitachalam; P. Pradhan, "Integrating Segmentation and Paging Protection for Safe, Efficient and Transparent Software Extensions" , Proc. ACM SOSP-99.

 
 

 

Last Update: 11/16/2000