Second NMADS Meeting
Abstracts

Home

Participating Institutions

Committees

Call for Participation

Final Program

NMADS Past Events

Related Events in the NYC Metro Area

Send e-mail to NMADS

Paper 11

Transport Layer Support for Highly-Available Network Services

Kiran Srinivasan, Florin Sultan, and Liviu Iftode

Rutgers University

Problem and Motivation

There has been a growing trend in the Internet to view resources as services rather than as servers. The coupling between a resource and its location (IP address) is progressively losing importance. The recipients of the services (clients) are only interested in obtaining good quality service no matter from where (geographical location) the service is being provided. This requires the service providers to provide highly-available and high-throughput services. This problem is being approached in two different ways. One method is of building fault-tolerant, high-throughput servers from a cluster of computers connected by a SAN. The other approach is of distributing the servers of the same service onto geographically different areas. The current transport layer framework (TCP) does not facilitate either of these mechanisms as it does not support easy connection hand-off from one server to another. The primary emphasis of this project is to explore and design a transport layer mechanism that enables building services based on the above architectures.

Background and Related Work

Related work includes TCP/IP connection hand-off protocols used either for mobility extensions to TCP/IP [Bakre95, Balakrishnan95, Snoeren00] or for request distribution in clusters [Aron99]. Indirect TCP [Bakre95] provides hand-off support for physical mobility of a connection endpoint over a wireless link, by splitting and maintaining hard state for it at a base station, and by transferring that state to the next base station during hand-off. [Balakrishnan95] provides support for mobility by maintaining soft state in the base stations, at the expense of flooding nearby base stations using multicast to build necessary state prior to a handoff. [Snoeren00] changes the protocol stack at the endpoints to provide end-to-end mobility when the network attachment point (IP address) changes.

All these approaches do not consider the task of migrating the connection endpoints between physically distinct machines. They either rely on directly using the full connection state maintained at the physical endpoints [Bakre95, Balakrishnan95], or on restarting a previously established connection after a IP address change by using an authentication mechanism to reuse the state of the old connection [Snoeren00].

[Aron99] proposes TCP connection handoff in clustered servers for distributing incoming connection requests from a front-end machine to the server back-end nodes. The approach has limitations due to its specialized nature (cluster-based web server): in their single handoff scheme a connection endpoint can migrate only during the connection setup phase. Multiple handoff of persistent HTTP/1.1 connections is only mentioned as an alternative, but no design or implementation is described. Even in the multiple handoff scheme, the granularity of migration of live connections is application-dependent: a connection can only migrate after fully servicing a HTTP request.

Our mechanism for connection handoff targets migration of the endpoint of an active (live) connection between physically distinct hosts, transparent to the fixed endpoint, at any moment during the connection lifetime.

Approach and Uniqueness

The model of client-server interaction that we envision in our design assumes that there are a number of hosts (the servers), either clustered or distributed across the Internet, that provide the same service (i.e., run the same application). A client contacts a preferred server using a TCP connection, at the beginning of a service session.

During the lifetime of a session, under the control of our scheme, the remote endpoint of the connection may transparently (to the client application) migrate between the servers, for example in response to events like failure of the current server or a loss in the quality of service received by the client. In response, the transport layer at the destination server reincarnates the connection endpoint existing at the previous server. In case of a failure to do so, this event is treated just as a failure to provide the service and the session/connection is aborted.

Connection migration involves only one endpoint of the connection (the server side), while the other endpoint (client side) is fixed. Migration may occur multiple times and at any moment throughout the lifetime of an active connection. 

Because migration may occur at any time, the application level state associated with the ongoing data transfer may have to be restored at the destination host to ensure correct continuation of the data transfer on the migrated connection. We provide a minimal API for exporting/importing the server-specific state that completely describes the ongoing data transfer at the application level. The writer of a server application uses our API in order to take advantage of dynamic server-side migration of live TCP connections.

Our migration mechanism can be uniformly used both in cluster-based servers or in groups of servers distributed over wide area. Inside a cluster, we can take advantage by a backend network of high bandwidth and low latency (SAN) to proactively replicate connection state across the cluster. This provides hot-swappable endpoints for a connection: in case of a failure of the server node, the client can transparently continue communication with a backup node. Over wide area, it can be used to recover from events like network congestion and DoS attacks, that degrade the quality of the service received by the client.

In our system, the client applications (e.g., web browser) are unchanged. We change the client and server TCP/IP stack to accommodate a new type of connection. Although we require changes to server applications, we believe the programming effort involved should be fairly low.

Results and Contributions

We are currently designing and implementing our migration scheme by extending the TCP specification.

The connection migration mechanism we propose can be used (i) on the client side, to dynamically shift to another server when the current server does not provide a satisfactory level of service, or (ii) on the server side, to recover connections from hard node failures in cluster-based servers, or to implement a load balancing scheme by shedding load from existing connections to other less loaded servers.

In addition, we believe that our scheme can be used to alleviate the impact on existing connections of certain types of DoS attacks (SYN flooding, the process table attack) by shifting connections from the host under attack to alternate servers. While the migration mechanism may need to use resources on the attacked machine (which, depending on the severity of the attack, may or may not be available), connection migration provides in any case a better alternative than the loss of existing connections.

References

[Aron99] M. Aron, P. Druschel, W. Zwaenepoel. Efficient Support for P-HTTP in Cluster-Based Web Servers. USENIX '99.

[Bakre95] Ajay Bakre, B. R. Badrinath. Handoff and system support for Indirect TCP/IP. Second Usenix Symposium on Mobile and Location-dependent Computing, April 1995. 

[Balakrishnan95] H. Balakrishnan, S. Seshan, E. Amir, R. H. Katz. Improving TCP/IP Performance over Wireless Networks. 1st ACM Conf. on Mobile Computing and Networking, November 1995.

[Snoeren00] A. C. Snoeren, H. Balakrishnan. An End-to-End Approach to Host Mobility. 6th ACM MOBICOM, August 2000.

 
 

 

Last Update: 11/16/2000