Poster 1
Cooperative File System (CFS) for PC clusters
Suresh
Gopalakrishnan and Liviu Iftode
Rutgers University
Problem and Motivation:
CFS aggregates local storage available on each node in a PC cluster into a single global file system that supports not just location transparency but
also location independence. The main goal is to provide scalable I/O throughput and support variability in load without compromising
manageability and flexibility. In recent times there is a significant trend in the usage of file systems towards server-based applications
rather than the traditional interactive applications. These changes in usage and traffic patterns along with the cluster based server models
motivate the idea of building a global cluster file system. Location independence allows CFS to manage within itself, transparently and
dynamically, the tasks of file placement and file migration for load balancing and file replication for fault tolerance. The possibility of
using global memory (cooperative caching) and the availability of fast interconnects with user level communication obviate concerns about
performance.
Background and Related Work:
CFS is different from other DFS projects (Andrew, Sprite, xFS, NFS, Frangipani, Archipelago etc.) mainly in that
1. it runs in the user level
2. on top of the existing file systems
3. using volatile metadata that can be rebuilt at run time
4. providing location independence
Projects like Frangipani and xFS were built from scratch and included storage management (Petal) issues also in the design, which complicates
the whole system. CFS runs on top of any commodity FS and chooses to leverage on existing systems rather than attempting to redesign a file
system itself.
Approach and Uniqueness:
The key to providing location independence is maintaining virtual directories on each node that contain global information. There is a
single global (virtual) file system that is the conglomeration of the local file systems on participating nodes, with a single root. Directories
are allowed to be replicated, so directories with the same name in the local filesystems (eg: /bin) are considered to be replications of the
single global directory with the same name. The virtual directories are maintained in memory and are volatile - they are built up at runtime by a
procedure called 'directory merging' or 'dirmerge'.
The CFS is implemented at the user level as a library that provides all the file system interface needed by the applications. The design of CFS
has two logical layers - one is the virtual directories that contain the metadata required to provide location independence and the second is a
cooperative caching layer with global replacement that caches the file data at block level. CFS exploits the high speed interconnects and user
level communication with features like memory mapped communication provided by VIA to achieve better performance as well as to allow relaxed
consistency protocols.
Results and Contributions:
The basic CFS implementation consisting of the cooperative caching and virtual directories has been implemented. Using CFS, we have also built a
user level distributed NFS server that we are currently benchmarking using SPEC NFS benchmark and traces from the file servers in the department. The
testbed consists of a cluster of 8 PCs (300 MHz) connected by a VIA-based interconnect (Giganet). The next step is to design and evaluate file
replication and migration policies and mechanisms. CFS will also be used as a platform to study other (distributed systems) issues like impact of
communication on performance, local v/s global cache replacement policies, meta-data consistency models etc.
Poster
2
Encryption Servers: A Cost-Effective Versatile Solution for
Internet Security
Vivek Pathak and Liviu Iftode
Rutgers
University
The explosive growth of the Internet has given rise to a number of
security concerns. These concerns have become important due to the increasing use of electronic commerce. As the Internet and its user
community are distributed in nature, any large scale Internet security solution has to tackle a number of issues. These include scalability, lack
of single points of failure and economic viability. This motivates the creation of Encryption server, a scalable and cost effective medium for
encryption and authentication of Internet traffic.
Encryption server provides IP level security. It implements encryption and authentication of forwarded IP packets. The Encryption
servers function as secure gateways to the Internet and support a destination based security policy. Variable key lifetimes are
used to provide a constant level of security and performance for a wide range of operating loads. The authentication of fresh
public keys is done by a vote among the peers and avoids the use of a single trusted third party. An automatic key exchange protocol for
authenticated propagation of public keys and secure destinations is developed. It uses lazy key update and optimistic trust to eliminate
overhead in the common operating case. A partial implementation on the Linux operating system has been done and preliminary experiments have
shown promising results.
It is intended to explore the performance tradeoffs in the variable key lifetime approach. Issues like caching of cryptographic processing and
interrupt free forwarding of IP packets have to be studied. The performance tradeoffs of various encryption methods and key exchange
mechanisms have to be experimentally validated. The scalability issues present due to hardware limitations have to be addressed by creation of
encryption clusters.
Poster 3
Cooperative Caching Middleware for Clusters
Matias
Cuenca and Thu D. Nguyen
Rutgers University
In order to seize the power of commodity clusters, Internet
servers sometimes take advantage of the aggregate amount of memory in the cluster nodes by using cooperative caching (CC) mechanisms. Such
mechanisms can reduce the number of accesses to disk, and as a result reduce the service latency and improve overall server throughput. Most of
the past work on CC has been done at the application level, i.e., caching was tailored to a specific server's needs. For example, some WWW servers
have assumed Zipf request distributions, have used (variable-sized) files as the caching unit, and have applied local rather than global replacement
policies. The problem with this approach is that the caching and coherence infrastructure has to be rebuilt for each new server. In
contrast, we believe that cooperative caching should be implemented as a middleware layer that can be reused by a variety of servers. To verify
this claim, we are developing a simulator that can test multiple CC policies and mechanisms. At first, we will focus on a distributed file
system and will compare the performance of hand-optimized CC against that of our general CC infrastructure. With our simulator and file system, we
plan to study the tradeoff between the performance and the generality of the middleware we propose.
Poster
4
A Window Into Your Computing Environment
Christopher
Peery and Thu D. Nguyen
Rutgers University
With the advent of Virtual Computing it has become possible
for users to move the execution of applications off of their local machines and on to a better computing base that is usually located
at a remote location. To the users of this form of Virtual Computing the application should appear as though it were still
running on the local machine. This application could be a anything, from a simple program, to a graphic environment, or
even to an entire operating system.
Currently this model has been implemented for simple applications. For
example, AT&T has developed an application known as Virtual Network Computing
(VNC). Using this application, users are able to access a simplified X-server that is running on a remote machine. Since the X-server is
running remotely, all the state of the server is kept remotely as well. Thus, users are able to utilize a graphical interface that can
potentially (depending on the reliability of the remote machine) survive a crash of the local host. In addition, a given user can
access the same remotely running X-server from any number of different machines and is therefore not bound to working at a given computer.
In addition to virtual computing, the computing world has seen a huge
growth in the development of PDAs, such as the Compaq IPAQ. The goal of this project is to combine the versatility and portability of
hand-held devices with the reliability and availability of virtual computing to allow individuals to access an already existing computing
environment.
This project will attempt to get the IPAQ to connect with an X server
that is running remotely on a reliable computing base. The IPAQ will use a wireless medium and the VNC application to interface with
this X server. In essence, the IPAQ will be turned in to a mobile network terminal with graphical capabilities. This will allow users to
access their computing environments from remote locations using this PDAs.
The purpose of this set-up is not to provide the IPAQ with the
complete functionality of full desktop machine. Instead it is to allow individuals to access their computing environment for information or to
allow them to make small alterations in it. The emphasis is not on functionality but on convienience and look-up.
The current focus of this project is on the implementation
details. This includes altering the VNC application to allow for efficient realization on the IPAQ, and how to best grant the user of
this application an overview of his computing environment.
Poster
5
Continuously Available and Scalable Sorted List Data Structure
Kiran Nagaraja, Richard Martin, and Thu Nguyen
Rutgers
University
The emergence of the Internet as the global, ubiquitous networking
infrastructure is driving a new class of highly scalable Internet services. As these services become a part of our everyday life, users will
demand ultra reliability. In this work, we argue that that the complexity of replication, fault tolerance, and consistency necessary to achieve this
reliability could and should be hidden behind data structure abstractions such as in-memory sorted lists, spatial trees, and graphs.
We
believe that the data structure should be flexible, resilient to structural imbalances and allow high
concurrency in order to provide the above said services in an efficient manner. We are investigating a
distributed sorted list structure sorted on the value field of a 2-tuple (keyid, value); potential applications include multi-item auctions and web
ranking/indexing services.
The data structure is organized in a two level hierarchy, a
globally replicated 'Splitter' that partitions the value-range among participating nodes and a set of B-trees to hold the data local to each
node. Our list implementation allows for high concurrency and ease of management. We are currently investigating data rebalancing among B-trees
of a node as well as across nodes.
Other posters
will be included as soon as we get the abstracts from the authors.