Paper 3
Effect of Interrupts in Multiprocessor Servers
Aniruddha Bohra and Liviu Iftode
Rutgers University
Problem and Motivation
Interrupts are the major performance bottlenecks in computer systems with very high Input/Output(I/O) activity. With the advent of new applications like webservers, which are not expected to carry
out any computationally intensive tasks, the cost of interrupts has become a more pronounced bottleneck. The highly optimized architectures
which rely heavily on pipelines, make the interrupt cost higher than it was before. The problem is compounded in case of multiprocessor servers
where there is a considerable overhead, of cache and Translation Lookaside Buffer(TLB)pollution,saving of the system state and
synchronization, to service interrupts at a very fast pace. Denial of Service attacks are proof of the fact that by a very simple mechanism of
bombarding a system with a very large number of packets at a very fast rate, the applications fail to
progress (livelock). This makes us think whether the current interrupt driven mechanism for handling of
network events is the best, or even correct!
The present study wishes to address the questions regarding the interrupt structure on multiprocessors for highly I/O bound processes like
the webserver. We wish to investigate the effect of interrupts on the performance of a webserver running on a multiprocessor system under heavy I/O
workload. We also aim to evaluate various alternative mechanisms that have been suggested for asynchronous event processing and try to suggest a mechanism which would help reduce the impact of interrupts on the
performance of the system, and eliminate livelock in such systems.
Background and Related Work
Several studies about webservers and their interaction with the underlying operating system establish interrupt processing as the
performance bottleneck. Hu et al. point out that the Apache webserver spends around 25%-40% of its execution time handling interrupts.
Synchronization and signal handling are identified as severe performance penalties in the same study. Also, they predict the effect
would be more pronounced in case of a multiprocessor.
Many researchers have looked at the problem of processing incoming messages synchronously. In particular the problem of polling v/s
interrupts with active messages has received much attention. Several systems use a mixture of polling and interrupts. Some systems use
interrupts for just the operating system or protocol specific messages and poll otherwise. Langendoen et al. use polling whenever the processor
is idle and use interrupts otherwise, Soft Timers extend that idea and artificially interrupt the execution at a certain interval to prevent
starvation of the polling thread for the network interface. Smith and Traw initiate the polling using clocked interrupts, that is polling
of the device is initiated by a timer interrupt. Another very popular idea is to interrupt only when we cannot poll fast enough. Several
systems implement this in hardware, in Polling Watchdog and in software by Mogul et al where the interrupt is used just to wake up the polling
thread which then services all outstanding requests and then reactivates the interrupts.
However, none of the above studies investigate the impact of interrupts or their own event handling mechanisms on multiprocessor systems. Also, the
fact that computation is not the main aim of the system in a very important class of applications has been ignored. Since the advent of the World Wide Web
and the gain of importance of the class of applications that are not CPU bound, the effect of interrupt processing needs to be re-evaluated and the
above mechanisms revisited in the new light.
Approach and Uniqueness
Traditional approaches to eliminating livelock typically suggest the use of polling and interrupts in conjunction. With interrupts acting
as triggers to polling. The rationale behind it is, that for polling to be as effective as the interrupts, we need to poll tens of thousands
of times per second, which even with the modern processor speed would leave few cycles for other processing.Our approach is a logical extension
of the ideas presented in above studies, in the realm of multiprocessor environments.
Polling of the event source is a way to eliminate livelock. To have a good performance, we need to poll the source as fast as possible. If we have more than one processor, we can dedicate one processor to poll
the devices and let the other processors carry on with the usual processing work. Here, we ensure that we poll as many times as the processor allows us to. Other benefits that we expect by letting the processor always poll the event source is eliminating the cost of invoking the interrupt
handler with a context switch and removing the overhead of crossing the kernel boundary repeatedly. The above approach ensures that the performance
of the system is not adversely affected while eliminating the problem of livelock in the system.
Although by dedicating a processor to polling the devices, a potential processing element is being wasted, we expect, based on observations of
several studies, that modern workloads are highly I/O bound and the current processors are fast enough that there is no loss in processing
ability of the system as a whole.
Results and Contributions
We have implemented the mechanisms to handle network events using some of the above approaches We are currently working to provide a
comprehensive comparison of all approaches for interrupt handling over multiprocessor systems. We have carried out some performance
measurements over the Apache webserver using the above approaches as the underlying event handling mechanisms.
The preliminary results are promising and we expect to further strengthen our claim through them. We also wish to investigate the impact of a dedicated I/O processor and its possible use for certain
performance enhancing tasks like prefetching. To study the effect of having multiple I/O processors in case of a large multiprocessor
system is also one of the avenues that we would explore in future.