Title: Robustness to Abnormal Events in Networked Systems
Abstract:
More than twenty years of building high-performance protocols and applications with novel functionality has lead to an increased reliance on the Internet and networked systems in general, with exponential growth in users and data transferred.
The problem, though, is that there are faults and failures everywhere ranging from the infrequent yet crippling distributed denial-of-service attacks on web-servers to the niggling yet frequent performance problems in edge networks~(e.g., Enterprise and University networks).
Recouping from such faults is difficult due to the complex nature of networked systems. Functionality is often distributed to multiple servers that interact in complex ways and even more the fundamental value of a system is often tied to its being openly accessible which means there is little control on requests and traffic. Today, these faults are punted to users or administrators who are ill equipped to cope with them.
In this talk, I will present two approaches that make robustness an integral part of networked systems.
When faults are due to uncontrollable external factors such as a denial-of-service attack, I will show Kill-Bots, a web-server protection mechanism that reacts quickly, apportions server resources so that legitimate users continue to be served during the attack and eventually mitigates the attack by detecting the attackers. Kill-Bots is the first system to address a novel kind of DDoS attacks, which we call CyberSlam, and has lead to significant follow-on work.
When faults are due to the complicated interactions between the network and various servers within an enterprise, I will show Sherlock, a mechanism that learns the underlying functional dependencies between these components, encodes the dependencies in a probabilistic inference graph and then uses the graph to quickly identify the causes of performance problems. We deployed Sherlock in a portion of the Microsoft Enterprise Network and demonstrated its practical use.
Bio:
Srikanth Kandula is a PhD candidate at the MIT Computer Science and Artificial Intelligence Laboratory in the Networks and Mobile Systems group. His research interests are in designing, building and analyzing networked systems and protocols that are easy to manage and are robust to abnormal events like attacks and failures. He is the recipient of a best student paper award at USENIX Networked Systems Design and Implementation, NSDI-II(2005) and, the Siebel Fellowship(2002). He received a B. Tech from Indian Institute of Technology (Kanpur) in 2001 and an M. S. from University of Illinois at Urbana-Champaign in 2003, both in Computer Science. His research advisor is Prof. Dina Katabi.
http://nms.lcs.mit.edu/~kandula/
|