Title: Understanding Host Availability in Distributed Systems
Abstract:
As distributed systems become more decentralized, fluctuating host availability becomes increasingly disruptive. Older systems such as AFS used a small set of well-maintained, highly available machines to coordinate access to client state; server uptime (and thus service availability) was expected to be high. Newer systems like the Google search engine scale to larger client populations by increasing the number of servers. In these systems, the responsibility for maintaining the service abstraction is spread across thousands of machines. Lacking constant attention from human operators, these machines may suffer non-trivial downtime due to software misconfiguration or hardware failure. Server downtime is even more pronounced in cooperative or peer-to-peer systems. In these environments, each client is also a server which must respond to requests from its peers. Since hosts can opt in or out of the system at any time, a non-trivial fraction of the servers may be unavailable at any given time.
In this talk, I provide a thorough investigation of availability dynamics in several distributed systems. I show that host availability does not fluctuate in random ways---instead, it exhibits regularity which often arises from the role-driven behavior of users. This regularity can be predicted, and I show how to exploit these predictions to improve performance in various types of distributed applications. For example, I describe how to reduce the network utilization of a cooperative storage system; by biasing data towards hosts that are likely to be online for a while, we can dramatically reduce the bandwidth needed to regenerate data when replica sites go offline. I also describe several other applications of availability introspection.
Bio:
James Mickens is a Ph.D. candidate in the Department of Electrical Engineering and Computer Science at the University of Michigan. His primary research areas are networking and software systems. In particular, he is interested in devising introspective systems that can reason about the behavior of their constituent components.
http://www.eecs.umich.edu/~jmickens/
|