|go to week of Apr 28, 2013||28||29||30||1||2||3||4|
|go to week of May 5, 2013||5||6||7||8||9||10||11|
|go to week of May 12, 2013||12||13||14||15||16||17||18|
|go to week of May 19, 2013||19||20||21||22||23||24||25|
|go to week of May 26, 2013||26||27||28||29||30||31||1|
Statistical learning theory is concerned with making accurate predictions on the basis of past observations. One of the main characteristics of any learning problem is its sample complexity: the minimum number of observations needed to ensure a given prediction accuracy at a given confidence level. For the most part, the focus has been on passive learning, in which the learning agent receives independent training samples. However, recently there has been increasing interest in active learning, in which past observations are used to control the process of gathering future observations. The main question is whether active learning is strictly more powerful than its passive counterpart. One way to answer this is to compare the sample complexities of passive and active learning for the same accuracy and confidence.
In this talk, based on joint work with Sasha Rakhlin (Department of Statistics, University of Pennsylvania), I will present a new unified approach to deriving tight lower bounds on the sample complexity of both passive and active learning in the setting of binary classification. This approach is fundamentally rooted in information theory, in particular, the simple but powerful data processing inequality for the f-divergence. I will give a high-level overview of the proof technique and discuss the connections between active learning and hypothesis testing with feedback.