Optimization is one of the foundations of machine learning. It plays a prominent role in training statistical models, which are typically nonlinear, of very large scale, and involve large data sets. First-order stochastic methods have been advocated as the most appropriate algorithms for problems of this type. We argue, however, that in many applications it is advantageous to employ second-order information as an integral part of the iteration, and propose two algorithms designed to exploit the stochastic and nonlinear nature of the problems. The talk concludes with a discussion of the highly non-convex problems arising in deep belief networks, and the opportunities for parallelism.
Jorge Nocedal is a Professor in the Industrial Engineering Department at Northwestern University. His research interests are in optimization algorithms and their application in areas such as machine learning. Much of his current research is being driven by a close collaboration with Google Research. Jorge is passionate about undergraduate education; he was one of the developers of the ‘’Engineering First’’ Curriculum at Northwestern that exposes students to engineering design in their freshman year. He is currently the Editor in Chief for the SIAM Journal on Optimization, is a SIAM Fellow, and was awarded the 2012 George B. Dantzig Prize.