Mark Schmidt, University of British Columbia
We first review classic algorithms and complexity results for stochastic methods for convex optimization, and then turn our attention to the wide variety of exponentially-convergent algorithms that have been developed in the last four years. Topics will include finite-time convergence rates of classic stochastic gradient methods, stochastic average/variance-reduced gradient methods, primal-dual methods, proximal operators, acceleration, alternating minimization, non-uniform sampling, and a discussion of parallelization and non-convex problems. Applications in the field of machine learning will emphasized, but the principles we cover in this course are applicable to many fields.
Material
Sldies: L1, L2, L3, L4, L5a, L5b, L6, L7, L8
Videos: 1, 2, 3, 4, 5, 6, 7