11/29: Mahan Tajrobehkar – Perturbed Gradient Descent Adapted with Occupation Time (virtual)
November 29 @ 3:30 pm - 4:30 pm
Abstract: We develop further the idea of perturbed gradient descent (PGD), by adapting perturbation with the history of states via the notion of occupation time. The proposed algorithm, perturbed gradient descent adapted with occupation time (PGDOT), is shown to converge at least as fast as the PGD algorithm and is guaranteed to avoid getting stuck at saddle points. The analysis is corroborated by empirical studies, in which a mini-batch version of PGDOT is shown to outperform alternatives such as mini-batch gradient descent, Adam, AMSGrad, and RMSProp in training multilayer perceptrons (MLPs). In particular, the mini-batch PGDOT manages to escape saddle points whereas these alternatives fail.
Short Bio: Mahan Tajrobehkar is a 4th-year PhD student in the Industrial Engineering and Operations Research department at UC Berkeley. His main research interests lie at the intersection of optimization, probability theory, and machine learning.
This will be a joint talk with Jeffrey Ichnowski. This talk will take place from approximately 4:05 p.m. – 4:30 p.m.