Shuffling Gradient Methods for Finite-sum Optimization
Người báo cáo: Trần Huyền Trang (Cornell University)

Thời gian: 9:30 thứ Ba ngày 03/01/2023

Địa điểm: Phòng 611 (612), nhà A6, Viện Toán học

Tóm tắt báo cáo: The finite-sum optimization problem arises in most machine learning tasks, including logistic regression, multi-kernel learning, and some neural networks. Stochastic Gradient Descent (SGD) and its stochastic first-order variants have been widely used to solve this problem thanks to its scalability and efficiency in dealing with large-scale tasks. While the traditional version of SGD often uses uniformly independent sampling, practical heuristics often use without-replacement sampling schemes (also known as shuffling sampling schemes). In this work, we propose Nesterov Accelerated Shuffling Gradient (NASG), a new algorithm for the convex finite-sum minimization problems. Our method integrates the traditional Nesterov's acceleration momentum with different shuffling sampling schemes. We show that our algorithm has an improved rate of O(1/T) using unified shuffling schemes, where T is the number of epochs. This rate is better than that of any other shuffling gradient methods in a convex regime.

(Co-authors/ Collaborators: Lam M. Nguyen - IBM Research

Quoc Tran-Dinh - University of North Carolina at Chapel Hill

Katya Scheinberg - Cornell University)

Back