1 |
Robbins H , Monro S . A Stochastic approximation method. Annals of Mathematical Statistics, 1951, 22 (3): 400- 407
doi: 10.1214/aoms/1177729586
|
2 |
Polyak B T , Juditsky A B . Acceleration of stochastic approximation by averaging. SIAM Journal on Control and Optimization, 1992, 30 (4): 838- 855
doi: 10.1137/0330046
|
3 |
Bottou L, Bousquet O. The tradeoffs of large scale learning//Platt J C, Koller D, Singer Y, et al. Adv Neural Inform Process Syst 20. Cambridge: MIT Press, 2008
|
4 |
Shalev-Shwartz S , Singer Y , Srebro N , Cotter A . Pegasos: Primal estimated sub-gradient solver for SVM. Math Prog, 2007, 127 (1): 3- 30
|
5 |
Nemirovski A , Juditsky A , et al. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 2009, 19 (4): 1574- 1609
doi: 10.1137/070704277
|
6 |
Lan G , Monteiro R D C . Iteration-complexity of first-order penalty methods for convex programming. Mathematical Programming, 2013, 138, 115- 139
doi: 10.1007/s10107-012-0588-x
|
7 |
Ghadimi S , Lan G . Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Mathematical Programming, 2016, 156 (1/2): 59- 99
|
8 |
Bach F, Moulines E. Non-asymptotic analysis of stochastic approximation algorithms for machine learning//Platt J C, Koller D, Singer Y, et al. Adv Neural Inform Process Syst 20. Cambridge: MIT Press, 2011: 451-459
|
9 |
Bach F, Moulines E. Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)//Platt J C, Koller D, Singer Y, et al. Adv Neural Inform Process Syst 20. Cambridge: MIT Press, 2013
|
10 |
Duchi J , Hazan E , Singer Y . Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 2010, 12, 2121- 2159
|
11 |
Gasnikov A V , Nesterov Y E , Spokoiny V G . On the efficiency of a randomized mirror descent algorithm in online optimization problems. Computational Mathematics&Mathematical Physics, 2015, 55, 580- 596
doi: 10.1134/S0965542515040041
|
12 |
Beck A , Teboulle M . A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sciences, 2009, 2, 183- 202
doi: 10.1137/080716542
|
13 |
Tseng P , Yun S . Incrementally updated gradient methods for constrained and regularized optimization. Journal of Optimization Theory & Applications, 2014, 160 (3): 832- 853
doi: 10.1007/s10957-013-0409-2
|
14 |
Nesterov Y . Smooth minimization of nonsmooth functions. Mathematical Programming, 2005, 103, 127- 152
doi: 10.1007/s10107-004-0552-5
|
15 |
Nesterov Y . Subgradient methods for huge-scale optimization problems. Mathematical Programming, 2014, 146 (1/2): 275- 297
|
16 |
Lan G . An optimal method for stochastic composite optimization. Mathematical Programming, 2012, 133 (1): 365- 397
|
17 |
Ghadimi S , Lan G . Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization Ⅰ: A Generic Algorithmic Framework. SIAM Journal on Optimization, 2012, 22, 1469- 1492
doi: 10.1137/110848864
|
18 |
Ghadimi S , Lan G . Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Mathematical Programming, 2016, 156 (1/2): 59- 99
|
19 |
Ghadimi S , Lan G , Zhang H C . Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Mathematical Programming, 2016, 155 (1/2): 267- 305
|
20 |
Lan G . Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Mathematical Programming, 2013, 149, 1- 45
|