[1] Vapnik V, Chervonenkis A. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 1971, 16: 264--280
[2] Wu Q, Ying Y M, Zhou D X. Learning rates of least-square regularized regression. Found Comput Math, 2006, 6: 171--192
[3] Cucker F, Smale S. On the mathematical foundations of learning. Bull Amer Math Soc, 2001, 39: 1--49
[4] Caponnetto A, DeVito E. Optimal rates for the regularized least-squares algorithm. Found Comput Math, 2007, 7: 331--368
[5] Cybenko G. Approximtion by superpositon of sigmodial functions. Mathematics of Control, Signal, and Systems, 1989, 3: 303--314
[6] Esposito A, Marinaro M, Oricchio D, Scarpetta S. Approximation of continuous and discontinuous mappings by a growing neural RBF-basd algorithm. Neural Networks, 2000, 13: 651--665
[7] Smale S, Zhou D X. Estimating the approximation error in learning theory. Anal Appl, 2003, 1: 17--41
[8] Smale S, Zhou D X. Shannon sampling and function reconstruction from point values. Bull Amer Math Soc, 2004, 41: 279--305
[9] Tong H Z, Chen D R, Li Z P. Learning rates for regularized classifiers using multivariate polynomial kernel. J Complexity, 2008, 24: 619--631
[10] Wu Q, Ying Y M, Zhou D X. Learning theory: from regression to classification. Studies in Computational Mathematics, 2006, 12: 257--290
[11] Zhou D X, Jetter K. Approximation with polynomial kernels and SVM classifiers. Adv Comput Math, 2006, 25: 323--344
[12] Aronszajn N. Theory of reproducing kernels. Trans Amer Soc, 1950, 68: 337--404
[13] Xiang D H, Zhou D X. Classification with gaussians and convex loss. J Mach Learning Res, 2009, 10: 1447--1468
[14] Steinwart I. Support vector machines are universally consistent. J Complexity, 2002, 18: 768--791
[15] Ye G B, Zhou D X. Learning and approximation by Gaussians on Riemannian manifolds. Adv Comput Math, 2008, 29: 291--310
[16] Maiorov V. Approximation by neural networks and learning theory. J Complexity, 2006, 22: 102--117
[17] Wu Q, Zhou D X. Support vector machine classifiers: linear programming versus quadratic programming. Neural Computation, 2005, 17: 1160--1187
[18] Xie T F, Cao F L. The rate of approximation of Gaussian radial basis neural networks in continuous function space.
Acta Mathematica Sinica, 2013, 29: 295--302
[19] Cucker F, Zhou D X. Learning Theory: An Approximation Theory Viewpoint. Cambridge: Cambridge University Press, 2007
[20] Bartlett P L. The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of network. IEEE Trans Inform Theory, 1998, 44: 525--536
[21] Zhou D X. The covering number in learning theory. J Complexity, 2002, 18: 739--767
[22] Zhou D X. Capacity of reproducing kernel spaces in learning theory. IEEE Trans Inform Theory, 2003, 49: 1734--1752
[23] Williamson R C, Smola A J, Sch\"{o}kopf B. Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators. IEEE Trans Inform Theory, 2001, 47: 2516--2532
[24] Pontil M. A note different covering numbers in learning theory. J Complexity, 2003, 19: 665--671 |