Yuan Cao
Title
Cited by
Cited by
Year
Gradient descent optimizes over-parameterized deep ReLU networks
D Zou, Y Cao, D Zhou, Q Gu
Machine Learning 109 (3), 467-492, 2020
3482020
Generalization bounds of stochastic gradient descent for wide and deep neural networks
Y Cao, Q Gu
Advances in Neural Information Processing Systems 32, 10836-10846, 2019
1422019
Generalization error bounds of gradient descent for learning over-parameterized deep relu networks
Y Cao, Q Gu
Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3349-3356, 2020
113*2020
Closing the generalization gap of adaptive gradient methods in training deep neural networks
J Chen, D Zhou, Y Tang, Z Yang, Y Cao, Q Gu
arXiv preprint arXiv:1806.06763, 2018
862018
On the convergence of adaptive gradient methods for nonconvex optimization
D Zhou, J Chen, Y Cao, Y Tang, Z Yang, Q Gu
arXiv preprint arXiv:1808.05671, 2018
802018
How much over-parameterization is sufficient to learn deep relu networks?
Z Chen, Y Cao, D Zou, Q Gu
arXiv preprint arXiv:1911.12360, 2019
482019
Towards understanding the spectral bias of deep learning
Y Cao, Z Fang, Y Wu, DX Zhou, Q Gu
arXiv preprint arXiv:1912.01198, 2019
412019
Local and global inference for high dimensional nonparanormal graphical models
Q Gu, Y Cao, Y Ning, H Liu
arXiv preprint arXiv:1502.02347, 2015
30*2015
A generalized neural tangent kernel analysis for two-layer neural networks
Z Chen, Y Cao, Q Gu, T Zhang
arXiv preprint arXiv:2002.04026, 2020
26*2020
Agnostic learning of a single neuron with gradient descent
S Frei, Y Cao, Q Gu
arXiv preprint arXiv:2005.14426, 2020
172020
Algorithm-dependent generalization bounds for overparameterized deep residual networks
S Frei, Y Cao, Q Gu
arXiv preprint arXiv:1910.02934, 2019
142019
Tight sample complexity of learning one-hidden-layer convolutional neural networks
Y Cao, Q Gu
arXiv preprint arXiv:1911.05059, 2019
132019
The edge density barrier: Computational-statistical tradeoffs in combinatorial inference
H Lu, Y Cao, Z Yang, J Lu, H Liu, Z Wang
International Conference on Machine Learning, 3247-3256, 2018
72018
High-temperature structure detection in ferromagnets
Y Cao, M Neykov, H Liu
arXiv preprint arXiv:1809.08204, 2018
62018
Risk bounds for over-parameterized maximum margin classification on sub-gaussian mixtures
Y Cao, Q Gu, M Belkin
arXiv preprint arXiv:2104.13628, 2021
52021
Agnostic learning of halfspaces with gradient descent via soft margins
S Frei, Y Cao, Q Gu
International Conference on Machine Learning, 3417-3426, 2021
42021
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
S Frei, Y Cao, Q Gu
arXiv preprint arXiv:2101.01152, 2021
22021
Accelerated factored gradient descent for low-rank matrix factorization
D Zhou, Y Cao, Q Gu
International Conference on Artificial Intelligence and Statistics, 4430-4440, 2020
22020
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
D Zou, Y Cao, Y Li, Q Gu
arXiv preprint arXiv:2108.11371, 2021
2021
Structure Detection in High Dimensional Graphical Models
Y Cao
Princeton, NJ: Princeton University, 2018
2018
The system can't perform the operation now. Try again later.
Articles 1–20