Yuan Cao's Homepage

Yuan Cao

Assistant Professor
Department of Statistics & Actuarial Science
Department of Mathematics
The University of Hong Kong

Office: Rm 118, Run Run Shaw Building
Phone: (852) 3917-8315
Email: yuancao@hku.hk

I am an Assistant Professor in the Department of Statistics and Actuarial Science and the Department of Mathematics at the University of Hong Kong. Before joining the University of Hong Kong, I was a postdoctoral researcher in the Department of Computer Science at University of California, Los Angeles working with Prof. Quanquan Gu. I obtained my Ph.D. in the Program in Applied and Computational Mathematics at Princeton University, where I worked with Prof. Han Liu and Prof. Weinan E.

I am looking for highly motivated Ph.D. students to work with me on research problems in machine learning and statistics. Please drop me an email with your CV if you are interested in joining my group.

Research Interests

My research interests include:

Machine learning
Learning theory
High-dimensional data analysis
Optimization

Publications and Preprints

(* indicates equal contribution)

The Implicit Bias of Adam on Separable Data
Chenyang Zhang, Difan Zou and Yuan Cao, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 37, 2024.
Attention Boosted Individualized Regression
Guang Yang, Yuan Cao and Long Feng, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 37, 2024.
Global Convergence in Training Large-Scale Transformers
Cheng Gao*, Yuan Cao*, Zihao Li, Yihan He, Mengdi Wang, Han Liu, Jason Klusowkski and Jianqing Fan, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 37, 2024.
One-Layer Transformer Provably Learns One-Nearest Neighbor In Context
Zihao Li*, Yuan Cao*, Cheng Gao, Yihan He, Han Liu, Jason Klusowkski, Jianqing Fan and Mengdi Wang, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 37, 2024.
On the Comparison between Multi-modal and Single-modal Contrastive Learning
Wei Huang*, Andi Han*, Yongqiang Chen, Yuan Cao, Zhiqiang Xu and Taiji Suzuki, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 37, 2024.
Per-Example Gradient Regularization Improves Learning Signals from Noisy Data
Xuran Meng, Yuan Cao and Difan Zou, Machine Learning Journal (MLJ), 2024.
Estimation of Out-of-Sample Sharpe Ratio for High Dimensional Portfolio Optimization
Xuran Meng, Yuan Cao and Weichen Wang, arXiv:2406.03954, 2024.
Understanding the Benefits of SimCLR Pre-Training in Two-Layer Convolutional Neural Network
Han Zhang and Yuan Cao, arXiv:2409.18685, 2024.
Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data
Xuran Meng, Difan Zou and Yuan Cao, in Proc. of the 41th International Conference on Machine Learning (ICML), 2024.
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
Dongruo Zhou*, Jinghui Chen*, Yuan Cao*, Ziyan Yang and Quanquan Gu, Transaction of Machine Learning Research (TMLR), 2024.
Multiple Descent in the Multiple Random Feature Model
Xuran Meng, Jianfeng Yao and Yuan Cao, Journal of Machine Learning Research (JMLR), 2024.
Can Overfitted Deep Neural Networks in Adversarial Training Generalize?--An Approximation Viewpoint
Zhongjie Shi, Fanghui Liu, Yuan Cao, Johan A.K. Suykens, arXiv:2401.13624, 2024.
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao, Difan Zou, Yuanzhi Li and Quanquan Gu, in Proc. of the 36th Annual Conference on Learning Theory (COLT), 2023.
The Benefits of Mixup for Feature Learning
Difan Zou, Yuan Cao, Yuanzhi Li and Quanquan Gu, in Proc. of the 40th International Conference on Machine Learning (ICML), 2023.
Benign Overfitting in Adversarially Robust Linear Classification
Jinghui Chen*, Yuan Cao* and Quanquan Gu, in Proc. of the 39th International Conference on Uncertainty in Artificial Intelligence (UAI), 2023.
Graph Neural Networks Provably Benefit From Structural Information: A Feature Learning Perspective
Wei Huang, Yuan Cao, Haonan Wang, Xin Cao, Taiji Suzuki, arXiv:2306.13926, 2023.
Graph over-parameterization: Why the graph helps the training of deep graph convolutional network
Yucong Lin, Silu Li, Jiaxing Xu, Jiawei Xu, Dong Huang, Wendi Zheng, Yuan Cao and Junwei Lu, Neurocomputing, 2023.
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Difan Zou, Yuan Cao, Yuanzhi Li and Quanquan Gu, in Proc. of the 11th International Conference on Learning Representations (ICLR), 2023.
How Does Semi-supervised learning with Pseudo-labelers Work? A Case Study
Yiwen Kou, Zixiang Chen, Yuan Cao and Quanquan Gu, in Proc. of the 11th International Conference on Learning Representations (ICLR), 2023.
Understanding Train-Validation Split in Meta-Learning with Neural Networks
Xinzhe Zuo, Zixiang Chen, Huaxiu Yao, Yuan Cao and Quanquan Gu, in Proc. of the 10th International Conference on Learning Representations (ICLR), 2023.
Benign Overfitting in Two-layer Convolutional Neural Networks
Yuan Cao*, Zixiang Chen*, Mikhail Belkin and Quanquan Gu, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 35, 2022. (Oral presentation)
Online Machine Learning Modeling and Predictive Control of Nonlinear Systems With Scheduled Mode Transitions
Cheng Hu, Yuan Cao and Zhe Wu, AIChE Journal, in press.
Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures
Yuan Cao, Quanquan Gu and Mikhail Belkin, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 34, 2021.
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
Spencer Frei, Yuan Cao and Quanquan Gu, in Proc. of the 38th International Conference on Machine Learning (ICML), 2021.
Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins
Spencer Frei, Yuan Cao and Quanquan Gu, in Proc. of the 38th International Conference on Machine Learning (ICML), 2021. (Long talk)
Towards Understanding the Spectral Bias of Deep Learning
Yuan Cao*, Zhiying Fang*, Yue Wu*, Ding-Xuan Zhou and Quanquan Gu, in Proc. of the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021.
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
Zixiang Chen*, Yuan Cao*, Difan Zou* and Quanquan Gu, in Proc. of the 9th International Conference on Learning Representations (ICLR), 2021.
High Temperature Structure Detection in Ferromagnets
Yuan Cao, Matey Neykov and Han Liu, Information and Inference: A Journal of the IMA, 2020.
Agnostic Learning of a Single Neuron with Gradient Descent
Spencer Frei, Yuan Cao and Quanquan Gu, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 33, 2020.
A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks
Zixiang Chen, Yuan Cao, Quanquan Gu and Tong Zhang, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 33, 2020.
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen, Dongruo Zhou, Yiqi Tang, Ziyan Yang, Yuan Cao and Quanquan Gu, in Proc. of the 29th International Joint Conference on Artificial Intelligence (IJCAI), Yokohama, Japan , 2020.
Accelerated Factored Gradient Descent for Low-Rank Matrix Factorization
Dongruo Zhou, Yuan Cao and Quanquan Gu, In Proc. of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), Palermo, Sicily, Italy, 2020.
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks [slides][poster]
Yuan Cao and Quanquan Gu, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 32, 2019. (Spotlight presentation)
Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks [poster]
Yuan Cao and Quanquan Gu, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 32, 2019.
Algorithm-dependent generalization bounds for overparameterized deep residual networks
Spencer Frei, Yuan Cao and Quanquan Gu, in Proc. of Advances in Neural Information Processing Systems (NeurIPS) 32, 2019.
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks
Yuan Cao and Quanquan Gu, in Proc. of the 34th AAAI Conference on Artificial Intelligence (AAAI), 2020.
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou*, Yuan Cao*, Dongruo Zhou and Quanquan Gu, Machine Learning Journal (MLJ), 2019.
The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference
Hao Lu, Yuan Cao, Zhuoran Yang, Junwei Lu, Han Liu and Zhaoran Wang, in Proc. of the 35th International Conference on Machine Learning (ICML), 2018.
Local and Global Inference for High Dimensional Nonparanormal Graphical Models
Quanquan Gu, Yuan Cao, Yang Ning and Han Liu, arXiv:1502.02347, 2015.