Research
Research Interests
- High-dimensional statistics and learning
- Optimization and deep learning theory
Publications (*: equal contribution)
- Structured Preconditioners in Adaptive Optimization: A Unified AnalysisIn International Conference on Machine Learning (ICML), 2025
- Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index ModelIn International Conference on Learning Representations (ICLR), 2025 (Oral)Presented at NeurIPS 2024 Workshop on Mathematics of Modern Machine Learning
- How well can Transformers emulate in-context Newton’s method?In International Conference on Artificial Intelligence and Statistics (AISTATS), 2025Presented at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning
- Training dynamics of multi-head softmax attention for in-context learning: emergence, convergence, and optimalityConference on Learning Theory (COLT), 2024Presented at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning
- Noise-adaptive Thompson sampling for linear contextual banditsIn Advances in Neural Information Processing Systems (NeurIPS), 2023
- Finding regularized competitive equilibria of heterogeneous agent macroeconomic models via reinforcement learningIn International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
- Fast mixing of stochastic gradient descent with normalization and weight decayIn Advances in Neural Information Processing Systems (NeurIPS), 2022
- Implicit bias of gradient descent on reparametrized models: On equivalence to mirror descentIn Advances in Neural Information Processing Systems (NeurIPS), 2022Abridged version accepted for a contributed talk to ICML 2022 Workshop on Continuous time methods for machine learning
- North American biliary stricture management strategies in children after liver transplantation: a multicenter analysis from the society of pediatric liver transplantation (SPLIT) registryLiver Transplantation, 2022
- Continuous and discrete-time accelerated stochastic mirror descent for strongly convex functionsIn International Conference on Machine Learning (ICML), 2018
- Accelerated stochastic mirror descent: From continuous-time dynamics to discrete-time algorithmsIn International Conference on Artificial Intelligence and Statistics (AISTATS), 2018
Preprints (*: equal contribution)
- Implicit regularization of gradient flow on one-layer softmax attentionarXiv:2403.08699, 2024Presented at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning