Research
Research Interests
- High-dimensional statistics and learning
- Deep learning theory
- Foundations of artificial intelligence
Publications (*: equal contribution)
- Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in TransformersIn Advances in Neural Information Processing Systems (NeurIPS), 2024Presented at ICML 2024 Workshop on Theoretical Foundations of Foundation Models
- Approximate Message Passing for orthogonally invariant ensembles: Multivariate non-linearities and spectral initializationInformation and Inference: A Journal of the IMA, 2024
- Training dynamics of multi-head softmax attention for in-context learning: emergence, convergence, and optimalityConference on Learning Theory (COLT), 2024Presented at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning
- Noise-adaptive Thompson sampling for linear contextual banditsIn Advances in Neural Information Processing Systems (NeurIPS), 2023
- Cooperative multi-Agent reinforcement learning: asynchronous communication and linear function approximationIn International Conference on Machine Learning (ICML), 2023
- Finding regularized competitive equilibria of heterogeneous agent macroeconomic models via reinforcement learningIn International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
- Fast mixing of stochastic gradient descent with normalization and weight decayIn Advances in Neural Information Processing Systems (NeurIPS), 2022
- Learn to match with no regret: Reinforcement learning in Markov matching marketsIn Advances in Neural Information Processing Systems (NeurIPS), 2022 (Oral)
- A simple and provably efficient algorithm for asynchronous federated contextual linear banditsIn Advances in Neural Information Processing Systems (NeurIPS), 2022
- Implicit bias of gradient descent on reparametrized models: On equivalence to mirror descentIn Advances in Neural Information Processing Systems (NeurIPS), 2022Abridged version accepted for a contributed talk to ICML 2022 Workshop on Continuous time methods for machine learning
- North American biliary stricture management strategies in children after liver transplantation: a multicenter analysis from the society of pediatric liver transplantation (SPLIT) registryLiver Transplantation, 2022
- Continuous and discrete-time accelerated stochastic mirror descent for strongly convex functionsIn International Conference on Machine Learning (ICML), 2018
- Accelerated stochastic mirror descent: From continuous-time dynamics to discrete-time algorithmsIn International Conference on Artificial Intelligence and Statistics (AISTATS), 2018
Preprints (*: equal contribution)
- Implicit regularization of gradient flow on one-layer softmax attentionarXiv:2403.08699, 2024Presented at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning
- How well can Transformers emulate in-context Newton’s method?arXiv:2403.03183, 2024Presented at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning