Research | Tianhao Wang

Research Interests

High-dimensional statistics and learning
Optimization and deep learning theory

Publications (*: equal contribution)

Structured Preconditioners in Adaptive Optimization: A Unified Analysis

Shuo Xie, Tianhao Wang, Sashank Reddi, Sanjiv Kumar, and Zhiyuan Li

In International Conference on Machine Learning (ICML), 2025

arXiv
Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model

Siyu Chen*, Beining Wu*, Miao Lu, Zhuoran Yang, and Tianhao Wang

In International Conference on Learning Representations (ICLR), 2025 (Oral)
Presented at NeurIPS 2024 Workshop on Mathematics of Modern Machine Learning
Link
How well can Transformers emulate in-context Newton’s method?

Angeliki Giannou, Liu Yang, Tianhao Wang, Dimitris Papailiopoulos, and Jason D. Lee

In International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Presented at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning
arXiv
Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers

Siyu Chen, Heejune Sheen, Tianhao Wang, and Zhuoran Yang

In Advances in Neural Information Processing Systems (NeurIPS), 2024
Presented at ICML 2024 Workshop on Theoretical Foundations of Foundation Models
arXiv Link
Approximate Message Passing for orthogonally invariant ensembles: Multivariate non-linearities and spectral initialization

Xinyi Zhong*, Tianhao Wang*, and Zhou Fan

Information and Inference: A Journal of the IMA, 2024

arXiv Link
Universality of Approximate Message Passing algorithms and tensor networks

Tianhao Wang, Xinyi Zhong, and Zhou Fan

Annals of Applied Probability, 2024

arXiv Link
Training dynamics of multi-head softmax attention for in-context learning: emergence, convergence, and optimality

Siyu Chen, Heejune Sheen, Tianhao Wang, and Zhuoran Yang

Conference on Learning Theory (COLT), 2024
Presented at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning
arXiv
Maximum likelihood for high-noise group orbit estimation and single-particle cryo-EM

Zhou Fan, Roy R. Lederman, Yi Sun, Tianhao Wang, and Sheng Xu

Annals of Statistics, 2024

arXiv Link
The Marginal Value of Momentum for Small Learning Rate SGD

Runzhe Wang, Sadhika Malladi, Tianhao Wang, Kaifeng Lyu, and Zhiyuan Li

In International Conference on Learning Representations (ICLR), 2024

arXiv Link
Noise-adaptive Thompson sampling for linear contextual bandits

Ruitu Xu, Yifei Min, and Tianhao Wang

In Advances in Neural Information Processing Systems (NeurIPS), 2023

Link
Cooperative multi-Agent reinforcement learning: asynchronous communication and linear function approximation

Yifei Min, Jiafan He, Tianhao Wang, and Quanquan Gu

In International Conference on Machine Learning (ICML), 2023

arXiv Link
Finding regularized competitive equilibria of heterogeneous agent macroeconomic models via reinforcement learning

Ruitu Xu, Yifei Min, Tianhao Wang, Michael I. Jordan, Zhaoran Wang, and Zhuoran Yang

In International Conference on Artificial Intelligence and Statistics (AISTATS), 2022

Link
Fast mixing of stochastic gradient descent with normalization and weight decay

Zhiyuan Li, Tianhao Wang, and Dingli Yu

In Advances in Neural Information Processing Systems (NeurIPS), 2022

Link
Learn to match with no regret: Reinforcement learning in Markov matching markets

Yifei Min, Tianhao Wang, Ruitu Xu, Zhaoran Wang, Michael I Jordan, and Zhuoran Yang

In Advances in Neural Information Processing Systems (NeurIPS), 2022 (Oral)

arXiv Link
A simple and provably efficient algorithm for asynchronous federated contextual linear bandits

Jiafan He*, Tianhao Wang*, Yifei Min*, and Quanquan Gu

In Advances in Neural Information Processing Systems (NeurIPS), 2022

arXiv Link
Implicit bias of gradient descent on reparametrized models: On equivalence to mirror descent

Zhiyuan Li*, Tianhao Wang*, Jason D. Lee, and Sanjeev Arora

In Advances in Neural Information Processing Systems (NeurIPS), 2022
Abridged version accepted for a contributed talk to ICML 2022 Workshop on Continuous time methods for machine learning
arXiv Poster Slides
Learning stochastic shortest path with linear function approximation

Yifei Min, Jiafan He, Tianhao Wang, and Quanquan Gu

In International Conference on Machine Learning (ICML), 2022

arXiv Link Poster Slides
What happens after SGD reaches zero loss?–A mathematical framework

Zhiyuan Li, Tianhao Wang, and Sanjeev Arora

In International Conference on Learning Representations (ICLR), 2022 (Spotlight)

arXiv Link Poster Slides
North American biliary stricture management strategies in children after liver transplantation: a multicenter analysis from the society of pediatric liver transplantation (SPLIT) registry

Pamela L Valentino, Tianhao Wang, Veronika Shabanova, Vicky Lee Ng, John C Bucuvalas, Amy G Feldman and 5 more authors

Liver Transplantation, 2022

Link
Variance-aware off-policy evaluation with linear function approximation

Yifei Min*, Tianhao Wang*, Dongruo Zhou, and Quanquan Gu

In Advances in neural information processing systems (NeurIPS), 2021

arXiv Link Poster Slides
Provably efficient reinforcement learning with linear function approximation under adaptivity constraints

Tianhao Wang*, Dongruo Zhou*, and Quanquan Gu

In Advances in Neural Information Processing Systems (NeurIPS), 2021

arXiv Link Poster Slides
Likelihood landscape and maximum likelihood estimation for the discrete orbit recovery model

Zhou Fan, Yi Sun, Tianhao Wang, and Yihong Wu

Communications on Pure and Applied Mathematics, 2022

arXiv Link
Continuous and discrete-time accelerated stochastic mirror descent for strongly convex functions

Pan Xu*, Tianhao Wang*, and Quanquan Gu

In International Conference on Machine Learning (ICML), 2018

Link
Accelerated stochastic mirror descent: From continuous-time dynamics to discrete-time algorithms

Pan Xu*, Tianhao Wang*, and Quanquan Gu

In International Conference on Artificial Intelligence and Statistics (AISTATS), 2018

Link

Preprints (*: equal contribution)

On Universality of NonSeparable Approximate Message Passing Algorithms

Max Lovig, Tianhao Wang, and Zhou Fan

arXiv:2506.23010, 2025

arXiv
Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders

Siyu Chen, Heejune Sheen, Xuyuan Xiong, Tianhao Wang, and Zhuoran Yang

arXiv:2506.14002, 2025

arXiv
Implicit regularization of gradient flow on one-layer softmax attention

Heejune Sheen, Siyu Chen, Tianhao Wang, and Harrison H. Zhou

arXiv:2403.08699, 2024
Presented at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning
arXiv