News

Sep 01, 2025 New preprint (Fantastic Pretraining Optimizers and Where to Find Them) on arxiv!
May 01, 2025 WSD-S is used in training the best open-source 8B model Marin 8B.
Jan 20, 2025 3 papers (River Valley Landscape, RNNs are not Transformers (Yet), Optimization Analysis on Chain-of-Thought) accepted at ICLR 2025!
Jan 20, 2025 New preprint (Global Load Balancing Helps Expert Specialization) on arxiv!
Dec 01, 2024 Residual Permutation Test is accepted at AoS!
Oct 01, 2024 New preprints River Valley Landscape and Optimization Analysis on Chain-of-Thought on arxiv!
Sep 01, 2024 Start my Ph.D. study at Stanford University! I am currently rotating with Percy Liang.
Jul 01, 2024 Graduated from Tsinghua University with a Bachelor’s degree in Computer Science.
May 01, 2024 Receive and accept the offer from Stanford University! I am honored to receive the Stanford Graduate Fellowship.
Feb 01, 2024 New preprint RNNs are not Transformers (Yet) on arxiv!
Oct 01, 2023 Awarded the National Scholarship (top 0.2%)!
Sep 20, 2023 2 papers ([Sharpness&Generalization](https://arxiv.org/abs/2307.11007), (Un)interpretability of Transformers) accepted at Neurips 2023! Sharpness&Generalization is received as oral.
Sep 01, 2023 Receive the silver medal for Yao Award (Top 4 in Yao’s pilot class)!
Aug 01, 2023 Return to China for my senior year in Tsinghua.
Jul 01, 2023 Visit Hawaii for ICML 2023! Always great to see old friends.
Jun 20, 2023 Residual Permutation Test receive Major Revision from AoS.
Jun 01, 2023 Visiting Tengyu Ma at Stanford!
May 01, 2023 Visit Rwanda for ICLR 2023!
Mar 20, 2023 Reviewing ICML for the first time!
Mar 01, 2023 New preprint Solving LPN with Neural Networks on arxiv!
Feb 01, 2023 Visiting Andrej Risteski at CMU!
Jan 20, 2023 2 papers (Understanding SAM, Not Benign Overfitting) accepted at ICLR 2023!
Dec 20, 2022 New preprint Residual Permutation Test on arxiv!
Dec 01, 2022 New preprint Understanding SAM on arxiv!
Oct 01, 2022 1 paper (Skill Neurons) accepted at EMNLP 2022.
Jun 01, 2022 New preprint Not Benign Overfitting on arxiv!