About Me

Hello! I am Kaiyue Wen. I am a first-year Phd student at Stanford University. I am currently rotating with Tengyu Ma while working together with Percy Liang. I graduated from Tsinghua University, where I was a member of Yao’s pilot class. Here are my CV and Publications. During my undergraduate study, I am fortunate to be advised by Tengyu Ma, Zhiyuan Liu, Andrej Risteski, Jingzhao Zhang, and Zhiyuan Li.

My research interest spreads broadly in deep learning, including theory and applications. My long-term goal is to understand the physics behind deep learning and I believe a combination of theoretical analysis and empirical study is essential for this goal.

Recent News

Jan, 2025 New preprint (Global Load Balancing Helps Expert Specialization) on arxiv!
Jan, 2025 3 papers (River Valley Landscape, RNNs are not Transformers (Yet), Optimization Analysis on Chain-of-Thought) accepted at ICLR 2025!
Dec, 2024 Residual Permutation Test is accepted at AoS!
Oct, 2024 New preprints River Valley Landscape and Optimization Analysis on Chain-of-Thought on arxiv!
Sep, 2024 Start my Ph.D. study at Stanford University! I am currently rotating with Percy Liang.
Jul, 2024 Graduated from Tsinghua University with a Bachelor’s degree in Computer Science.
May, 2024 Receive and accept the offer from Stanford University! I am honored to receive the Stanford Graduate Fellowship.
Feb, 2024 New preprint RNNs are not Transformers (Yet) on arxiv!
Oct, 2023 Awarded the National Scholarship (top 0.2%)!
Sep, 2023 2 papers (Sharpness&Generalization, (Un)interpretability of Transformers) accepted at Neurips 2023! Sharpness&Generalization is received as oral.
Sep, 2023 Receive the silver medal for Yao Award (Top 4 in Yao’s pilot class)!
Aug, 2023 Return to China for my senior year in Tsinghua.
Jul, 2023 Visit Hawaii for ICML 2023! Always great to see old friends.
Jun, 2023 Residual Permutation Test receive Major Revision from AoS.
Jun, 2023 Visiting Tengyu Ma at Stanford!
May, 2023 Visit Rwanda for ICLR 2023!
Mar, 2023 Reviewing ICML for the first time!
Mar, 2023 New preprint Solving LPN with Neural Networks on arxiv!
Feb, 2023 Visiting Andrej Risteski at CMU!
Jan, 2023 2 papers (Understanding SAM, Not Benign Overfitting) accepted at ICLR 2023!
Dec, 2022 New preprint Residual Permutation Test on arxiv!
Dec, 2022 New preprint Understanding SAM on arxiv!
Oct, 2022 1 paper (Skill Neurons) accepted at EMNLP 2022.
Jun, 2022 New preprint Not Benign Overfitting on arxiv!

One More Thing

I keep a firm faith in analytical thinking, hard work, and consistent self-improvement. Any advice or feedback is welcome. You can use this Anonymous Form or discuss with me in person.