Empowering the future driven by curiosity.
I currently work at Alibaba Qwen Team (since Jan. 2026). I obtained my bachelor's (2015–2019) and Ph.D. (2019–2024) degrees from Tsinghua University, advised by Minlie Huang. I have also interned at various institutions, including the Allen Institute for Artificial Intelligence in Seattle, supervised by Hao Peng and Jesse Dodge (Jan.–Jun. 2023), and the University of Virginia, supervised by Hongning Wang (Jul.–Sep. 2018). Previously, I was a researcher at Ant Group (Jul. 2024–Jan. 2026), working with Wei Wu.
We are actively recruiting research interns to push the boundaries of visual intelligence in Qwen. If you are passionate about advancing multimodal capabilities and building real-world agents, we'd love to hear from you!
中国中文信息学会博士学位论文激励计划 (Top 10 in China)
清华大学优秀博士论文
微软学者提名奖(Top 33 in Asia)
国家奖学金
北京市优秀毕业生
清华大学优良毕业生
Qwen-VLA is a unified vision-language-action model built on Qwen3.5-4B and a 1.15B DiT action decoder, enabling manipulation, navigation, and trajectory prediction across diverse robot embodiments through a shared action prediction framework.
* indicates equal contribution; † indicates corresponding author(s)
All publications sorted by publication year (newest first)
Advancing reasoning capabilities in large (vision-)language models for complex problem solving
Natural language generation for open-ended tasks and comprehensive evaluation methodologies
Must-read paper listDeveloping efficient architectures and training methods for large-scale foundation models