Jian Guan 关健

Empowering the future driven by curiosity.

Beijing, China

Alibaba Qwen Team

I currently work at Alibaba Qwen Team (since Jan. 2026). I obtained my bachelor's (2015–2019) and Ph.D. (2019–2024) degrees from Tsinghua University, advised by Minlie Huang. I have also interned at various institutions, including the Allen Institute for Artificial Intelligence in Seattle, supervised by Hao Peng and Jesse Dodge (Jan.–Jun. 2023), and the University of Virginia, supervised by Hongning Wang (Jul.–Sep. 2018). Previously, I was a researcher at Ant Group (Jul. 2024–Jan. 2026), working with Wei Wu.

🚀

Seeking Self-Motivated Interns

We are actively recruiting research interns to push the boundaries of visual intelligence in Qwen. If you are passionate about advancing multimodal capabilities and building real-world agents, we'd love to hear from you!

🏅 Awards

2024

CIPS Doctoral Dissertation Incentive Program

中国中文信息学会博士学位论文激励计划 (Top 10 in China)

2024

Tsinghua University Excellent Doctoral Dissertation

清华大学优秀博士论文

2023

ACL (CCF-A) Area Chair Award

StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse Representations and Content Enhancing

2022

Microsoft Research Asia Fellowship Nomination Award

微软学者提名奖（Top 33 in Asia）

2022

National Scholarship for Doctoral Students

国家奖学金

2019

Excellent Graduate in Beijing

北京市优秀毕业生

2019

Outstanding Graduate in Tsinghua University

清华大学优良毕业生

News

May 2026

New

Qwen-VLA: A Unified Vision-Language-Action Model

Qwen-VLA is a unified vision-language-action model built on Qwen3.5-4B and a 1.15B DiT action decoder, enabling manipulation, navigation, and trajectory prediction across diverse robot embodiments through a shared action prediction framework.

Tech Report Blog Demo

Research & Publications

* indicates equal contribution; † indicates corresponding author(s)

📚

All Publications

All publications sorted by publication year (newest first)

🧠

Advanced Reasoning

Advancing reasoning capabilities in large (vision-)language models for complex problem solving

👤

Personalized Alignment

Aligning AI systems with individual human preferences and values

Survey

📝

Natural Language Generation

Natural language generation for open-ended tasks and comprehensive evaluation methodologies

Must-read paper list

⚡

Efficient Foundation Models

Developing efficient architectures and training methods for large-scale foundation models