I am currently with the Meituan LongCat Team, working on unified multimodal models.
My recent work focuses on multimodal representation learning, unified understanding-generation systems.
Publications
* indicates equal contribution.

Achieving Cross Modal Generalization with Multimodal Unified Representation
Yan Xia, Hai Huang, Jieming Zhu, Zhou Zhao

Bridging Domain Generalization to Multimodal Domain Generalization via Unified Representations
Hai Huang, Yan Xia, Sashuai Zhou, Hanting Wang, Shulei Wang, Zhou Zhao

Open-set Cross Modal Generalization via Multimodal Unified Representation
Hai Huang, Yan Xia, Shulei Wang, Hanting Wang, Minghui Fang, Shengpeng Ji, Sashuai Zhou, Tao Jin, Zhou Zhao

Enhancing Multimodal Unified Representations for Cross Modal Generalization
Hai Huang, Yan Xia, Shengpeng Ji, Shulei Wang, Hanting Wang, Minghui Fang, Jieming Zhu, Zhenhua Dong, Sashuai Zhou, Zhou Zhao

Semantic Residual for Multimodal Unified Discrete Representation
Hai Huang, Shulei Wang, Yan Xia

Overcoming both Domain Shift and Label Shift for Referring Video Segmentation
Hai Huang, Sashuai Zhou, Yan Xia
Honors
- 2025.10 National Scholarship
- 2021.10 First Prize, RoboMaster Robotics Competition
Education
- 2023.09 - 2026.03, M.S. in Artificial Intelligence, Zhejiang University
- 2019.09 - 2023.06, B.S. in Computer Science and Technology, Northeastern University, China
Experience
- 2025.01 - 2025.07, Huawei, pretraining a 6B-parameter image generation model based on DiT