Xiaoxuan He   何晓轩

Ph.D. student in Computer Science

College of Computer Science and Technology, Zhejiang University
📍 Hangzhou

Email: xiaoxuanhe@zju.edu.cn

Biography

I am a Ph.D. student in Computer Science at College of Computer Science and Technology, Zhejiang University, advised by Prof. Bohan Zhuang and Prof. Haoji Hu (2021-2024). My research interests lie in the fields of image/video generation and reinforcement learning. During my Ph.D. studies, I had a wonderful time interning at Kuaishou MMU, Microsoft, Taobao and Tmall Group, Tecent WeChat, JD Explore Academy.

News

Selected Publications [Google Scholar] (* Equal contribution)

  • Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization
    Xiaoxuan He, Siming Fu, Zeyue Xue, Weijie Wang, Ruizhe He, Yuming Li, Dacheng Yin, Shuai Dong, Haoyang Huang, Hongfa Wang, Nan Duan, Bohan Zhuang.
    International Conference on Machine Learning (ICML), 2026.
  • World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
    Weijie Wang*, Xiaoxuan He*, Youping Gu*, Yifan Yang, Zeyu Zhang, Yefei He, Yanbo Ding, Xirui Hu, Donny Y. Chen, Zhiyuan He, Yuqing Yang, Bohan Zhuang.
    International Conference on Machine Learning (ICML), 2026.
  • WeMMU: Enhanced Bridging of Vision-Language Models and Diffusion Models via Noisy Query Tokens
    Jian Yang, Dacheng Yin, Xiaoxuan He, Yong Li, Fengyun Rao, Jing Lyu, Wei Zhai, Yang Cao, Zheng-Jun Zha.
    The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026.
  • TempFlow-GRPO: When Timing Matters for GRPO in Flow Models
    Xiaoxuan He, Siming Fu, Yuke Zhao, Wanli Li, Jian Yang, Dacheng Yin, Fengyun Rao, Bo Zhang.
    International Conference on Learning Representations (ICLR), 2026.
  • SAIL: Self-Amplified Iterative Learning for Diffusion Model Alignment with Minimal Human Feedback
    Xiaoxuan He, Siming Fu, Wanli Li, Zhiyuan Li, Dacheng Yin, Kang Rong, Fengyun Rao, Bo Zhang.
    International Conference on Learning Representations (ICLR), 2026.
  • R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
    Yi Yang*, Xiaoxuan He*, Hongkun Pan*, Xiyan Jiang, Yan Deng, Xingtao Yang, Haoyu Lu, Dacheng Yin, Fengyun Rao, Minfeng Zhu, Bo Zhang, Wei Chen.
    International Conference on Computer Vision (ICCV), 2025.
  • Unified Medical Image Pre-training in Language-Guided Common Semantic Space
    Xiaoxuan He, Yifan Yang, Xinyang Jiang, Xufang Luo, Haoji Hu, Siyun Zhao, Dongsheng Li, Yuqing Yang, Lili Qiu.
    European Conference on Computer Vision (ECCV), 2024.
  • Robustness-guided image synthesis for data-free quantization
    Jianhong Bai, Yuchen Yang, Huanpeng Chu, Hualiang Wang, Zuozhu Liu, Ruizhe Chen, Xiaoxuan He, Lianrui Mu, Chengfei Cai, Haoji Hu.
    The Association for the Advancement of Artificial Intelligence (AAAI), 2024.
  • Uniformly Distributed Category Prototype-Guided Vision-Language Framework for Long-Tail Recognition
    Xiaoxuan He, Siming Fu, Xinpeng Ding, Yuchen Cao, Hualiang Wang.
    ACM International Conference on Multimedia (ACM MM), 2023.
  • Hierarchical Self-Supervised Learning for 3D Tooth Segmentation in Intra-Oral Mesh Scans
    Zuozhu Liu*, Xiaoxuan He*, Hualiang Wang, Huimin Xiong, Yan Zhang, Gaoang Wang, Jin Hao, Yang Feng, Fudong Zhu, Haoji Hu
    IEEE Transactions on Medical Imaging (TMI).
  • Towards calibrated hyper-sphere representation via distribution overlap coefficient for long-tailed learning
    Hualiang Wang, Siming Fu, Xiaoxuan He, Hangxiang Fang, Zuozhu Liu, Haoji Hu.
    European Conference on Computer Vision (ECCV Oral), 2022.
  • Unsupervised Pre-training Improves Tooth Segmentation in 3-Dimensional Intraoral Mesh Scans
    Xiaoxuan He, Hualiang Wang, Haoji Hu, Jianfei Yang, Yang Feng, Gaoang Wang, Zuozhu Liu.
    International Conference on Medical Imaging with Deep Learning (MIDL), 2022.