Xian Liu

Research Scientist
NVIDIA Research
Santa Clara, CA

E-mail  /  CV  /  Google Scholar  /  Github  /  Twitter  /  LinkedIn

Biography

I am a research scientist at NVIDIA Research, Deep Imagination Research Group, working on the foundation generative models, with special focus on the GenAI pre/middle/post-training, visual tokenizers, and their applications in digital human and embodied AI.

I am the core contributor to the NVIDIA Cosmos series GitHub stars , an ensemble of open-source world foundation models including visual tokenizers, image / video foundation models, vision-language models, and their post-trained variants.

I obtained Ph.D. degree at CUHK MMLab, supervised by Prof. Dahua Lin, Prof. Ziwei Liu, and Prof. Xihui Liu. Before that, I received Bachelor's degree at Zhejiang University in 2021, advised by Prof. Xiaowei Zhou. During my research journey, I am glad to take multiple industrial internships at NVIDIA Research, Snap Research, Tencent AI Lab, SenseTime Research, and Shanghai AI Lab.

I am always open to discussions and collaborations, feel free to drop me an email if you are interested in :)

News

  • [07/2025] One paper is accepted to ICCV 2025 with oral presentation.
  • [06/2025] Pass my Ph.D. defense and become Dr. Liu officially!
  • [06/2025] We release Cosmos-Predict2, a world foundation model with improved quality. Models open-sourced on Github and HF.
  • [03/2025] We release Cosmos-Transfer1, a world model with multi-modal controllability. Models open-sourced on Github and HF.
  • [02/2025] Two papers are accepted to CVPR 2025.
  • [01/2025] Cosmos won the Best of CES, Best of AI, and Best Overall Awards in CNET 2025!
  • [01/2025] We release Cosmos, a world foundation model platform for Physical AI. Models open-sourced on Github and HF.
[Show more]

Industrial Research

Cosmos World Foundation Model Platform for Physical AI
NVIDIA Research: Xian Liu (Core Contributor).
Contributions: Auto-Regressive Foundation Model Pre-Training & Post-Training. (CES'25 Best of AI, Best Overall)
Cosmos Tokenizer: A Suite of Image and Video Neural Tokenizers
NVIDIA Research: Xian Liu (Core Contributor).
Contributions: Continuous/Discrete Image/Video Tokenizers.
Cosmos-Transfer1: World Generation with Adaptive Multimodal Control
NVIDIA Research: Xian Liu (Core Contributor).
Contributions: Adaptive Multi-Modal Control, Data Processing Pipelines, Open-Source Repo.
Cosmos-Predict2: World Foundation Model Platform for Physical AI
NVIDIA Research: Xian Liu (Core Contributor).
Contributions: Data Processing Pipelines, Captioning, Long Video Generation, Transfer Post-training.

Selected Publications [ Full List ] (* indicates equal contribution)

DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior
International Conference on Computer Vision (ICCV), 2025. (Oral)
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. (Highlight, Top 2.8%)
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
International Conference on Learning Representations (ICLR), 2024. (Review Score 6, 6, 8, 10, Top 1.6%, Rank)
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
European Conference on Computer Vision (ECCV), 2022. (Oral, Top 2.7%)
Audio-Driven Co-Speech Gesture Video Generation
Advances in Neural Information Processing Systems (NeurIPS), 2022. (Spotlight, Top 5%)
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Also appears at CVPR 2022 Sight and Sound Workshop. [5-min Invited Talk] (link)
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Xian Liu*, Lingting Zhu*, Xuanyu Liu, Rui Qian, Ziwei Liu, Lequan Yu.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
AAAI Conference on Artificial Intelligence (AAAI), 2022.
HMAR: Efficient Hierarchical Masked AutoRegressive Image Generation
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.
TC4D: Trajectory-Conditioned Text-to-4D Generation
European Conference on Computer Vision (ECCV), 2024.
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
International Conference on Learning Representations (ICLR), 2025.
High-Quality Joint Image and Video Tokenization with Causal VAE
International Conference on Learning Representations (ICLR), 2025.
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
European Conference on Computer Vision (ECCV), 2024.
TextCraftor: Your Text Encoder Can be Image Quality Controller
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

Experiences

Research Scientist.
Jun. 2024 - Now
NVIDIA Research, Deep Imagination Research Group.
Manager: Ming-Yu Liu.
Generative AI Research Intern, Deep Imagination Research, NVIDIA Research.
Mar. 2024 - Jun. 2024
Topic: Image/Video Foundation Models, Tokenizers, Multi-Modal Language Models.
Research Visiting Student, Toronto Computational Imaging Group.
Dec. 2023 - Mar. 2024
Topic: Text-to-4D Generation.
Research Intern, Tencent AI Laboratory.
Sept. 2023 - Dec. 2023
Topic: Text-Driven 3D Human Generation.
Supervised by: Xiaohang Zhan, Ying Shan.
Research Intern, Creative Vision Group, Snap Research.
May. 2023 - Sept. 2023
Topic: Human Generation Foundation Model.
Research Intern, Digital Content Group, Shanghai AI Laboratory.
Jul. 2021 - Feb. 2022
Topic: Digital Human, Gesture Generation.
Supervised by: Hang Zhou, Wayne Wu.
Research Intern, Intelligent Video Group, SenseTime Research.
Aug. 2020 - Jun. 2021
Topic: Digital Human, Face Animation.
Supervised by: Qianyi Wu, Bo Dai.

Invited Talks

Professional Services

  • Conference Program Committee / Reviewer: CVPR, ECCV, ICCV, WACV, SIGGRAPH, SIGGRAPH Asia, NeurIPS, ICML, ICLR, AISTATS, AAAI, ACM MM.
  • Journal Reviewer: TPAMI, IJCV, TVCG, EG, CGF, PG.

Selected Honors & Awards

  • CNET 2025 Best of CES, Best of AI, and Best Overall.
    2025
  • ECCV Outstanding Reviewer Award.
    2024
  • CVPR Travel Award.
    2024
  • ICLR Travel Award.
    2024
  • National Scholarship.
    2019, 2020
  • Hong Kong Ph.D. Fellowship Scheme (HKPFS).
    2021- 2025
  • Outstanding Graduate of Zhejiang Province.
    2021
  • Outstanding Bachelor Thesis Award of Zhejiang University, Top 1%.
    2021
  • UCLA CSST Scholarship Program.
    2020
  • SenseTime Scholarship.
    2020
  • Tang Lixin Scholarship.
    2019
  • First Class Scholarship for Academic Excellence.
    2019, 2020

Teaching Experience

  • ENGG 1120, Linear Algebra for Engineers.
    Spring 2022.
  • ENGG 2440, Discrete Mathematics for Engineers.
    Fall 2021.