Xian Liu

Ph.D. Candidate
Department of Information Engineering
The Chinese University of Hong Kong

E-mail  /  CV  /  Google Scholar  /  Github  /  Twitter  /  LinkedIn

Biography

Xian Liu is a final-year Ph.D. at CUHK Multi-Media Lab (MMLab). He is supervised by Prof. Dahua Lin and Prof. Ziwei Liu, working closely with Prof. Xihui Liu. His research interests include computer vision and generative models, especially the foundation GenAI (image/video/3D/4D generation), large-scale vision-language models, multi-modal tokenizers, and their applications in digital humans. Before that, he received the Bachelor's degree at Zhejiang University in 2021, advised by Prof. Xiaowei Zhou.

He is fortunate to have extensive industrial experience during Ph.D. study, with multiple internships at several leading research institutes, including NVIDIA Research, Snap Research, Tencent AI Lab, SenseTime Research, and Shanghai AI Lab.

I am always open to discussions and collaborations, feel free to drop me an email if you are interested in :)

News

  • [11/2024] We release Cosmos-Tokenizer, a suite of SOTA image/video tokenizers with models available on Github and HF!
  • [09/2024] Honored to receive ECCV 2024 Outstanding Reviewer Award. Great thanks for the recognition!
  • [07/2024] Two papers are accepted to ECCV 2024.
  • [05/2024] One paper is accepted to ICML 2024.
  • [03/2024] Start my internship at NVIDIA Research. See you in Santa Clara!
  • [03/2024] Two papers are accepted to CVPR 2024, with HumanGaussian accepted as Highlight (Top 2.8%). See you in Seattle!
  • [01/2024] One paper is accepted to ICLR 2024, with HyperHuman receiving review score of 6, 6, 8, 10 (Top 1.6%, Rank).
[Show more]

Industrial Research

Cosmos Tokenizer: A suite of image and video neural tokenizers
NVIDIA Research, DIR Team: Xian Liu (Core Contributor).

Selected Preprints

EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
Preprint. Under Review.
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu.
Preprint. Under Review.

Selected Publications [ Full List ] (* indicates equal contribution)

HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. (Highlight, Top 2.8%)
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
International Conference on Learning Representations (ICLR), 2024. (Review Score 6, 6, 8, 10, Top 1.6%, Rank)
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
European Conference on Computer Vision (ECCV), 2022. (Oral, Top 2.7%)
Audio-Driven Co-Speech Gesture Video Generation
Advances in Neural Information Processing Systems (NeurIPS), 2022. (Spotlight, Top 5%)
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Also appears at CVPR 2022 Sight and Sound Workshop. [5-min Invited Talk] (link)
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Xian Liu*, Lingting Zhu*, Xuanyu Liu, Rui Qian, Ziwei Liu, Lequan Yu.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
AAAI Conference on Artificial Intelligence (AAAI), 2022.
TC4D: Trajectory-Conditioned Text-to-4D Generation
European Conference on Computer Vision (ECCV), 2024.
Object-Compositional Neural Implicit Surfaces
European Conference on Computer Vision (ECCV), 2022.
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
European Conference on Computer Vision (ECCV), 2024.
TextCraftor: Your Text Encoder Can be Image Quality Controller
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

Experiences

Generative AI Research Intern, Deep Imagination Research, NVIDIA Research.
Mar. 2024 - Now
Topic: Image/Video Foundation Models, Tokenizers, Multi-Modal Language Models.
Research Visiting Student, Toronto Computational Imaging Group.
Dec. 2023 - Mar. 2024
Topic: Text-to-4D Generation.
Research Intern, Tencent AI Laboratory.
Sept. 2023 - Dec. 2023
Topic: Text-Driven 3D Human Generation.
Supervised by: Xiaohang Zhan, Ying Shan.
Research Intern, Creative Vision Group, Snap Research.
May. 2023 - Sept. 2023
Topic: Human Generation Foundation Model.
Research Intern, Digital Content Group, Shanghai AI Laboratory.
Jul. 2021 - Feb. 2022
Topic: Digital Human, Gesture Generation.
Supervised by: Hang Zhou, Wayne Wu.
Research Intern, Intelligent Video Group, SenseTime Research.
Aug. 2020 - Jun. 2021
Topic: Digital Human, Face Animation.
Supervised by: Qianyi Wu, Bo Dai.

Invited Talks

Professional Services

  • Conference Program Committee / Reviewer: CVPR (2022-2025), ECCV (2022-2024), ICCV (2023), SIGGRAPH (2024), SIGGRAPH Asia (2022, 2023), NeurIPS (2022-2024), ICML (2022-2024), ICLR (2023-2025), AISTATS (2025), AAAI (2022-2025).
  • Journal Reviewer: IJCV, TVCG, EG, CGF, PG.

Selected Honors & Awards

  • ECCV Outstanding Reviewer Award.
    2024
  • CVPR Travel Award.
    2024
  • ICLR Travel Award.
    2024
  • National Scholarship.
    2019, 2020
  • Hong Kong Ph.D. Fellowship Scheme (HKPFS).
    2021- 2025
  • Outstanding Graduate of Zhejiang Province.
    2021
  • Outstanding Bachelor Thesis Award of Zhejiang University, Top 1%.
    2021
  • UCLA CSST Scholarship Program.
    2020
  • SenseTime Scholarship.
    2020
  • Tang Lixin Scholarship.
    2019
  • First Class Scholarship for Academic Excellence.
    2019, 2020

Teaching Experience

  • ENGG 1120, Linear Algebra for Engineers.
    Spring 2022.
  • ENGG 2440, Discrete Mathematics for Engineers.
    Fall 2021.