|
Biography
I am a research scientist at NVIDIA Research, Deep Imagination Research Group, working on the foundation generative models, with special focus on the GenAI pre/middle/post-training, visual tokenizers, and their applications in digital human and embodied AI.
I am the core contributor to the NVIDIA Cosmos series
, an ensemble of open-source world foundation models including visual tokenizers, image / video foundation models, vision-language models, and their post-trained variants.
I obtained Ph.D. degree at CUHK MMLab, supervised by Prof. Dahua Lin, Prof. Ziwei Liu, and Prof. Xihui Liu.
Before that, I received Bachelor's degree at Zhejiang University in 2021, advised by Prof. Xiaowei Zhou.
During my research journey, I am glad to take multiple industrial internships at NVIDIA Research, Snap Research, Tencent AI Lab, SenseTime Research, and Shanghai AI Lab.
I am always open to discussions and collaborations, feel free to drop me an email if you are interested in :)
News
- [07/2025] One paper is accepted to ICCV 2025 with oral presentation.
- [06/2025] Pass my Ph.D. defense and become Dr. Liu officially!
- [06/2025] We release Cosmos-Predict2, a world foundation model with improved quality. Models open-sourced on Github and HF.
- [03/2025] We release Cosmos-Transfer1, a world model with multi-modal controllability. Models open-sourced on Github and HF.
- [02/2025] Two papers are accepted to CVPR 2025.
- [01/2025] Cosmos won the Best of CES, Best of AI, and Best Overall Awards in CNET 2025!
- [01/2025] We release Cosmos, a world foundation model platform for Physical AI. Models open-sourced on Github and HF.
- [01/2025] Four papers are accepted to ICLR 2025.
- [12/2024] One paper is accepted to AAAI 2025.
- [11/2024] We release Cosmos-Tokenizer, a suite of SOTA image/video tokenizers with models available on Github and HF.
- [09/2024] Honored to receive ECCV 2024 Outstanding Reviewer Award. Great thanks for the recognition!
- [07/2024] Two papers are accepted to ECCV 2024.
- [06/2024] Join NVIDIA Research as full-time research scientist, building large-scale foundation models. Stay tuned for our release!
- [05/2024] One paper is accepted to ICML 2024.
- [03/2024] Start my internship at NVIDIA Research. See you in Santa Clara!
- [03/2024] Two papers are accepted to CVPR 2024, with HumanGaussian accepted as Highlight (Top 2.8%). See you in Seattle!
- [01/2024] One paper is accepted to ICLR 2024, with HyperHuman receiving review score of 6, 6, 8, 10 (Top 1.6%, Rank).
- [01/2024] I will intern at GenAI Team @ Meta AI Research in 2024 Fall. See you in Menlo Park!
- [11/2023] I will intern at Deep Imagination Research @ NVIDIA Research in 2024 Spring with Ming-Yu Liu. See you in Santa Clara!
- [11/2023] A high-quality 3D human generation framework HumanGaussian is released, with all the code and models available!
- [10/2023] A hyper-realistic human generation foundation model HyperHuman collaborated with Snap Research is on arXiv!
- [07/2023] One paper is accepted to ICCV 2023.
- [05/2023] Start my internship at Snap Research. See you in Los Angeles!
- [03/2023] Two papers are accepted to CVPR 2023.
- [03/2023] One paper is accepted to TMLR 2023.
- [09/2022] One paper is accepted to NeurIPS 2022, with ANGIE accepted as Spotlight (Top 5%)!
- [07/2022] Three papers are accepted to ECCV 2022, with SSP-NeRF accepted as Oral (Top 2.7%)!
- [03/2022] One paper is accepted to CVPR 2022.
- [12/2021] One paper is accepted to AAAI 2022.
[Show more]
Industrial Research
|
Cosmos World Foundation Model Platform for Physical AI
Contributions: Auto-Regressive Foundation Model Pre-Training & Post-Training. (CES'25 Best of AI, Best Overall)
|
|
Cosmos Tokenizer: A Suite of Image and Video Neural Tokenizers
Contributions: Continuous/Discrete Image/Video Tokenizers.
|
|
Cosmos-Transfer1: World Generation with Adaptive Multimodal Control
Contributions: Adaptive Multi-Modal Control, Data Processing Pipelines, Open-Source Repo.
|
|
Cosmos-Predict2: World Foundation Model Platform for Physical AI
Contributions: Data Processing Pipelines, Captioning, Long Video Generation, Transfer Post-training.
|
Selected Publications [ Full List ] (* indicates equal contribution)
|
DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior
Junzhe Lu,
Jing Lin,
Hongkun Dou,
Ailing Zeng,
Yue Deng,
Xian Liu,
Zhongang Cai,
Lei Yang,
Yulun Zhang,
Haoqian Wang,
Ziwei Liu.
International Conference on Computer Vision (ICCV), 2025. (Oral)
|
|
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. (Highlight, Top 2.8%)
|
|
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
International Conference on Learning Representations ( ICLR), 2024. (Review Score 6, 6, 8, 10, Top 1.6%, Rank)
|
|
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
European Conference on Computer Vision (ECCV), 2022. (Oral, Top 2.7%)
|
|
Audio-Driven Co-Speech Gesture Video Generation
Advances in Neural Information Processing Systems (NeurIPS), 2022. (Spotlight, Top 5%)
|
|
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
|
|
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
|
|
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
AAAI Conference on Artificial Intelligence (AAAI), 2022.
|
|
HMAR: Efficient Hierarchical Masked AutoRegressive Image Generation
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.
|
|
TC4D: Trajectory-Conditioned Text-to-4D Generation
Sherwin Bahmani*,
Xian Liu*,
Yifan Wang*,
Ivan Skorokhodov,
Victor Rong,
Ziwei Liu,
Xihui Liu,
Jeong Joon Park,
Sergey Tulyakov,
Gordon Wetzstein,
Andrea Tagliasacchi,
David B. Lindell.
European Conference on Computer Vision (ECCV), 2024.
|
|
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
International Conference on Learning Representations (ICLR), 2025.
|
|
High-Quality Joint Image and Video Tokenization with Causal VAE
International Conference on Learning Representations (ICLR), 2025.
|
|
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
European Conference on Computer Vision (ECCV), 2024.
|
|
TextCraftor: Your Text Encoder Can be Image Quality Controller
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
|
Experiences
|
Research Scientist.
Jun. 2024 - Now
NVIDIA Research, Deep Imagination Research Group.
|
|
Generative AI Research Intern, Deep Imagination Research, NVIDIA Research.
Mar. 2024 - Jun. 2024
Topic: Image/Video Foundation Models, Tokenizers, Multi-Modal Language Models.
|
|
Research Visiting Student, Toronto Computational Imaging Group.
Dec. 2023 - Mar. 2024
Topic: Text-to-4D Generation.
|
|
Research Intern, Tencent AI Laboratory.
Sept. 2023 - Dec. 2023
Topic: Text-Driven 3D Human Generation.
|
|
Research Intern, Creative Vision Group, Snap Research.
May. 2023 - Sept. 2023
Topic: Human Generation Foundation Model.
|
|
Research Intern, Digital Content Group, Shanghai AI Laboratory.
Jul. 2021 - Feb. 2022
Topic: Digital Human, Gesture Generation.
|
|
Research Intern, Intelligent Video Group, SenseTime Research.
Aug. 2020 - Jun. 2021
Topic: Digital Human, Face Animation.
|
Invited Talks
Professional Services
- Conference Program Committee / Reviewer: CVPR, ECCV, ICCV, WACV, SIGGRAPH, SIGGRAPH Asia, NeurIPS, ICML, ICLR, AISTATS, AAAI, ACM MM.
- Journal Reviewer: TPAMI, IJCV, TVCG, EG, CGF, PG.
Selected Honors & Awards
- CNET 2025 Best of CES, Best of AI, and Best Overall.
2025
- ECCV Outstanding Reviewer Award.
2024
- CVPR Travel Award.
2024
- ICLR Travel Award.
2024
- National Scholarship.
2019, 2020
- Hong Kong Ph.D. Fellowship Scheme (HKPFS).
2021- 2025
- Outstanding Graduate of Zhejiang Province.
2021
- Outstanding Bachelor Thesis Award of Zhejiang University, Top 1%.
2021
- UCLA CSST Scholarship Program.
2020
- SenseTime Scholarship.
2020
- Tang Lixin Scholarship.
2019
- First Class Scholarship for Academic Excellence.
2019, 2020
Teaching Experience
- ENGG 1120, Linear Algebra for Engineers.
Spring 2022.
- ENGG 2440, Discrete Mathematics for Engineers.
Fall 2021.
|