Yuanzhi Liang

Mail: liangyzh18 [at] outlook [dot] com

Bilibili zhihu Xiaohongshu


About Me

I am a research scientist specializing in generative AI at the Institute of Artificial Intelligence (TeleAI), China Telecom. I received my Ph.D. from the University of Technology Sydney in 2024, advised by Dr. Linchao Zhu and Prof. Yi Yang.

I received a Master's degree from Xi'an Jiaotong University in 2020 and was a member of the SMILES LAB, advised by Prof. Xueming Qian and Prof. Li Zhu.

My academic and professional journey is fueled by a dual curiosity: developing machines that perceive real-world environments and interpret complex semantics. My research interests include multimodal large models, video generation, 3D synthesis, and human-like AI agents. I am also interested in theoretical innovations in generative models.

I am always looking for highly motivated research interns and long-term collaborators. We currently have multiple positions available, focusing on, but not limited to, multimodal large models, video generation/editing, and 3D generation. If you are interested in exploring these areas or discussing potential research collaborations, please feel free to contact me via email. (Applicants for internships are encouraged to include your CV.)


Work Experience

  • Jul 2021 - Dec 2021, Alibaba DAMO Academy
    • Research intern working on virtual human synthesis.

  • Jul 2020 - Jul 2021, Baidu Research
    • Research intern working on visual knowledge embedding, object recognition, and multi-modal representation.

  • Mar 2020 - Jun 2020, JD AI Research
    • Research intern working on product recognition.

  • Aug 2018 - Jun 2019, JD AI Research
    • Research intern working on visual-language representation learning.


Selected Honors

  • First place in AliProducts Challenge @ CVPR 2020 the RetailVision workshop.
  • First place in iMat Product Competition @ CVPR 2019 FGVC6 workshop.
  • First place in in Fieldguide Challenge: Moths & Butterflies @ CVPR 2019 FGVC6 workshop.
  • Second place in iFood Competition @ CVPR 2019 FGVC6 workshop.
  • Second place in iMet2020 Fine-grained Attributes Classification Competition @ CVPR 2020 FGVC7 workshop.
  • Kaggle Silver Medal in Deepfake Detection Challenge 2020.

Important Preprints

  • VAST 1.0: A Unified Framework for Controllable and Consistent Video Generation.
    Chi Zhang, Yuanzhi Liang, Xi Qiu, Fangqiu Yi, Xuelong Li.
    Arxiv, 2024

  • AntEval: Quantitatively Evaluating Informativeness and Expressiveness of Agent Social Interactions.
    Yuanzhi Liang, Linchao Zhu, Yi Yang.
    Arxiv, 2024

  • Tachikuma: Understading Complex Interactions with Multi-Character and Novel Objects by Large Language Models.
    Yuanzhi Liang, Linchao Zhu, Yi Yang.
    Arxiv, 2023

  • Selected Publications

  • MAAL: Multimodality-Aware Autoencoder-based Affordance Learning for 3D Articulated Objects.
    Yuanzhi Liang, Xiaohan Wang, Linchao Zhu, Yi Yang.
    Accepted by ICCV 2023

  • A Simple Episodic Linear Probe Improves Visual Recognition in the Wild.
    Yuanzhi Liang, Linchao Zhu, Xiaohan Wang, Yi Yang.
    Accepted by CVPR 2022 (Score 1/2/2)

  • SEEG: Semantic Energized Co-speech Gesture Generation.
    Yuanzhi Liang, Qianyu Feng, Linchao Zhu, Li Hu, Pan Pan, Yi Yang.
    Accepted by CVPR 2022

  • VrR-VG: Refocusing Visually-Relevant Relationships.
    Yuanzhi Liang, Yalong Bai, Wei Zhang, Xueming Qian, Li Zhu, Tao Mei.
    Accepted by ICCV 2019

  • IcoCap: Improving Video Captioning by Compounding Images.
    Yuanzhi Liang, Linchao Zhu, Xiaohan Wang, Yi Yang.
    Accepted by TMM 2023

  • Penalizing the Hard Example But Not Too Much: A Strong Baseline for Fine-Grained Visual Classification.
    Yuanzhi Liang, Linchao Zhu, Xiaohan Wang, Yi Yang.
    Accepted by TNNLS 2022

  • Towards Better Railway Service: Passengers Counting in Railway Compartment.
    Yuanzhi Liang, Zhu Li, Xueming Qian.
    Accepted by TCSVT 2020

  • Freelong: Training-free long video generation with spectralblend temporal attention.
    Yu Lu, Yuanzhi Liang, Linchao Zhu, Yi Yang.
    Accepted by NeurIPS 2024

  • Removing Raindrops and Rain Streaks in One Go.
    Ruijie Quan, Xin Yu, Yuanzhi Liang, Yi Yang.
    Accepted by CVPR 2021

  • Product Recognition for Unmanned Vending Machines.
    Chengxu Liu, Zongyang Da, Yuanzhi Liang, Yao Xue, Guoshuai Zhao, Xueming Qian.
    Accepted by TNNLS 2022

  • Food and Ingredient Joint Learning for Fine-Grained Recognition.
    Chengxu Liu, Yuanzhi Liang, Yao Xue, Xueming Qian, Jianlong Fu.
    Accepted by TCSVT 2020


  • Journal Reviewer

  • Reviewer for TPAMI and TIP.
  • Conference Reviewer / Program Committee Member

  • Reviewer for ICCV, CVPR, ICLR, NeurIPS, ECCV, MM, AAAI, IJCAI, ICME, and CAAI.
  • Others

  • Member of MNBVC (Massive Never-ending BT Vast Chinese corpus).