I work at Multimodal Agent Researcher as a Phd now in UTS NLP Group, advised by Prof. Ling Chen and Prof. Meng Fang.

I graduated from JiangSu University with a bachelor’s degree and from the Department of Computer Science and Technology, University of Adelaide with a master’s degree, advised by Yutong Xie and Yifan Liu. I also collaborate with Zeyu Zhang from Australian Institute for Machine Learning closely.

My research interests include multimodal representation learning, LLM-based Agent and multimodal Agent. I have published 1 paper at the top international AI conferences such as NeurIPS, ICML, ICLR, KDD.

To promote the communication among the Chinese ML & NLP community, we (along with other 11 young scholars worldwide) founded the MLNLP community in 2021. I am honored to be one of the chairs of the MLNLP committee.

If you like the template of this homepage, welcome to star and fork my open-sourced template version AcadHomepage .

🔥 News

  • 2023.01: I join UTS NLP Group as a PHD in Sydney!
  • 2022.02: I release a modern and responsive academic personal homepage template. Welcome to STAR and FORK!

📝 Publications

🎙 Speech Synthesis

NeurIPS 2019
sym

FastSpeech: Fast, Robust and Controllable Text to Speech
Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Project

  • FastSpeech is the first fully parallel end-to-end speech synthesis model.
  • Academic Impact: This work is included by many famous speech synthesis open-source projects, such as ESPNet . Our work are promoted by more than 20 media and forums, such as 机器之心InfoQ.
  • Industry Impact: FastSpeech has been deployed in Microsoft Azure TTS service and supports 49 more languages with state-of-the-art AI quality. It was also shown as a text-to-speech system acceleration example in NVIDIA GTC2020.
ICLR 2021
sym

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Project

ICLR 2024
sym

Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis \ Ziyue Jiang, Jinglin Liu, Yi Ren, et al.

Project

  • This work has been deployed on many TikTok products.
  • Advandced zero-shot voice cloning model.
AAAI 2022
sym

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Jinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Zhou Zhao

NeurIPS 2021
sym

👄 TalkingFace & Avatar

ICLR 2024
sym

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis, Zhenhui Ye, Tianyun Zhong, Yi Ren, et al. (Spotlight) Project | Code

📚 Machine Translation

🎼 Music & Dance Generation

🧑‍🎨 Generative Model

Others

🎖 Honors and Awards

📖 Educations

  • *2024.06 - *, Phd, University of Technology Sydney, Sydney.
  • 2022.06 - 2024.04, Master, University of Adelaide, Adelaide.
  • 2015.09 - 2019.06, Undergraduate, Jiangsu Univeristy, Zhenjiang.

💻 Internships