Login / Signup
Lei He
ORCID
Publication Activity (10 Years)
Years Active: 2022-2024
Publications (10 Years): 17
Top Topics
Human Level Ai
Prosodic Features
Diffusion Models
Speech Synthesis
Top Venues
CoRR
ICASSP
INTERSPEECH
ICLR
</>
Publications
</>
Leying Zhang
,
Yao Qian
,
Long Zhou
,
Shujie Liu
,
Dongmei Wang
,
Xiaofei Wang
,
Midia Yousefi
,
Yanmin Qian
,
Jinyu Li
,
Lei He
,
Sheng Zhao
,
Michael Zeng
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations.
CoRR
(2024)
Xu Tan
,
Jiawei Chen
,
Haohe Liu
,
Jian Cong
,
Chen Zhang
,
Yanqing Liu
,
Xi Wang
,
Yichong Leng
,
Yuanhao Yi
,
Lei He
,
Sheng Zhao
,
Tao Qin
,
Frank K. Soong
,
Tie-Yan Liu
NaturalSpeech: End-to-End Text-to-Speech Synthesis With Human-Level Quality.
IEEE Trans. Pattern Anal. Mach. Intell.
46 (6) (2024)
Zeqian Ju
,
Yuancheng Wang
,
Kai Shen
,
Xu Tan
,
Detai Xin
,
Dongchao Yang
,
Yanqing Liu
,
Yichong Leng
,
Kaitao Song
,
Siliang Tang
,
Zhizheng Wu
,
Tao Qin
,
Xiang-Yang Li
,
Wei Ye
,
Shikun Zhang
,
Jiang Bian
,
Lei He
,
Jinyu Li
,
Sheng Zhao
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models.
CoRR
(2024)
Xueyuan Chen
,
Xi Wang
,
Shaofei Zhang
,
Lei He
,
Zhiyong Wu
,
Xixin Wu
,
Helen Meng
Stylespeech: Self-Supervised Style Enhancing with VQ-VAE-Based Pre-Training for Expressive Audiobook Speech Synthesis.
ICASSP
(2024)
Kai Shen
,
Zeqian Ju
,
Xu Tan
,
Eric Liu
,
Yichong Leng
,
Lei He
,
Tao Qin
,
Sheng Zhao
,
Jiang Bian
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers.
ICLR
(2024)
Yichong Leng
,
Zhifang Guo
,
Kai Shen
,
Zeqian Ju
,
Xu Tan
,
Eric Liu
,
Yufei Liu
,
Dongchao Yang
,
Leying Zhang
,
Kaitao Song
,
Lei He
,
Xiangyang Li
,
Sheng Zhao
,
Tao Qin
,
Jiang Bian
PromptTTS 2: Describing and Generating Voices with Text Prompt.
ICLR
(2024)
Kai Shen
,
Zeqian Ju
,
Xu Tan
,
Yanqing Liu
,
Yichong Leng
,
Lei He
,
Tao Qin
,
Sheng Zhao
,
Jiang Bian
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers.
CoRR
(2023)
Chen Zhang
,
Shubham Bansal
,
Aakash Lakhera
,
Jinzhu Li
,
Gang Wang
,
Sandeepkumar Satpal
,
Sheng Zhao
,
Lei He
LeanSpeech: The Microsoft Lightweight Speech Synthesis System for Limmits Challenge 2023.
ICASSP
(2023)
Brendan Walsh
,
Mark Hamilton
,
Greg Newby
,
Xi Wang
,
Serena Ruan
,
Sheng Zhao
,
Lei He
,
Shaofei Zhang
,
Eric Dettinger
,
William T. Freeman
,
Markus Weimer
Large-Scale Automatic Audiobook Creation.
INTERSPEECH
(2023)
Yujia Xiao
,
Shaofei Zhang
,
Xi Wang
,
Xu Tan
,
Lei He
,
Sheng Zhao
,
Frank K. Soong
,
Tan Lee
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading.
INTERSPEECH
(2023)
Yuancheng Wang
,
Zeqian Ju
,
Xu Tan
,
Lei He
,
Zhizheng Wu
,
Jiang Bian
,
Sheng Zhao
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models.
CoRR
(2023)
Kun Wei
,
Long Zhou
,
Ziqiang Zhang
,
Liping Chen
,
Shujie Liu
,
Lei He
,
Jinyu Li
,
Furu Wei
Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation.
ICASSP
(2023)
Yan Deng
,
Long Zhou
,
Yuanhao Yi
,
Shujie Liu
,
Lei He
Prosody-Aware Speecht5 for Expressive Neural TTS.
ICASSP
(2023)
Yihan Wu
,
Junliang Guo
,
Xu Tan
,
Chen Zhang
,
Bohan Li
,
Ruihua Song
,
Lei He
,
Sheng Zhao
,
Arul Menezes
,
Jiang Bian
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing.
AAAI
(2023)
Xu Tan
,
Jiawei Chen
,
Haohe Liu
,
Jian Cong
,
Chen Zhang
,
Yanqing Liu
,
Xi Wang
,
Yichong Leng
,
Yuanhao Yi
,
Lei He
,
Frank K. Soong
,
Tao Qin
,
Sheng Zhao
,
Tie-Yan Liu
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality.
CoRR
(2022)
Yujia Xiao
,
Xi Wang
,
Lei He
,
Frank K. Soong
Improving Fastspeech TTS with Efficient Self-Attention and Compact Feed-Forward Network.
ICASSP
(2022)
Mutian He
,
Jingzhou Yang
,
Lei He
,
Frank K. Soong
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge.
INTERSPEECH
(2022)