Sign in
Lei Xie
ORCID
Publication Activity (10 Years)
Years Active: 2008-2024
Publications (10 Years): 268
Top Topics
Autoregressive
Neural Network
Speaker Verification
Speech Recognition
Top Venues
CoRR
ICASSP
INTERSPEECH
ASRU
</>
Publications
</>
Zhichao Wang
,
Yuanzhe Chen
,
Xinsheng Wang
,
Zhuo Chen
,
Lei Xie
,
Yuping Wang
,
Yuxuan Wang
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion.
CoRR
(2024)
Qijie Shao
,
Pengcheng Guo
,
Jinghao Yan
,
Pengfei Hu
,
Lei Xie
Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Xinfa Zhu
,
Yi Lei
,
Tao Li
,
Yongmao Zhang
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Dake Guo
,
Xinfa Zhu
,
Liumeng Xue
,
Tao Li
,
Yuanjun Lv
,
Yuepeng Jiang
,
Lei Xie
HIGNN-TTS: Hierarchical Prosody Modeling With Graph Neural Networks for Expressive Long-Form TTS.
ASRU
(2023)
Zhichao Wang
,
Xinsheng Wang
,
Qicong Xie
,
Tao Li
,
Lei Xie
,
Qiao Tian
,
Yuping Wang
MSM-VC: High-Fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-Scale Style Modeling.
IEEE ACM Trans. Audio Speech Lang. Process.
31 (2023)
Jixun Yao
,
Yi Lei
,
Qing Wang
,
Pengcheng Guo
,
Ziqian Ning
,
Lei Xie
,
Hai Li
,
Junhui Liu
,
Danming Xie
Preserving Background Sound in Noise-Robust Voice Conversion Via Multi-Task Learning.
ICASSP
(2023)
Kaixun Huang
,
Ao Zhang
,
Binbin Zhang
,
Tianyi Xu
,
Xingchen Song
,
Lei Xie
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition.
ASRU
(2023)
Jixun Yao
,
Qing Wang
,
Yi Lei
,
Pengcheng Guo
,
Lei Xie
,
Namin Wang
,
Jie Liu
Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling.
ICASSP
(2023)
Li Zhang
,
Huan Zhao
,
Yue Li
,
Bowen Pang
,
Yannan Wang
,
Hongji Wang
,
Wei Rao
,
Qing Wang
,
Lei Xie
The FlySpeech Audio-Visual Speaker Diarization System for MISP Challenge 2022.
CoRR
(2023)
Qing Wang
,
Jixun Yao
,
Li Zhang
,
Pengcheng Guo
,
Lei Xie
Timbre-Reserved Adversarial Attack in Speaker Identification.
IEEE ACM Trans. Audio Speech Lang. Process.
31 (2023)
Dake Guo
,
Xinfa Zhu
,
Liumeng Xue
,
Tao Li
,
Yuanjun Lv
,
Yuepeng Jiang
,
Lei Xie
HiGNN-TTS: Hierarchical Prosody Modeling with Graph Neural Networks for Expressive Long-form TTS.
CoRR
(2023)
Yuke Li
,
Xinfa Zhu
,
Yi Lei
,
Hai Li
,
Junhui Liu
,
Danming Xie
,
Lei Xie
Zero-Shot Emotion Transfer for Cross-Lingual Speech Synthesis.
ASRU
(2023)
Zhichao Wang
,
Xinsheng Wang
,
Qicong Xie
,
Tao Li
,
Lei Xie
,
Qiao Tian
,
Yuping Wang
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling.
CoRR
(2023)
Peikun Chen
,
Fan Yu
,
Yuhao Liang
,
Hongfei Xue
,
Xucheng Wan
,
Naijun Zheng
,
Huan Zhou
,
Lei Xie
BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition.
ASRU
(2023)
Qing Wang
,
Jixun Yao
,
Li Zhang
,
Pengcheng Guo
,
Lei Xie
Timbre-reserved Adversarial Attack in Speaker Identification.
CoRR
(2023)
Yongmao Zhang
,
Guanghou Liu
,
Yi Lei
,
Yunlin Chen
,
Hao Yin
,
Lei Xie
,
Zhifei Li
Promptspeaker: Speaker Generation Based on Text Descriptions.
ASRU
(2023)
Tao Li
,
Chenxu Hu
,
Jian Cong
,
Xinfa Zhu
,
Jingbei Li
,
Qiao Tian
,
Yuping Wang
,
Lei Xie
DiCLET-TTS: Diffusion Model Based Cross-Lingual Emotion Transfer for Text-to-Speech - A Study Between English and Mandarin.
IEEE ACM Trans. Audio Speech Lang. Process.
31 (2023)
Ziqian Wang
,
Qing Wang
,
Jixun Yao
,
Lei Xie
The NPU-ASLP System for Deepfake Algorithm Recognition in ADD 2023 Challenge.
DADA@IJCAI
(2023)
Ao Zhang
,
He Wang
,
Pengcheng Guo
,
Yihui Fu
,
Lei Xie
,
Yingying Gao
,
Shilei Zhang
,
Junlan Feng
VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting.
ICASSP
(2023)
Yuke Li
,
Xinfa Zhu
,
Yi Lei
,
Hai Li
,
Junhui Liu
,
Danming Xie
,
Lei Xie
Zero-Shot Emotion Transfer For Cross-Lingual Speech Synthesis.
CoRR
(2023)
Kaixun Huang
,
Ao Zhang
,
Binbin Zhang
,
Tianyi Xu
,
Xingchen Song
,
Lei Xie
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition.
CoRR
(2023)
Mingshuai Liu
,
Shubo Lv
,
Zihan Zhang
,
Runduo Han
,
Xiang Hao
,
Xianjun Xia
,
Li Chen
,
Yijian Xiao
,
Lei Xie
Two-Stage Neural Network for ICASSP 2023 Speech Signal Improvement Challenge.
ICASSP
(2023)
Peikun Chen
,
Fan Yu
,
Yuhao Liang
,
Hongfei Xue
,
Xucheng Wan
,
Naijun Zheng
,
Huan Zhou
,
Lei Xie
BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition.
CoRR
(2023)
Ziqian Ning
,
Yuepeng Jiang
,
Pengcheng Zhu
,
Shuai Wang
,
Jixun Yao
,
Lei Xie
,
Mengxiao Bi
DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion.
CoRR
(2023)
Jixun Yao
,
Yuguang Yang
,
Yi Lei
,
Ziqian Ning
,
Yanni Hu
,
Yu Pan
,
Jingjing Yin
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts.
CoRR
(2023)
Tao Li
,
Zhichao Wang
,
Xinfa Zhu
,
Jian Cong
,
Qiao Tian
,
Yuping Wang
,
Lei Xie
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning.
CoRR
(2023)
Zhichao Wang
,
Xinsheng Wang
,
Lei Xie
,
Yuanzhe Chen
,
Qiao Tian
,
Yuping Wang
Delivering Speaking Style in Low-Resource Voice Conversion with Multi-Factor Constraints.
ICASSP
(2023)
Ao Zhang
,
Pan Zhou
,
Kaixun Huang
,
Yong Zou
,
Ming Liu
,
Lei Xie
U2-KWS: Unified Two-Pass Open-Vocabulary Keyword Spotting with Keyword Bias.
ASRU
(2023)
Tao Li
,
Chenxu Hu
,
Jian Cong
,
Xinfa Zhu
,
Jingbei Li
,
Qiao Tian
,
Yuping Wang
,
Lei Xie
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech - A Study between English and Mandarin.
CoRR
(2023)
Qijie Shao
,
Pengcheng Guo
,
Jinghao Yan
,
Pengfei Hu
,
Lei Xie
Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition.
CoRR
(2023)
Xinfa Zhu
,
Yi Lei
,
Kun Song
,
Yongmao Zhang
,
Tao Li
,
Lei Xie
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling.
ICASSP
(2023)
Li Zhang
,
Qing Wang
,
Hongji Wang
,
Yue Li
,
Wei Rao
,
Yannan Wang
,
Lei Xie
Distance-Based Weight Transfer for Fine-Tuning From Near-Field to Far-Field Speaker Verification.
ICASSP
(2023)
Ziqian Ning
,
Yuepeng Jiang
,
Zhichao Wang
,
Bin Zhang
,
Lei Xie
Vits-Based Singing Voice Conversion Leveraging Whisper and Multi-Scale F0 Modeling.
ASRU
(2023)
Junwen Xiong
,
Yu Zhou
,
Peng Zhang
,
Lei Xie
,
Wei Huang
,
Yufei Zha
Look&listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement.
IEEE Trans. Multim.
25 (2023)
Hongfei Xue
,
Qijie Shao
,
Kaixun Huang
,
Peikun Chen
,
Lei Xie
,
Jie Liu
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition.
CoRR
(2023)
Yangze Li
,
Fan Yu
,
Yuhao Liang
,
Pengcheng Guo
,
Mohan Shi
,
Zhihao Du
,
Shiliang Zhang
,
Lei Xie
Sa-Paraformer: Non-Autoregressive End-To-End Speaker-Attributed ASR.
ASRU
(2023)
Zihan Zhang
,
Jiayao Sun
,
Xianjun Xia
,
Ziqian Wang
,
Xiaopeng Yan
,
Yijian Xiao
,
Lei Xie
An Exploration of Task-Decoupling on Two-Stage Neural Post Filter for Real-Time Personalized Acoustic Echo Cancellation.
ASRU
(2023)
Yuanjun Lv
,
Jixun Yao
,
Peikun Chen
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
Salt: Distinguishable Speaker Anonymization Through Latent Space Transformation.
ASRU
(2023)
Yangze Li
,
Fan Yu
,
Yuhao Liang
,
Pengcheng Guo
,
Mohan Shi
,
Zhihao Du
,
Shiliang Zhang
,
Lei Xie
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR.
CoRR
(2023)
Jie Wang
,
Menglong Xu
,
Jingyong Hou
,
Binbin Zhang
,
Xiao-Lei Zhang
,
Lei Xie
,
Fuping Pan
Wekws: A Production First Small-Footprint End-to-End Keyword Spotting Toolkit.
ICASSP
(2023)
Yuanjun Lv
,
Jixun Yao
,
Peikun Chen
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation.
CoRR
(2023)
Xiaopeng Yan
,
Yindi Yang
,
Zhihao Guo
,
Liangliang Peng
,
Lei Xie
The NPU-Elevoc Personalized Speech Enhancement System for Icassp2023 DNS Challenge.
ICASSP
(2023)
Kun Wei
,
Bei Li
,
Hang Lv
,
Quan Lu
,
Ning Jiang
,
Lei Xie
Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation.
CoRR
(2023)
Xinfa Zhu
,
Yuanjun Lv
,
Yi Lei
,
Tao Li
,
Wendi He
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
Vec-Tok Speech: speech vectorization and tokenization for neural speech generation.
CoRR
(2023)
Xinfa Zhu
,
Yuke Li
,
Yi Lei
,
Ning Jiang
,
Guoqing Zhao
,
Lei Xie
Multi-Speaker Expressive Speech Synthesis via Semi-supervised Contrastive Learning.
CoRR
(2023)
Zhichao Wang
,
Yuanzhe Chen
,
Lei Xie
,
Qiao Tian
,
Yuping Wang
LM-VC: Zero-Shot Voice Conversion via Speech Generation Based on Language Models.
IEEE Signal Process. Lett.
30 (2023)
Xiang Hao
,
Chenglin Xu
,
Lei Xie
Neural speech enhancement with unsupervised pre-training and mixture training.
Neural Networks
158 (2023)
Li Zhang
,
Qing Wang
,
Hongji Wang
,
Yue Li
,
Wei Rao
,
Yannan Wang
,
Lei Xie
Distance-based Weight Transfer for Fine-tuning from Near-field to Far-field Speaker Verification.
CoRR
(2023)
Kun Song
,
Yongmao Zhang
,
Yi Lei
,
Jian Cong
,
Hanzhao Li
,
Lei Xie
,
Gang He
,
Jinfeng Bai
DSPGAN: A Gan-Based Universal Vocoder for High-Fidelity TTS by Time-Frequency Domain Supervision from DSP.
ICASSP
(2023)
Ziqian Ning
,
Qicong Xie
,
Pengcheng Zhu
,
Zhichao Wang
,
Liumeng Xue
,
Jixun Yao
,
Lei Xie
,
Mengxiao Bi
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features.
ICASSP
(2023)
Xinsheng Wang
,
Qicong Xie
,
Jihua Zhu
,
Lei Xie
,
Odette Scharenborg
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Persons.
IEEE Trans. Multim.
25 (2023)
Hongqiang Du
,
Lei Xie
,
Haizhou Li
Noise-robust voice conversion with domain adversarial training.
Neural Networks
148 (2022)
Yi Lei
,
Shan Yang
,
Xinsheng Wang
,
Lei Xie
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis.
CoRR
(2022)
Kun Song
,
Heyang Xue
,
Xinsheng Wang
,
Jian Cong
,
Yongmao Zhang
,
Lei Xie
,
Bing Yang
,
Xiong Zhang
,
Dan Su
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation.
ISCSLP
(2022)
Qijie Shao
,
Jinghao Yan
,
Jian Kang
,
Pengcheng Guo
,
Xian Shi
,
Pengfei Hu
,
Lei Xie
Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition.
CoRR
(2022)
Xiaochun An
,
Frank K. Soong
,
Lei Xie
Disentangling Style and Speaker Attributes for TTS Style Transfer.
IEEE ACM Trans. Audio Speech Lang. Process.
30 (2022)
Liumeng Xue
,
Shan Yang
,
Na Hu
,
Dan Su
,
Lei Xie
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers.
INTERSPEECH
(2022)
Xiaochun An
,
Frank K. Soong
,
Lei Xie
Disentangling Style and Speaker Attributes for TTS Style Transfer.
CoRR
(2022)
Fan Yu
,
Zhihao Du
,
Shiliang Zhang
,
Yuxiao Lin
,
Lei Xie
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings.
INTERSPEECH
(2022)
Zhanheng Yang
,
Sining Sun
,
Jin Li
,
Xiaoming Zhang
,
Xiong Wang
,
Long Ma
,
Lei Xie
CaTT-KWS: A Multi-stage Customized Keyword Spotting Framework based on Cascaded Transducer-Transformer.
CoRR
(2022)
Hongqiang Du
,
Lei Xie
,
Haizhou Li
Noise-robust voice conversion with domain adversarial training.
CoRR
(2022)
Binbin Zhang
,
Hang Lv
,
Pengcheng Guo
,
Qijie Shao
,
Chao Yang
,
Lei Xie
,
Xin Xu
,
Hui Bu
,
Xiaoyu Chen
,
Chenchen Zeng
,
Di Wu
,
Zhendong Peng
WENETSPEECH: A 10000+ Hours Multi-Domain Mandarin Corpus for Speech Recognition.
ICASSP
(2022)
Ao Zhang
,
Fan Yu
,
Kaixun Huang
,
Lei Xie
,
Longbiao Wang
,
Eng Siong Chng
,
Hui Bu
,
Binbin Zhang
,
Wei Chen
,
Xin Xu
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results.
CoRR
(2022)
Kun Song
,
Jian Cong
,
Xinsheng Wang
,
Yongmao Zhang
,
Lei Xie
,
Ning Jiang
,
Haiying Wu
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS.
ISCSLP
(2022)
Kun Wei
,
Yike Zhang
,
Sining Sun
,
Lei Xie
,
Long Ma
Conversational Speech Recognition by Learning Conversation-Level Characteristics.
ICASSP
(2022)
Yukai Ju
,
Wei Rao
,
Xiaopeng Yan
,
Yihui Fu
,
Shubo Lv
,
Luyao Cheng
,
Yannan Wang
,
Lei Xie
,
Shidong Shang
TEA-PSE: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System for ICASSP 2022 DNS Challenge.
ICASSP
(2022)
Kun Wei
,
Yike Zhang
,
Sining Sun
,
Lei Xie
,
Long Ma
Conversational Speech Recognition By Learning Conversation-level Characteristics.
CoRR
(2022)
Tao Li
,
Xinsheng Wang
,
Qicong Xie
,
Zhichao Wang
,
Mingqi Jiang
,
Lei Xie
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis.
INTERSPEECH
(2022)
Tao Li
,
Xinsheng Wang
,
Qicong Xie
,
Zhichao Wang
,
Mingqi Jiang
,
Lei Xie
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis.
CoRR
(2022)
Jixun Yao
,
Yi Lei
,
Qing Wang
,
Pengcheng Guo
,
Ziqian Ning
,
Lei Xie
,
Hai Li
,
Junhui Liu
,
Danming Xie
Preserving background sound in noise-robust voice conversion via multi-task learning.
CoRR
(2022)
Shubo Lv
,
Yihui Fu
,
Yukai Jv
,
Lei Xie
,
Weixin Zhu
,
Wei Rao
,
Yannan Wang
Spatial-DCCRN: DCCRN Equipped with Frame-Level Angle Feature and Hybrid Filtering for Multi-Channel Speech Enhancement.
SLT
(2022)
Tao Li
,
Xinsheng Wang
,
Qicong Xie
,
Zhichao Wang
,
Lei Xie
Cross-Speaker Emotion Disentangling and Transfer for End-to-End Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process.
30 (2022)
Yongmao Zhang
,
Jian Cong
,
Heyang Xue
,
Lei Xie
,
Pengcheng Zhu
,
Mengxiao Bi
VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis.
ICASSP
(2022)
Qicong Xie
,
Shan Yang
,
Yi Lei
,
Lei Xie
,
Dan Su
End-to-End Voice Conversion with Information Perturbation.
CoRR
(2022)
Fan Yu
,
Shiliang Zhang
,
Pengcheng Guo
,
Yihui Fu
,
Zhihao Du
,
Siqi Zheng
,
Weilong Huang
,
Lei Xie
,
Zheng-Hua Tan
,
DeLiang Wang
,
Yanmin Qian
,
Kong Aik Lee
,
Zhijie Yan
,
Bin Ma
,
Xin Xu
,
Hui Bu
Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
ICASSP
(2022)
Fan Yu
,
Zhihao Du
,
Shiliang Zhang
,
Yuxiao Lin
,
Lei Xie
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings.
CoRR
(2022)
Fan Yu
,
Shiliang Zhang
,
Yihui Fu
,
Lei Xie
,
Siqi Zheng
,
Zhihao Du
,
Weilong Huang
,
Pengcheng Guo
,
Zhijie Yan
,
Bin Ma
,
Xin Xu
,
Hui Bu
M2Met: The Icassp 2022 Multi-Channel Multi-Party Meeting Transcription Challenge.
ICASSP
(2022)
Qicong Xie
,
Shan Yang
,
Yi Lei
,
Lei Xie
,
Dan Su
End-to-End Voice Conversion with Information Perturbation.
ISCSLP
(2022)
Zhanheng Yang
,
Hang Lv
,
Xiong Wang
,
Ao Zhang
,
Lei Xie
Minimizing Sequential Confusion Error in Speech Command Recognition.
INTERSPEECH
(2022)
Yongmao Zhang
,
Zhichao Wang
,
Peiji Yang
,
Hongshen Sun
,
Zhisheng Wang
,
Lei Xie
AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents.
ISCSLP
(2022)
Fan Yu
,
Shiliang Zhang
,
Pengcheng Guo
,
Yuhao Liang
,
Zhihao Du
,
Yuxiao Lin
,
Lei Xie
MFCCA:Multi-Frame Cross-Channel Attention for Multi-Speaker ASR in Multi-Party Meeting Scenario.
SLT
(2022)
Kun Song
,
Heyang Xue
,
Xinsheng Wang
,
Jian Cong
,
Yongmao Zhang
,
Lei Xie
,
Bing Yang
,
Xiong Zhang
,
Dan Su
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation.
CoRR
(2022)
Yi Lei
,
Shan Yang
,
Xinsheng Wang
,
Lei Xie
MsEmoTTS: Multi-Scale Emotion Transfer, Prediction, and Control for Emotional Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process.
30 (2022)
Binbin Zhang
,
Di Wu
,
Zhendong Peng
,
Xingchen Song
,
Zhuoyuan Yao
,
Hang Lv
,
Lei Xie
,
Chao Yang
,
Fuping Pan
,
Jianwei Niu
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit.
INTERSPEECH
(2022)
Shubo Lv
,
Yihui Fu
,
Yukai Jv
,
Lei Xie
,
Weixin Zhu
,
Wei Rao
,
Yannan Wang
spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancement.
CoRR
(2022)
Yi Lei
,
Shan Yang
,
Jian Cong
,
Lei Xie
,
Dan Su
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion.
CoRR
(2022)
Liumeng Xue
,
Frank K. Soong
,
Shaofei Zhang
,
Lei Xie
ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS.
CoRR
(2022)
Yi Lei
,
Shan Yang
,
Jian Cong
,
Lei Xie
,
Dan Su
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion.
INTERSPEECH
(2022)
Liumeng Xue
,
Frank K. Soong
,
Shaofei Zhang
,
Lei Xie
ParaTTS: Learning Linguistic and Prosodic Cross-Sentence Information in Paragraph-Based TTS.
IEEE ACM Trans. Audio Speech Lang. Process.
30 (2022)
Fan Yu
,
Shiliang Zhang
,
Pengcheng Guo
,
Yuhao Liang
,
Zhihao Du
,
Yuxiao Lin
,
Lei Xie
MFCCA: Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario.
CoRR
(2022)
Jingyong Hou
,
Lei Xie
,
Shilei Zhang
Two-stage streaming keyword detection and localization with multi-scale depthwise temporal convolution.
Neural Networks
150 (2022)
Yongmao Zhang
,
Zhichao Wang
,
Peiji Yang
,
Hongshen Sun
,
Zhisheng Wang
,
Lei Xie
AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents.
CoRR
(2022)
Yi Lei
,
Shan Yang
,
Xinfa Zhu
,
Lei Xie
,
Dan Su
Cross-Speaker Emotion Transfer Through Information Perturbation in Emotional Speech Synthesis.
IEEE Signal Process. Lett.
29 (2022)
Liumeng Xue
,
Shan Yang
,
Na Hu
,
Dan Su
,
Lei Xie
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers.
CoRR
(2022)
Bowen Pang
,
Huan Zhao
,
Gaosheng Zhang
,
Xiaoyue Yang
,
Yang Sun
,
Li Zhang
,
Qing Wang
,
Lei Xie
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge.
ISCSLP
(2022)
Qijie Shao
,
Jinghao Yan
,
Jian Kang
,
Pengcheng Guo
,
Xian Shi
,
Pengfei Hu
,
Lei Xie
Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition.
INTERSPEECH
(2022)
Zhanheng Yang
,
Hang Lv
,
Xiong Wang
,
Ao Zhang
,
Lei Xie
Minimizing Sequential Confusion Error in Speech Command Recognition.
CoRR
(2022)
Fan Yu
,
Shiliang Zhang
,
Pengcheng Guo
,
Yihui Fu
,
Zhihao Du
,
Siqi Zheng
,
Weilong Huang
,
Lei Xie
,
Zheng-Hua Tan
,
DeLiang Wang
,
Yanmin Qian
,
Kong Aik Lee
,
Zhijie Yan
,
Bin Ma
,
Xin Xu
,
Hui Bu
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
CoRR
(2022)
Binbin Zhang
,
Di Wu
,
Zhendong Peng
,
Xingchen Song
,
Zhuoyuan Yao
,
Hang Lv
,
Lei Xie
,
Chao Yang
,
Fuping Pan
,
Jianwei Niu
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit.
CoRR
(2022)
Qicong Xie
,
Tao Li
,
Xinsheng Wang
,
Zhichao Wang
,
Lei Xie
,
Guoqiao Yu
,
Guanglu Wan
Multi-speaker Multi-style Text-to-speech Synthesis with Single-speaker Single-style Training Data Scenarios.
ISCSLP
(2022)