​
Login / Signup
Lei Xie
ORCID
Publication Activity (10 Years)
Years Active: 2008-2024
Publications (10 Years): 283
Top Topics
Speech Recognition
Spoken Term Detection
Speaker Verification
Keyword Spotting
Top Venues
CoRR
INTERSPEECH
ICASSP
ASRU
</>
Publications
</>
Linhan Ma
,
Xinfa Zhu
,
Yuanjun Lv
,
Zhichao Wang
,
Ziqian Wang
,
Wendi He
,
Hongbin Zhou
,
Lei Xie
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy.
CoRR
(2024)
Mingshuai Liu
,
Zhuangqi Chen
,
Xiaopeng Yan
,
Yuanjun Lv
,
Xianjun Xia
,
Chuanzeng Huang
,
Yijian Xiao
,
Lei Xie
RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention.
CoRR
(2024)
Jixun Yao
,
Yuguang Yang
,
Yi Lei
,
Ziqian Ning
,
Yanni Hu
,
Yu Pan
,
Jingjing Yin
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
Promptvc: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts.
ICASSP
(2024)
Yuepeng Jiang
,
Tao Li
,
Fengyu Yang
,
Lei Xie
,
Meng Meng
,
Yujun Wang
Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling.
CoRR
(2024)
Runduo Han
,
Weiming Xu
,
Zihan Zhang
,
Mingshuai Liu
,
Lei Xie
Distil-DCCRN: A Small-Footprint DCCRN Leveraging Feature-Based Knowledge Distillation in Speech Enhancement.
IEEE Signal Process. Lett.
31 (2024)
Bingshen Mu
,
Yangze Li
,
Qijie Shao
,
Kun Wei
,
Xucheng Wan
,
Naijun Zheng
,
Huan Zhou
,
Lei Xie
MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition.
CoRR
(2024)
Jixun Yao
,
Qing Wang
,
Pengcheng Guo
,
Ziqian Ning
,
Lei Xie
Distinctive and Natural Speaker Anonymization via Singular Value Transformation-Assisted Matrix.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Li Zhang
,
Ning Jiang
,
Qing Wang
,
Yue Li
,
Quan Lu
,
Lei Xie
Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification.
CoRR
(2024)
He Wang
,
Pengcheng Guo
,
Xucheng Wan
,
Huan Zhou
,
Lei Xie
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder.
CoRR
(2024)
Zhichao Wang
,
Yuanzhe Chen
,
Xinsheng Wang
,
Lei Xie
,
Yuping Wang
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion.
ACL (1)
(2024)
Xuelong Geng
,
Tianyi Xu
,
Kun Wei
,
Bingshen Mu
,
Hongfei Xue
,
He Wang
,
Yangze Li
,
Pengcheng Guo
,
Yuhang Dai
,
Longhao Li
,
Mingchen Shao
,
Lei Xie
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets.
CoRR
(2024)
Rong Gong
,
Hongfei Xue
,
Lezhi Wang
,
Xin Xu
,
Qisheng Li
,
Lei Xie
,
Hui Bu
,
Shaomei Wu
,
Jiaming Zhou
,
Yong Qin
,
Binbin Zhang
,
Jun Du
,
Jia Bin
,
Ming Li
AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection.
CoRR
(2024)
Zhichao Wang
,
Yuanzhe Chen
,
Xinsheng Wang
,
Zhuo Chen
,
Lei Xie
,
Yuping Wang
,
Yuxuan Wang
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion.
CoRR
(2024)
Bingshen Mu
,
Xucheng Wan
,
Naijun Zheng
,
Huan Zhou
,
Lei Xie
MMGER: Multi-Modal and Multi-Granularity Generative Error Correction With LLM for Joint Accent and Speech Recognition.
IEEE Signal Process. Lett.
31 (2024)
Qijie Shao
,
Pengcheng Guo
,
Jinghao Yan
,
Pengfei Hu
,
Lei Xie
Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Xinfa Zhu
,
Yi Lei
,
Tao Li
,
Yongmao Zhang
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Yuanjun Lv
,
Hai Li
,
Ying Yan
,
Junhui Liu
,
Danming Xie
,
Lei Xie
FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter.
CoRR
(2024)
Ziqian Wang
,
Xinfa Zhu
,
Zihan Zhang
,
Yuanjun Lv
,
Ning Jiang
,
Guoqing Zhao
,
Lei Xie
SELM: Speech Enhancement using Discrete Tokens and Language Models.
ICASSP
(2024)
Kun Wei
,
Bei Li
,
Hang Lv
,
Quan Lu
,
Ning Jiang
,
Lei Xie
Conversational Speech Recognition by Learning Audio-Textual Cross-Modal Contextual Representation.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Peikun Chen
,
Sining Sun
,
Changhao Shan
,
Qing Yang
,
Lei Xie
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study.
CoRR
(2024)
Li Zhang
,
Ning Jiang
,
Qing Wang
,
Yue Li
,
Quan Lu
,
Lei Xie
Whisper-SV: Adapting Whisper for low-data-resource speaker verification.
Speech Commun.
163 (2024)
Ziqian Ning
,
Yuepeng Jiang
,
Pengcheng Zhu
,
Shuai Wang
,
Jixun Yao
,
Lei Xie
,
Mengxiao Bi
Dualvc 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion.
ICASSP
(2024)
Zhichao Wang
,
Liumeng Xue
,
Qiuqiang Kong
,
Lei Xie
,
Yuanzhe Chen
,
Qiao Tian
,
Yuping Wang
Multi-Level Temporal-Channel Speaker Retrieval for Zero-Shot Voice Conversion.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Dake Guo
,
Xinfa Zhu
,
Liumeng Xue
,
Tao Li
,
Yuanjun Lv
,
Yuepeng Jiang
,
Lei Xie
HIGNN-TTS: Hierarchical Prosody Modeling With Graph Neural Networks for Expressive Long-Form TTS.
ASRU
(2023)
Zhichao Wang
,
Xinsheng Wang
,
Qicong Xie
,
Tao Li
,
Lei Xie
,
Qiao Tian
,
Yuping Wang
MSM-VC: High-Fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-Scale Style Modeling.
IEEE ACM Trans. Audio Speech Lang. Process.
31 (2023)
Jixun Yao
,
Yi Lei
,
Qing Wang
,
Pengcheng Guo
,
Ziqian Ning
,
Lei Xie
,
Hai Li
,
Junhui Liu
,
Danming Xie
Preserving Background Sound in Noise-Robust Voice Conversion Via Multi-Task Learning.
ICASSP
(2023)
Kaixun Huang
,
Ao Zhang
,
Zhanheng Yang
,
Pengcheng Guo
,
Bingshen Mu
,
Tianyi Xu
,
Lei Xie
Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network.
INTERSPEECH
(2023)
Yongmao Zhang
,
Heyang Xue
,
Hanzhao Li
,
Lei Xie
,
Tingwei Guo
,
Ruixiong Zhang
,
Caixia Gong
VISinger2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer.
INTERSPEECH
(2023)
Kaixun Huang
,
Ao Zhang
,
Binbin Zhang
,
Tianyi Xu
,
Xingchen Song
,
Lei Xie
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition.
ASRU
(2023)
Jixun Yao
,
Qing Wang
,
Yi Lei
,
Pengcheng Guo
,
Lei Xie
,
Namin Wang
,
Jie Liu
Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling.
ICASSP
(2023)
Qing Wang
,
Jixun Yao
,
Ziqian Wang
,
Pengcheng Guo
,
Lei Xie
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification.
INTERSPEECH
(2023)
Tianyi Xu
,
Zhanheng Yang
,
Kaixun Huang
,
Pengcheng Guo
,
Ao Zhang
,
Biao Li
,
Changru Chen
,
Chao Li
,
Lei Xie
Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition.
INTERSPEECH
(2023)
Li Zhang
,
Huan Zhao
,
Yue Li
,
Bowen Pang
,
Yannan Wang
,
Hongji Wang
,
Wei Rao
,
Qing Wang
,
Lei Xie
The FlySpeech Audio-Visual Speaker Diarization System for MISP Challenge 2022.
CoRR
(2023)
Qing Wang
,
Jixun Yao
,
Li Zhang
,
Pengcheng Guo
,
Lei Xie
Timbre-Reserved Adversarial Attack in Speaker Identification.
IEEE ACM Trans. Audio Speech Lang. Process.
31 (2023)
Dake Guo
,
Xinfa Zhu
,
Liumeng Xue
,
Tao Li
,
Yuanjun Lv
,
Yuepeng Jiang
,
Lei Xie
HiGNN-TTS: Hierarchical Prosody Modeling with Graph Neural Networks for Expressive Long-form TTS.
CoRR
(2023)
Yuke Li
,
Xinfa Zhu
,
Yi Lei
,
Hai Li
,
Junhui Liu
,
Danming Xie
,
Lei Xie
Zero-Shot Emotion Transfer for Cross-Lingual Speech Synthesis.
ASRU
(2023)
Zhichao Wang
,
Xinsheng Wang
,
Qicong Xie
,
Tao Li
,
Lei Xie
,
Qiao Tian
,
Yuping Wang
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling.
CoRR
(2023)
Guanghou Liu
,
Yongmao Zhang
,
Yi Lei
,
Yunlin Chen
,
Rui Wang
,
Lei Xie
,
Zhifei Li
PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions.
INTERSPEECH
(2023)
Hongfei Xue
,
Qijie Shao
,
Peikun Chen
,
Pengcheng Guo
,
Lei Xie
,
Jie Liu
TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition.
INTERSPEECH
(2023)
Peikun Chen
,
Fan Yu
,
Yuhao Liang
,
Hongfei Xue
,
Xucheng Wan
,
Naijun Zheng
,
Huan Zhou
,
Lei Xie
BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition.
ASRU
(2023)
Qing Wang
,
Jixun Yao
,
Li Zhang
,
Pengcheng Guo
,
Lei Xie
Timbre-reserved Adversarial Attack in Speaker Identification.
CoRR
(2023)
Yongmao Zhang
,
Guanghou Liu
,
Yi Lei
,
Yunlin Chen
,
Hao Yin
,
Lei Xie
,
Zhifei Li
Promptspeaker: Speaker Generation Based on Text Descriptions.
ASRU
(2023)
Tao Li
,
Chenxu Hu
,
Jian Cong
,
Xinfa Zhu
,
Jingbei Li
,
Qiao Tian
,
Yuping Wang
,
Lei Xie
DiCLET-TTS: Diffusion Model Based Cross-Lingual Emotion Transfer for Text-to-Speech - A Study Between English and Mandarin.
IEEE ACM Trans. Audio Speech Lang. Process.
31 (2023)
Ziqian Wang
,
Qing Wang
,
Jixun Yao
,
Lei Xie
The NPU-ASLP System for Deepfake Algorithm Recognition in ADD 2023 Challenge.
DADA@IJCAI
(2023)
Ao Zhang
,
He Wang
,
Pengcheng Guo
,
Yihui Fu
,
Lei Xie
,
Yingying Gao
,
Shilei Zhang
,
Junlan Feng
VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting.
ICASSP
(2023)
Yuke Li
,
Xinfa Zhu
,
Yi Lei
,
Hai Li
,
Junhui Liu
,
Danming Xie
,
Lei Xie
Zero-Shot Emotion Transfer For Cross-Lingual Speech Synthesis.
CoRR
(2023)
Kaixun Huang
,
Ao Zhang
,
Binbin Zhang
,
Tianyi Xu
,
Xingchen Song
,
Lei Xie
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition.
CoRR
(2023)
Mingshuai Liu
,
Shubo Lv
,
Zihan Zhang
,
Runduo Han
,
Xiang Hao
,
Xianjun Xia
,
Li Chen
,
Yijian Xiao
,
Lei Xie
Two-Stage Neural Network for ICASSP 2023 Speech Signal Improvement Challenge.
ICASSP
(2023)
Yuhao Liang
,
Fan Yu
,
Yangze Li
,
Pengcheng Guo
,
Shiliang Zhang
,
Qian Chen
,
Lei Xie
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR.
INTERSPEECH
(2023)
Peikun Chen
,
Fan Yu
,
Yuhao Liang
,
Hongfei Xue
,
Xucheng Wan
,
Naijun Zheng
,
Huan Zhou
,
Lei Xie
BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition.
CoRR
(2023)
Kun Song
,
Yi Ren
,
Yi Lei
,
Chunfeng Wang
,
Kun Wei
,
Lei Xie
,
Xiang Yin
,
Zejun Ma
StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation.
INTERSPEECH
(2023)
Ziqian Ning
,
Yuepeng Jiang
,
Pengcheng Zhu
,
Shuai Wang
,
Jixun Yao
,
Lei Xie
,
Mengxiao Bi
DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion.
CoRR
(2023)
Jixun Yao
,
Yuguang Yang
,
Yi Lei
,
Ziqian Ning
,
Yanni Hu
,
Yu Pan
,
Jingjing Yin
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts.
CoRR
(2023)
Tao Li
,
Zhichao Wang
,
Xinfa Zhu
,
Jian Cong
,
Qiao Tian
,
Yuping Wang
,
Lei Xie
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning.
CoRR
(2023)
Zhichao Wang
,
Xinsheng Wang
,
Lei Xie
,
Yuanzhe Chen
,
Qiao Tian
,
Yuping Wang
Delivering Speaking Style in Low-Resource Voice Conversion with Multi-Factor Constraints.
ICASSP
(2023)
Ao Zhang
,
Pan Zhou
,
Kaixun Huang
,
Yong Zou
,
Ming Liu
,
Lei Xie
U2-KWS: Unified Two-Pass Open-Vocabulary Keyword Spotting with Keyword Bias.
ASRU
(2023)
Tao Li
,
Chenxu Hu
,
Jian Cong
,
Xinfa Zhu
,
Jingbei Li
,
Qiao Tian
,
Yuping Wang
,
Lei Xie
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech - A Study between English and Mandarin.
CoRR
(2023)
Qijie Shao
,
Pengcheng Guo
,
Jinghao Yan
,
Pengfei Hu
,
Lei Xie
Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition.
CoRR
(2023)
Ziqian Ning
,
Yuepeng Jiang
,
Pengcheng Zhu
,
Jixun Yao
,
Shuai Wang
,
Lei Xie
,
Mengxiao Bi
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding.
INTERSPEECH
(2023)
Xinfa Zhu
,
Yi Lei
,
Kun Song
,
Yongmao Zhang
,
Tao Li
,
Lei Xie
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling.
ICASSP
(2023)
Li Zhang
,
Qing Wang
,
Hongji Wang
,
Yue Li
,
Wei Rao
,
Yannan Wang
,
Lei Xie
Distance-Based Weight Transfer for Fine-Tuning From Near-Field to Far-Field Speaker Verification.
ICASSP
(2023)
Ziqian Ning
,
Yuepeng Jiang
,
Zhichao Wang
,
Bin Zhang
,
Lei Xie
Vits-Based Singing Voice Conversion Leveraging Whisper and Multi-Scale F0 Modeling.
ASRU
(2023)
Junwen Xiong
,
Yu Zhou
,
Peng Zhang
,
Lei Xie
,
Wei Huang
,
Yufei Zha
Look&listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement.
IEEE Trans. Multim.
25 (2023)
Hongfei Xue
,
Qijie Shao
,
Kaixun Huang
,
Peikun Chen
,
Lei Xie
,
Jie Liu
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition.
CoRR
(2023)
Yangze Li
,
Fan Yu
,
Yuhao Liang
,
Pengcheng Guo
,
Mohan Shi
,
Zhihao Du
,
Shiliang Zhang
,
Lei Xie
Sa-Paraformer: Non-Autoregressive End-To-End Speaker-Attributed ASR.
ASRU
(2023)
Zihan Zhang
,
Jiayao Sun
,
Xianjun Xia
,
Ziqian Wang
,
Xiaopeng Yan
,
Yijian Xiao
,
Lei Xie
An Exploration of Task-Decoupling on Two-Stage Neural Post Filter for Real-Time Personalized Acoustic Echo Cancellation.
ASRU
(2023)
Yuanjun Lv
,
Jixun Yao
,
Peikun Chen
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
Salt: Distinguishable Speaker Anonymization Through Latent Space Transformation.
ASRU
(2023)
Yangze Li
,
Fan Yu
,
Yuhao Liang
,
Pengcheng Guo
,
Mohan Shi
,
Zhihao Du
,
Shiliang Zhang
,
Lei Xie
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR.
CoRR
(2023)
Jie Wang
,
Menglong Xu
,
Jingyong Hou
,
Binbin Zhang
,
Xiao-Lei Zhang
,
Lei Xie
,
Fuping Pan
Wekws: A Production First Small-Footprint End-to-End Keyword Spotting Toolkit.
ICASSP
(2023)
Yuanjun Lv
,
Jixun Yao
,
Peikun Chen
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation.
CoRR
(2023)
Xiaopeng Yan
,
Yindi Yang
,
Zhihao Guo
,
Liangliang Peng
,
Lei Xie
The NPU-Elevoc Personalized Speech Enhancement System for Icassp2023 DNS Challenge.
ICASSP
(2023)
Kun Wei
,
Bei Li
,
Hang Lv
,
Quan Lu
,
Ning Jiang
,
Lei Xie
Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation.
CoRR
(2023)
Shubo Lv
,
Xiong Wang
,
Sining Sun
,
Long Ma
,
Lei Xie
DCCRN-KWS: An Audio Bias Based Model for Noise Robust Small-Footprint Keyword Spotting.
INTERSPEECH
(2023)
Xinfa Zhu
,
Yuanjun Lv
,
Yi Lei
,
Tao Li
,
Wendi He
,
Hongbin Zhou
,
Heng Lu
,
Lei Xie
Vec-Tok Speech: speech vectorization and tokenization for neural speech generation.
CoRR
(2023)
Xinfa Zhu
,
Yuke Li
,
Yi Lei
,
Ning Jiang
,
Guoqing Zhao
,
Lei Xie
Multi-Speaker Expressive Speech Synthesis via Semi-supervised Contrastive Learning.
CoRR
(2023)
Zhichao Wang
,
Yuanzhe Chen
,
Lei Xie
,
Qiao Tian
,
Yuping Wang
LM-VC: Zero-Shot Voice Conversion via Speech Generation Based on Language Models.
IEEE Signal Process. Lett.
30 (2023)
Xiang Hao
,
Chenglin Xu
,
Lei Xie
Neural speech enhancement with unsupervised pre-training and mixture training.
Neural Networks
158 (2023)
Zhanheng Yang
,
Sining Sun
,
Xiong Wang
,
Yike Zhang
,
Long Ma
,
Lei Xie
Two Stage Contextual Word Filtering for Context Bias in Unified Streaming and Non-streaming Transducer.
INTERSPEECH
(2023)
Li Zhang
,
Qing Wang
,
Hongji Wang
,
Yue Li
,
Wei Rao
,
Yannan Wang
,
Lei Xie
Distance-based Weight Transfer for Fine-tuning from Near-field to Far-field Speaker Verification.
CoRR
(2023)
Kun Song
,
Yongmao Zhang
,
Yi Lei
,
Jian Cong
,
Hanzhao Li
,
Lei Xie
,
Gang He
,
Jinfeng Bai
DSPGAN: A Gan-Based Universal Vocoder for High-Fidelity TTS by Time-Frequency Domain Supervision from DSP.
ICASSP
(2023)
Ziqian Ning
,
Qicong Xie
,
Pengcheng Zhu
,
Zhichao Wang
,
Liumeng Xue
,
Jixun Yao
,
Lei Xie
,
Mengxiao Bi
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features.
ICASSP
(2023)
Xinsheng Wang
,
Qicong Xie
,
Jihua Zhu
,
Lei Xie
,
Odette Scharenborg
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Persons.
IEEE Trans. Multim.
25 (2023)
Hongqiang Du
,
Lei Xie
,
Haizhou Li
Noise-robust voice conversion with domain adversarial training.
Neural Networks
148 (2022)
Yi Lei
,
Shan Yang
,
Xinsheng Wang
,
Lei Xie
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis.
CoRR
(2022)
Kun Song
,
Heyang Xue
,
Xinsheng Wang
,
Jian Cong
,
Yongmao Zhang
,
Lei Xie
,
Bing Yang
,
Xiong Zhang
,
Dan Su
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation.
ISCSLP
(2022)
Qijie Shao
,
Jinghao Yan
,
Jian Kang
,
Pengcheng Guo
,
Xian Shi
,
Pengfei Hu
,
Lei Xie
Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition.
CoRR
(2022)
Xiaochun An
,
Frank K. Soong
,
Lei Xie
Disentangling Style and Speaker Attributes for TTS Style Transfer.
IEEE ACM Trans. Audio Speech Lang. Process.
30 (2022)
Liumeng Xue
,
Shan Yang
,
Na Hu
,
Dan Su
,
Lei Xie
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers.
INTERSPEECH
(2022)
Xiaochun An
,
Frank K. Soong
,
Lei Xie
Disentangling Style and Speaker Attributes for TTS Style Transfer.
CoRR
(2022)
Fan Yu
,
Zhihao Du
,
Shiliang Zhang
,
Yuxiao Lin
,
Lei Xie
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings.
INTERSPEECH
(2022)
Zhanheng Yang
,
Sining Sun
,
Jin Li
,
Xiaoming Zhang
,
Xiong Wang
,
Long Ma
,
Lei Xie
CaTT-KWS: A Multi-stage Customized Keyword Spotting Framework based on Cascaded Transducer-Transformer.
CoRR
(2022)
Hongqiang Du
,
Lei Xie
,
Haizhou Li
Noise-robust voice conversion with domain adversarial training.
CoRR
(2022)
Binbin Zhang
,
Hang Lv
,
Pengcheng Guo
,
Qijie Shao
,
Chao Yang
,
Lei Xie
,
Xin Xu
,
Hui Bu
,
Xiaoyu Chen
,
Chenchen Zeng
,
Di Wu
,
Zhendong Peng
WENETSPEECH: A 10000+ Hours Multi-Domain Mandarin Corpus for Speech Recognition.
ICASSP
(2022)
Ao Zhang
,
Fan Yu
,
Kaixun Huang
,
Lei Xie
,
Longbiao Wang
,
Eng Siong Chng
,
Hui Bu
,
Binbin Zhang
,
Wei Chen
,
Xin Xu
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results.
CoRR
(2022)
Kun Song
,
Jian Cong
,
Xinsheng Wang
,
Yongmao Zhang
,
Lei Xie
,
Ning Jiang
,
Haiying Wu
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS.
ISCSLP
(2022)
Kun Wei
,
Yike Zhang
,
Sining Sun
,
Lei Xie
,
Long Ma
Conversational Speech Recognition by Learning Conversation-Level Characteristics.
ICASSP
(2022)
Yukai Ju
,
Wei Rao
,
Xiaopeng Yan
,
Yihui Fu
,
Shubo Lv
,
Luyao Cheng
,
Yannan Wang
,
Lei Xie
,
Shidong Shang
TEA-PSE: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System for ICASSP 2022 DNS Challenge.
ICASSP
(2022)
Kun Wei
,
Yike Zhang
,
Sining Sun
,
Lei Xie
,
Long Ma
Conversational Speech Recognition By Learning Conversation-level Characteristics.
CoRR
(2022)
Tao Li
,
Xinsheng Wang
,
Qicong Xie
,
Zhichao Wang
,
Mingqi Jiang
,
Lei Xie
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis.
INTERSPEECH
(2022)
Tao Li
,
Xinsheng Wang
,
Qicong Xie
,
Zhichao Wang
,
Mingqi Jiang
,
Lei Xie
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis.
CoRR
(2022)