​
Login / Signup
Yinghao Aaron Li
Publication Activity (10 Years)
Years Active: 2021-2024
Publications (10 Years): 21
Top Topics
Prosodic Features
Knowledge Transfer
Speech Synthesis
Language Model
Top Venues
CoRR
ICASSP
WASPAA
INTERSPEECH
</>
Publications
</>
Gavin Mischler
,
Yinghao Aaron Li
,
Stephan Bickel
,
Ashesh D. Mehta
,
Nima Mesgarani
Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain.
CoRR
(2024)
Xilin Jiang
,
Cong Han
,
Yinghao Aaron Li
,
Nima Mesgarani
Exploring Self-supervised Contrastive Learning of Spatial Sound Event Representation.
ICASSP
(2024)
Xilin Jiang
,
Yinghao Aaron Li
,
Adrian Nicolas Florea
,
Cong Han
,
Nima Mesgarani
Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis.
CoRR
(2024)
Xilin Jiang
,
Cong Han
,
Yinghao Aaron Li
,
Nima Mesgarani
Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience.
CoRR
(2024)
Yinghao Aaron Li
,
Cong Han
,
Vinay S. Raghavan
,
Gavin Mischler
,
Nima Mesgarani
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models.
NeurIPS
(2023)
Xilin Jiang
,
Cong Han
,
Yinghao Aaron Li
,
Nima Mesgarani
Exploring Self-Supervised Contrastive Learning of Spatial Sound Event Representation.
CoRR
(2023)
Xilin Jiang
,
Yinghao Aaron Li
,
Nima Mesgarani
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes.
CoRR
(2023)
Yinghao Aaron Li
,
Cong Han
,
Xilin Jiang
,
Nima Mesgarani
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions.
CoRR
(2023)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs.
CoRR
(2023)
Cong Han
,
Vishal Choudhari
,
Yinghao Aaron Li
,
Nima Mesgarani
Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation.
CoRR
(2023)
Yinghao Aaron Li
,
Cong Han
,
Xilin Jiang
,
Nima Mesgarani
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform.
CoRR
(2023)
Yinghao Aaron Li
,
Cong Han
,
Xilin Jiang
,
Nima Mesgarani
Phoneme-Level Bert for Enhanced Prosody of Text-To-Speech with Grapheme Predictions.
ICASSP
(2023)
Cong Han
,
Vishal Choudhari
,
Yinghao Aaron Li
,
Nima Mesgarani
Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation.
EMBC
(2023)
Xilin Jiang
,
Yinghao Aaron Li
,
Nima Mesgarani
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes.
INTERSPEECH
(2023)
Yinghao Aaron Li
,
Cong Han
,
Vinay S. Raghavan
,
Gavin Mischler
,
Nima Mesgarani
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models.
CoRR
(2023)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs.
WASPAA
(2023)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
Styletts-VC: One-Shot Voice Conversion by Knowledge Transfer From Style-Based TTS Models.
SLT
(2022)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis.
CoRR
(2022)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models.
CoRR
(2022)
Yinghao Aaron Li
,
Ali Zare
,
Nima Mesgarani
StarGANv2-VC: A Diverse, Unsupervised, Non-Parallel Framework for Natural-Sounding Voice Conversion.
Interspeech
(2021)
Yinghao Aaron Li
,
Ali Zare
,
Nima Mesgarani
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion.
CoRR
(2021)