​
Login / Signup
Manthan Thakker
Publication Activity (10 Years)
Years Active: 2022-2024
Publications (10 Years): 10
Top Topics
Noise Suppression
Gaussian Filter
Language Model
Speech Enhancement
Top Venues
CoRR
ICASSP
INTERSPEECH
IEEE ACM Trans. Audio Speech Lang. Process.
</>
Publications
</>
Naoyuki Kanda
,
Xiaofei Wang
,
Sefik Emre Eskimez
,
Manthan Thakker
,
Hemin Yang
,
Zirun Zhu
,
Min Tang
,
Canrun Li
,
Steven Tsai
,
Zhen Xiao
,
Yufei Xia
,
Jinzhu Li
,
Yanqing Liu
,
Sheng Zhao
,
Michael Zeng
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like.
CoRR
(2024)
Sefik Emre Eskimez
,
Xiaofei Wang
,
Manthan Thakker
,
Canrun Li
,
Chung-Hsien Tsai
,
Zhen Xiao
,
Hemin Yang
,
Zirun Zhu
,
Min Tang
,
Xu Tan
,
Yanqing Liu
,
Sheng Zhao
,
Naoyuki Kanda
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS.
CoRR
(2024)
Xiaofei Wang
,
Sefik Emre Eskimez
,
Manthan Thakker
,
Hemin Yang
,
Zirun Zhu
,
Min Tang
,
Yufei Xia
,
Jinzhu Li
,
Sheng Zhao
,
Jinyu Li
,
Naoyuki Kanda
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS.
CoRR
(2024)
Xiaofei Wang
,
Manthan Thakker
,
Zhuo Chen
,
Naoyuki Kanda
,
Sefik Emre Eskimez
,
Sanyuan Chen
,
Min Tang
,
Shujie Liu
,
Jinyu Li
,
Takuya Yoshioka
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Haibin Wu
,
Xiaofei Wang
,
Sefik Emre Eskimez
,
Manthan Thakker
,
Daniel Tompkins
,
Chung-Hsien Tsai
,
Canrun Li
,
Zhen Xiao
,
Sheng Zhao
,
Jinyu Li
,
Naoyuki Kanda
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech.
CoRR
(2024)
Xiaofei Wang
,
Manthan Thakker
,
Zhuo Chen
,
Naoyuki Kanda
,
Sefik Emre Eskimez
,
Sanyuan Chen
,
Min Tang
,
Shujie Liu
,
Jinyu Li
,
Takuya Yoshioka
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer.
CoRR
(2023)
Harishchandra Dubey
,
Vishak Gopal
,
Ross Cutler
,
Ashkan Aazami
,
Sergiy Matusevych
,
Sebastian Braun
,
Sefik Emre Eskimez
,
Manthan Thakker
,
Takuya Yoshioka
,
Hannes Gamper
,
Robert Aichner
ICASSP 2022 Deep Noise Suppression Challenge.
CoRR
(2022)
Manthan Thakker
,
Sefik Emre Eskimez
,
Takuya Yoshioka
,
Huaming Wang
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation.
INTERSPEECH
(2022)
Harishchandra Dubey
,
Vishak Gopal
,
Ross Cutler
,
Ashkan Aazami
,
Sergiy Matusevych
,
Sebastian Braun
,
Sefik Emre Eskimez
,
Manthan Thakker
,
Takuya Yoshioka
,
Hannes Gamper
,
Robert Aichner
Icassp 2022 Deep Noise Suppression Challenge.
ICASSP
(2022)
Manthan Thakker
,
Sefik Emre Eskimez
,
Takuya Yoshioka
,
Huaming Wang
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation.
CoRR
(2022)