​
Login / Signup
Yuma Shirahata
Publication Activity (10 Years)
Years Active: 2019-2024
Publications (10 Years): 13
Top Topics
Variational Inference
Natural Language Descriptions
Speech Synthesis
Phrase Structure
Top Venues
CoRR
ICASSP
INTERSPEECH
O-COCOSDA
</>
Publications
</>
Reo Shimizu
,
Ryuichi Yamamoto
,
Masaya Kawamura
,
Yuma Shirahata
,
Hironori Doi
,
Tatsuya Komatsu
,
Kentaro Tachibana
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions.
ICASSP
(2024)
Masaya Kawamura
,
Ryuichi Yamamoto
,
Yuma Shirahata
,
Takuya Hasumi
,
Kentaro Tachibana
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning.
CoRR
(2024)
Robin Scheibler
,
Yusuke Fujita
,
Yuma Shirahata
,
Tatsuya Komatsu
Universal Score-based Speech Enhancement with High Content Preservation.
CoRR
(2024)
Masaya Kawamura
,
Yuma Shirahata
,
Ryuichi Yamamoto
,
Kentaro Tachibana
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform.
ICASSP
(2023)
Yuma Shirahata
,
Ryuichi Yamamoto
,
Eunwoo Song
,
Ryo Terashima
,
Jae-Min Kim
,
Kentaro Tachibana
Period VITS: Variational Inference with Explicit Pitch Modeling for End-To-End Emotional Speech Synthesis.
ICASSP
(2023)
Reo Shimizu
,
Ryuichi Yamamoto
,
Masaya Kawamura
,
Yuma Shirahata
,
Hironori Doi
,
Tatsuya Komatsu
,
Kentaro Tachibana
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions.
CoRR
(2023)
Ryo Terashima
,
Ryuichi Yamamoto
,
Eunwoo Song
,
Yuma Shirahata
,
Hyun-Wook Yoon
,
Jae-Min Kim
,
Kentaro Tachibana
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation.
CoRR
(2022)
Masaya Kawamura
,
Yuma Shirahata
,
Ryuichi Yamamoto
,
Kentaro Tachibana
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform.
CoRR
(2022)
Yuma Shirahata
,
Ryuichi Yamamoto
,
Eunwoo Song
,
Ryo Terashima
,
Jae-Min Kim
,
Kentaro Tachibana
Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis.
CoRR
(2022)
Ryo Terashima
,
Ryuichi Yamamoto
,
Eunwoo Song
,
Yuma Shirahata
,
Hyun-Wook Yoon
,
Jae-Min Kim
,
Kentaro Tachibana
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation.
INTERSPEECH
(2022)
Michiko Watanabe
,
Yuma Shirahata
,
Ralph Rose
,
Kikuo Maekawa
How Do Speakers Pause and Hesitate in English and Japanese? - A Comparison Using Parallel Corpora of English and Japanese Presentation Speeches -.
O-COCOSDA
(2021)
Yuma Shirahata
,
Daisuke Saito
,
Nobuaki Minematsu
Discriminative Method to Extract Coarse Prosodic Structure and its Application for Statistical Phrase/Accent Command Estimation.
INTERSPEECH
(2020)
Yuma Shirahata
,
Daisuke Saito
,
Nobuaki Minematsu
Generative Modeling of F0 Contours Leveraged by Phrase Structure and Its Application to Statistical Focus Control.
SSW
(2019)