Yuma Shirahata

Publication Activity (10 Years)

Years Active: 2019-2024
Publications (10 Years): 13

Top Topics

Variational Inference

Natural Language Descriptions

Speech Synthesis

Phrase Structure

Top Venues

Publications

Reo Shimizu, Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions. ICASSP (2024)
Masaya Kawamura, Ryuichi Yamamoto, Yuma Shirahata, Takuya Hasumi, Kentaro Tachibana
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning. CoRR (2024)
Robin Scheibler, Yusuke Fujita, Yuma Shirahata, Tatsuya Komatsu
Universal Score-based Speech Enhancement with High Content Preservation. CoRR (2024)
Masaya Kawamura, Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform. ICASSP (2023)
Yuma Shirahata, Ryuichi Yamamoto, Eunwoo Song, Ryo Terashima, Jae-Min Kim, Kentaro Tachibana
Period VITS: Variational Inference with Explicit Pitch Modeling for End-To-End Emotional Speech Synthesis. ICASSP (2023)
Reo Shimizu, Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions. CoRR (2023)
Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song, Yuma Shirahata, Hyun-Wook Yoon, Jae-Min Kim, Kentaro Tachibana
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation. CoRR (2022)
Masaya Kawamura, Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform. CoRR (2022)
Yuma Shirahata, Ryuichi Yamamoto, Eunwoo Song, Ryo Terashima, Jae-Min Kim, Kentaro Tachibana
Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis. CoRR (2022)
Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song, Yuma Shirahata, Hyun-Wook Yoon, Jae-Min Kim, Kentaro Tachibana
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation. INTERSPEECH (2022)
Michiko Watanabe, Yuma Shirahata, Ralph Rose, Kikuo Maekawa
How Do Speakers Pause and Hesitate in English and Japanese? - A Comparison Using Parallel Corpora of English and Japanese Presentation Speeches -. O-COCOSDA (2021)
Yuma Shirahata, Daisuke Saito, Nobuaki Minematsu
Discriminative Method to Extract Coarse Prosodic Structure and its Application for Statistical Phrase/Accent Command Estimation. INTERSPEECH (2020)
Yuma Shirahata, Daisuke Saito, Nobuaki Minematsu
Generative Modeling of F0 Contours Leveraged by Phrase Structure and Its Application to Statistical Focus Control. SSW (2019)