​
Login / Signup
Yui Sudo
ORCID
Publication Activity (10 Years)
Years Active: 2019-2024
Publications (10 Years): 29
Top Topics
Speech Recognition
Similarity Estimation
Joint Optimization
Beam Search
Top Venues
CoRR
INTERSPEECH
SII
ACL (1)
</>
Publications
</>
Yui Sudo
,
Muhammad Shakeel
,
Yosuke Fukumoto
,
Yifan Peng
,
Shinji Watanabe
Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search.
CoRR
(2024)
Yui Sudo
,
Muhammad Shakeel
,
Yosuke Fukumoto
,
Yifan Peng
,
Shinji Watanabe
Contextualized Automatic Speech Recognition With Attention-Based Bias Phrase Boosted Beam Search.
ICASSP
(2024)
Yifan Peng
,
Yui Sudo
,
Muhammad Shakeel
,
Shinji Watanabe
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification.
CoRR
(2024)
Yui Sudo
,
Muhammad Shakeel
,
Yosuke Fukumoto
,
Brian Yan
,
Jiatong Shi
,
Yifan Peng
,
Shinji Watanabe
4D ASR: Joint Beam Search Integrating CTC, Attention, Transducer, and Mask Predict Decoders.
CoRR
(2024)
Takahiro Osaki
,
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Improving Noise Robustness of Automatic Speech Recognition Based on a Parallel Adapter Model with Near-Identity Initialization.
IEA/AIE
(2024)
Yifan Peng
,
Jinchuan Tian
,
William Chen
,
Siddhant Arora
,
Brian Yan
,
Yui Sudo
,
Muhammad Shakeel
,
Kwanghee Choi
,
Jiatong Shi
,
Xuankai Chang
,
Jee-weon Jung
,
Shinji Watanabe
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer.
CoRR
(2024)
Yui Sudo
,
Yosuke Fukumoto
,
Muhammad Shakeel
,
Yifan Peng
,
Shinji Watanabe
Contextualized Automatic Speech Recognition with Dynamic Vocabulary.
CoRR
(2024)
Muhammad Shakeel
,
Yui Sudo
,
Yifan Peng
,
Shinji Watanabe
Joint Optimization of Streaming and Non-Streaming Automatic Speech Recognition with Multi-Decoder and Knowledge Distillation.
CoRR
(2024)
Muhammad Shakeel
,
Yui Sudo
,
Yifan Peng
,
Shinji Watanabe
Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss.
CoRR
(2024)
Yifan Peng
,
Yui Sudo
,
Muhammad Shakeel
,
Shinji Watanabe
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification.
ACL (1)
(2024)
Yifan Peng
,
Jinchuan Tian
,
Brian Yan
,
Dan Berrebbi
,
Xuankai Chang
,
Xinjian Li
,
Jiatong Shi
,
Siddhant Arora
,
William Chen
,
Roshan S. Sharma
,
Wangyou Zhang
,
Yui Sudo
,
Muhammad Shakeel
,
Jee-weon Jung
,
Soumi Maiti
,
Shinji Watanabe
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data.
CoRR
(2023)
Ryu Takeda
,
Yui Sudo
,
Kazunori Komatani
Flexible Evidence Model to Reduce Uncertainty Mismatch Between Speech Enhancement and ASR Based on Encoder-Decoder Architecture.
APSIPA ASC
(2023)
Yui Sudo
,
Muhammad Shakeel
,
Yifan Peng
,
Shinji Watanabe
Time-synchronous one-pass Beam Search for Parallel Online and Offline Transducers with Dynamic Block Training.
INTERSPEECH
(2023)
Yui Sudo
,
Kazuya Hata
,
Kazuhiro Nakadai
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation.
INTERSPEECH
(2023)
Yifan Peng
,
Yui Sudo
,
Muhammad Shakeel
,
Shinji Watanabe
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models.
CoRR
(2023)
Yui Sudo
,
Muhammad Shakeel
,
Brian Yan
,
Jiatong Shi
,
Shinji Watanabe
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders.
INTERSPEECH
(2023)
Yui Sudo
,
Kazuya Hata
,
Kazuhiro Nakadai
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation.
CoRR
(2023)
Yifan Peng
,
Yui Sudo
,
Muhammad Shakeel
,
Shinji Watanabe
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models.
INTERSPEECH
(2023)
Yui Sudo
,
Masayuki Takigahira
,
Hideo Tsuru
,
Kazuhiro Nakadai
,
Hirofumi Nakajima
Online Adaptation of Fourier Series Based Acoustic Transfer Function Model to Improve Sound Source Localization and Separation.
RO-MAN
(2023)
Yifan Peng
,
Jinchuan Tian
,
Brian Yan
,
Dan Berrebbi
,
Xuankai Chang
,
Xinjian Li
,
Jiatong Shi
,
Siddhant Arora
,
William Chen
,
Roshan S. Sharma
,
Wangyou Zhang
,
Yui Sudo
,
Muhammad Shakeel
,
Jee-Weon Jung
,
Soumi Maiti
,
Shinji Watanabe
Reproducing Whisper-Style Training Using An Open-Source Toolkit And Publicly Available Data.
ASRU
(2023)
Yui Sudo
,
Muhammad Shakeel
,
Brian Yan
,
Jiatong Shi
,
Shinji Watanabe
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders.
CoRR
(2022)
Ryu Takeda
,
Yui Sudo
,
Kazuhiro Nakadai
,
Kazunori Komatani
Empirical Sampling from Latent Utterance-wise Evidence Model for Missing Data ASR based on Neural Encoder-Decoder Model.
INTERSPEECH
(2022)
Yui Sudo
,
Muhammad Shakeel
,
Kazuhiro Nakadai
,
Jiatong Shi
,
Shinji Watanabe
Streaming Automatic Speech Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection.
INTERSPEECH
(2022)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Multi-channel Environmental Sound Segmentation utilizing Sound Source Localization and Separation U-Net.
SII
(2021)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Multichannel environmental sound segmentation.
Appl. Intell.
51 (11) (2021)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Multi-channel Environmental sound segmentation.
SII
(2020)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Sound event aware environmental sound segmentation with Mask U-Net.
Adv. Robotics
34 (20) (2020)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection.
DCASE
(2019)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Environmental sound segmentation utilizing Mask U-Net.
IROS
(2019)