Sign in
Yui Sudo
ORCID
Publication Activity (10 Years)
Years Active: 2019-2024
Publications (10 Years): 17
Top Topics
Automatic Speech Recognition
Sound Source
Video Segmentation
Segmentation Errors
Top Venues
CoRR
INTERSPEECH
SII
APSIPA ASC
</>
Publications
</>
Yui Sudo
,
Muhammad Shakeel
,
Yosuke Fukumoto
,
Yifan Peng
,
Shinji Watanabe
Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search.
CoRR
(2024)
Yifan Peng
,
Jinchuan Tian
,
William Chen
,
Siddhant Arora
,
Brian Yan
,
Yui Sudo
,
Muhammad Shakeel
,
Kwanghee Choi
,
Jiatong Shi
,
Xuankai Chang
,
Jee-weon Jung
,
Shinji Watanabe
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer.
CoRR
(2024)
Yifan Peng
,
Jinchuan Tian
,
Brian Yan
,
Dan Berrebbi
,
Xuankai Chang
,
Xinjian Li
,
Jiatong Shi
,
Siddhant Arora
,
William Chen
,
Roshan S. Sharma
,
Wangyou Zhang
,
Yui Sudo
,
Muhammad Shakeel
,
Jee-weon Jung
,
Soumi Maiti
,
Shinji Watanabe
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data.
CoRR
(2023)
Ryu Takeda
,
Yui Sudo
,
Kazunori Komatani
Flexible Evidence Model to Reduce Uncertainty Mismatch Between Speech Enhancement and ASR Based on Encoder-Decoder Architecture.
APSIPA ASC
(2023)
Yifan Peng
,
Yui Sudo
,
Muhammad Shakeel
,
Shinji Watanabe
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models.
CoRR
(2023)
Yui Sudo
,
Kazuya Hata
,
Kazuhiro Nakadai
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation.
CoRR
(2023)
Yui Sudo
,
Masayuki Takigahira
,
Hideo Tsuru
,
Kazuhiro Nakadai
,
Hirofumi Nakajima
Online Adaptation of Fourier Series Based Acoustic Transfer Function Model to Improve Sound Source Localization and Separation.
RO-MAN
(2023)
Yifan Peng
,
Jinchuan Tian
,
Brian Yan
,
Dan Berrebbi
,
Xuankai Chang
,
Xinjian Li
,
Jiatong Shi
,
Siddhant Arora
,
William Chen
,
Roshan S. Sharma
,
Wangyou Zhang
,
Yui Sudo
,
Muhammad Shakeel
,
Jee-Weon Jung
,
Soumi Maiti
,
Shinji Watanabe
Reproducing Whisper-Style Training Using An Open-Source Toolkit And Publicly Available Data.
ASRU
(2023)
Yui Sudo
,
Muhammad Shakeel
,
Brian Yan
,
Jiatong Shi
,
Shinji Watanabe
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders.
CoRR
(2022)
Ryu Takeda
,
Yui Sudo
,
Kazuhiro Nakadai
,
Kazunori Komatani
Empirical Sampling from Latent Utterance-wise Evidence Model for Missing Data ASR based on Neural Encoder-Decoder Model.
INTERSPEECH
(2022)
Yui Sudo
,
Muhammad Shakeel
,
Kazuhiro Nakadai
,
Jiatong Shi
,
Shinji Watanabe
Streaming Automatic Speech Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection.
INTERSPEECH
(2022)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Multi-channel Environmental Sound Segmentation utilizing Sound Source Localization and Separation U-Net.
SII
(2021)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Multichannel environmental sound segmentation.
Appl. Intell.
51 (11) (2021)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Multi-channel Environmental sound segmentation.
SII
(2020)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Sound event aware environmental sound segmentation with Mask U-Net.
Adv. Robotics
34 (20) (2020)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection.
DCASE
(2019)
Yui Sudo
,
Katsutoshi Itoyama
,
Kenji Nishida
,
Kazuhiro Nakadai
Environmental sound segmentation utilizing Mask U-Net.
IROS
(2019)