Sign in
Ryo Masumura
ORCID
Publication Activity (10 Years)
Years Active: 2010-2023
Publications (10 Years): 132
Top Topics
Customer Satisfaction
Neural Network
Language Model
Outcome Prediction
Top Venues
INTERSPEECH
CoRR
ICASSP
APSIPA
</>
Publications
</>
Nobukatsu Hojo
,
Saki Mizuno
,
Satoshi Kobashikawa
,
Ryo Masumura
Modeling Lead-Lag Structure in Facial Expression Synchrony for Social-Psychological Outcome Prediction from Negotiation Interaction.
ICASSP Workshops
(2023)
Satoshi Suzuki
,
Taiga Yamane
,
Naoki Makishima
,
Keita Suzuki
,
Atsushi Ando
,
Ryo Masumura
OnDA-DETR: Online Domain Adaptation for Detection Transformers with Self-Training Framework.
ICIP
(2023)
Takafumi Moriya
,
Takanori Ashihara
,
Hiroshi Sato
,
Kohei Matsuura
,
Tomohiro Tanaka
,
Ryo Masumura
Improving Scheduled Sampling for Neural Transducer-Based ASR.
ICASSP
(2023)
Shota Orihashi
,
Yoshihiro Yamazaki
,
Mihiro Uchida
,
Akihiko Takashima
,
Ryo Masumura
Distilling Knowledge of Bidirectional Language Model for Scene Text Recognition.
ICIP
(2023)
Ryo Masumura
,
Naoki Makishima
,
Taiga Yamane
,
Yoshihiko Yamazaki
,
Saki Mizuno
,
Mana Ihori
,
Mihiro Uchida
,
Keita Suzuki
,
Hiroshi Sato
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Satoshi Suzuki
,
Takafumi Moriya
,
Nobukatsu Hojo
,
Atsushi Ando
End-to-End Joint Target and Non-Target Speakers ASR.
CoRR
(2023)
Kohei Matsuura
,
Takanori Ashihara
,
Takafumi Moriya
,
Tomohiro Tanaka
,
Atsunori Ogawa
,
Marc Delcroix
,
Ryo Masumura
Leveraging Large Text Corpora for End-to-End Speech Summarization.
CoRR
(2023)
Ryo Masumura
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
Text-to-Text Pre-Training with Paraphrasing for Improving Transformer-Based Image Captioning.
EUSIPCO
(2023)
Kohei Matsuura
,
Takanori Ashihara
,
Takafumi Moriya
,
Tomohiro Tanaka
,
Atsunori Ogawa
,
Marc Delcroix
,
Ryo Masumura
Leveraging Large Text Corpora For End-To-End Speech Summarization.
ICASSP
(2023)
Mihiro Uchida
,
Shota Orihashi
,
Akihiko Takashima
,
Yoshihiro Yamazaki
,
Ryo Masumura
Open-Set Recognition for Facial-Expression Recognition.
ICIP
(2023)
Keita Suzuki
,
Satoshi Suzuki
,
Ryo Masumura
,
Atsushi Ando
,
Naoki Makishima
Multi-region CNN-Transformer for Micro-gesture Recognition in Face and Upper Body.
MMAsia
(2023)
Hiroshi Sato
,
Ryo Masumura
,
Tsubasa Ochiai
,
Marc Delcroix
,
Takafumi Moriya
,
Takanori Ashihara
,
Kentaro Shinayama
,
Saki Mizuno
,
Mana Ihori
,
Tomohiro Tanaka
,
Nobukatsu Hojo
Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss.
CoRR
(2023)
Satoshi Suzuki
,
Shin'ya Yamaguchi
,
Shoichiro Takeda
,
Sekitoshi Kanai
,
Naoki Makishima
,
Atsushi Ando
,
Ryo Masumura
Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff.
CoRR
(2023)
Tomohiro Tanaka
,
Ryo Masumura
,
Mana Ihori
,
Hiroshi Sato
,
Taiga Yamane
,
Takanori Ashihara
,
Kohei Matsuura
,
Takafumi Moriya
Leveraging Language Embeddings for Cross-Lingual Self-Supervised Speech Representation Learning.
ICASSP
(2023)
Satoshi Suzuki
,
Shin'ya Yamaguchi
,
Shoichiro Takeda
,
Sekitoshi Kanai
,
Naoki Makishima
,
Atsushi Ando
,
Ryo Masumura
Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff.
ICCV
(2023)
Mana Ihori
,
Hiroshi Sato
,
Tomohiro Tanaka
,
Ryo Masumura
Retrieval, Masking, and Generation: Feedback Comment Generation using Masked Comment Examples.
INLG (Generation Challenges)
(2023)
Saki Mizuno
,
Nobukatsu Hojo
,
Satoshi Kobashikawa
,
Ryo Masumura
Next-Speaker Prediction Based on Non-Verbal Information in Multi-Party Video Conversation.
ICASSP
(2023)
Atsushi Ando
,
Ryo Masumura
,
Akihiko Takashima
,
Satoshi Suzuki
,
Naoki Makishima
,
Keita Suzuki
,
Takafumi Moriya
,
Takanori Ashihara
,
Hiroshi Sato
On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis.
SLT
(2022)
Satoshi Suzuki
,
Shoichiro Takeda
,
Naoki Makishima
,
Atsushi Ando
,
Ryo Masumura
,
Hayaru Shouno
Knowledge Transferred Fine-Tuning: Convolutional Neural Network Is Born Again With Anti-Aliasing Even in Data-Limited Situations.
IEEE Access
10 (2022)
Hiroshi Sato
,
Tsubasa Ochiai
,
Marc Delcroix
,
Keisuke Kinoshita
,
Takafumi Moriya
,
Naoki Makishima
,
Mana Ihori
,
Tomohiro Tanaka
,
Ryo Masumura
Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations.
CoRR
(2022)
Takafumi Moriya
,
Takanori Ashihara
,
Atsushi Ando
,
Hiroshi Sato
,
Tomohiro Tanaka
,
Kohei Matsuura
,
Ryo Masumura
,
Marc Delcroix
,
Takahiro Shinozaki
Hybrid RNN-T/Attention-Based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration.
ICASSP
(2022)
Shota Orihashi
,
Yoshihiro Yamazaki
,
Mihiro Uchida
,
Akihiko Takashima
,
Ryo Masumura
Fully Shareable Scene Text Recognition Modeling for Horizontal and Vertical Writing.
ICIP
(2022)
Wataru Nakata
,
Tomoki Koriyama
,
Shinnosuke Takamichi
,
Yuki Saito
,
Yusuke Ijima
,
Ryo Masumura
,
Hiroshi Saruwatari
Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis.
INTERSPEECH
(2022)
Naoki Makishima
,
Satoshi Suzuki
,
Atsushi Ando
,
Ryo Masumura
Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data.
INTERSPEECH
(2022)
Atsushi Ando
,
Ryo Masumura
,
Akihiko Takashima
,
Satoshi Suzuki
,
Naoki Makishima
,
Keita Suzuki
,
Takafumi Moriya
,
Takanori Ashihara
,
Hiroshi Sato
On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis.
CoRR
(2022)
Mana Ihori
,
Hiroshi Sato
,
Tomohiro Tanaka
,
Ryo Masumura
Multi-Perspective Document Revision.
COLING
(2022)
Yoshihiro Yamazaki
,
Shota Orihashi
,
Ryo Masumura
,
Mihiro Uchida
,
Akihiko Takashima
Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations.
CoRR
(2022)
Atsushi Ando
,
Yumiko Murata
,
Ryo Masumura
,
Satoshi Suzuki
,
Naoki Makishima
,
Takafumi Moriya
,
Takanori Ashihara
,
Hiroshi Sato
Customer Satisfaction Estimation Using Unsupervised Representation Learning with Multi-Format Prediction Loss.
ICASSP
(2022)
Fumio Nihei
,
Ryo Ishii
,
Yukiko I. Nakano
,
Kyosuke Nishida
,
Ryo Masumura
,
Atsushi Fukayama
,
Takao Nakamura
Dialogue Acts Aided Important Utterance Detection Based on Multiparty and Multimodal Information.
INTERSPEECH
(2022)
Naoki Makishima
,
Satoshi Suzuki
,
Atsushi Ando
,
Ryo Masumura
Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data.
CoRR
(2022)
Akihiko Takashima
,
Ryo Masumura
,
Atsushi Ando
,
Yoshihiro Yamazaki
,
Mihiro Uchida
,
Shota Orihashi
Interactive Co-Learning with Cross-Modal Transformer for Audio-Visual Emotion Recognition.
INTERSPEECH
(2022)
Ryo Masumura
,
Yoshihiro Yamazaki
,
Saki Mizuno
,
Naoki Makishima
,
Mana Ihori
,
Mihiro Uchida
,
Hiroshi Sato
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Satoshi Suzuki
,
Shota Orihashi
,
Takafumi Moriya
,
Nobukatsu Hojo
,
Atsushi Ando
End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training.
INTERSPEECH
(2022)
Nobukatsu Hojo
,
Satoshi Kobashikawa
,
Saki Mizuno
,
Ryo Masumura
Multimodal Negotiation Corpus with Various Subjective Assessments for Social-Psychological Outcome Prediction from Non-Verbal Cues.
LREC
(2022)
Hiroshi Sato
,
Tsubasa Ochiai
,
Marc Delcroix
,
Keisuke Kinoshita
,
Takafumi Moriya
,
Naoki Makishima
,
Mana Ihori
,
Tomohiro Tanaka
,
Ryo Masumura
Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations.
INTERSPEECH
(2022)
Tomohiro Tanaka
,
Ryo Masumura
,
Hiroshi Sato
,
Mana Ihori
,
Kohei Matsuura
,
Takanori Ashihara
,
Takafumi Moriya
Domain Adversarial Self-Supervised Speech Representation Learning for Improving Unknown Domain Downstream Tasks.
INTERSPEECH
(2022)
Naoki Makishima
,
Mana Ihori
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Shota Orihashi
,
Ryo Masumura
Enrollment-Less Training for Personalized Voice Activity Detection.
Interspeech
(2021)
Shota Orihashi
,
Yoshihiro Yamazaki
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Ryo Masumura
Hierarchical Knowledge Distillation for Dialogue Sequence Labeling.
ASRU
(2021)
Mana Ihori
,
Naoki Makishima
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Shota Orihashi
,
Ryo Masumura
MAPGN: MAsked Pointer-Generator Network for sequence-to-sequence pre-training.
CoRR
(2021)
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
,
Ryo Masumura
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss.
ICASSP
(2021)
Takafumi Moriya
,
Tomohiro Tanaka
,
Takanori Ashihara
,
Tsubasa Ochiai
,
Hiroshi Sato
,
Atsushi Ando
,
Ryo Masumura
,
Marc Delcroix
,
Taichi Asami
Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture.
Interspeech
(2021)
Tomohiro Tanaka
,
Ryo Masumura
,
Mana Ihori
,
Akihiko Takashima
,
Takafumi Moriya
,
Takanori Ashihara
,
Shota Orihashi
,
Naoki Makishima
Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition.
Interspeech
(2021)
Ryo Masumura
,
Daiki Okamura
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation.
CoRR
(2021)
Takafumi Moriya
,
Takanori Ashihara
,
Tomohiro Tanaka
,
Tsubasa Ochiai
,
Hiroshi Sato
,
Atsushi Ando
,
Yusuke Ijima
,
Ryo Masumura
,
Yusuke Shinohara
Simpleflat: A Simple Whole-Network Pre-Training Approach for RNN Transducer-Based End-to-End Speech Recognition.
ICASSP
(2021)
Mana Ihori
,
Naoki Makishima
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Shota Orihashi
,
Ryo Masumura
Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens.
Interspeech
(2021)
Shota Orihashi
,
Yoshihiro Yamazaki
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Ryo Masumura
Hierarchical Knowledge Distillation for Dialogue Sequence Labeling.
CoRR
(2021)
Ryo Masumura
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
Large-Context Conversational Representation Learning: Self-Supervised Learning For Conversational Documents.
SLT
(2021)
Ryo Masumura
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
Large-Context Conversational Representation Learning: Self-Supervised Learning for Conversational Documents.
CoRR
(2021)
Shota Orihashi
,
Yoshihiro Yamazaki
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Ryo Masumura
Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages.
MMAsia
(2021)
Shota Orihashi
,
Yoshihiro Yamazaki
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Ryo Masumura
Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages.
CoRR
(2021)
Mana Ihori
,
Naoki Makishima
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Shota Orihashi
,
Ryo Masumura
MAPGN: Masked Pointer-Generator Network for Sequence-to-Sequence Pre-Training.
ICASSP
(2021)
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
,
Ryo Masumura
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss.
CoRR
(2021)
Ryo Masumura
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
Hierarchical Transformer-Based Large-Context End-To-End ASR with Large-Context Knowledge Distillation.
ICASSP
(2021)
Tomohiro Tanaka
,
Ryo Masumura
,
Mana Ihori
,
Akihiko Takashima
,
Takafumi Moriya
,
Takanori Ashihara
,
Shota Orihashi
,
Naoki Makishima
Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition.
CoRR
(2021)
Mana Ihori
,
Naoki Makishima
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Shota Orihashi
,
Ryo Masumura
Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens.
CoRR
(2021)
Tomohiro Tanaka
,
Ryo Masumura
,
Takanobu Oba
Neural candidate-aware language models for speech recognition.
Comput. Speech Lang.
66 (2021)
Ryo Masumura
,
Daiki Okamura
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation.
Interspeech
(2021)
Atsushi Ando
,
Ryo Masumura
,
Hiroshi Sato
,
Takafumi Moriya
,
Takanori Ashihara
,
Yusuke Ijima
,
Tomoki Toda
Speech Emotion Recognition Based on Listener Adaptive Models.
ICASSP
(2021)
Tomohiro Tanaka
,
Ryo Masumura
,
Mana Ihori
,
Akihiko Takashima
,
Shota Orihashi
,
Naoki Makishima
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning.
Interspeech
(2021)
Ryo Masumura
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
Hierarchical Transformer-based Large-Context End-to-end ASR with Large-Context Knowledge Distillation.
CoRR
(2021)
Tomohiro Tanaka
,
Ryo Masumura
,
Mana Ihori
,
Akihiko Takashima
,
Shota Orihashi
,
Naoki Makishima
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning.
CoRR
(2021)
Ryo Masumura
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Takanori Ashihara
End-to-End Automatic Speech Recognition with Deep Mutual Learning.
CoRR
(2021)
Ryo Masumura
,
Taichi Asami
,
Takanobu Oba
,
Sumitaka Sakauchi
Hierarchical Latent Words Language Models for Automatic Speech Recognition.
J. Inf. Process.
29 (2021)
Naoki Makishima
,
Mana Ihori
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Shota Orihashi
,
Ryo Masumura
Enrollment-less training for personalized voice activity detection.
CoRR
(2021)
Akihiko Takashima
,
Naoki Makishima
,
Mana Ihori
,
Tomohiro Tanaka
,
Shota Orihashi
,
Ryo Masumura
Unsupervised Domain Adversarial Training in Angular Space for Facial Expression Recognition.
APSIPA
(2020)
Mana Ihori
,
Akihiko Takashima
,
Ryo Masumura
Large-Context Pointer-Generator Networks for Spoken-to-Written Style Conversion.
ICASSP
(2020)
Takashi Kodama
,
Ryuichiro Higashinaka
,
Koh Mitsuda
,
Ryo Masumura
,
Yushi Aono
,
Ryuta Nakamura
,
Noritake Adachi
,
Hidetoshi Kawabata
Generating Responses that Reflect Meta Information in User-Generated Question Answer Pairs.
LREC
(2020)
Mana Ihori
,
Ryo Masumura
,
Naoki Makishima
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Shota Orihashi
Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model.
CoRR
(2020)
Yuki Yamashita
,
Tomoki Koriyama
,
Yuki Saito
,
Shinnosuke Takamichi
,
Yusuke Ijima
,
Ryo Masumura
,
Hiroshi Saruwatari
Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis.
INTERSPEECH
(2020)
Takafumi Moriya
,
Hiroshi Sato
,
Tomohiro Tanaka
,
Takanori Ashihara
,
Ryo Masumura
,
Yusuke Shinohara
Distilling Attention Weights for CTC-Based ASR Systems.
ICASSP
(2020)
Mana Ihori
,
Akihiko Takashima
,
Ryo Masumura
Parallel Corpus for Japanese Spoken-to-Written Style Conversion.
LREC
(2020)
Ryo Masumura
,
Mana Ihori
,
Akihiko Takashima
,
Takafumi Moriya
,
Atsushi Ando
,
Yusuke Shinohara
Sequence-Level Consistency Training for Semi-Supervised End-to-End Automatic Speech Recognition.
ICASSP
(2020)
Yuma Koizumi
,
Ryo Masumura
,
Kyosuke Nishida
,
Masahiro Yasuda
,
Shoichiro Saito
A Transformer-based Audio Captioning Model with Keyword Estimation.
CoRR
(2020)
Ryo Imaizumi
,
Ryo Masumura
,
Sayaka Shiota
,
Hitoshi Kiya
Sequence-To-One Neural Networks for Japanese Dialect Speech Classification.
GCCE
(2020)
Yuki Yamashita
,
Tomoki Koriyama
,
Yuki Saito
,
Shinnosuke Takamichi
,
Yusuke Ijima
,
Ryo Masumura
,
Hiroshi Saruwatari
DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus.
LREC
(2020)
Atsushi Ando
,
Ryo Masumura
,
Hosana Kamiyama
,
Satoshi Kobashikawa
,
Yushi Aono
,
Tomoki Toda
Customer Satisfaction Estimation in Contact Center Calls Based on a Hierarchical Multi-Task Model.
IEEE ACM Trans. Audio Speech Lang. Process.
28 (2020)
Shota Orihashi
,
Mana Ihori
,
Tomohiro Tanaka
,
Ryo Masumura
Unsupervised Domain Adaptation for Dialogue Sequence Labeling Based on Hierarchical Adversarial Training.
INTERSPEECH
(2020)
Takafumi Moriya
,
Tsubasa Ochiai
,
Shigeki Karita
,
Hiroshi Sato
,
Tomohiro Tanaka
,
Takanori Ashihara
,
Ryo Masumura
,
Yusuke Shinohara
,
Marc Delcroix
Self-Distillation for Improving CTC-Transformer-Based ASR Systems.
INTERSPEECH
(2020)
Mana Ihori
,
Ryo Masumura
,
Naoki Makishima
,
Tomohiro Tanaka
,
Akihiko Takashima
,
Shota Orihashi
Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model.
INLG
(2020)
Yuma Koizumi
,
Ryo Masumura
,
Kyosuke Nishida
,
Masahiro Yasuda
,
Shoichiro Saito
A Transformer-Based Audio Captioning Model with Keyword Estimation.
INTERSPEECH
(2020)
Ryo Imaizumi
,
Ryo Masumura
,
Sayaka Shiota
,
Hitoshi Kiya
Dialect-Aware Modeling for End-to-End Japanese Dialect Speech Recognition.
APSIPA
(2020)
Ryo Masumura
,
Naoki Makishima
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Shota Orihashi
Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition.
INTERSPEECH
(2020)
Ryo Masumura
,
Mana Ihori
,
Akihiko Takashima
,
Tomohiro Tanaka
,
Takanori Ashihara
End-to-End Automatic Speech Recognition with Deep Mutual Learning.
APSIPA
(2020)
Ryo Masumura
,
Kiyoaki Matsui
,
Yuma Koizumi
,
Takaaki Fukutomi
,
Takanobu Oba
,
Yushi Aono
Context-Aware Neural Voice Activity Detection Using Auxiliary Networks for Phoneme Recognition, Speech Enhancement and Acoustic Scene Classification.
EUSIPCO
(2019)
Atsushi Ando
,
Ryo Masumura
,
Hosana Kamiyama
,
Satoshi Kobashikawa
,
Yushi Aono
Speech Emotion Recognition Based on Multi-Label Emotion Existence Model.
INTERSPEECH
(2019)
Ryo Masumura
,
Taichi Asami
,
Takanobu Oba
,
Sumitaka Sakauchi
,
Akinori Ito
Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition.
IEICE Trans. Inf. Syst.
(12) (2019)
Ryo Masumura
,
Yusuke Ijima
,
Satoshi Kobashikawa
,
Takanobu Oba
,
Yushi Aono
Can We Simulate Generative Process of Acoustic Modeling Data? Towards Data Restoration for Acoustic Modeling.
APSIPA
(2019)
Hosana Kamiyama
,
Atsushi Ando
,
Ryo Masumura
,
Satoshi Kobashikawa
,
Yushi Aono
Likability Estimation of Call-center Agents by Suppressing Annotator Variability.
APSIPA
(2019)
Ryo Masumura
,
Mana Ihori
,
Tomohiro Tanaka
,
Atsushi Ando
,
Ryo Ishii
,
Takanobu Oba
,
Ryuichiro Higashinaka
Improving Speech-Based End-of-Turn Detection Via Cross-Modal Representation Learning with Punctuated Text Data.
ASRU
(2019)
Takafumi Moriya
,
Jian Wang
,
Tomohiro Tanaka
,
Ryo Masumura
,
Yusuke Shinohara
,
Yoshikazu Yamaguchi
,
Yushi Aono
Joint Maximization Decoder with Neural Converters for Fully Neural Network-Based Japanese Speech Recognition.
INTERSPEECH
(2019)
Ryo Masumura
,
Tomohiro Tanaka
,
Takafumi Moriya
,
Yusuke Shinohara
,
Takanobu Oba
,
Yushi Aono
Large Context End-to-end Automatic Speech Recognition via Extension of Hierarchical Recurrent Encoder-decoder Models.
ICASSP
(2019)
Ryo Masumura
,
Tomohiro Tanaka
,
Atsushi Ando
,
Hosana Kamiyama
,
Takanobu Oba
,
Satoshi Kobashikawa
,
Yushi Aono
Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models.
INTERSPEECH
(2019)
Hosana Kamiyama
,
Atsushi Ando
,
Ryo Masumura
,
Satoshi Kobashikawa
,
Yushi Aono
Urgent Voicemail Detection Focused on Long-term Temporal Variation.
APSIPA
(2019)
Satoshi Kobashikawa
,
Atushi Odakura
,
Takao Nakamura
,
Takeshi Mori
,
Kimitaka Endo
,
Takafumi Moriya
,
Ryo Masumura
,
Yushi Aono
,
Nobuaki Minematsu
Does Speaking Training Application with Speech Recognition Motivate Junior High School Students in Actual Classroom? - A Case Study.
SLaTE
(2019)
Ryo Masumura
,
Hiroshi Sato
,
Tomohiro Tanaka
,
Takafumi Moriya
,
Yusuke Ijima
,
Takanobu Oba
End-to-End Automatic Speech Recognition with a Reconstruction Criterion Using Speech-to-Text and Text-to-Speech Encoder-Decoders.
INTERSPEECH
(2019)
Tomohiro Tanaka
,
Ryo Masumura
,
Takafumi Moriya
,
Takanobu Oba
,
Yushi Aono
A Joint End-to-End and DNN-HMM Hybrid Automatic Speech Recognition System with Transferring Sharable Knowledge.
INTERSPEECH
(2019)
Ryo Masumura
,
Taichi Asami
,
Takanobu Oba
,
Hirokazu Masataki
,
Sumitaka Sakauchi
Viterbi Approximation of Latent Words Language Models for Automatic Speech Recognition.
J. Inf. Process.
27 (2019)
Tomohiro Tanaka
,
Ryo Masumura
,
Takafumi Moriya
,
Takanobu Oba
,
Yushi Aono
Disfluency Detection Based on Speech-Aware Token-by-Token Sequence Labeling with BLSTM-CRFs and Attention Mechanisms.
APSIPA
(2019)
Hiroshi Sato
,
Takafumi Moriya
,
Yusuke Shinohara
,
Ryo Masumura
,
Takaaki Fukutomi
,
Kiyoaki Matsui
,
Takanori Ashihara
,
Yoshikazu Yamaguchi
,
Yushi Aono
Revisiting Dynamic Adjustment of Language Model Scaling Factor for Automatic Speech Recognition.
APSIPA
(2019)
Taichi Asami
,
Ryo Masumura
,
Yushi Aono
,
Koichi Shinoda
Recurrent out-of-vocabulary word detection based on distribution of features.
Comput. Speech Lang.
58 (2019)
Ryo Masumura
,
Mana Ihori
,
Tomohiro Tanaka
,
Itsumi Saito
,
Kyosuke Nishida
,
Takanobu Oba
Generalized Large-Context Language Models Based on Forward-Backward Hierarchical Recurrent Encoder-Decoder Models.
ASRU
(2019)
Ryo Masumura
,
Setsuo Yamada
,
Tomohiro Tanaka
,
Atsushi Ando
,
Hosana Kamiyama
,
Yushi Aono
Online Call Scene Segmentation of Contact Center Dialogues based on Role Aware Hierarchical LSTM-RNNs.
APSIPA
(2018)