Login / Signup
Jeong Hun Yeo
ORCID
Publication Activity (10 Years)
Years Active: 2022-2024
Publications (10 Years): 15
Top Topics
Fusing Multiple
Uni Modal
Visual Speech Recognition
Visual Speech
Top Venues
CoRR
ICASSP
IEEE Trans. Multim.
AAAI
</>
Publications
</>
Minsu Kim
,
Jeong Hun Yeo
,
Jeongsoo Choi
,
Se Jin Park
,
Yong Man Ro
Multilingual Visual Speech Recognition with a Single Model by Learning with Discrete Visual Speech Units.
CoRR
(2024)
Jeong Hun Yeo
,
Seunghee Han
,
Minsu Kim
,
Yong Man Ro
Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing.
CoRR
(2024)
Minsu Kim
,
Jeongsoo Choi
,
Soumi Maiti
,
Jeong Hun Yeo
,
Shinji Watanabe
,
Yong Man Ro
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-Training and Multi-Modal Tokens.
ICASSP
(2024)
Se Jin Park
,
Chae Won Kim
,
Hyeongseop Rha
,
Minsu Kim
,
Joanna Hong
,
Jeong Hun Yeo
,
Yong Man Ro
Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation.
CoRR
(2024)
Jeong Hun Yeo
,
Minsu Kim
,
Jeongsoo Choi
,
Dae Hoe Kim
,
Yong Man Ro
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model.
IEEE Trans. Multim.
26 (2024)
Jeong Hun Yeo
,
Minsu Kim
,
Shinji Watanabe
,
Yong Man Ro
Visual Speech Recognition for Languages with Limited Labeled Data Using Automatic Labels from Whisper.
ICASSP
(2024)
Jeong Hun Yeo
,
Minsu Kim
,
Yong Man Ro
Multi-Temporal Lip-Audio Memory for Visual Speech Recognition.
ICASSP
(2023)
Jeong Hun Yeo
,
Minsu Kim
,
Shinji Watanabe
,
Yong Man Ro
Visual Speech Recognition for Low-resource Languages with Automatic Labels From Whisper Model.
CoRR
(2023)
Jeong Hun Yeo
,
Minsu Kim
,
Yong Man Ro
Multi-Temporal Lip-Audio Memory for Visual Speech Recognition.
CoRR
(2023)
Minsu Kim
,
Jeong Hun Yeo
,
Jeongsoo Choi
,
Yong Man Ro
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge.
ICCV
(2023)
Minsu Kim
,
Jeongsoo Choi
,
Soumi Maiti
,
Jeong Hun Yeo
,
Shinji Watanabe
,
Yong Man Ro
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens.
CoRR
(2023)
Minsu Kim
,
Jeong Hun Yeo
,
Jeongsoo Choi
,
Yong Man Ro
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge.
CoRR
(2023)
Jeong Hun Yeo
,
Minsu Kim
,
Jeongsoo Choi
,
Dae Hoe Kim
,
Yong Man Ro
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model.
CoRR
(2023)
Minsu Kim
,
Jeong Hun Yeo
,
Yong Man Ro
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading.
AAAI
(2022)
Minsu Kim
,
Jeong Hun Yeo
,
Yong Man Ro
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading.
CoRR
(2022)