Login / Signup
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception.
HyoJung Han
Mohamed Anwar
Juan Pino
Wei-Ning Hsu
Marine Carpuat
Bowen Shi
Changhan Wang
Published in:
ACL (1) (2024)
Keyphrases
</>
cross lingual
noisy environments
speaker identification
visual speech
reinforcement learning
active learning
supervised learning
broadcast news
visual speech recognition
learning algorithm
training set
information extraction
information retrieval systems
audio visual
speech signal
audio signals