Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers.

Yuan GongSameer KhuranaLeonid KarlinskyJames R. Glass
Published in: INTERSPEECH (2023)