Login / Signup
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio.
Max Bain
Jaesung Huh
Tengda Han
Andrew Zisserman
Published in:
INTERSPEECH (2023)
Keyphrases
</>
multimedia
visual information
machine learning
computationally efficient
multiscale
high accuracy
data mining
high quality
signal processing
audio visual
cepstral features