Leveraging State-of-the-art ASR Techniques to Audio Captioning.
Chaitanya Prasad NarisettyTomoki HayashiRyunosuke IshizakiShinji WatanabeKazuya TakedaPublished in: DCASE (2021)
Keyphrases
- automatic speech recognition
- broadcast news
- multimedia
- speech recognition
- audio video
- audio stream
- speaker identification
- speech signal
- image classification
- noisy environments
- visual information
- audio visual
- signal processing
- music information retrieval
- music score
- audio recordings
- neural network
- audio features
- real time
- acoustic features
- digital video
- audio files
- spontaneous speech
- learning algorithm
- genetic algorithm
- video files