Efficient speech detection in environmental audio using acoustic recognition and knowledge distillation.
Drew PriebeBurooj GhaniDan StowellPublished in: CoRR (2023)
Keyphrases
- environmental sounds
- acoustic features
- recognition rate
- visual speech
- speech sounds
- domain knowledge
- audio visual
- speech recognition
- recognition engine
- speech recognition systems
- automatic transcription
- object recognition
- object detection
- audio stream
- hidden markov models
- mel frequency cepstral coefficients
- speaker independent
- prosodic features
- traffic signs
- noisy environments
- pattern recognition
- cepstral coefficients
- digital audio
- speech corpus
- feature extraction
- multi stream
- audio signals
- speaker recognition
- speaker identification
- automatic speech recognition
- recognition algorithm
- knowledge base
- audio signal
- noisy speech
- voice activity detection
- cepstral features