Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains.
Martin LebourdaisThéo MariotteMarie TahonAnthony LarcherAntoine LaurentSilvio MontrésorSylvain MeignierJean-Hugh ThomasPublished in: CoRR (2023)
Keyphrases
- audio visual
- audio stream
- broadcast news
- speech recognition
- text to speech
- speech signal
- speaker identification
- automatic speech recognition
- speech processing
- audio signals
- noisy environments
- emotion recognition
- voice activity detection
- spoken language
- recognition engine
- audio recordings
- detection method
- speech synthesis
- cepstral features
- real world
- digital audio
- acoustic signals
- speech music discrimination
- audio features
- multi modal
- automatic detection
- detection algorithm
- spoken documents
- object detection
- multimedia
- acoustic features
- visual data
- face detection
- prosodic features
- anomaly detection
- endpoint detection