Benchmarking Representations for Speech, Music, and Acoustic Events.
Moreno La QuatraAlkis KoudounasLorenzo VaianiElena BaralisLuca CaglieroPaolo GarzaSabato Marco SiniscalchiPublished in: CoRR (2024)
Keyphrases
- acoustic features
- music information retrieval
- audio signals
- audio features
- speech sounds
- audio signal
- speech recognition systems
- speech signal
- speech music discrimination
- event detection
- speaker verification
- speech recognition
- automatic speech recognition
- audio visual
- digital audio
- speech recognizers
- prosodic features
- audio recordings
- music retrieval
- source localization
- music recommendation
- underwater acoustic
- mel frequency cepstral coefficients
- speech recognizer
- speaker identification
- vocal tract
- sound source
- emotional speech
- acoustic signal
- bird species
- speaker recognition
- text to speech
- temporal information
- genre classification
- symbolic representation
- feature set
- formant frequencies
- hidden markov models
- higher level
- temporal patterns
- audio stream