AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs.
Yassir FathullahChunyang WuEgor LakomkinKe LiJunteng JiaYuan ShangguanJay MahadeokarOzlem KalinliChristian FuegenMike SeltzerPublished in: NAACL-HLT (2024)
Keyphrases
- general purpose
- speech recognition
- special purpose
- domain specific
- speech signal
- programming language
- speech synthesis
- application specific
- spoken language
- endpoint detection
- automatic speech recognition
- audio visual
- text to speech
- recognition engine
- dialogue system
- text to speech synthesis
- automatic speech recognition systems
- language acquisition
- linear prediction
- emotion recognition
- spectral features
- speech processing
- cognitive abilities
- spontaneous speech
- multi lingual
- speaker recognition
- tightly coupled
- data sets
- human beings
- image sequences
- high level
- information systems
- neural network