W2N-AVSC: Audiovisual Extension For Whisper-To-Normal Speech Conversion.
Shogo SekiKanami ImamuraHirokazu KameokaTakuhiro KanekoKou TanakaNoboru HaradaPublished in: EUSIPCO (2023)
Keyphrases
- audio visual
- emotion recognition
- visual information
- speech recognition
- speech signal
- machine learning
- speaker identification
- multimedia
- information systems
- multi modal
- non stationary
- user interface
- case study
- video retrieval
- automatic speech recognition
- artificial intelligence
- spoken language
- database
- english text
- recognition engine
- endpoint detection