UniKW-AT: Unified Keyword Spotting and Audio Tagging.
Heinrich DinkelYongqing WangZhiyong YanJunbo ZhangYujun WangPublished in: INTERSPEECH (2022)
Keyphrases
- keyword spotting
- speech processing
- speech recognition
- hidden markov models
- speaker identification
- signal processing
- multimedia
- printed documents
- natural language processing
- visual information
- artificial intelligence
- multimedia systems
- audio visual
- metadata
- neural network
- handwritten documents
- english text
- automatic speech recognition
- multimedia information
- audio features
- character recognition
- gaussian mixture model
- image analysis