Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection.
Yu-Hsuan WangHung-yi LeeLin-Shan LeePublished in: ICASSP (2018)
Keyphrases
- spoken term detection
- broadcast news
- hidden markov models
- automatic speech recognition
- language model
- sequence alignment
- out of vocabulary
- video search
- n gram
- multimedia
- pattern recognition
- multi modal
- natural language
- vector space
- video retrieval
- visual information
- non stationary
- co occurrence
- language processing
- speech recognition