Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding.
Kumari NishuMinsik ChoPaul DixonDevang NaikPublished in: CoRR (2023)
Keyphrases
- keyword spotting
- printed documents
- speech processing
- handwritten documents
- speech recognition
- multimedia
- text retrieval
- document analysis
- signal processing
- document images
- character recognition
- english text
- text to speech
- information retrieval
- textual data
- speaker identification
- audio visual
- vector space
- web documents
- text mining
- optical character recognition
- automatic speech recognition
- multimedia information
- broadcast news
- hidden markov models
- keywords
- metadata
- video search
- natural language generation