Investigating the Corpus Independence of the Bag-of-Audio-Words Approach.
Mercedes VetrábGábor GosztolyaPublished in: TDS (2020)
Keyphrases
- english words
- spontaneous speech
- word frequencies
- automatic transcription
- multiword
- bag of words
- text corpus
- unknown words
- text corpora
- word pairs
- training corpus
- world knowledge
- word frequency
- linguistic information
- word co occurrence
- n gram
- person names
- multimedia
- lexical features
- noun phrases
- human language
- word sense disambiguation
- human machine interaction
- conversational speech
- parallel texts
- textual features
- related words
- document level
- manually annotated
- audio visual
- parallel corpus
- semantic roles
- visual information
- audio stream
- keywords
- pos tagging
- annotated corpus
- broadcast news
- ambiguous words
- automatic speech recognition
- visual features