Downdating Lexicon and Language Model for Automatic Transcription of Czech Historical Spoken Documents.
Josef ChaloupkaJan NouzaPetr CervaJirí MálekPublished in: TSD (2013)
Keyphrases
- language model
- spoken document retrieval
- spontaneous speech
- cross language
- document retrieval
- out of vocabulary
- test collection
- language modeling
- information retrieval
- n gram
- language independent
- cross lingual
- probabilistic model
- speech recognition
- retrieval model
- query expansion
- text retrieval
- question answering
- cross language information retrieval
- query terms
- vector space model
- natural language
- broadcast news
- multiword
- pseudo relevance feedback
- text mining
- average precision
- information extraction
- query translation
- retrieval systems
- relevance model
- hidden markov models
- bayesian networks
- information access
- visual features
- document collections
- relevant documents