Language modeling and transcription of the TED corpus lectures.
Erwin LeeuwisMarcello FedericoMauro CettoloPublished in: ICASSP (1) (2003)
Keyphrases
- language modeling
- language model
- information retrieval
- query expansion
- retrieval model
- comparable corpora
- n gram
- probabilistic model
- cross lingual
- test collection
- active learning
- statistical machine translation
- text classification
- statistical language modeling
- statistical language models
- handwriting recognition
- word segmentation
- speech recognition
- multiword
- vector space model
- relevance model
- information retrieval systems
- parallel corpora
- digital libraries
- similarity measure
- multimedia