Unsupervised stemmed text corpus for language modeling and transcription of Telugu broadcast news.
Mythilisharan PalaParayitam LaxminarayanaVenkataramana AppalaPublished in: Int. J. Speech Technol. (2020)
Keyphrases
- language modeling
- broadcast news
- language model
- information retrieval
- retrieval model
- query expansion
- automatic speech recognition
- text corpora
- n gram
- cross lingual
- unsupervised learning
- probabilistic model
- video search
- text classification
- speech recognition
- semi supervised
- video retrieval
- information retrieval systems
- text documents
- data mining
- image classification
- named entities
- topic modeling
- document collections
- document retrieval
- test collection
- digital libraries
- keywords
- similarity measure
- feature selection
- machine learning