Stochastic Tokenization with a Language Model for Neural Text Classification.
Tatsuya HiraokaHiroyuki ShindoYuji MatsumotoPublished in: ACL (1) (2019)
Keyphrases
- language model
- text classification
- n gram
- language modeling
- probabilistic model
- bag of words
- document retrieval
- language modelling
- query expansion
- speech recognition
- text categorization
- test collection
- information retrieval
- naive bayes
- feature selection
- mixture model
- smoothing methods
- vector space model
- knn
- statistical language modeling
- retrieval model
- context sensitive
- labeled data
- query terms
- text documents
- machine learning
- text mining
- cross lingual
- multi label
- language models for information retrieval
- statistical language models
- language model for information retrieval
- document length
- translation model
- text classifiers
- pseudo relevance feedback
- named entities