You should evaluate your language model on marginal likelihood over tokenisations.
Kris CaoLaura RimellPublished in: CoRR (2021)
Keyphrases
- language model
- marginal likelihood
- model selection
- language modeling
- gaussian process
- information criterion
- approximate inference
- closed form
- exponential family
- document retrieval
- speech recognition
- mixture model
- probabilistic model
- information retrieval
- hyperparameters
- bayesian information criterion
- least squares
- machine learning
- cross validation
- message passing
- information extraction