Login / Signup
An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers.
Valentin Hofmann
Hinrich Schütze
Janet B. Pierrehumbert
Published in:
ACL (2) (2022)
Keyphrases
</>
language model
probabilistic model
context sensitive
information retrieval
clustering method
feature selection
n gram