Login / Signup
Tokenization Falling Short: The Curse of Tokenization.
Yekun Chai
Yewei Fang
Qiwei Peng
Xuhong Li
Published in:
CoRR (2024)
Keyphrases
</>
named entities
biomedical text
biomedical information retrieval
high dimensional
n gram
dimension reduction
character n grams
high dimensional data
real time
neural network
real world
evolutionary algorithm
language model
high dimensionality