Login / Signup
Empowering 1000 tokens/second on-device LLM prefilling with mllm-NPU.
Daliang Xu
Hao Zhang
Liming Yang
Ruiqi Liu
Gang Huang
Mengwei Xu
Xuanzhe Liu
Published in:
CoRR (2024)
Keyphrases
</>
information retrieval
real time
machine learning
line segments
data acquisition
medical devices
portable devices
data mining
knowledge base
face recognition
wide range
artificial neural networks