EdgeLLM: A Highly Efficient CPU-FPGA Heterogeneous Edge Accelerator for Large Language Models.
Mingqiang HuangAo ShenKai LiHaoxiang PengBoyu LiHao YuPublished in: CoRR (2024)
Keyphrases
- highly efficient
- language model
- field programmable gate array
- low cost
- multithreading
- language modeling
- n gram
- probabilistic model
- information retrieval
- retrieval model
- speech recognition
- document retrieval
- test collection
- hardware implementation
- query expansion
- ad hoc information retrieval
- embedded systems
- real time
- smoothing methods
- language modelling
- query terms
- relevance model
- statistical language models
- context sensitive
- translation model
- low complexity
- parallel implementation
- parallel computing
- document ranking
- relevant documents
- language models for information retrieval
- spoken term detection