Login / Signup
NinjaLLM: Fast, Scalable and Cost-effective RAG using Amazon SageMaker and AWS Trainium and Inferentia2.
Tengfei Xue
Xuefeng Li
Roman Smirnov
Tahir Azim
Arash Sadrieh
Babak Pahlavan
Published in:
CoRR (2024)
Keyphrases
</>
lightweight
cost effective
low cost
cost effectiveness
error tolerant
data center
graph cuts
map reduce
optimal scheduling
database systems
mobile devices
graph matching
highly scalable