Login / Signup

NinjaLLM: Fast, Scalable and Cost-effective RAG using Amazon SageMaker and AWS Trainium and Inferentia2.

Tengfei XueXuefeng LiRoman SmirnovTahir AzimArash SadriehBabak Pahlavan
Published in: CoRR (2024)
Keyphrases
  • lightweight
  • cost effective
  • low cost
  • cost effectiveness
  • error tolerant
  • data center
  • graph cuts
  • map reduce
  • optimal scheduling
  • database systems
  • mobile devices
  • graph matching
  • highly scalable