Login / Signup

Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for Low-Latency Inference in NLP Applications.

Matthew KhouryRumen DangovskiLongwu OuPreslav NakovYichen ShenLi Jing
Published in: EMNLP (1) (2020)
Keyphrases
  • low latency
  • real time
  • high speed
  • computational complexity
  • low cost
  • high throughput
  • database
  • data mining
  • highly efficient