GRNN: Low-Latency and Scalable RNN Inference on GPUs.

Connor Holmes Daniel Mawhirter Yuxiong He Feng Yan Bo Wu

Published in: EuroSys (2019)

Keyphrases

low latency
continuous query processing
high speed
recurrent neural networks
high throughput
real time
high bandwidth
highly efficient
massive scale
nearest neighbor
virtual machine
general purpose
stream processing
databases
response time
low cost
orders of magnitude
database
graphics processing units
data sets