Login / Signup
Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference.
Benjamin Hawks
Javier M. Duarte
Nicholas J. Fraser
Alessandro Pappalardo
Nhan Tran
Yaman Umuroglu
Published in:
CoRR (2021)
Keyphrases
</>
low latency
neural network
highly efficient
high bandwidth
search space
high throughput
high speed
sensor networks
real time
data sets
web services
cost effective