Login / Signup
Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference.
Benjamin Hawks
Javier M. Duarte
Nicholas J. Fraser
Alessandro Pappalardo
Nhan Tran
Yaman Umuroglu
Published in:
Frontiers Artif. Intell. (2021)
Keyphrases
</>
low latency
neural network
high throughput
highly efficient
high bandwidth
cost effective
massive scale
data sets
databases
search space
web services
sensor networks