HALP: Hardware-Aware Latency Pruning.
Maying ShenHongxu YinPavlo MolchanovLei MaoJianna LiuJose M. AlvarezPublished in: CoRR (2021)
Keyphrases
- low latency
- low cost
- heterogeneous computing
- real time
- search space
- hardware and software
- pruning method
- parallel hardware
- embedded systems
- computer systems
- high throughput
- hardware implementation
- computing power
- vlsi implementation
- general purpose
- effective pruning
- tree pruning
- pruning methods
- hardware architecture
- single chip
- high end
- response time
- prefetching