Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation.
Amir YazdanbakhshAshkan MoradifirouzabadiZheng LiMingu KangPublished in: CoRR (2022)
Keyphrases
- high speed
- random access memory
- memory access
- low cost
- sparse data
- vlsi implementation
- memory space
- memory requirements
- memory usage
- digital signal processors
- single chip
- computational power
- high dimensional
- real time
- memory subsystem
- compressive sensing
- visual attention
- processor core
- sparse representation
- pruning method
- avoid overfitting
- analog vlsi
- search space
- pruning power
- level parallelism
- programmable logic
- pruning algorithm
- main memory
- computing power
- focus of attention