Optimized FPGA-based Deep Learning Accelerator for Sparse CNN using High Bandwidth Memory.
Chao JiangDavid OjikaBhavesh PatelHerman LamPublished in: FCCM (2021)
Keyphrases
- deep learning
- high bandwidth
- application specific
- end to end
- unsupervised learning
- general purpose
- low latency
- machine learning
- field programmable gate array
- high density
- weakly supervised
- mental models
- high dimensional
- hardware implementation
- sparse representation
- parallel implementation
- ibm eservertm
- data analysis
- object recognition
- reinforcement learning
- learning algorithm