An algorithm/hardware co-optimized method to accelerate CNNs with compressed convolutional weights on FPGA.
Jiangwei ShangZhan ZhangKun ZhangChuanyou LiLei QianHong-Wei LiuPublished in: Concurr. Comput. Pract. Exp. (2024)
Keyphrases
- high accuracy
- experimental evaluation
- dynamic programming
- preprocessing
- computational cost
- objective function
- computationally efficient
- significant improvement
- cost function
- detection algorithm
- k means
- improved algorithm
- clustering method
- optimization algorithm
- computational complexity
- detection method
- recognition algorithm
- hardware implementation
- theoretical analysis
- weighted average
- energy function
- segmentation method
- reduce the computational cost
- segmentation algorithm
- support vector machine svm
- input data
- tree structure
- matching algorithm
- convergence rate
- weighting scheme
- optimization method
- low cost
- probabilistic model
- similarity measure
- parallel implementation
- classification algorithm
- nearest neighbour
- cellular neural networks
- hardware architecture
- simulated annealing
- software implementation
- real time
- fpga implementation
- optimal weights