Duplo: Lifting Redundant Memory Accesses of Deep Neural Networks for GPU Tensor Cores.
Hyeonjin KimSungwoo AhnYunho OhBogil KimWon Woo RoWilliam J. SongPublished in: MICRO (2020)
Keyphrases
- neural network
- auto associative
- level parallelism
- memory bandwidth
- artificial neural networks
- pattern recognition
- high order
- associative memory
- real time
- memory requirements
- memory space
- memory usage
- wavelet transform
- higher order
- multilayer perceptron
- recurrent neural networks
- computational power
- intel xeon
- genetic algorithm
- computing power
- parallel processing
- neural network model
- access patterns
- graphics processors
- main memory
- parallel implementation
- neural nets
- data transfer
- fuzzy logic
- graphics hardware
- parallel architectures
- memory access
- multiresolution
- back propagation