Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU.

Jianjin LiaoMingzhen LiHailong YangQingxiao SunBiao SunJiwei HaoTianyu FengFengwei YuShengdong ChenYe TaoZicheng ZhangZhongzhi LuanDepei Qian
Published in: IPDPS (2023)
Keyphrases
  • real time
  • distributed databases
  • information processing
  • parallel processing
  • recurrent networks
  • graphics processors
  • neural network
  • support vector
  • general purpose
  • parallel computation
  • parallel architectures