Login / Signup

M-A3C: A Mean-Asynchronous Advantage Actor-Critic Reinforcement Learning Method for Real-Time Gait Planning of Biped Robot.

Jie LengSuozhong FanJun TangHaiming MouJunxiao XueQingdu Li
Published in: IEEE Access (2022)
Keyphrases
  • reinforcement learning
  • real time
  • dynamic programming
  • temporal difference
  • biped robot
  • machine learning
  • monte carlo
  • input output
  • function approximation
  • negative matrix factorization
  • gradient method