Login / Signup
M-A3C: A Mean-Asynchronous Advantage Actor-Critic Reinforcement Learning Method for Real-Time Gait Planning of Biped Robot.
Jie Leng
Suozhong Fan
Jun Tang
Haiming Mou
Junxiao Xue
Qingdu Li
Published in:
IEEE Access (2022)
Keyphrases
</>
reinforcement learning
real time
dynamic programming
temporal difference
biped robot
machine learning
monte carlo
input output
function approximation
negative matrix factorization
gradient method