Login / Signup
Modified Retrace for Off-Policy Temporal Difference Learning.
Xingguo Chen
Xingzhou Ma
Yang Li
Guang Yang
Shangdong Yang
Yang Gao
Published in:
UAI (2023)
Keyphrases
</>
temporal difference learning
function approximation
fixed point
game playing
reinforcement learning
approximate value iteration
temporal difference
evaluation function
reinforcement learning algorithms
markov decision process
learning algorithm
dynamic programming
monte carlo