Login / Signup
Loosely consistent emphatic temporal-difference learning.
Jiamin He
Fengdi Che
Yi Wan
A. Rupam Mahmood
Published in:
UAI (2023)
Keyphrases
</>
temporal difference learning
fixed point
function approximation
reinforcement learning
game playing
evaluation function
temporal difference
markov decision process
approximate value iteration
reinforcement learning algorithms
monte carlo
policy iteration
function approximators
state space