Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies.

Zihan Zhang Xiangyang Ji Simon S. Du

Published in: COLT (2022)

Keyphrases

reinforcement learning
action sets
markov decision processes
stationary policies
state space
function approximation
special case
reinforcement learning algorithms
optimal policy
markov decision process
computational complexity
finite state
multi agent
optimal control
linear programming
linear program
action space
learning algorithm