Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies.
Zihan ZhangXiangyang JiSimon S. DuPublished in: COLT (2022)
Keyphrases
- reinforcement learning
- action sets
- markov decision processes
- stationary policies
- state space
- function approximation
- special case
- reinforcement learning algorithms
- optimal policy
- markov decision process
- computational complexity
- finite state
- multi agent
- optimal control
- linear programming
- linear program
- action space
- learning algorithm