Login / Signup
Optimizing over a Restricted Policy Class in MDPs.
Ershad Banijamali
Yasin Abbasi-Yadkori
Mohammad Ghavamzadeh
Nikos Vlassis
Published in:
AISTATS (2019)
Keyphrases
</>
reinforcement learning problems
markov decision problems
policy iteration
markov decision processes
reinforcement learning
dynamic programming
state space
average reward
optimal policy
factored mdps
state and action spaces