Login / Signup
Model-Free Algorithm and Regret Analysis for MDPs with Peak Constraints.
Qinbo Bai
Ather Gattami
Vaneet Aggarwal
Published in:
CoRR (2020)
Keyphrases
</>
model free
reinforcement learning
dynamic programming
learning algorithm
policy iteration
search space
monte carlo
markov decision processes
feature selection
worst case
convergence rate
function approximation
temporal difference
policy evaluation