Provably Sample-Efficient Model-Free Algorithm for MDPs with Peak Constraints.

Qinbo Bai Vaneet Aggarwal Ather Gattami

Published in: J. Mach. Learn. Res. (2023)

Keyphrases

model free
dynamic programming
reinforcement learning
learning algorithm
worst case
monte carlo
search space
policy iteration
optimal solution
average reward
machine learning
genetic algorithm
feature extraction
state space
constraint satisfaction problems
reinforcement learning algorithms