On Sample Complexity of Projection-Free Primal-Dual Methods for Learning Mixture Policies in Markov Decision Processes.
Masoud Badiei KhuzaniVarun VasudevanHongyi RenLei XingPublished in: CoRR (2019)
Keyphrases
- markov decision processes
- primal dual
- reinforcement learning
- optimal policy
- learning algorithm
- sample complexity
- learning problems
- supervised learning
- finite state
- active learning
- markov decision process
- state space
- partially observable
- learning process
- linear programming
- reward function
- markov decision problems
- decision theoretic planning
- policy iteration
- infinite horizon
- generalization error
- approximation algorithms
- convergence rate
- semi supervised
- dynamic programming