On the Performance Bounds of some Policy Search Dynamic Programming Algorithms.
Bruno ScherrerPublished in: CoRR (2013)
Keyphrases
- dynamic programming algorithms
- policy search
- dynamic programming
- markov decision problems
- optimal policy
- reinforcement learning
- upper bound
- linear programming
- lower bound
- continuous state
- state space
- partially observable markov decision processes
- markov decision processes
- partially observable
- reward function
- decision theoretic
- reinforcement learning algorithms
- approximation methods
- function approximation
- optimal control
- decision processes
- utility function
- linear program
- domain independent
- supervised learning