Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search.
Bruno ScherrerMatthieu GeistPublished in: ECML/PKDD (3) (2014)
Keyphrases
- queueing networks
- policy search
- markov decision problems
- policy iteration
- reinforcement learning
- approximate policy iteration
- state dependent
- continuous state
- optimal policy
- markov decision processes
- state space
- dynamic programming
- linear programming
- reward function
- partially observable markov decision processes
- action space