Policy Gradient Planning for Environmental Decision Making with Existing Simulators.
Mark CrowleyDavid PoolePublished in: AAAI (2011)
Keyphrases
- policy gradient
- decision making
- action selection
- parametric optimization
- function approximation
- reinforcement learning
- decision makers
- partially observable markov decision processes
- heuristic search
- single agent
- planning problems
- actor critic
- domain independent
- dynamic programming
- multi agent
- path finding
- data mining
- reinforcement learning methods
- gradient method
- approximation methods
- reinforcement learning algorithms
- optimal control
- markov decision processes