Global Policy Construction in Modular Reinforcement Learning.
Ruohan ZhangZhao SongDana H. BallardPublished in: AAAI (2015)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- partially observable
- partially observable environments
- function approximation
- state space
- reinforcement learning algorithms
- markov decision problems
- action space
- transition model
- action selection
- reinforcement learning problems
- model free
- dynamic programming
- global information
- neural network
- state and action spaces
- modular structure
- learning process
- markov decision processes
- policy gradient
- control policy
- partially observable domains
- construction process
- control policies
- function approximators
- partially observable markov decision processes
- asymptotically optimal
- reward function
- infinite horizon
- optimal control
- sufficient conditions
- decision problems