Reinforcement Learning for POMDPs Based on Action Values and Stochastic Optimization.
Theodore J. PerkinsPublished in: AAAI/IAAI (2002)
Keyphrases
- stochastic optimization
- reinforcement learning
- multistage
- action selection
- state space
- action space
- state action
- partially observable
- markov decision processes
- function approximation
- continuous state
- partially observable domains
- policy search
- optimal policy
- continuous action
- partially observable markov decision processes
- reinforcement learning algorithms
- robust optimization
- machine learning
- model free
- learning algorithm
- optimal control
- multi agent
- policy gradient
- agent learns
- finite state
- partial observability
- dynamic programming
- continuous state spaces
- artificial intelligence