On the Existence of Fixed Points for Q-Learning and Sarsa in Partially Observable Domains.
Theodore J. PerkinsMark D. PendrithPublished in: ICML (2002)
Keyphrases
- fixed point
- partially observable domains
- reinforcement learning
- temporal difference learning
- policy iteration
- reinforcement learning algorithms
- function approximation
- partially observable
- state space
- dynamical systems
- inverse reinforcement learning
- model free
- temporal difference
- sufficient conditions
- action selection
- optimal policy
- sensing actions
- supervised learning
- action models
- transfer learning
- partially observable markov decision processes
- learning algorithm
- function approximators
- markov decision process
- multi agent
- markov decision processes
- belief propagation
- monte carlo
- dynamic programming
- game playing