Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes.
John LochSatinder P. SinghPublished in: ICML (1998)
Keyphrases
- partially observable markov decision processes
- eligibility traces
- policy evaluation
- reinforcement learning
- optimal policy
- finite state
- reinforcement learning algorithms
- dynamical systems
- planning under uncertainty
- markov decision processes
- state space
- decision problems
- dynamic programming
- belief state
- partially observable
- reinforcement learning methods
- multi agent
- function approximation
- planning problems
- policy gradient
- model free
- infinite horizon
- average reward
- dec pomdps
- policy iteration
- temporal difference
- machine learning
- heuristic search
- point based value iteration
- action selection
- markov decision process
- learning algorithm