An Improved Policy Iteration Algorithm for Partially Observable MDPs.
Eric A. HansenPublished in: NIPS (1997)
Keyphrases
- partially observable
- markov decision processes
- policy iteration algorithm
- reinforcement learning
- policy iteration
- markov decision problems
- finite state
- infinite horizon
- state space
- dynamic programming
- optimal policy
- reinforcement learning algorithms
- planning under uncertainty
- reward function
- average reward
- average cost
- decision processes
- markov decision process
- policy evaluation
- state variables
- decision problems
- machine learning