Approximating Optimal Policies for Partially Observable Stochastic Domains.
Ronald ParrStuart J. RussellPublished in: IJCAI (1995)
Keyphrases
- optimal policy
- partially observable stochastic domains
- partially observable markov decision processes
- markov decision processes
- decision problems
- state space
- reinforcement learning
- dynamic programming
- finite state
- long run
- agent programming
- finite horizon
- infinite horizon
- initial state
- average reward
- dynamical systems
- markov decision process
- dynamic programming algorithms
- partial observability
- policy iteration
- serial inventory systems
- average reward reinforcement learning
- partially observable
- situation calculus
- sufficient conditions
- belief state
- markov chain