Optimal policy switching algorithms for reinforcement learning.
Gheorghe ComaniciDoina PrecupPublished in: AAMAS (2010)
Keyphrases
- optimal policy
- reinforcement learning
- markov decision processes
- policy iteration
- learning algorithm
- state space
- control policies
- decision problems
- long run
- infinite horizon
- dynamic programming algorithms
- dynamic programming
- partially observable markov decision processes
- markov decision process
- reinforcement learning algorithms
- state dependent
- finite horizon
- model free
- multistage
- function approximation
- average cost
- average reward
- reinforcement learning methods
- policy evaluation
- supply chain
- total reward