Issues concerning realizability of Blackwell optimal policies in reinforcement learning.
Nicholas DenisPublished in: CoRR (2019)
Keyphrases
- optimal policy
- reinforcement learning
- markov decision processes
- decision problems
- state space
- dynamic programming
- infinite horizon
- finite horizon
- policy iteration
- average reward
- long run
- multistage
- state dependent
- finite state
- reward function
- markov decision problems
- machine learning
- sufficient conditions
- average cost
- markov decision process
- lost sales
- initial state
- bayesian reinforcement learning
- function approximation
- reinforcement learning algorithms
- multi agent
- serial inventory systems
- total reward
- average reward reinforcement learning
- temporal difference
- dynamic programming algorithms
- control policies
- model free
- search space
- learning algorithm