Keyphrases
- markov decision processes
- dynamic programming
- total reward
- finite horizon
- average cost
- average reward
- optimal policy
- reinforcement learning
- model free reinforcement learning
- worst case
- infinite horizon
- state space
- expected reward
- discounted reward
- optimal solution
- markov decision process
- optimal control
- reward function
- finite state
- partially observable
- stationary policies
- stochastic games
- real valued
- upper bound