Keyphrases
- policy iteration
- lower bound
- markov decision processes
- upper bound
- model free
- average case complexity
- least squares
- fixed point
- reinforcement learning
- sample path
- optimal policy
- temporal difference
- markov decision process
- policy evaluation
- finite state
- np hard
- average reward
- objective function
- convergence rate
- linear programming
- infinite horizon
- markov decision problems
- optimal control
- optimal solution
- worst case
- average cost
- discounted reward
- supervised learning
- state space
- convergence speed
- linear program
- dynamic programming
- sufficient conditions
- markov chain
- bayesian networks