Keyphrases
- policy iteration
- fixed point
- approximate value iteration
- markov decision processes
- discounted reward
- model free
- upper bound
- least squares
- sample path
- average reward
- optimal policy
- reinforcement learning
- finite state
- temporal difference
- markov decision process
- policy evaluation
- lower bound
- sufficient conditions
- optimal control
- convergence rate
- markov decision problems
- infinite horizon
- belief propagation
- dynamical systems
- active learning
- linear programming
- generalization bounds
- variance reduction
- graphical models
- worst case
- probabilistic model