Keyphrases
- reinforcement learning
- temporal difference
- least squares
- policy evaluation
- policy iteration
- temporal difference learning
- function approximation
- td learning
- model free
- markov decision processes
- step size
- reinforcement learning algorithms
- evaluation function
- monte carlo
- linear approximation
- control problems
- fixed point
- multi step
- state space
- markov decision process
- supervised learning
- cooperative