Keyphrases
- policy iteration
- reinforcement learning
- markov decision processes
- model free
- optimal control
- control problems
- optimal policy
- human operators
- policy evaluation
- temporal difference
- stochastic approximation
- markov decision process
- approximate dynamic programming
- sample path
- finite state
- control policy
- infinite horizon
- least squares
- state space
- control system
- robotic arm
- reinforcement learning algorithms
- actor critic
- temporal difference learning
- robotic systems
- function approximation
- average reward
- fixed point
- state and action spaces
- partially observable
- action selection
- control strategy
- dynamic programming
- machine learning
- convergence rate
- transfer learning
- linear programming
- markov decision problems
- supervised learning
- multi agent
- learning algorithm