Keyphrases
- policy iteration
- reinforcement learning
- markov decision processes
- control problems
- model free
- optimal control
- optimal policy
- human operators
- temporal difference
- approximate dynamic programming
- stochastic approximation
- policy evaluation
- sample path
- markov decision process
- robotic systems
- fixed point
- average reward
- control policy
- finite state
- function approximation
- state space
- least squares
- actor critic
- discounted reward
- dynamic programming
- robotic arm
- control system
- temporal difference learning
- infinite horizon
- partially observable
- average cost
- control strategy
- hand eye
- markov decision problems
- linear programming
- approximate policy iteration
- action space
- reinforcement learning algorithms
- action selection
- transfer learning
- rl algorithms
- machine learning
- convergence rate
- state and action spaces
- linear program
- supervised learning