An Online Adaptive Policy Iteration-Based Reinforcement Learning for a Class of a Nonlinear 3D Overhead Crane.
Nezar M. AlyazidiAbdalrahman M. HassanineMagdi Sadek MahmoudPublished in: Appl. Math. Comput. (2023)
Keyphrases
- policy iteration
- reinforcement learning
- markov decision processes
- model free
- linear approximation
- optimal policy
- temporal difference
- actor critic
- policy evaluation
- stochastic approximation
- average reward
- sample path
- approximate dynamic programming
- fixed point
- function approximation
- reinforcement learning algorithms
- finite state
- markov decision process
- temporal difference learning
- optimal control
- least squares
- state space
- learning algorithm
- neural network
- infinite horizon
- convergence rate
- state and action spaces
- policy search
- markov decision problems
- sufficient conditions
- dynamic programming
- search algorithm
- machine learning