Actor-Critic Reinforcement Learning with Phased Actor.
Ruofan WuJunmin ZhongJennie SiPublished in: CoRR (2024)
Keyphrases
- actor critic
- reinforcement learning
- temporal difference
- reinforcement learning algorithms
- optimal control
- approximate dynamic programming
- policy gradient
- function approximation
- neuro fuzzy
- gradient method
- multi agent
- state space
- policy iteration
- model free
- machine learning
- temporal difference learning
- linear program
- markov decision processes
- step size
- rl algorithms
- dynamic programming
- control problems
- average reward
- policy gradient methods
- least squares
- learning algorithm
- action space
- function approximators
- partially observable
- convergence rate
- learning problems
- learning tasks
- dynamic environments
- supervised learning