Actor-Critic Alignment for Offline-to-Online Reinforcement Learning.
Zishun YuXinhua ZhangPublished in: ICML (2023)
Keyphrases
- actor critic
- reinforcement learning
- temporal difference
- approximate dynamic programming
- reinforcement learning algorithms
- policy gradient
- optimal control
- function approximation
- real time
- neuro fuzzy
- gradient method
- policy iteration
- policy gradient methods
- control problems
- state space
- multi agent
- learning algorithm
- neural network
- model free
- rl algorithms
- machine learning
- markov decision process
- natural actor critic