BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement Learning.
Chi ZhangSanmukh Rao KuppannagariViktor K. PrasannaPublished in: CoRR (2021)
Keyphrases
- actor critic
- reinforcement learning
- policy gradient
- reinforcement learning algorithms
- temporal difference
- optimal control
- function approximation
- approximate dynamic programming
- neuro fuzzy
- gradient method
- model free
- rl algorithms
- state space
- policy iteration
- dynamical systems
- learning algorithm
- markov decision processes
- multi agent
- optimal policy
- action selection
- control problems
- partially observable markov decision processes
- function approximators
- average reward
- transfer learning
- reinforcement learning methods
- least squares