Improving Generalization of Reinforcement Learning with Minimax Distributional Soft Actor-Critic.
Yangang RenJingliang DuanShengbo Eben LiYang GuanQi SunPublished in: ITSC (2020)
Keyphrases
- actor critic
- reinforcement learning
- temporal difference
- policy gradient
- reinforcement learning algorithms
- approximate dynamic programming
- function approximation
- optimal control
- gradient method
- neuro fuzzy
- evaluation function
- state space
- model free
- policy iteration
- control problems
- learning algorithm
- dynamic programming
- policy gradient methods
- adaptive control
- multi agent
- markov decision processes
- markov chain
- least squares