Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.
Tuomas HaarnojaAurick ZhouPieter AbbeelSergey LevinePublished in: ICML (2018)
Keyphrases
- maximum entropy
- actor critic
- reinforcement learning
- temporal difference
- policy gradient
- approximate dynamic programming
- reinforcement learning algorithms
- optimal control
- gradient method
- maximum entropy principle
- neuro fuzzy
- markov models
- random fields
- function approximation
- monte carlo
- policy iteration
- state space
- markov decision processes
- dynamic programming
- conditional random fields
- average reward
- evaluation function
- learning algorithm
- temporal difference learning
- machine learning
- transfer learning
- action space
- reinforcement learning methods
- model free
- optimal solution
- multi agent