Entropy-Augmented Entropy-Regularized Reinforcement Learning and a Continuous Path from Policy Gradient to Q-Learning.

Published in: CoRR (2020)

Keyphrases

reinforcement learning
policy gradient
reinforcement learning algorithms
function approximation
actor critic
state space
policy search
model free
learning algorithm
multi agent
continuous state and action spaces
temporal difference
reinforcement learning methods
optimal policy
machine learning
state action
action space
temporal difference learning
rl algorithms
markov decision processes
dynamic programming
optimal control
single agent
evaluation function
approximate dynamic programming
cooperative
model free reinforcement learning