Login / Signup
Controlling Underestimation Bias in Reinforcement Learning via Quasi-median Operation.
Wei Wei
Yujia Zhang
Jiye Liang
Lin Li
Yyuze Li
Published in:
AAAI (2022)
Keyphrases
</>
reinforcement learning
function approximation
optimal policy
reinforcement learning algorithms
temporal difference
supervised learning
state space
model free
multi agent
learning process
control system
neural network
decision trees
transfer learning
markov decision processes
learning problems
evaluation function
learning algorithm
information retrieval
machine learning
real world