Deadly triad matters for offline reinforcement learning.

Zhiyong Peng Yadong Liu Zongtan Zhou

Published in: Knowl. Based Syst. (2024)

Keyphrases

reinforcement learning
function approximation
state space
reinforcement learning algorithms
robotic control
learning algorithm
real time
model free
markov decision processes
temporal difference
learning problems
transfer learning
partially observable
database
optimal policy
evolutionary algorithm
expert systems
bayesian networks
artificial intelligence
temporal difference learning
multi agent reinforcement learning
transition model
policy search
neural network
direct policy search