Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning.

Fan-Ming Luo Tian Xu Xingchen Cao Yang Yu

Published in: CoRR (2023)

Keyphrases

reinforcement learning
multi agent
experimental data
function approximation
dynamic programming
state space
statistical models
reinforcement learning algorithms
neural network
machine learning
probabilistic model
markov decision processes
dynamic model
mathematical models
reward function