Login / Signup
Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning.
Fan-Ming Luo
Tian Xu
Xingchen Cao
Yang Yu
Published in:
CoRR (2023)
Keyphrases
</>
reinforcement learning
multi agent
experimental data
function approximation
dynamic programming
state space
statistical models
reinforcement learning algorithms
neural network
machine learning
probabilistic model
markov decision processes
dynamic model
mathematical models
reward function