Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief.

Kaiyang Guo Yunfeng Shao Yanhui Geng

Published in: NeurIPS (2022)

Keyphrases

reinforcement learning
model free
function approximation
dynamical systems
learning algorithm
reinforcement learning algorithms
real time
markov decision processes
belief revision
optimal control
belief space
dynamic model
optimal policy
state space
radial basis function
temporal difference
expected utility
probability theory
evolutionary algorithm
initial conditions
fully unsupervised
temporal difference learning
stochastic approximation
learning environment
continuous state
multi agent reinforcement learning