Towards Offline Reinforcement Learning with Pessimistic Value Priors.

Filippo Valdettaro A. Aldo Faisal

Published in: Epi UAI (2024)

Keyphrases

reinforcement learning
function approximation
state space
real time
prior knowledge
machine learning
learning process
temporal difference
optimal policy
markov decision processes
multi agent reinforcement learning
possibility theory
reinforcement learning algorithms
stochastic approximation
learning problems
supervised learning
data sets
transition model
learned from training data
prior model
robotic control
expected utility
optimal control
bayesian framework
prior information
learning algorithm
neural network