Towards Offline Reinforcement Learning with Pessimistic Value Priors.
Filippo ValdettaroA. Aldo FaisalPublished in: Epi UAI (2024)
Keyphrases
- reinforcement learning
- function approximation
- state space
- real time
- prior knowledge
- machine learning
- learning process
- temporal difference
- optimal policy
- markov decision processes
- multi agent reinforcement learning
- possibility theory
- reinforcement learning algorithms
- stochastic approximation
- learning problems
- supervised learning
- data sets
- transition model
- learned from training data
- prior model
- robotic control
- expected utility
- optimal control
- bayesian framework
- prior information
- learning algorithm
- neural network