Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees.
Daniil TiapkinDenis BelomestnyDaniele CalandrielloEric MoulinesRémi MunosAlexey NaumovMark RowlandMichal ValkoPierre MénardPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- sampling methods
- markov chain monte carlo
- function approximation
- sampled data
- sparsely sampled
- random sampling
- random samples
- stratified sampling
- sample space
- random sample
- lower bound
- upper bound
- learning algorithm
- sample selection
- sampling strategy
- optimal control
- sampling algorithm
- training samples
- posterior probability
- bayesian framework
- metropolis hastings
- data sets
- neural network
- sampling strategies
- model free
- optimal policy
- probabilistic model
- temporal difference
- reinforcement learning algorithms
- gaussian process
- minority class
- image reconstruction
- learning process
- active learning
- state space
- decision trees
- class imbalance
- training data
- supervised learning
- sample size
- sample points
- bayesian inference