Pessimistic Model-based Offline RL: PAC Bounds and Posterior Sampling under Partial Coverage.
Masatoshi UeharaWen SunPublished in: CoRR (2021)
Keyphrases
- upper bound
- vc dimension
- sample size
- pac bayesian
- model free
- reinforcement learning
- mistake bound
- markov chain monte carlo
- random sampling
- pac learning
- metropolis hastings
- probability distribution
- sampling algorithm
- sample complexity
- worst case
- lower bound
- real time
- posterior distribution
- variance reduction
- monte carlo
- multi agent
- upper and lower bounds
- function approximation
- sampling methods
- linear threshold
- average case
- probabilistic model
- latent variables
- policy iteration
- generalization bounds
- bayesian framework