Safe Posterior Sampling for Constrained MDPs with Bounded Constraint Violation.
Krishna Chaitanya KalagarlaRahul JainPierluigi NuzzoPublished in: CoRR (2023)
Keyphrases
- markov decision processes
- constraint violations
- markov chain monte carlo
- hard constraints
- constrained problems
- reinforcement learning
- random sampling
- metropolis hastings
- sampling algorithm
- state space
- probability distribution
- sample size
- constraint satisfaction
- parameter space
- bayesian framework
- factored mdps
- finite horizon
- user defined constraints
- global constraints
- posterior distribution
- posterior probability
- monte carlo
- generative model
- linear constraints
- optimal policy
- markov chain
- machine learning