Offline Bayesian Aleatoric and Epistemic Uncertainty Quantification and Posterior Value Optimisation in Finite-State MDPs.
Filippo ValdettaroA. Aldo FaisalPublished in: CoRR (2024)
Keyphrases
- finite state
- markov decision processes
- posterior distribution
- posterior probability
- average cost
- markov chain
- optimal policy
- decision theory
- markov chain monte carlo
- state space
- action sets
- policy iteration
- bayesian networks
- reinforcement learning
- bayesian framework
- probability distribution
- conditional probabilities
- continuous state
- dynamic programming
- reinforcement learning algorithms
- latent variables
- model checking
- partially observable markov decision processes
- partially observable
- policy iteration algorithm
- tree automata
- markov decision process
- infinite horizon
- bayesian inference
- probabilistic model
- maximum likelihood
- search space
- expected utility
- machine learning
- decision problems