Leveraging Counterfactual Paths for Contrastive Explanations of POMDP Policies.
Benjamin KraskeZakariya LaouarZachary SunbergPublished in: CoRR (2024)
Keyphrases
- partially observable markov decision processes
- optimal policy
- markov decision process
- finite state
- markov decision problems
- reinforcement learning
- reward function
- dynamical systems
- decision problems
- continuous state
- state space
- belief state
- markov decision processes
- predictive state representations
- dynamic programming
- partially observable
- partially observable stochastic games
- policy search
- planning under uncertainty
- control policies
- belief space
- optimal path
- shortest path
- partially observable markov decision process
- long run
- infinite horizon
- decision processes
- machine learning
- hidden state
- policy iteration
- multi agent
- dec pomdps
- bayesian reinforcement learning
- model free reinforcement learning
- average reward
- causal reasoning
- policy gradient
- domain theory
- markov chain
- approximate solutions