Login / Signup
A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes.
Kelly W. Zhang
Omer Gottesman
Finale Doshi-Velez
Published in:
CoRR (2022)
Keyphrases
</>
markov decision processes
reinforcement learning
partially observable
optimal policy
dynamic programming
finite state
stochastic games
model based reinforcement learning
decision theoretic planning
macro actions
machine learning
learning algorithm
markov decision process
risk sensitive
action sets