Login / Signup
rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions.
Mathieu Fehr
Olivier Buffet
Vincent Thomas
Jilles Steeve Dibangoye
Published in:
NeurIPS (2018)
Keyphrases
</>
dynamic programming
reinforcement learning
markov decision processes
worst case
planning problems
real time
data structure
search algorithm
theoretical analysis