Login / Signup
What should be observed for optimal reward in POMDPs?
Alyzia-Maria Konsta
Alberto Lluch-Lafuente
Christoph Matheja
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
dynamic programming
average reward
expected reward
optimal solution
partially observed
markov decision processes
optimal control
search algorithm
worst case
optimal strategy
reward function
initially unknown