Theoretical Hardness and Tractability of POMDPs in RL with Partial Hindsight State Information.

Ming Shi Yingbin Liang Ness B. Shroff

Published in: CoRR (2023)

Keyphrases

state information
reinforcement learning
state space
action space
markov decision processes
computational complexity
np complete
partially observable
action selection
partially observable markov decision processes
optimal policy
action models
belief state
dynamic programming
constraint satisfaction
multi agent
orders of magnitude
heuristic search
policy gradient
markov decision process
machine learning