Theoretical Hardness and Tractability of POMDPs in RL with Partial Hindsight State Information.
Ming ShiYingbin LiangNess B. ShroffPublished in: CoRR (2023)
Keyphrases
- state information
- reinforcement learning
- state space
- action space
- markov decision processes
- computational complexity
- np complete
- partially observable
- action selection
- partially observable markov decision processes
- optimal policy
- action models
- belief state
- dynamic programming
- constraint satisfaction
- multi agent
- orders of magnitude
- heuristic search
- policy gradient
- markov decision process
- machine learning