Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability.
Dibya GhoshJad RahmeAviral KumarAmy ZhangRyan P. AdamsSergey LevinePublished in: CoRR (2021)
Keyphrases
- partial observability
- reinforcement learning
- partially observable markov decision processes
- belief state
- partially observable
- belief space
- markov decision processes
- state space
- markov decision process
- fully observable
- planning problems
- optimal policy
- learning agent
- function approximation
- planning under uncertainty
- finite state
- dynamical systems
- planning under partial observability
- temporal difference
- partial information
- belief revision
- decision problems
- multi agent