Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes.
Miao LuYifei MinZhaoran WangZhuoran YangPublished in: ICLR (2023)
Keyphrases
- reinforcement learning
- partially observable markov decision processes
- continuous state
- partially observable domains
- markov decision processes
- state space
- finite state
- partially observable
- multi agent
- optimal policy
- partial observability
- planning under uncertainty
- partially observable environments
- hidden state
- partially observable stochastic games
- function approximation
- decision problems
- dynamic programming
- belief space
- action selection
- model free
- stochastic domains
- decision makers