Learning in POMDPs is Sample-Efficient with Hindsight Observability.

Jonathan N. Lee Alekh Agarwal Christoph Dann Tong Zhang

Published in: CoRR (2023)

Keyphrases

reinforcement learning
learning algorithm
learning process
online learning
learning tasks
prior knowledge
learning systems
learning problems
partially observable
data sets
neural network
multi agent
sufficient conditions
unsupervised learning
background knowledge
predictive state representations