Global Linear Convergence of Online Reinforcement Learning for Partially Observable Systems.

Takumi Hirai Tomonori Sadamoto

Published in: ECC (2022)

Keyphrases

partially observable
reinforcement learning
state space
partial observability
markov decision processes
hidden state
partially observable domains
partially observable environments
markov decision problems
dynamical systems
belief state
complex systems
infinite horizon
decision problems
action models
initially unknown
fully observable
partial observations
optimal policy
np hard