The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough.

Riccardo Zamboni Duilio Cirino Marcello Restelli Mirco Mutti

Published in: CoRR (2024)

Keyphrases

reinforcement learning
information theoretic
information theory
partially observable markov decision processes
markov decision processes
information entropy
belief state
continuous state
mutual information
information content
model based reinforcement learning
finite state
dec pomdps
point based value iteration
real time
exploration strategy
belief space
optimal policy
feature selection
neural network