The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough.
Riccardo ZamboniDuilio CirinoMarcello RestelliMirco MuttiPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- information theoretic
- information theory
- partially observable markov decision processes
- markov decision processes
- information entropy
- belief state
- continuous state
- mutual information
- information content
- model based reinforcement learning
- finite state
- dec pomdps
- point based value iteration
- real time
- exploration strategy
- belief space
- optimal policy
- feature selection
- neural network