Combining information-seeking exploration and reward maximization: Unified inference on continuous state and action spaces under partial observability.
Parvin MalekzadehKonstantinos N. PlataniotisPublished in: CoRR (2022)
Keyphrases
- information seeking
- partial observability
- reinforcement learning
- continuous state and action spaces
- action selection
- partially observable
- learning agent
- information retrieval
- partially observable markov decision processes
- information resources
- markov decision process
- state space
- optimal policy
- dynamic programming
- reinforcement learning algorithms
- belief state
- learning algorithm
- function approximation
- temporal difference
- reward function
- markov decision processes
- planning problems
- hidden state
- learning process
- knowledge base
- bayesian networks
- markov decision problems
- function approximators
- objective function
- domain specific
- dynamic environments
- dynamical systems
- incomplete information
- model free
- optimal control