Information Gathering and Reward Exploitation of Subgoals for POMDPs.
Hang MaJoelle PineauPublished in: AAAI (2015)
Keyphrases
- information gathering
- reinforcement learning
- expected reward
- policy gradient
- resource bounded
- average reward
- information fusion
- markov decision processes
- partially observable markov decision processes
- decision making
- belief state
- reward function
- optimal policy
- partially observable
- state space
- bayes risk
- predictive state representations
- decision process
- state action
- distributed constraint optimization
- model free
- finite state
- genetic programming
- search algorithm
- multi agent
- point based value iteration