Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs.
Diederik M. RoijersErwin WalravenMatthijs T. J. SpaanPublished in: ICAPS (2018)
Keyphrases
- partially observable
- markov decision processes
- multi objective
- state space
- evolutionary algorithm
- multi objective optimization
- policy iteration
- markov decision problems
- reinforcement learning
- belief space
- dynamic programming
- partial observability
- partial observations
- infinite horizon
- partially observable markov decision processes
- optimal policy
- genetic algorithm
- belief state
- finite state
- markov decision process
- discount factor
- action models
- reinforcement learning algorithms
- reward function
- average reward
- objective function
- planning under uncertainty
- decision processes
- partially observable environments
- stochastic shortest path
- action space
- average cost
- pareto optimal
- factored mdps
- optimal control
- planning problems
- decision problems
- knowledge base