Multi-Objective POMDPs with Lexicographic Reward Preferences.
Kyle Hollins WrayShlomo ZilbersteinPublished in: IJCAI (2015)
Keyphrases
- multi objective
- reinforcement learning
- multiple objectives
- preference models
- multiple criteria
- expected reward
- multi objective optimization
- evolutionary algorithm
- policy gradient
- optimization algorithm
- partially observable markov decision processes
- user preferences
- markov decision processes
- partially observable
- reward function
- average reward
- belief state
- objective function
- partially observable environments
- conflicting objectives
- multi objective optimization problems
- pareto optimal
- multi criteria
- genetic algorithm
- nsga ii
- soft constraints
- particle swarm optimization
- multi objective evolutionary
- dynamic programming
- continuous state
- state space
- preference elicitation
- multicriteria optimization
- decision making
- long run
- linear programming
- preference relations
- machine learning
- simulated annealing
- optimization problems
- search algorithm
- multi agent
- belief revision
- multi attribute
- learning algorithm