Multi-Objective MDPs with Conditional Lexicographic Reward Preferences.
Kyle Hollins WrayShlomo ZilbersteinAbdel-Illah MouaddibPublished in: AAAI (2015)
Keyphrases
- multi objective
- reinforcement learning
- multiple objectives
- markov decision processes
- average reward
- reward function
- preference models
- multiple criteria
- evolutionary algorithm
- multi objective optimization
- optimization algorithm
- cp nets
- objective function
- state space
- optimal policy
- genetic algorithm
- particle swarm optimization
- nsga ii
- preference elicitation
- multi attribute
- multi criteria
- discounted reward
- pareto optimal
- multiple agents
- factored mdps
- user preferences
- multi objective optimization problems
- conflicting objectives
- soft constraints
- partially observable
- markov decision process
- expected reward
- reinforcement learning algorithms
- policy iteration
- multi objective evolutionary algorithms
- multicriteria optimization
- multi objective evolutionary
- preference relations
- decision making
- finite horizon
- markov decision problems
- decision diagrams
- model free
- markov chain
- linear programming
- least squares
- dynamic programming
- learning algorithm