Pruning Dominated Policies in Multiobjective Pareto Q-Learning.
Lawrence MandowJosé-Luis Pérez-de-la-CruzPublished in: CAEPIA (2018)
Keyphrases
- multi objective
- optimal policy
- multiobjective optimization
- evolutionary algorithm
- multi objective optimization
- reinforcement learning
- optimization algorithm
- state space
- genetic algorithm
- hierarchical reinforcement learning
- reward function
- pareto optimal
- markov decision processes
- objective function
- cooperative
- particle swarm optimization
- discounted reward
- multi agent
- search space
- nsga ii
- conflicting objectives
- pruning method
- bi objective
- learning algorithm
- multiple objectives
- function approximation
- long run
- multiobjective genetic algorithm
- action selection
- optimum design
- markov decision problems
- control policies
- stochastic approximation
- multiobjective evolutionary algorithms
- multiobjective evolutionary algorithm
- genetic programming
- sufficient conditions
- pareto optimal solutions
- policy iteration
- learning rate
- dynamic programming
- markov decision process
- search algorithm
- infinite horizon