Policy composition in reinforcement learning via multi-objective policy optimization.
Shruti MishraAnkit AnandJordan HoffmannNicolas HeessMartin A. RiedmillerAbbas AbdolmalekiDoina PrecupPublished in: CoRR (2023)
Keyphrases
- multi objective
- optimal policy
- reinforcement learning
- optimization algorithm
- policy search
- evolutionary optimization
- dynamic programming
- partially observable environments
- state space
- evolutionary algorithm
- learning algorithm
- markov decision process
- reward function
- action selection
- multiple objectives
- model free
- state dependent
- pareto optimal
- long run
- partially observable
- multi objective optimization
- policy iteration
- control policy
- trade off
- policy evaluation
- actor critic
- multi agent
- reinforcement learning problems
- particle swarm optimization