Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning.
Ruida ZhouTao LiuDileep M. KalathilP. R. KumarChao TianPublished in: CoRR (2022)
Keyphrases
- policy gradient
- multi objective
- reinforcement learning
- actor critic
- function approximation
- reinforcement learning algorithms
- policy search
- optimization algorithm
- objective function
- evolutionary algorithm
- optimal control
- gradient method
- model free reinforcement learning
- genetic algorithm
- policy gradient methods
- particle swarm optimization
- reinforcement learning methods
- least squares
- average reward
- function approximators
- machine learning
- markov decision processes
- state action
- state space
- partially observable markov decision processes
- approximation methods
- variance reduction
- temporal difference learning
- rl algorithms
- learning algorithm
- neural network
- multi agent systems
- transfer learning
- approximate dynamic programming
- multi agent
- control system
- policy iteration
- single agent