Policy-regularized Offline Multi-objective Reinforcement Learning.
Qian LinChao YuZongkai LiuZifan WuPublished in: AAMAS (2024)
Keyphrases
- simulated annealing
- multi objective
- evolutionary algorithm
- reinforcement learning
- optimal policy
- genetic algorithm
- policy search
- markov decision process
- multi objective optimization
- action selection
- optimization algorithm
- partially observable environments
- particle swarm optimization
- markov decision processes
- policy evaluation
- optimization problems
- partially observable
- policy gradient
- function approximation
- reinforcement learning problems
- least squares
- objective function
- actor critic
- reward function
- action space
- state space
- state and action spaces
- function approximators
- policy iteration
- control policies
- multiple objectives
- reinforcement learning algorithms
- control policy
- conflicting objectives
- approximate dynamic programming
- differential evolution
- rl algorithms
- optimal control
- dynamic programming
- state action
- markov decision problems
- inverse reinforcement learning
- partially observable markov decision processes
- model free
- infinite horizon
- pareto optimal
- nsga ii
- multi objective optimization problems
- finite state
- average reward
- state dependent
- optimum design
- partially observable domains
- real time
- policy gradient methods
- decision problems
- sufficient conditions
- multi agent
- neural network
- partially observable markov decision process
- continuous state
- trade off
- learning algorithm
- agent receives