Multi-objective Reinforcement Learning with Path Integral Policy Improvement.
Ryo AriizumiHayato SagoToru AsaiShun-Ichi AzumaPublished in: SICE (2023)
Keyphrases
- multi objective
- reinforcement learning
- optimal policy
- evolutionary algorithm
- policy search
- markov decision process
- multi objective optimization
- action selection
- optimization algorithm
- objective function
- multiple objectives
- function approximation
- partially observable
- actor critic
- function approximators
- control policy
- genetic algorithm
- markov decision processes
- partially observable domains
- control policies
- policy gradient
- state and action spaces
- reinforcement learning problems
- particle swarm optimization
- multi objective optimization problems
- partially observable environments
- policy iteration
- significant improvement
- dynamic programming
- reinforcement learning algorithms
- conflicting objectives
- shortest path
- action space
- multi objective evolutionary algorithms
- approximate dynamic programming
- exploration exploitation tradeoff
- machine learning
- partially observable markov decision processes
- reward function
- model free
- nsga ii
- state space
- learning algorithm
- state dependent
- state action
- rl algorithms
- temporal difference
- infinite horizon
- decision problems
- learning process
- trade off
- agent receives
- neural network