Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards.
Umer SiddiquePaul WengMatthieu ZimmerPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- optimal policy
- markov decision processes
- multi objective
- learning process
- supervised learning
- markov decision process
- learning algorithm
- learning capabilities
- multi agent
- policy iteration
- macro actions
- learning problems
- policy search
- reinforcement learning algorithms
- autonomous learning
- control policy
- temporal difference learning
- reinforcement learning methods
- action selection
- hierarchical reinforcement learning
- function approximation
- learning tasks
- state space
- genetic algorithm
- neural network
- deep architectures
- actor critic
- total reward
- discounted reward
- markov decision problems
- deep learning
- multiobjective optimization
- model free
- multiple objectives
- decision problems
- dynamic programming
- machine learning