Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards.
Umer SiddiquePaul WengMatthieu ZimmerPublished in: ICML (2020)
Keyphrases
- reinforcement learning
- multi objective
- markov decision processes
- optimal policy
- learning algorithm
- supervised learning
- state space
- markov decision process
- evolutionary algorithm
- policy search
- dynamic programming
- learning process
- macro actions
- machine learning
- reinforcement learning agents
- state abstraction
- neural network
- hierarchical reinforcement learning
- total reward
- deep learning
- partially observable
- action selection
- model free
- learning problems
- decision problems
- particle swarm optimization