Fairness in Preference-based Reinforcement Learning.

Umer Siddique Abhinav Sinha Yongcan Cao

Published in: CoRR (2023)

Keyphrases

state space
reinforcement learning
reinforcement learning algorithms
markov decision processes
optimal policy
function approximation
control problems
markov decision process
machine learning
resource allocation
learning algorithm
game theory
temporal difference
learning process
supervised learning
learning agent
multi agent
model free
database
temporal difference learning
learning capabilities
learning problems
user preferences
real time