Aligning Human Preferences with Baseline Objectives in Reinforcement Learning.

Daniel Marta Simon Holk Christian Pek Jana Tumova Iolanda Leite

Published in: ICRA (2023)

Keyphrases

reinforcement learning
multiple objectives
decision making
multi agent
image registration
user preferences
relative improvement
human subjects
function approximation
neural network
temporal difference
transition model
human users
multi attribute
computational models
learning problems
markov decision processes
state space