RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences.

Jie Cheng Gang Xiong Xingyuan Dai Qinghai Miao Yisheng Lv Fei-Yue Wang

Published in: CoRR (2024)

Keyphrases

reinforcement learning
user preferences
noisy environments
state space
function approximation
neural network
learning process
preference relations
temporal difference learning
data mining
multi agent
collaborative filtering
computationally efficient
missing data
preference models
erroneous data