Deploying Offline Reinforcement Learning with Human Feedback.

Ziniu Li Ke Xu Liu Liu Lanqing Li Deheng Ye Peilin Zhao

Published in: CoRR (2023)

Keyphrases

reinforcement learning
function approximation
human operators
relevance feedback
real time
data sets
dynamic programming
optimal policy
human interaction
reward signal
feedback mechanisms
motor skills
sensory inputs
temporal difference
evaluation function
markov decision processes
state space
information systems