Login / Signup
Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks.
Julia Kreutzer
Stefan Riezler
Carolin Lawrence
Published in:
SPNLP@ACL-IJCNLP (2021)
Keyphrases
</>
real world
reinforcement learning
artificial intelligence
data mining
data sets
information retrieval
wide range
multi task
decision making
case study
human interaction
long sequences