Login / Signup

Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks.

Julia KreutzerStefan RiezlerCarolin Lawrence
Published in: SPNLP@ACL-IJCNLP (2021)
Keyphrases
  • real world
  • reinforcement learning
  • artificial intelligence
  • data mining
  • data sets
  • information retrieval
  • wide range
  • multi task
  • decision making
  • case study
  • human interaction
  • long sequences