Login / Signup
Efficient Two-Phase Offline Deep Reinforcement Learning from Preference Feedback.
Yinglun Xu
Gagandeep Singh
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
data sets
databases
search algorithm
supervised learning
lightweight
database
knowledge base
data structure
learning process
computationally efficient
optimal policy
multi attribute