Login / Signup
D2PO: Discriminator-Guided DPO with Response Evaluation Models.
Prasann Singhal
Nathan Lambert
Scott Niekum
Tanya Goyal
Greg Durrett
Published in:
CoRR (2024)
Keyphrases
</>
databases
decision trees
parameter estimation
machine learning algorithms
statistical models
data mining
machine learning
social networks
three dimensional
video sequences
complex systems
classification models
evaluation method
gold standard
evaluation methods
accurate models