Login / Signup
Improving Reward Models with Synthetic Critiques.
Zihuiwen Ye
Fraser Greenlee-Scott
Max Bartolo
Phil Blunsom
Jon Ander Campos
Matthias Gallé
Published in:
CoRR (2024)
Keyphrases
</>
real world
probabilistic model
parameter estimation
random fields
data mining
machine learning
decision making
knowledge base
video sequences
artificial neural networks
prior knowledge
language model
model selection
experimental data
real scenes