Bayesian Reward Models for LLM Alignment.

Adam X. Yang Maxime Robeyns Thomas Coste Jun Wang Haitham Bou-Ammar Laurence Aitchison

Published in: CoRR (2024)

Keyphrases

statistical models
reinforcement learning
experimental data
data sets
genetic algorithm
bayesian networks
multi agent systems
pairwise
prior knowledge
probabilistic model
statistical methods
mathematical models