Login / Signup

WARM: On the Benefits of Weight Averaged Reward Models.

Alexandre RaméNino VieillardLéonard HussenotRobert DadashiGeoffrey CideronOlivier BachemJohan Ferret
Published in: CoRR (2024)
Keyphrases
  • reinforcement learning
  • probabilistic model
  • complex systems
  • experimental data
  • accurate models
  • machine learning
  • database systems
  • search algorithm
  • neural network model
  • bayesian framework
  • learned models