Login / Signup

WARP: On the Benefits of Weight Averaged Rewarded Policies.

Alexandre RaméJohan FerretNino VieillardRobert DadashiLéonard HussenotPierre-Louis CedozPier Giuseppe SessaSertan GirginArthur DouillardOlivier Bachem
Published in: CoRR (2024)
Keyphrases
  • optimal policy
  • data mining
  • search algorithm
  • weighting scheme
  • control policies
  • neural network
  • artificial intelligence
  • reinforcement learning
  • video sequences
  • weight function