Login / Signup

RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs.

Shreyas ChaudhariPranjal AggarwalVishvak MurahariTanmay RajpurohitAshwin KalyanKarthik NarasimhanAmeet DeshpandeBruno Castro da Silva
Published in: CoRR (2024)
Keyphrases
  • reinforcement learning
  • learning process
  • social networks
  • statistical analysis
  • neural network
  • data mining
  • website
  • image analysis
  • active learning
  • human experts
  • function approximation