Login / Signup
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models.
Ziniu Li
Tian Xu
Yushun Zhang
Yang Yu
Ruoyu Sun
Zhi-Quan Luo
Published in:
CoRR (2023)
Keyphrases
</>
language model
probabilistic model
similarity measure
document retrieval
information retrieval
reinforcement learning
classification accuracy
error rate
n gram
statistical model
weighting scheme
translation model
smoothing methods