Login / Signup

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search.

Dan ZhangSining ZhoubianYisong YueYuxiao DongJie Tang
Published in: CoRR (2024)
Keyphrases
  • tree search
  • search algorithm
  • training set
  • nearest neighbor
  • game tree