Login / Signup
Making PPO even better: Value-Guided Monte-Carlo Tree Search decoding.
Jiacheng Liu
Andrew Cohen
Ramakanth Pasunuru
Yejin Choi
Hannaneh Hajishirzi
Asli Celikyilmaz
Published in:
CoRR (2023)
Keyphrases
</>
monte carlo tree search
monte carlo
tree search algorithm
bayesian reinforcement learning
evaluation function
search space
markov chain
optimal policy
game tree
monte carlo search