Sign in

Making PPO even better: Value-Guided Monte-Carlo Tree Search decoding.

Jiacheng LiuAndrew CohenRamakanth PasunuruYejin ChoiHannaneh HajishirziAsli Celikyilmaz
Published in: CoRR (2023)
Keyphrases
  • monte carlo tree search
  • monte carlo
  • tree search algorithm
  • bayesian reinforcement learning
  • evaluation function
  • search space
  • markov chain
  • optimal policy
  • game tree
  • monte carlo search