SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search.
Gal DalalAssaf HallakGugan ThoppeShie MannorGal ChechikPublished in: CoRR (2023)
Keyphrases
- tree search
- variance reduction
- policy gradient
- monte carlo
- search algorithm
- branch and bound
- sample size
- actor critic
- search tree
- importance sampling
- mathematical programming
- constraint propagation
- confidence intervals
- markov chain
- state space
- path finding
- graphical models
- simulated annealing
- probability distribution
- cost function
- search space
- bayesian networks
- genetic algorithm