SoftTreeMax: Policy Gradient with Tree Search.
Gal DalalAssaf HallakShie MannorGal ChechikPublished in: CoRR (2022)
Keyphrases
- tree search
- policy gradient
- branch and bound
- search algorithm
- reinforcement learning
- search tree
- optimal control
- constraint propagation
- reinforcement learning algorithms
- gradient method
- state space
- function approximation
- path finding
- mathematical programming
- approximation methods
- single agent
- variance reduction
- game tree
- orders of magnitude
- reinforcement learning methods
- average reward
- partially observable markov decision processes
- search space
- convergence rate
- monte carlo
- model free
- simulated annealing
- neural network