Scalable Online Planning via Reinforcement Learning Fine-Tuning.

Arnaud Fickinger Hengyuan Hu Brandon Amos Stuart J. Russell Noam Brown

Published in: NeurIPS (2021)

Keyphrases

fine tuning
reinforcement learning
fine tune
viable alternative
online learning
fine tuned
heuristic search
planning problems
action selection
real time
decision theoretic
reward shaping
deterministic domains
macro actions
partially observable
function approximation
complex domains
learning process
temporal difference
blocks world
decision support
mobile robot
multi agent
learning algorithm