Scalable Online Planning via Reinforcement Learning Fine-Tuning.
Arnaud FickingerHengyuan HuBrandon AmosStuart RussellNoam BrownPublished in: CoRR (2021)
Keyphrases
- fine tuning
- reinforcement learning
- viable alternative
- fine tune
- fine tuned
- real time
- function approximation
- decision support
- web scale
- planning problems
- online learning
- state space
- travel planning
- reinforcement learning algorithms
- mixed initiative
- deterministic domains
- partially observable domains
- action selection
- heuristic search
- decision making
- search engine
- machine learning
- single agent
- partially observable markov decision processes
- planning process
- model free
- partial observability
- dynamic programming
- stochastic domains