Sample-Efficient Iterative Lower Bound Optimization of Deep Reactive Policies for Planning in Continuous MDPs.
Siow Meng LowAkshat KumarScott SannerPublished in: AAAI (2022)
Keyphrases
- lower bound
- upper bound
- optimal policy
- markov decision processes
- dynamic programming
- state space
- optimization problems
- stochastic domains
- markov decision process
- reactive planning
- macro actions
- markov decision problems
- action space
- efficient computation
- finite horizon
- query plan
- lower and upper bounds
- reinforcement learning
- objective function