Bounding the optimal value function in compositional reinforcement learning.
Jacob AdamczykVolodymyr MakarenkoArgenis ArriojasStas TiomkinRahul V. KulkarniPublished in: UAI (2023)
Keyphrases
- reinforcement learning
- dynamic programming
- optimal control
- control policy
- function approximators
- piecewise linear
- state space
- upper bound
- function approximation
- approximate dynamic programming
- semi infinite programming
- lp norm
- optimal strategy
- worst case
- active learning
- learning algorithm
- model free
- optimal design
- markov decision processes
- average reward
- optimal solution
- support vector