The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning.
Kaiwen WangKevin ZhouRunzhe WuNathan KallusWen SunPublished in: NeurIPS (2023)
Keyphrases
- reinforcement learning
- temporal difference learning
- loss bounds
- function approximation
- reinforcement learning algorithms
- small number
- worst case
- temporal difference
- machine learning
- state space
- nearest neighbor
- expert advice
- optimal policy
- markov decision processes
- model free
- function approximators
- dynamic programming