Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards.
Semih CayciAtilla EryilmazPublished in: CoRR (2023)
Keyphrases
- temporal difference learning
- heavy tailed
- reinforcement learning
- function approximation
- fixed point
- evaluation function
- markov decision processes
- game playing
- temporal difference
- probability distribution
- worst case
- random variables
- markov decision process
- function approximators
- support vector
- generalized gaussian
- machine learning