No-Regret Reinforcement Learning with Heavy-Tailed Rewards.

Vincent Zhuang Yanan Sui

Published in: CoRR (2021)

Keyphrases

heavy tailed
reinforcement learning
reward function
total reward
markov decision processes
reinforcement learning algorithms
bandit problems
state space
generalized gaussian
heavy tails
optimal policy
learning algorithm
average reward
information theory
reward signal