More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning.

Kaiwen Wang Owen Oertell Alekh Agarwal Nathan Kallus Wen Sun

Published in: CoRR (2024)

Keyphrases

reinforcement learning
upper bound
function approximation
higher order
upper and lower bounds
state space
optimal policy
machine learning
tight bounds
model free
lower bound
learning process
error bounds
learning algorithm
robotic control
co occurrence
dynamic programming
markov decision processes
learning classifier systems
vc dimension
genetic algorithm
reinforcement learning algorithms
robot control
markov decision process
real time