Login / Signup
Near Minimax-Optimal Distributional Temporal Difference Algorithms and The Freedman Inequality in Hilbert Spaces.
Yang Peng
Liangyu Zhang
Zhihua Zhang
Published in:
CoRR (2024)
Keyphrases
</>
worst case
temporal difference
evaluation function
td learning
reinforcement learning
function approximation
model free
data mining
machine learning algorithms
hilbert spaces
dynamic programming