Login / Signup

Near Minimax-Optimal Distributional Temporal Difference Algorithms and The Freedman Inequality in Hilbert Spaces.

Yang PengLiangyu ZhangZhihua Zhang
Published in: CoRR (2024)
Keyphrases
  • worst case
  • temporal difference
  • evaluation function
  • td learning
  • reinforcement learning
  • function approximation
  • model free
  • data mining
  • machine learning algorithms
  • hilbert spaces
  • dynamic programming