Login / Signup
Uncertainty-Aware Reward-Free Exploration with General Function Approximation.
Junkai Zhang
Weitong Zhang
Dongruo Zhou
Quanquan Gu
Published in:
CoRR (2024)
Keyphrases
</>
temporal difference learning
function approximation
reinforcement learning
temporal difference
temporal difference learning algorithms
learning tasks
function approximators
radial basis function
model free
state space
markov decision processes
policy gradient
multi agent
k nearest neighbor