Uncertainty-Aware Reward-Free Exploration with General Function Approximation.

Junkai Zhang Weitong Zhang Dongruo Zhou Quanquan Gu

Published in: CoRR (2024)

Keyphrases

temporal difference learning
function approximation
reinforcement learning
temporal difference
temporal difference learning algorithms
learning tasks
function approximators
radial basis function
model free
state space
markov decision processes
policy gradient
multi agent
k nearest neighbor