VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation.

Thanh Nguyen-Tang Raman Arora

Published in: ICLR (2023)

Keyphrases

function approximation
reinforcement learning
learning algorithm
model free
mountain car
search space
dynamic programming
function approximators
temporal difference learning
actor critic
supervised learning
learning experience
evaluation function
td learning