Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation.

Weitong Zhang Dongruo Zhou Quanquan Gu

Published in: NeurIPS (2021)

Keyphrases

function approximation
reinforcement learning
model based reinforcement learning
temporal difference learning algorithms
markov decision processes
function approximators
temporal difference
state space
model free
policy gradient
reinforcement learning algorithms
markov decision problems
reward function
learning process
radial basis function
neural network
machine learning
partially observable markov decision processes
learning tasks
supervised learning
dynamic programming
multi agent