Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation.

Dylan J. Foster Akshay Krishnamurthy David Simchi-Levi Yunzong Xu

Published in: COLT (2022)

Keyphrases

reinforcement learning
temporal difference
state space
temporal difference learning
function approximation
approximate dynamic programming
function approximators
real time
model free
state action
reinforcement learning algorithms
action selection
basis functions
markov games
markov decision processes
optimal policy
optimal control
linear combination
supervised learning
partially observable
control problems
robot control
dynamic programming
active learning
machine learning
reinforcement learning methods
stochastic approximation
neural network
database