When is Agnostic Reinforcement Learning Statistically Tractable?

Zeyu Jia Gene Li Alexander Rakhlin Ayush Sekhari Nathan Srebro

Published in: CoRR (2023)

Keyphrases

reinforcement learning
function approximation
np complete
state space
optimal policy
reinforcement learning algorithms
model free
control problems
computational complexity
markov decision processes
learning algorithm
np hard
action selection
temporal difference learning
computational problems
markov decision process
stochastic approximation
reinforcement learning methods
database
temporal difference
optimal control
least squares
multi agent
neural network
real time