Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-free RL.

Ruiquan Huang Jing Yang Yingbin Liang

Published in: CoRR (2022)

Keyphrases

sample complexity
reinforcement learning
learning problems
learning algorithm
theoretical analysis
supervised learning
pac learning
vc dimension
special case
lower bound
generalization error
sequential decision problems
upper bound
active learning
action selection
training examples
concept classes
state space
sample size
training set
multi agent
active exploration
training data
unsupervised learning
machine learning algorithms
long run
markov decision processes
pac model
sufficient conditions
machine learning