Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-free RL.
Ruiquan HuangJing YangYingbin LiangPublished in: CoRR (2022)
Keyphrases
- sample complexity
- reinforcement learning
- learning problems
- learning algorithm
- theoretical analysis
- supervised learning
- pac learning
- vc dimension
- special case
- lower bound
- generalization error
- sequential decision problems
- upper bound
- active learning
- action selection
- training examples
- concept classes
- state space
- sample size
- training set
- multi agent
- active exploration
- training data
- unsupervised learning
- machine learning algorithms
- long run
- markov decision processes
- pac model
- sufficient conditions
- machine learning