Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-Free RL.
Ruiquan HuangJing YangYingbin LiangPublished in: ICLR (2023)
Keyphrases
- sample complexity
- reinforcement learning
- learning problems
- learning algorithm
- supervised learning
- theoretical analysis
- vc dimension
- upper bound
- pac learning
- special case
- active learning
- sequential decision problems
- lower bound
- generalization error
- action selection
- training examples
- concept classes
- number of irrelevant features
- sample size
- markov decision processes
- optimal policy
- multi agent
- learning tasks
- kernel methods
- data sets
- small number
- bandit problems
- active exploration
- feature extraction
- machine learning