Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning.
Adhyyan NarangAndrew WagenmakerLillian J. RatliffKevin G. JamiesonPublished in: CoRR (2024)
Keyphrases
- sample complexity
- reinforcement learning
- learning problems
- policy search
- learning algorithm
- optimal policy
- supervised learning
- theoretical analysis
- vc dimension
- pac learning
- special case
- upper bound
- active learning
- concept classes
- generalization error
- lower bound
- training examples
- sequential decision problems
- partially observable
- reward function
- markov decision problems
- sample complexity bounds
- partially observable markov decision processes
- markov decision processes
- kernel methods
- sample size
- machine learning algorithms
- state space
- decision problems
- data mining
- function approximators
- training samples
- learning process
- machine learning