Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs.
Yuan ChengRuiquan HuangYingbin LiangJing YangPublished in: ICLR (2023)
Keyphrases
- reinforcement learning
- low rank
- sample complexity
- learning problems
- learning algorithm
- supervised learning
- markov decision processes
- missing data
- convex optimization
- linear combination
- matrix factorization
- state space
- theoretical analysis
- singular value decomposition
- function approximation
- upper bound
- semi supervised
- kernel matrix
- generalization error
- reward function
- high order
- active learning
- optimal policy
- high dimensional data
- reinforcement learning algorithms
- special case
- machine learning
- transfer learning
- lower bound
- markov decision process
- dynamic programming
- vc dimension
- model free
- learning process
- partially observable
- sample size
- markov decision problems
- training data
- unsupervised learning
- higher order
- feature selection
- pairwise
- temporal difference
- knn
- action space