On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL.
Jinglin ChenAditya ModiAkshay KrishnamurthyNan JiangAlekh AgarwalPublished in: NeurIPS (2022)
Keyphrases
- reinforcement learning
- exploration strategy
- data driven
- balancing exploration and exploitation
- data sets
- action selection
- learning algorithm
- learning process
- information theoretic
- autonomous learning
- high efficiency
- learning agent
- markov decision processes
- exploration exploitation
- reward function
- function approximation
- evolutionary algorithm
- genetic algorithm
- machine learning