On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL.

Jinglin Chen Aditya Modi Akshay Krishnamurthy Nan Jiang Alekh Agarwal

Published in: CoRR (2022)

Keyphrases

reinforcement learning
high efficiency
balancing exploration and exploitation
statistical analysis
statistical models
exploration exploitation
neural network
information theoretic
markov decision processes
exploration strategy
data driven
statistical information
long run
action selection
average reward