On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL.

Jinglin Chen Aditya Modi Akshay Krishnamurthy Nan Jiang Alekh Agarwal

Published in: NeurIPS (2022)

Keyphrases

reinforcement learning
exploration strategy
data driven
balancing exploration and exploitation
data sets
action selection
learning algorithm
learning process
information theoretic
autonomous learning
high efficiency
learning agent
markov decision processes
exploration exploitation
reward function
function approximation
evolutionary algorithm
genetic algorithm
machine learning