Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon.

Zihan Zhang Xiangyang Ji Simon S. Du

Published in: COLT (2021)

Keyphrases

learning algorithm
k means
experimental evaluation
reinforcement learning
times faster
detection algorithm
high accuracy
dynamic programming
cost function
expectation maximization
optimization algorithm
recognition algorithm
computational cost
high dimensional
preprocessing
optimal solution
model free
stochastic approximation
reinforcement learning algorithms
optimal or near optimal
hybrid algorithm
ant colony optimization
segmentation algorithm
linear programming
simulated annealing