Login / Signup
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs.
Junkai Zhang
Weitong Zhang
Quanquan Gu
Published in:
ICML (2023)
Keyphrases
</>
reinforcement learning
markov decision processes
dynamic programming
state space
optimal linear
closed form
worst case
average reward
machine learning
mixture model
function approximation
piecewise linear
finite horizon