Login / Signup
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs.
Junkai Zhang
Weitong Zhang
Quanquan Gu
Published in:
CoRR (2023)
Keyphrases
</>
reinforcement learning
markov decision processes
closed form
average reward
semi infinite programming
reward function
optimal linear
optimal solution
dynamic programming
mixture model
long run
infinite horizon