Sign in

Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs.

Junkai ZhangWeitong ZhangQuanquan Gu
Published in: CoRR (2023)
Keyphrases
  • reinforcement learning
  • markov decision processes
  • closed form
  • average reward
  • semi infinite programming
  • reward function
  • optimal linear
  • optimal solution
  • dynamic programming
  • mixture model
  • long run
  • infinite horizon