Login / Signup

Logarithmic regret bounds for continuous-time average-reward Markov decision processes.

Xuefeng GaoXun Yu Zhou
Published in: CoRR (2022)
Keyphrases