Login / Signup
Finite Time Analysis of Temporal Difference Learning for Mean-Variance in a Discounted MDP.
Tejaram Sangadi
Prashanth L. A.
Krishna P. Jagannathan
Published in:
CoRR (2024)
Keyphrases
</>
temporal difference learning
markov decision processes
markov decision process
optimal policy
reinforcement learning
function approximation
machine learning
dynamic programming
graph cuts
finite number
game playing
approximate value iteration