Login / Signup
Weakly Chained Matrices, Policy Iteration, and Impulse Control.
Parsiad Azimzadeh
Peter A. Forsyth
Published in:
SIAM J. Numer. Anal. (2016)
Keyphrases
</>
policy iteration
markov decision processes
optimal control
least squares
model free
optimal policy
fixed point
sample path
reinforcement learning
control system
temporal difference
policy evaluation
state space
linear programming
dynamical systems
markov decision process