Login / Signup
Loop Estimator for Discounted Values in Markov Reward Processes.
Falcon Z. Dai
Matthew R. Walter
Published in:
AAAI (2021)
Keyphrases
</>
average reward
markov chain
dynamic programming
least squares
attribute values
optimality criterion
long run
discounted reward
reinforcement learning
probabilistic model
machine learning
user defined
optimal policy
markov model
confidence intervals
business processes
bayesian networks