Login / Signup
A Structure-aware Online Learning Algorithm for Markov Decision Processes.
Arghyadip Roy
Vivek S. Borkar
Abhay Karandikar
Prasanna Chaporkar
Published in:
VALUETOOLS (2019)
Keyphrases
</>
markov decision processes
learning algorithm
reinforcement learning
reinforcement learning algorithms
state space
optimal policy
finite state
transition matrices
policy iteration
risk sensitive
dynamic programming
planning under uncertainty
reachability analysis
factored mdps
infinite horizon
reward function
partially observable
action space
average reward
learning tasks
state abstraction
policy evaluation
decision processes
finite horizon
model based reinforcement learning
optimal solution
action sets
state and action spaces
decision theoretic planning
average cost
multistage