Login / Signup
Learning parametric policies and transition probability models of markov decision processes from data.
Tingting Xu
Henghui Zhu
Ioannis Ch. Paschalidis
Published in:
Eur. J. Control (2021)
Keyphrases
</>
markov decision processes
prior knowledge
reinforcement learning
probability distribution
optimal policy
probability models
macro actions
learning algorithm
training data
average cost
markov decision problems
decision theoretic planning
hierarchical reinforcement learning