Login / Signup
Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs.
Tao Liu
Ruida Zhou
Dileep Kalathil
Panganamala R. Kumar
Chao Tian
Published in:
NeurIPS (2021)
Keyphrases
</>
reinforcement learning
learning process
learning algorithm
learning systems
markov decision problems
e learning
bayesian networks
multi agent systems
prior knowledge
active learning
markov chain
optimal policy
markov decision processes
learning tasks
partially observable
policy search