Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs.

Tao Liu Ruida Zhou Dileep Kalathil P. R. Kumar Chao Tian

Published in: CoRR (2021)

Keyphrases

learning process
learning systems
learning algorithm
reinforcement learning
supervised learning
bayesian networks
online learning
stochastic domains
active learning
learning tasks