Login / Signup

Safe Q-Learning Method Based on Constrained Markov Decision Processes.

Yangyang GeFei ZhuXinghong LingQuan Liu
Published in: IEEE Access (2019)
Keyphrases
  • markov decision processes
  • state space
  • dynamic programming
  • reinforcement learning
  • optimal policy
  • objective function
  • finite state
  • convergence rate
  • real valued
  • model free
  • average cost
  • average reward