CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee.
Tengyu XuYingbin LiangGuanghui LanPublished in: ICML (2021)
Keyphrases
- reinforcement learning
- stochastic approximation
- function approximation
- learning algorithm
- reinforcement learning algorithms
- machine learning
- policy search
- multi agent
- optimal policy
- neural network
- global convergence
- robot control
- initial conditions
- temporal difference
- transfer learning
- state space
- information retrieval
- model free
- partially observable
- optimal control
- markov decision process
- faster convergence
- function approximators
- multi agent reinforcement learning
- database
- number of iterations required
- robotic control