Constrained Policy Gradient Method for Safe and Fast Reinforcement Learning: a Neural Tangent Kernel Based Approach.
Balázs VargaBalázs KulcsárMorteza Haghir ChehreghaniPublished in: CoRR (2021)
Keyphrases
- gradient method
- actor critic
- policy gradient
- reinforcement learning
- convergence rate
- optimal policy
- negative matrix factorization
- optimization methods
- step size
- neural network
- support vector
- action selection
- reinforcement learning algorithms
- kernel methods
- state space
- rl algorithms
- markov decision process
- machine learning
- temporal difference
- multiple kernel learning
- markov decision processes
- feature space
- action space
- control policy
- text categorization
- approximate dynamic programming
- fitted q iteration