Projection-Based Fast and Safe Policy Optimization for Reinforcement Learning.
Shijun LinHao WangZiyang ChenZhen KanPublished in: ICRA (2024)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- state space
- partially observable
- markov decision process
- global optimization
- partially observable domains
- multi agent
- optimization algorithm
- reward function
- optimization methods
- learning algorithm
- optimization problems
- learning process
- actor critic
- markov decision problems
- action space
- supervised learning
- policy gradient
- control policies
- rl algorithms
- average reward
- optimization method
- reinforcement learning algorithms
- transfer learning
- action selection
- optimization process
- machine learning
- markov decision processes
- learning tasks