Fast Stochastic Policy Gradient: Negative Momentum for Reinforcement Learning.
Haobin ZhangZhuang YangPublished in: CoRR (2024)
Keyphrases
- policy gradient
- reinforcement learning
- model free reinforcement learning
- actor critic
- function approximation
- reinforcement learning algorithms
- policy search
- learning automata
- policy gradient methods
- optimal control
- variance reduction
- gradient method
- approximation methods
- learning rate
- function approximators
- learning algorithm
- partially observable markov decision processes
- temporal difference
- model free
- monte carlo
- multi agent
- optimal policy
- reinforcement learning methods
- markov decision processes
- average reward
- state space
- neural network
- single agent
- radial basis function
- sufficient conditions
- continuous state
- control system
- multi agent systems
- machine learning