Meta attention for Off-Policy Actor-Critic.
Jiateng HuangWanrong HuangLong LanDan WuPublished in: Neural Networks (2023)
Keyphrases
- actor critic
- reinforcement learning
- optimal control
- gradient method
- policy gradient
- temporal difference
- neural network
- function approximation
- average reward
- approximate dynamic programming
- neuro fuzzy
- cost function
- dynamic programming
- state space
- sufficient conditions
- machine learning algorithms
- reinforcement learning algorithms