Cooperative Actor-Critic via TD Error Aggregation.
Martin FiguraYixuan LinJi LiuVijay GuptaPublished in: CoRR (2022)
Keyphrases
- actor critic
- temporal difference
- cooperative
- reinforcement learning
- reinforcement learning algorithms
- evaluation function
- function approximation
- monte carlo
- policy gradient
- temporal difference learning
- model free
- learning algorithm
- optimal control
- multi agent systems
- action selection
- multi agent
- markov decision processes
- approximate dynamic programming
- step size
- game theory
- gradient method
- variance reduction
- state space
- convergence speed
- neuro fuzzy
- policy iteration
- function approximators
- reinforcement learning methods
- sparse representation