Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods.
René CarmonaMathieu LaurièreZongjun TanPublished in: CoRR (2019)
Keyphrases
- policy gradient methods
- linear quadratic
- optimal control
- reinforcement learning
- policy gradient
- actor critic
- natural actor critic
- markov random field
- dynamical systems
- function approximation
- robot arm
- vector valued
- closed loop
- function approximators
- reinforcement learning algorithms
- convergence rate
- reinforcement learning problems
- em algorithm
- dynamic programming
- state space
- real time
- rl algorithms
- temporal difference
- convergence speed
- action selection
- pairwise
- multi agent
- control strategy
- least squares