Convergence results for an averaged LQR problem with applications to reinforcement learning.
Andrea PesareMichele PalladinoMaurizio FalconePublished in: Math. Control. Signals Syst. (2021)
Keyphrases
- reinforcement learning
- optimal control
- stochastic approximation
- function approximation
- model free
- state space
- learning algorithm
- robotic control
- multi agent
- initial conditions
- temporal difference
- reinforcement learning algorithms
- machine learning
- faster convergence
- robot control
- learning capabilities
- convergence speed
- convergence rate
- transfer learning
- learning process
- information systems
- objective function
- temporal difference learning
- markov decision processes
- markov decision problems
- policy search