Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees.
Kishan Panaganti BadrinathDileep KalathilPublished in: ICML (2021)
Keyphrases
- reinforcement learning
- model free
- reinforcement learning algorithms
- temporal difference
- function approximation
- reinforcement learning methods
- policy iteration
- multi agent
- state space
- optimal policy
- continuous state spaces
- partially observable
- partial occlusion
- markov decision processes
- supervised learning
- nearest neighbor
- dynamic programming
- control system