Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees.
Sharan VaswaniAmirreza KazemiReza Babanezhad HarikandehNicolas Le RouxPublished in: NeurIPS (2023)
Keyphrases
- function approximation
- theoretical guarantees
- actor critic
- policy iteration
- temporal difference
- reinforcement learning
- policy gradient
- model free
- reinforcement learning algorithms
- markov decision processes
- decision making
- worst case
- temporal difference learning
- learning tasks
- approximate dynamic programming
- optimal policy
- function approximators
- radial basis function
- state space
- decision problems
- least squares
- optimal control
- markov chain
- artificial neural networks
- machine learning
- linear programming
- average reward