Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees.
Sharan VaswaniAmirreza KazemiReza BabanezhadNicolas Le RouxPublished in: CoRR (2023)
Keyphrases
- function approximation
- theoretical guarantees
- actor critic
- policy iteration
- temporal difference
- reinforcement learning
- model free
- policy gradient
- markov decision processes
- temporal difference learning
- worst case
- reinforcement learning algorithms
- decision making
- learning tasks
- approximate dynamic programming
- radial basis function
- optimal policy
- decision problems
- gradient method
- optimal control
- least squares
- upper bound
- function approximators
- artificial neural networks
- average reward