Login / Signup
Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees.
Sharan Vaswani
Amirreza Kazemi
Reza Babanezhad
Nicolas Le Roux
Published in:
CoRR (2023)
Keyphrases
</>
function approximation
theoretical guarantees
actor critic
policy iteration
temporal difference
reinforcement learning
model free
policy gradient
markov decision processes
temporal difference learning
worst case
reinforcement learning algorithms
decision making
learning tasks
approximate dynamic programming
radial basis function
optimal policy
decision problems
gradient method
optimal control
least squares
upper bound
function approximators
artificial neural networks
average reward