A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound.

Gal Dalal Balázs Szörényi Gugan Thoppe

Published in: CoRR (2019)

Keyphrases

reinforcement learning
state and action spaces
upper bound
lower bound
markov decision processes
function approximation
machine learning
unit length
worst case
real valued functions
reinforcement learning algorithms
multi agent
state space
vc dimension
neural network
temporal difference learning
finite number
error bounds
optimal control
markov decision process
learning agents
learning process
reinforcement learning methods
optimal policy
policy search
learning algorithm
dynamic programming