A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound.

Gal Dalal Balázs Szörényi Gugan Thoppe

Published in: AAAI (2020)

Keyphrases

reinforcement learning
state and action spaces
upper bound
function approximation
lower bound
worst case
model free
state space
reinforcement learning algorithms
markov decision processes
machine learning
real valued functions
multi agent
dynamic programming
temporal difference
data sets
temporal difference learning
action space
learning algorithm
optimal policy
supervised learning
finite number
action selection
real time
finite automata
least squares
reinforcement learning methods
decision trees
support vector