A Reduction from Reinforcement Learning to No-Regret Online Learning.

Ching-An Cheng Remi Tachet des Combes Byron Boots Geoffrey J. Gordon

Published in: CoRR (2019)

Keyphrases

online learning
reinforcement learning
computer mediated
higher education
online course
e learning
dynamic programming
distance education
online algorithms
model free
function approximation
markov decision processes
active learning
distance learning
learning algorithm
transfer learning
state space
online learning environments
temporal difference
neural network
online convex optimization
blended learning
optimal control
learning process
machine learning
learning tasks
action space
multi agent