Unknown mixing times in apprenticeship and reinforcement learning.

Tom Zahavy Alon Cohen Haim Kaplan Yishay Mansour

Published in: UAI (2020)

Keyphrases

reinforcement learning
state space
reinforcement learning algorithms
function approximation
initially unknown
learning process
markov decision processes
neural network
learning algorithm
multi agent
robotic control
cognitive apprenticeship
policy search
optimal control
orders of magnitude
dynamic programming
machine learning
optimal policy
evolutionary algorithm
model free
action selection
expert systems
partially observable
robot control
information systems
information retrieval
database