Self-Paced Deep Reinforcement Learning.

Pascal Klink Carlo D'Eramo Jan Peters Joni Pajarinen

Published in: NeurIPS (2020)

Keyphrases

reinforcement learning
function approximation
optimal policy
temporal difference
markov decision processes
direct policy search
reinforcement learning algorithms
model free
state space
learning algorithm
multi agent
control problems
machine learning
temporal difference learning
multi agent reinforcement learning
data sets
supervised learning
active learning
database
learning problems
learning process
action space
stochastic approximation