URLB: Unsupervised Reinforcement Learning Benchmark.

Michael Laskin Denis Yarats Hao Liu Kimin Lee Albert Zhan Kevin Lu Catherine Cang Lerrel Pinto Pieter Abbeel

Published in: CoRR (2021)

Keyphrases

reinforcement learning
supervised learning
unsupervised learning
semi supervised
temporal difference
model free
temporal difference learning
multi agent
optimal policy
state space
data driven
supervised classification
completely unsupervised
database
dynamic programming
markov decision processes
evaluation function
learning process
training data
information retrieval
learning capabilities
robot control
data mining
stochastic approximation
information bottleneck
multi agent reinforcement learning
real time