Offline Reinforcement Learning as Anti-Exploration.

Shideh Rezaeifar Robert Dadashi Nino Vieillard Léonard Hussenot Olivier Bachem Olivier Pietquin Matthieu Geist

Published in: CoRR (2021)

Keyphrases

reinforcement learning
active exploration
exploration strategy
action selection
model based reinforcement learning
exploration exploitation
function approximation
reinforcement learning algorithms
learning algorithm
autonomous learning
real time
machine learning
exploration exploitation tradeoff
model free
markov decision processes
active learning
temporal difference
robotic control
state space
learning capabilities
balancing exploration and exploitation
database
real world
genetic algorithm
information systems
control policy
objective function
multi agent
transfer learning
optimal policy