The Regret of Exploration and the Control of Bad Episodes in Reinforcement Learning.

Victor Boone Bruno Gaujal

Published in: ICML (2023)

Keyphrases

real robot
reinforcement learning
control problems
action selection
exploration exploitation
exploration strategy
optimal control
adaptive control
control strategies
total reward
bandit problems
model free
control system
robot control
function approximation
lower bound
machine learning
cost sensitive
optimal policy
active learning
multi agent systems
active exploration
robotic control
model based reinforcement learning