Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning.

Michael Teng Michiel van de Panne Frank Wood

Published in: CoRR (2022)

Keyphrases

reinforcement learning
active exploration
action selection
co occurrence
function approximation
exploration strategy
attribute values
reinforcement learning algorithms
feature values
learning process
genetic algorithm
sample size
user defined
moving target
exploration exploitation tradeoff
exploration exploitation
model free
target detection
optimal control
state space
active learning