Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning.
Michael TengMichiel van de PanneFrank WoodPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- active exploration
- action selection
- co occurrence
- function approximation
- exploration strategy
- attribute values
- reinforcement learning algorithms
- feature values
- learning process
- genetic algorithm
- sample size
- user defined
- moving target
- exploration exploitation tradeoff
- exploration exploitation
- model free
- target detection
- optimal control
- state space
- active learning