Tuning continual exploration in reinforcement learning: An optimality property of the Boltzmann strategy.
Youssef AchbanyFrançois FoussLuh YenAlain PirotteMarco SaerensPublished in: Neurocomputing (2008)
Keyphrases
- reinforcement learning
- exploration strategy
- exploration exploitation
- unknown environments
- markov decision processes
- active exploration
- search strategies
- action selection
- model free
- exhaustive search
- function approximation
- dynamic programming
- learning process
- model based reinforcement learning
- optimal solution
- data mining
- reinforcement learning algorithms
- average reward
- database
- exploration exploitation tradeoff
- optimal strategy
- selection strategy
- parameter tuning
- optimal control
- learning problems
- learning tasks
- state space
- data sets