Adaptive exploration policy for exploration-exploitation tradeoff in continuous action control optimization.
Min LiTianyi HuangWilliam ZhuPublished in: Int. J. Mach. Learn. Cybern. (2021)
Keyphrases
- exploration exploitation tradeoff
- reinforcement learning
- function approximation
- continuous action
- objective function
- policy search
- relevance feedback
- action selection
- continuous state
- control policies
- control system
- control problems
- action space
- optimal policy
- partially observable markov decision processes
- control policy
- image retrieval
- monte carlo
- control strategy
- optimal control
- dynamic programming