Deep Reinforcement Learning for Bandit Arm Localization.
Wenbin DuHuaqing JinChao YuGuosheng YinPublished in: Big Data (2022)
Keyphrases
- reinforcement learning
- multi armed bandit problems
- multi armed bandit
- state space
- function approximation
- learning algorithm
- robotic control
- accurate localization
- robotic arm
- object localization
- optimal policy
- random sampling
- neural network
- dynamic programming
- exploration exploitation
- bandit problems
- source localization
- localization method
- machine learning
- reinforcement learning methods
- temporal difference learning
- deep learning
- robot control
- learning capabilities
- action selection
- model free
- markov chain
- supervised learning
- reinforcement learning algorithms
- markov decision processes
- upper bound