Learning Neural Contextual Bandits through Perturbed Rewards.
Yiling JiaWeitong ZhangDongruo ZhouQuanquan GuHongning WangPublished in: ICLR (2022)
Keyphrases
- learning systems
- reinforcement learning
- learning process
- knowledge acquisition
- data sets
- neural network
- multi armed bandits
- incremental learning
- learning tasks
- supervised learning
- prior knowledge
- case study
- contextual information
- mobile learning
- decision trees
- learning algorithm
- learning rules
- motor control
- machine learning