Bandit Learning with Delayed Impact of Actions.
Wei TangChien-Ju HoYang LiuPublished in: NeurIPS (2021)
Keyphrases
- learning process
- learning problems
- learning algorithm
- neural network
- learning scheme
- active learning
- intelligent behavior
- state space
- supervised learning
- knowledge acquisition
- decision theoretic
- learning systems
- background knowledge
- mobile learning
- learning mechanisms
- goal directed
- data sets
- simulated robot
- activity recognition
- unsupervised learning
- markov chain
- online learning
- prior knowledge
- reinforcement learning