Bandit Learning with Delayed Impact of Actions.

Wei Tang Chien-Ju Ho Yang Liu

Published in: NeurIPS (2021)

Keyphrases

learning process
learning problems
learning algorithm
neural network
learning scheme
active learning
intelligent behavior
state space
supervised learning
knowledge acquisition
decision theoretic
learning systems
background knowledge
mobile learning
learning mechanisms
goal directed
data sets
simulated robot
activity recognition
unsupervised learning
markov chain
online learning
prior knowledge
reinforcement learning