An efficient algorithm for learning with semi-bandit feedback
Gergely NeuGábor BartókPublished in: CoRR (2013)
Keyphrases
- learning algorithm
- theoretical analysis
- cost function
- dynamic programming
- search space
- matching algorithm
- incremental learning
- learning scheme
- learning speed
- learning phase
- computational cost
- learning tasks
- learning process
- prior knowledge
- np hard
- experimental evaluation
- objective function
- simulated annealing
- similarity measure
- times faster
- learning rules
- recognition algorithm
- neural network
- upper confidence bound
- detection algorithm
- optimization algorithm
- computationally efficient
- expectation maximization
- high accuracy
- active learning
- association rules
- search algorithm
- reinforcement learning