Bandit problems and the exploration/exploitation tradeoff.
William G. MacreadyDavid H. WolpertPublished in: IEEE Trans. Evol. Comput. (1998)
Keyphrases
- bandit problems
- exploration exploitation tradeoff
- objective function
- relevance feedback
- multi armed bandits
- reinforcement learning
- function approximation
- decision problems
- active learning
- optimal solution
- influence diagrams
- decision makers
- expected utility
- markov chain
- data mining
- artificial neural networks
- artificial intelligence
- information retrieval