Linear Programming for Finite State Multi-Armed Bandit Problems.
Yih Ren ChenMichael N. KatehakisPublished in: Math. Oper. Res. (1986)
Keyphrases
- finite state
- linear programming
- multi armed bandit problems
- markov chain
- markov decision processes
- linear program
- bandit problems
- model checking
- dynamic programming
- optimal policy
- average cost
- partially observable markov decision processes
- np hard
- objective function
- tree automata
- optimal solution
- context free
- decision problems
- policy iteration
- markov decision problems
- continuous time bayesian networks
- reinforcement learning
- decision making
- evolutionary algorithm
- state space