Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions.
Yasin Abbasi-YadkoriPeter L. BartlettVarun KanadeYevgeny SeldinCsaba SzepesváriPublished in: NIPS (2013)
Keyphrases
- markov decision processes
- online learning
- probability distribution
- finite state
- optimal policy
- reinforcement learning
- transition matrices
- state space
- random variables
- reachability analysis
- dynamic programming
- state transition
- decision theoretic planning
- planning under uncertainty
- average cost
- action space
- model based reinforcement learning
- policy iteration
- average reward
- partially observable
- e learning
- risk sensitive
- finite horizon
- factored mdps
- reinforcement learning algorithms
- decision processes
- markov decision process
- active learning
- semi markov decision processes
- bayesian networks
- action sets
- infinite horizon
- state and action spaces
- probabilistic planning
- partially observable markov decision processes
- decision problems