Sampling Based Approaches for Minimizing Regret in Uncertain Markov Decision Processes (MDPs).
Asrar AhmedPradeep VarakanthamMeghna LowalekarYossiri AdulyasakPatrick JailletPublished in: J. Artif. Intell. Res. (2017)
Keyphrases
- markov decision processes
- finite state
- policy iteration
- state space
- optimal policy
- reinforcement learning
- dynamic programming
- decision theoretic planning
- reward function
- planning under uncertainty
- transition matrices
- factored mdps
- finite horizon
- model based reinforcement learning
- average reward
- partially observable
- reachability analysis
- average cost
- infinite horizon
- action sets
- reinforcement learning algorithms
- markov decision process
- total reward
- action space
- expected reward
- continuous state spaces
- semi markov decision processes
- state and action spaces
- probabilistic planning
- markov games
- decision making