Optimising darts strategy using Markov decision processes and reinforcement learning.
Graham BairdPublished in: J. Oper. Res. Soc. (2020)
Keyphrases
- markov decision processes
- reinforcement learning
- reinforcement learning algorithms
- optimal policy
- state space
- dynamic programming
- policy iteration
- state and action spaces
- finite state
- model based reinforcement learning
- decision theoretic planning
- state abstraction
- partially observable
- finite horizon
- reachability analysis
- planning under uncertainty
- function approximation
- transition matrices
- action space
- decision processes
- action sets
- average reward
- markov decision process
- factored mdps
- reward function
- temporal difference
- model free
- total reward
- policy iteration algorithm
- machine learning
- control policy
- optimal strategy
- decision problems
- sufficient conditions
- markov chain
- learning algorithm