An Evolutionary Random Policy Search Algorithm for Solving Markov Decision Processes.
Jiaqiao HuMichael C. FuVahid Reza RamezaniSteven I. MarcusPublished in: INFORMS J. Comput. (2007)
Keyphrases
- markov decision processes
- optimal policy
- policy iteration
- search algorithm for solving
- markov decision process
- average reward
- infinite horizon
- state space
- finite horizon
- reward function
- partially observable
- decision processes
- average cost
- action space
- reinforcement learning
- finite state
- dynamic programming
- transition matrices
- state and action spaces
- total reward
- search algorithm
- policy evaluation
- reinforcement learning algorithms
- discounted reward
- decision theoretic planning
- long run
- control policies
- partially observable markov decision processes
- decision problems
- lower bound
- planning under uncertainty
- model based reinforcement learning
- reachability analysis
- markov decision problems
- factored mdps
- stationary policies
- sufficient conditions
- expected reward
- least squares
- risk sensitive
- tabu search
- action selection