Multi-policy iteration with a distributed voting.
Hyeong Soo ChangPublished in: Math. Methods Oper. Res. (2004)
Keyphrases
- policy iteration
- markov decision processes
- fixed point
- model free
- reinforcement learning
- least squares
- infinite horizon
- optimal policy
- multi agent
- optimal control
- search algorithm
- average reward
- finite state
- sample path
- markov decision problems
- temporal difference
- convergence rate
- probability distribution
- cost function