Playing games with scenario- and resource-aware SDF graphs through policy iteration.
Yang YangMarc GeilenTwan BastenSander StuijkHenk CorporaalPublished in: DATE (2012)
Keyphrases
- policy iteration
- markov decision processes
- playing games
- least squares
- model free
- reinforcement learning
- fixed point
- optimal policy
- sample path
- temporal difference
- game design
- infinite horizon
- markov decision process
- finite state
- policy evaluation
- average reward
- linear programming
- optimal control
- convergence rate
- state space
- markov chain
- markov decision problems
- search algorithm