A reward field model generation in Q-learning by dynamic programming.
Yunsick SungKyungeun ChoKyhyun UmPublished in: ICIS (2009)
Keyphrases
- dynamic programming
- reinforcement learning
- state space
- function approximation
- optimal policy
- single machine
- multi agent
- cooperative
- markov decision processes
- reward function
- linear programming
- learning agent
- dp matching
- decision making
- discounted reward
- eligibility traces
- markov decision problems
- coarse to fine
- greedy algorithm
- stereo matching