Biasing Approximate Dynamic Programming with a Lower Discount Factor.
Marek PetrikBruno ScherrerPublished in: NIPS (2008)
Keyphrases
- approximate dynamic programming
- linear program
- reinforcement learning
- dynamic programming
- step size
- average cost
- markov decision processes
- markov decision problems
- policy iteration
- linear programming
- optimal policy
- convergence rate
- action selection
- control policy
- learning rate
- long run
- decision making
- average reward
- control system
- least squares