Reducing policy degradation in neuro-dynamic programming.
Thomas GabelMartin A. RiedmillerPublished in: ESANN (2006)
Keyphrases
- dynamic programming
- optimal policy
- infinite horizon
- markov decision problems
- state space
- policy search
- linear programming
- artificial neural networks
- markov decision process
- single machine
- policy making
- markov decision processes
- partially observable markov decision processes
- neural network
- neuro fuzzy
- significantly reduced
- greedy algorithm
- optimal control
- decision problems
- stereo matching
- reinforcement learning
- finite state
- information technology
- machine learning