Integrating Temporal Difference Methods and Self-Organizing Neural Networks for Reinforcement Learning With Delayed Evaluative Feedback.
Ah-Hwee TanN. LuDan XiaoPublished in: IEEE Trans. Neural Networks (2008)
Keyphrases
- temporal difference methods
- reinforcement learning
- function approximation
- temporal difference
- function approximators
- policy search
- evolutionary methods
- reinforcement learning problems
- policy evaluation
- reinforcement learning algorithms
- td learning
- evaluation function
- evolutionary computation
- td methods
- evolutionary algorithm
- monte carlo
- model free
- learning process
- policy iteration
- action selection
- dynamic programming
- machine learning
- markov decision problems
- markov decision processes
- optimal policy
- computational intelligence
- least squares
- multi agent
- learning algorithm