Restricted gradient-descent algorithm for value-function approximation in reinforcement learning.
André da Motta Salles BarretoCharles W. AndersonPublished in: Artif. Intell. (2008)
Keyphrases
- dynamic programming
- learning algorithm
- reinforcement learning
- objective function
- preprocessing
- computational cost
- times faster
- detection algorithm
- experimental evaluation
- cost function
- search space
- state space
- np hard
- significant improvement
- worst case
- simulated annealing
- optimization algorithm
- computational complexity
- temporal difference learning
- neural network
- high accuracy
- particle swarm optimization
- optimal policy
- approximate dynamic programming