PIRANHA: Policy iteration for recurrent artificial neural networks with hidden activities.
István SzitaAndrás LörinczPublished in: Neurocomputing (2006)
Keyphrases
- policy iteration
- artificial neural networks
- markov decision processes
- model free
- optimal policy
- fixed point
- reinforcement learning
- recurrent neural networks
- neural network
- least squares
- sample path
- finite state
- markov decision process
- temporal difference
- infinite horizon
- markov decision problems
- optimal control
- linear programming
- policy evaluation
- convergence rate
- average cost
- state space
- genetic algorithm
- radial basis function
- sufficient conditions
- average reward