Adaptive maximum-lifetime routing in mobile ad-hoc networks using temporal difference reinforcement learning.
Saloua ChettibiSalim ChikhiPublished in: Evol. Syst. (2014)
Keyphrases
- temporal difference
- reinforcement learning
- maximum lifetime
- function approximation
- td learning
- model free
- reinforcement learning algorithms
- actor critic
- temporal difference learning
- wireless ad hoc networks
- evaluation function
- data gathering
- policy evaluation
- step size
- monte carlo
- function approximators
- action selection
- wireless sensor networks
- state space
- mobile ad hoc networks
- ad hoc networks
- policy iteration
- end to end
- base station
- least squares
- routing algorithm
- temporal difference methods
- learning algorithm
- transfer learning
- markov decision processes
- supervised learning
- sensor networks
- neural network