rl4dtn: Q-Learning for Opportunistic Networks.
Jorge ViscaJavier BaliosianPublished in: Future Internet (2022)
Keyphrases
- reinforcement learning
- function approximation
- delay tolerant
- reinforcement learning algorithms
- optimal policy
- model free
- state space
- multi agent
- learning algorithm
- action selection
- temporal difference learning
- multi agent reinforcement learning
- markov decision processes
- cooperative
- social networks
- network structure
- policy iteration
- learning agent
- stochastic approximation
- hierarchical reinforcement learning
- learning rate
- end to end delay