MDPo: Offline Reinforcement Learning Based on Mixture Density Policy Network.
Chen LiuYizhuo WangPublished in: GAIIS (2024)
Keyphrases
- reinforcement learning
- optimal policy
- computer networks
- network structure
- reinforcement learning problems
- real time
- policy search
- network traffic
- network model
- peer to peer
- complex networks
- function approximation
- markov decision process
- action selection
- reinforcement learning algorithms
- partially observable domains
- reward function
- temporal difference
- model free
- dynamic programming
- wireless sensor networks
- social networks