Deep Reinforcement Learning-Assisted Age-optimal Transmission Policy for HARQ-aided NOMA Networks.
Kunpeng LiuAimin LiShaohua WuPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- optimal policy
- control policy
- dynamic programming
- optimal control
- control policies
- policy search
- approximate dynamic programming
- expected cost
- finite horizon
- total reward
- state space
- markov decision process
- partially observable
- state and action spaces
- worst case
- markov decision processes
- function approximation
- action selection
- state dependent
- action space
- average reward
- network coding
- continuous state spaces
- transition model
- learning algorithm
- asymptotically optimal
- allocation policy
- cellular networks
- partially observable environments