Efficient Massive-Device Orchestration Through Reinforcement Learning With Boosted Deep Deterministic Policy Gradient.
Haowei ShiJiadao ZouQingxue ZhangPublished in: IEEE Internet Things J. (2024)
Keyphrases
- reinforcement learning
- policy gradient
- function approximation
- reinforcement learning algorithms
- actor critic
- state space
- optimal control
- temporal difference
- cost function
- learning problems
- optimal policy
- learning algorithm
- learning capabilities
- function approximators
- approximation methods
- temporal difference learning
- policy search
- policy gradient methods