Learning RL-Policies for Joint Beamforming Without Exploration: A Batch Constrained Off-Policy Approach.
Heasung KimSravan AnkireddyPublished in: CoRR (2023)
Keyphrases
- learning process
- reinforcement learning
- learning algorithm
- autonomous learning
- active learning
- learning systems
- knowledge acquisition
- machine learning
- supervised learning
- optimal policy
- learning tasks
- action selection
- learning agents
- action models
- reinforcement learning agents
- exploration exploitation
- exploration exploitation tradeoff