Learning RL-Policies for Joint Beamforming Without Exploration: A Batch Constrained Off-Policy Approach.

Heasung Kim Sravan Ankireddy

Published in: CoRR (2023)

Keyphrases

learning process
reinforcement learning
learning algorithm
autonomous learning
active learning
learning systems
knowledge acquisition
machine learning
supervised learning
optimal policy
learning tasks
action selection
learning agents
action models
reinforcement learning agents
exploration exploitation
exploration exploitation tradeoff