Login / Signup
POTEC: Off-Policy Learning for Large Action Spaces via Two-Stage Policy Decomposition.
Yuta Saito
Jihan Yao
Thorsten Joachims
Published in:
CoRR (2024)
Keyphrases
</>
learning algorithm
action space
reinforcement learning
supervised learning
prior knowledge
state action
state and action spaces
action selection
pairwise
probability distribution
state space
domain independent
partially observable
continuous state
reinforcement learning problems
skill learning