Login / Signup
f-Policy Gradients: A General Framework for Goal-Conditioned RL using f-Divergences.
Siddhant Agarwal
Ishan Durugkar
Peter Stone
Amy Zhang
Published in:
NeurIPS (2023)
Keyphrases
</>
optimal policy
reinforcement learning
function approximation
markov decision process
action selection
policy gradient
neural network
control policy
agent learns
reward signal
policy search