Login / Signup
Diversity-Inducing Policy Gradient: Using Maximum Mean Discrepancy to Find a Set of Diverse Policies.
Muhammad A. Masood
Finale Doshi-Velez
Published in:
CoRR (2019)
Keyphrases
</>
np hard
policy gradient
machine learning
simulated annealing
markov chain
matrix factorization
function approximation