Login / Signup
Learning Optimal Policies in Mean Field Models with Kullback-Leibler Regularization.
Ana Busic
Sean P. Meyn
Neil Cammardella
Published in:
CDC (2023)
Keyphrases
</>
optimal policy
reinforcement learning
markov decision processes
kullback leibler
prior knowledge
probabilistic model
learning algorithm
dynamic programming
state space
kl divergence
machine learning
distance measure
hidden variables
cross entropy
average reward reinforcement learning