Login / Signup
Geometric Value Iteration: Dynamic Error-Aware KL Regularization for Reinforcement Learning.
Toshinori Kitamura
Lingwei Zhu
Takamitsu Matsubara
Published in:
ACML (2021)
Keyphrases
</>
reinforcement learning
markov decision processes
state space
optimal policy
error rate
dynamic programming
markov decision process
heuristic search
dynamic environments
semi supervised
reinforcement learning algorithms
kl divergence
kullback leibler
markov decision chains