Login / Signup
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation.
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger B. Grosse
Jimmy Ba
Published in:
CoRR (2017)
Keyphrases
</>
trust region
dynamic programming
reinforcement learning
log likelihood
cost function
objective function
state space
particle swarm optimization
em algorithm
optimization method
approximation algorithms