Login / Signup
Backprop-MPDM: Faster Risk-Aware Policy Evaluation Through Efficient Gradient Optimization.
Dhanvin Mehta
Gonzalo Ferrer
Edwin Olson
Published in:
ICRA (2018)
Keyphrases
</>
policy evaluation
least squares
neural network
machine learning
multi agent
evolutionary algorithm
cost function
graphical models
monte carlo
decision problems
fixed point
learning tasks
function approximation
temporal difference