Login / Signup
Multi-step Off-policy Learning Without Importance Sampling Ratios.
Ashique Rupam Mahmood
Huizhen Yu
Richard S. Sutton
Published in:
CoRR (2017)
Keyphrases
</>
multi step
importance sampling
learning process
learning algorithm
neural network
objective function
active learning
monte carlo
similarity measure
information extraction
supervised learning
linear combination