Login / Signup
Expected Policy Gradients.
Kamil Ciosek
Shimon Whiteson
Published in:
CoRR (2017)
Keyphrases
</>
optimal policy
data mining
machine learning
reinforcement learning
opportunity cost
search engine
state space
sufficient conditions
markov decision processes
total reward