Login / Signup
Multiple-Step Greedy Policies in Online and Approximate Reinforcement Learning.
Yonathan Efroni
Gal Dalal
Bruno Scherrer
Shie Mannor
Published in:
CoRR (2018)
Keyphrases
</>
reinforcement learning
optimal policy
function approximation
machine learning
search algorithm
real time
multi step
dynamic programming
state space
post processing
markov decision process
multi agent
least squares
greedy algorithm
policy search
hierarchical reinforcement learning