Login / Signup
Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning.
Nathan Kallus
Masatoshi Uehara
Published in:
CoRR (2020)
Keyphrases
</>
reinforcement learning
optimal policy
cost effective
learning automata
control policies
direct policy search
sufficient conditions
markov decision processes
reward function
stochastic approximation
learning process
dynamic programming
state space
evaluation method
policy search