Login / Signup
Policy Search using Paired Comparisons.
Malcolm J. A. Strens
Andrew W. Moore
Published in:
J. Mach. Learn. Res. (2002)
Keyphrases
</>
policy search
reinforcement learning
continuous state
dynamic programming
reinforcement learning algorithms
continuous action
reward function
search algorithm
multi agent systems
state space
markov decision problems
policy gradient