Login / Signup
Sigmoidally Preconditioned Off-policy Learning: a new exploration method for reinforcement learning.
Xing Chen
Dongcui Diao
Hechang Chen
Hengshuai Yao
Jielong Yang
Haiyin Piao
Zhixiao Sun
Bei Jiang
Yi Chang
Published in:
CoRR (2022)
Keyphrases
</>
reinforcement learning
prior knowledge
high accuracy
learning process
learning mechanism
active learning
learning algorithm
unsupervised learning
dynamic programming
objective function
similarity measure
significant improvement
cost function
multi agent
policy search
autonomous learning
detection method
markov decision processes
model free
machine learning
learning capabilities
active exploration
action selection
online learning
least squares
state space
pairwise