Sigmoidally Preconditioned Off-policy Learning: a new exploration method for reinforcement learning.
Xing ChenDongcui DiaoHechang ChenHengshuai YaoJielong YangHaiyin PiaoZhixiao SunBei JiangYi ChangPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- prior knowledge
- high accuracy
- learning process
- learning mechanism
- active learning
- learning algorithm
- unsupervised learning
- dynamic programming
- objective function
- similarity measure
- significant improvement
- cost function
- multi agent
- policy search
- autonomous learning
- detection method
- markov decision processes
- model free
- machine learning
- learning capabilities
- active exploration
- action selection
- online learning
- least squares
- state space
- pairwise