Login / Signup
A novel multi-step reinforcement learning method for solving reward hacking.
Yinlong Yuan
Zhu Liang Yu
Zhenghui Gu
Xiaoyan Deng
Yuanqing Li
Published in:
Appl. Intell. (2019)
Keyphrases
</>
multi step
reinforcement learning
dynamic programming
combinatorial optimization
pairwise
covariance matrix
single step
data sets
machine learning
objective function
support vector machine
mutual information
function approximation