Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity.

Yan Yang Bin Gao Ya-xiang Yuan

Published in: CoRR (2024)

Keyphrases

lower level
higher level
reinforcement learning
upper level
low level
high level
optimality conditions
development process
case study
information processing
machine learning
software engineering
function approximation
data mining
learning algorithm
edge detection