Diffusion Policies creating a Trust Region for Offline Reinforcement Learning.
Tianyu ChenZhendong WangMingyuan ZhouPublished in: CoRR (2024)
Keyphrases
- trust region
- reinforcement learning
- optimal policy
- optimization methods
- global optimum
- column generation
- newton method
- state space
- log likelihood
- hessian matrix
- function approximation
- levenberg marquardt
- dynamic programming
- anisotropic diffusion
- line search
- feature selection
- mean shift
- temporal difference
- genetic algorithm
- search algorithm
- step size
- machine learning
- simulated annealing
- least squares
- support vector machine