UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning.
Yu ZhangRui YuZhipeng YaoWenyuan ZhangJun WangLiming ZhangPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- control policy
- optimal control
- dynamic programming
- piecewise linear
- function approximators
- optimality criterion
- machine learning
- optimal solution
- image quality
- learning algorithm
- optimal policy
- scaling factors
- real time
- markov decision processes
- temporal difference learning
- optimal design
- model free
- optimal strategy
- rate distortion
- vector quantization
- neural network