Deep reinforcement learning using least-squares truncated temporal-difference.
Junkai RenYixing LanXin XuYichuan ZhangQiang FangYujun ZengPublished in: CAAI Trans. Intell. Technol. (2024)
Keyphrases
- temporal difference
- least squares
- reinforcement learning
- policy evaluation
- function approximation
- td learning
- policy iteration
- reinforcement learning algorithms
- evaluation function
- model free
- temporal difference learning
- monte carlo
- action selection
- step size
- state space
- actor critic
- temporal difference methods
- function approximators
- optical flow
- learning process
- optimal policy
- optimal control
- markov decision processes
- linear combination
- multi agent
- machine learning
- transfer learning
- td methods