Re-attentive experience replay in off-policy reinforcement learning.
Wei WeiDa WangLin LiJiye LiangPublished in: Mach. Learn. (2024)
Keyphrases
- reinforcement learning
- function approximation
- learning algorithm
- markov decision processes
- machine learning
- optimal policy
- model free
- temporal difference learning
- learning process
- reinforcement learning algorithms
- dynamic programming
- state space
- temporal difference
- partially observable
- user experience
- learning capabilities