Balanced prioritized experience replay in off-policy reinforcement learning.
Zhouwei LouYiye WangShuo ShanKanjian ZhangHaikun WeiPublished in: Neural Comput. Appl. (2024)
Keyphrases
- reinforcement learning
- state space
- function approximation
- machine learning
- learning algorithm
- neural network
- reinforcement learning algorithms
- markov decision processes
- decision making
- artificial intelligence
- real world
- case study
- data mining
- learning classifier systems
- action selection
- temporal difference
- databases
- partially observable
- possibilistic logic
- temporal difference learning
- robotic control