On the Reuse Bias in Off-Policy Reinforcement Learning.
Chengyang YingZhongkai HaoXinning ZhouHang SuDong YanJun ZhuPublished in: IJCAI (2023)
Keyphrases
- reinforcement learning
- function approximation
- learning algorithm
- learning objects
- software reuse
- reinforcement learning algorithms
- temporal difference
- machine learning
- learning process
- state space
- multi agent reinforcement learning
- robotic control
- learning problems
- multi agent
- temporal difference learning
- learning classifier systems
- direct policy search
- reinforcement learning methods
- fitted q iteration
- function approximators
- robot control
- action selection
- real time
- knowledge management
- trade off
- data mining