Optimization of music education strategy guided by the temporal-difference reinforcement learning algorithm.
Yingwei SuYuan WangPublished in: Soft Comput. (2024)
Keyphrases
- temporal difference
- td learning
- reinforcement learning
- function approximation
- evaluation function
- global optimization
- temporal difference learning
- reinforcement learning algorithms
- monte carlo
- optimization algorithm
- model free
- action selection
- temporal difference methods
- policy evaluation
- supervised learning
- training set
- e learning
- step size
- function approximators
- search space
- feature selection