Episode-Experience Replay Based Tree-Backup Method for Off-Policy Actor-Critic Algorithm.
Haobo JiangJianjun QianJin XieJian YangPublished in: PRCV (1) (2018)
Keyphrases
- cost function
- dynamic programming
- clustering method
- optimization algorithm
- computational complexity
- k means
- objective function
- optimization method
- actor critic
- gradient method
- mathematical model
- support vector machine svm
- reinforcement learning
- learning algorithm
- optimal solution
- convergence rate
- recursive least squares
- linear programming
- search space
- np hard
- evolutionary algorithm
- basis functions
- kalman filter
- negative matrix factorization
- multi agent