Model-Free Temporal Difference Learning for Non-Zero-Sum Games.
Liming WangYongliang YangDawei DingYixin YinZhishan GuoDonald C. WunschPublished in: IJCNN (2019)
Keyphrases
- temporal difference learning
- model free
- function approximation
- temporal difference
- reinforcement learning algorithms
- reinforcement learning
- policy iteration
- radial basis function
- function approximators
- fixed point
- optimal policy
- learning tasks
- markov chain
- dynamic programming
- learning process
- markov decision process
- learning algorithm