Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning.
Jinna LiZhenfei XiaoPing LiPublished in: IEEE Access (2019)
Keyphrases
- reinforcement learning
- state space
- markov chain
- function approximation
- cooperative
- multi agent
- reinforcement learning algorithms
- learning algorithm
- action selection
- multi agent reinforcement learning
- markov processes
- finite state
- model free
- bucket brigade
- stochastic approximation
- learning rate
- temporal difference learning
- potential field
- optimal policy
- data sets
- reinforcement learning methods
- genetic algorithm
- neural network
- continuous state spaces
- hierarchical reinforcement learning
- relational reinforcement learning