Reinforcement Learning for the N-Persons Iterated Prisoners' Dilemma.
Juan Enrique AgudoColin FyfePublished in: CIS (2011)
Keyphrases
- reinforcement learning
- exploration exploitation dilemma
- function approximation
- optimal policy
- robotic control
- state space
- reinforcement learning algorithms
- markov decision processes
- temporal difference
- machine learning
- direct policy search
- control problems
- model free
- multi agent
- relational reinforcement learning
- neural network
- reward function
- transfer learning
- artificial intelligence
- action space
- reinforcement learning methods
- dynamic programming
- decision making