Reinforcement Learning Produces Dominant Strategies for the Iterated Prisoner's Dilemma.
Marc HarperVincent A. KnightMartin JonesGeorgios KoutsovoulosNikoleta E. GlynatsiOwen CampbellPublished in: CoRR (2017)
Keyphrases
- reinforcement learning
- databases
- online auctions
- function approximation
- markov decision processes
- exploration exploitation dilemma
- learning agents
- selection strategies
- reinforcement learning algorithms
- temporal difference
- state space
- website
- real time
- optimal control
- least squares
- decision trees
- machine learning
- data sets