Exploring selfish reinforcement learning in repeated games with stochastic rewards.
Katja VerbeeckAnn NowéJohan ParentKarl TuylsPublished in: Auton. Agents Multi Agent Syst. (2007)
Keyphrases
- reinforcement learning
- repeated games
- nash equilibrium
- stochastic games
- average reward
- learning automata
- markov decision processes
- incomplete information
- function approximation
- reinforcement learning algorithms
- game theoretic
- game theory
- state space
- model free
- monte carlo
- learning algorithm
- temporal difference
- partially observable
- multi agent
- reward shaping
- optimal policy
- dynamic programming
- reward function
- action selection
- supervised learning
- learning agent
- policy iteration
- worst case
- resource allocation
- optimal control
- partially observable markov decision processes
- markov chain
- learning capabilities
- control policy
- np hard
- temporal difference learning
- special case