Intrinsic fluctuations of reinforcement learning promote cooperation.
Wolfram BarfussJanusz MeylahnPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- multi agent
- cooperative
- function approximation
- learning algorithm
- state space
- model free
- learning process
- machine learning
- markov decision processes
- function approximators
- reinforcement learning algorithms
- action selection
- optimal control
- multi agent systems
- data sets
- learning problems
- transfer learning
- optimal policy
- temporal difference
- temporal difference learning
- information sharing
- information exchange
- dynamic programming
- control system
- learning capabilities
- real robot
- multi agent reinforcement learning
- direct policy search