It's All about Reward: Contrasting Joint Rewards and Individual Reward in Centralized Learning Decentralized Execution Algorithms.
Peter AtrazhevPetr MusilekPublished in: Syst. (2023)
Keyphrases
- reinforcement learning
- bandit problems
- learning algorithm
- inverse reinforcement learning
- noise tolerant
- decentralized decision making
- partially observable environments
- multi armed bandits
- markov decision processes
- optimal policy
- eligibility traces
- automatically learned
- reinforcement learning methods
- reward function
- learning models
- multi agent
- cooperative
- online learning
- optimization problems
- theoretical analysis
- significant improvement
- supervised learning
- machine learning
- discounted reward
- computational complexity
- prior knowledge
- active learning
- policy gradient
- peer to peer
- inductive inference
- distributed environment
- learning tasks
- computationally efficient
- neural network
- machine learning algorithms