Balanced Q-learning: Combining the Influence of Optimistic and Pessimistic Targets.
Thommen George KarimpanalHung LeMajid AbdolshahSantu RanaSunil GuptaTruyen TranSvetha VenkateshPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- state space
- multi agent
- cooperative
- learning algorithm
- function approximation
- combining multiple
- multiple targets
- learning rate
- dynamic programming
- social influence
- multi agent reinforcement learning
- prior studies
- potential field
- stochastic approximation
- multi target
- information systems
- factors influencing
- action selection
- model free
- target detection
- target tracking
- real time
- knowledge base
- artificial neural networks
- optimal policy