Dynamic Weights and Prior Reward in Policy Fusion for Compound Agent Learning.
Meng XuYechao SheYang JinJianping WangPublished in: ACM Trans. Intell. Syst. Technol. (2023)
Keyphrases
- prior knowledge
- inverse reinforcement learning
- reinforcement learning
- learning algorithm
- learning process
- action selection
- online learning
- dynamic environments
- multi agent
- partially observable environments
- intelligent agents
- learning systems
- active learning
- multiagent systems
- solving problems
- state space
- reward function
- partially observable
- learning agent
- eligibility traces
- multi agent systems