Time-Varying Weights in Multi-Reward Architecture for Deep Reinforcement Learning.
Meng XuXinhong ChenYechao SheYang JinJianping WangPublished in: IEEE Trans. Emerg. Top. Comput. Intell. (2024)
Keyphrases
- reinforcement learning
- function approximation
- learning capabilities
- state space
- model free
- real time
- reward function
- control policy
- relative importance
- learning agent
- optimal policy
- dynamic programming
- learning process
- multi agent
- machine learning
- partially observable environments
- reinforcement learning algorithms
- network architecture
- supervised learning
- management system
- weighting scheme
- weighted sum
- optimal control
- markov decision process
- markov decision processes
- function approximators
- state action
- robotic control