Distributional Reward Decomposition for Reinforcement Learning.
Zichuan LinLi ZhaoDerek YangTao QinTie-Yan LiuGuangwen YangPublished in: NeurIPS (2019)
Keyphrases
- reinforcement learning
- function approximation
- state space
- markov decision processes
- eligibility traces
- reinforcement learning algorithms
- dynamic programming
- learning algorithm
- decomposition method
- model free
- partially observable environments
- optimal policy
- machine learning
- multi agent
- reward function
- co occurrence
- supervised learning
- temporal difference
- action selection
- optimal control
- learning problems
- temporal difference learning
- decomposition methods
- monte carlo
- long run
- robotic control
- hierarchical decomposition
- image decomposition
- control policy
- decomposition algorithm
- markov decision process
- partially observable
- learning process