Episodic Return Decomposition by Difference of Implicitly Assigned Sub-trajectory Reward.

Haoxin LinHongqiu WuJiaji ZhangYihao SunJunyin YeYang Yu
Published in: AAAI (2024)
Keyphrases
  • reinforcement learning
  • decomposition method
  • neural network
  • long run
  • trajectory data
  • hierarchical decomposition
  • real time
  • social networks
  • spatio temporal
  • reward function