Reward Advancement: Transforming Policy under Maximum Causal Entropy Principle.
Guojun WuYanhua LiZhenming LiuJie BaoYu ZhengJieping YeJun LuoPublished in: CoRR (2019)
Keyphrases
- inverse reinforcement learning
- partially observable environments
- reward function
- policy gradient
- average reward
- reinforcement learning
- optimal policy
- expected reward
- information theory
- information entropy
- total reward
- causal relationships
- control policy
- bayesian networks
- information theoretic
- causal relations
- causal networks
- agent receives
- causal discovery
- causal reasoning
- internet technology
- causal knowledge
- long run
- causal models