Reward Advancement: Transforming Policy under Maximum Causal Entropy Principle.

Guojun Wu Yanhua Li Zhenming Liu Jie Bao Yu Zheng Jieping Ye Jun Luo

Published in: CoRR (2019)

Keyphrases

inverse reinforcement learning
partially observable environments
reward function
policy gradient
average reward
reinforcement learning
optimal policy
expected reward
information theory
information entropy
total reward
causal relationships
control policy
bayesian networks
information theoretic
causal relations
causal networks
agent receives
causal discovery
causal reasoning
internet technology
causal knowledge
long run
causal models