DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning.
Jinxin LiuHongyin ZhangDonglin WangPublished in: ICLR (2022)
Keyphrases
- reinforcement learning
- function approximation
- multi relational
- eligibility traces
- state space
- reinforcement learning algorithms
- markov decision processes
- temporal difference
- dynamic model
- partially observable
- partially observable environments
- reward function
- optimal policy
- learning problems
- learning process
- machine learning
- real time
- supervised learning
- model free
- dynamic programming
- multi agent
- learning agent
- reinforcement learning methods
- total reward
- reward shaping
- learning algorithm
- policy gradient