Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance.
Mingxuan JingXiaojian MaWenbing HuangFuchun SunChao YangBin FangHuaping LiuPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- function approximation
- machine learning
- learning algorithm
- temporal difference
- model free
- state space
- robotic control
- domain experts
- optimal policy
- reinforcement learning algorithms
- multi agent reinforcement learning
- knowledge acquisition
- dynamic programming
- optimal control
- domain knowledge
- search algorithm
- imperfect information
- temporal difference learning
- data sets