Introspective Q-learning and learning from demonstration.
Mao LiTim BrysDaniel KudenkoPublished in: Knowl. Eng. Rev. (2019)
Keyphrases
- reinforcement learning
- cooperative
- function approximation
- state space
- model free
- action selection
- stochastic approximation
- learning algorithm
- multi agent
- multi agent reinforcement learning
- optimal policy
- learning rate
- bucket brigade
- reinforcement learning algorithms
- potential field
- credit assignment
- temporal difference learning
- dynamic programming
- td learning
- data sets
- single agent
- dynamic environments
- genetic algorithm
- machine learning